We need to know what’s happening with AI
Without transparency, society is flying blind toward potentially catastrophic AI capabilities
AI systems may be approaching genuinely dangerous capabilities. Companies are sounding the alarm: Anthropic and OpenAI warn that frontier AI models are “on the cusp” of helping non-experts create bioweapons. Google says that Gemini could reach a “critical capability level” — meaning it “may pose a significant risk of severe harm without appropriate mitigations” — for cybersecurity risks in the “next few months”. What once took years of expertise might soon be a few prompts away — and the world isn’t ready.
Yet as we hurtle towards these dangerous capabilities, companies are becoming less transparent about their work. Despite having committed to the Biden White House to “publish reports for all new significant model public releases,” Google released Gemini 2.5 Pro weeks before publishing information about its safety testing results, while OpenAI skipped releasing a system card for GPT-4.1 entirely.
Some argue that the White House commitments don’t specifically require companies to release their model cards alongside the new models. But even if Google didn’t technically break its commitment, the behavior is still concerning. For weeks, it was unclear if Google had conducted any safety testing, or if its new model possessed dangerous capabilities.
This opacity has real consequences. In the absence of binding regulation, transparency is the only viable check on AI companies’ practices. It allows external experts to identify flaws in safety testing protocols, spotting mistakes that even well-intentioned companies might make. (OpenAI’s testing protocols notably failed to catch GPT-4o’s recent sycophantic update, a mistake external experts might have spotted.)
Transparency is also critical for ensuring society can prepare for dangerous AI capabilities, by building up defences in other areas. When a company learns that its model can explain novel pathways to biological weapons, that information must reach biosecurity experts and government officials immediately. Right now, a model capable of generating bioweapons could be put on the market without the public — or politicians — even knowing. That’s a concerning position to be in.
If AI’s rapid progress continues, this could become a big problem. Ajeya Cotra, a senior program officer on technical AI safety at Open Philanthropy, worries that “companies are likely to end up breaking their [safety policies] (at least in spirit)” if AI reaches high risk levels in the next few years. (Open Philanthropy is Transformer’s primary funder.) Companies, she fears, simply won’t have time to prepare suitable mitigation strategies. Zach Stein-Perlman, an independent researcher who assesses labs’ safety efforts for AI Lab Watch, agrees, noting that Anthropic doesn’t have a plan for reaching the security levels required for AGI by 2026 — the timelines CEO Dario Amodei predicts. In a world with rapidly advancing AI capabilities, transparency may be the only way for the rest of us to know if companies are responding adequately.
Amid this backdrop, a growing coalition of AI experts is advocating for increased transparency. Deep-learning pioneer Yoshua Bengio argues “we need more transparency in managing AI risks.” AI governance researchers Daniel Kokatajlo and Dean Ball — now a White House AI advisor — have said that transparency “is the key to making AI go well,” noting that deliberative regulation “is simply impossible if the public, and even many subject-matter experts, have no idea what is being built.” Even Gary Marcus, typically skeptical of AI safety concerns, testified to the Senate that independent oversight was “vital” given the “mind-boggling” sums of money at stake.
While experts disagree over what exactly meaningful transparency entails, three themes recur. First, companies should share enough concrete information (including benchmark results, survey data, forecasts and red-teaming results) on their models’ capabilities, enabling external experts to evaluate the companies’ safety claims. Currently, disclosure practices vary widely in adequacy. Despite praising OpenAI for sharing their biorisk evals at all, one independent analysis concluded that “based on the multiple-choice proxy tests they actually used, I can’t tell if o1-preview meets OpenAI’s definition of ‘high risk.’ And I don’t know if OpenAI can tell either.” This challenge will only get harder as more robust evaluations and mitigations are required, such as expensive uplift studies. Companies should also disclose breakthrough capabilities the moment they're discovered. “The public deserves disclosure closer in time to when the capabilities are noticed to get more advanced time to prepare,” argues Peter Wildeford, co-founder of the Institute for AI Policy and Strategy.
Second, companies should share more detailed information about their risk projections over the coming months and years, alongside detailed descriptions of the measures they are taking to understand and limit those risks. While companies already publish versions of these, known as preparedness frameworks, Stein-Perlman deems most “inadequate, barely better than nothing”. More detailed information would allow external evaluators to assess whether companies are adequately preparing for impending risks, and sound the alarm if not.
Finally, companies should guarantee whistleblower protections, particularly for employees who report on violations of the law or extreme risks. This would add a layer of credibility by increasing the likelihood of exposure for deceptive practices. Alternatively, a third-party evaluator could conduct thorough audits to check for compliance. “The AI companies kind of both do the homework and grade the homework,” observes Wildeford. “There’s inherent conflicts of interest in that sort of system.”
At a time when companies are barely meeting their existing transparency commitments, asking for more might seem ambitious. But experts insist it’s an essential step towards tackling the risks that companies themselves are warning about — and, crucially, is much lighter-touch than actual regulation. “We’re not stepping in and telling [the companies] what they have to do,” Wildeford says. “What we are doing is requesting the necessary information that it takes to evaluate whether companies are telling the truth, making it harder for them to sweep things under the rug and pretend their models are safer and more reliable than they are.” In a world where companies can’t be held accountable for their safety commitments, it would be nice if society at least understood the coming risks.
Disclosure: The author’s partner works at Google DeepMind.
I appreciate the concerns raised here, and the push for greater transparency is understandable. But I can’t help feeling a bit frustrated. By now, saying “we need transparency” is becoming something of a mantra.
We all want the upsides of transparency: accountability, scrutiny, early warning, without the downsides like leaking sensitive information across labs or to adversarial states. But hitting that goldilocks is hard. It’s only going to get harder as we see more soft nationalisation and geopolitical entrenchment.
We need practical mechanisms to create transparency. How do you safely share capability evaluations across labs? What kind of third-party audit frameworks actually work under NDAs or security constraints? Can governments mandate structured disclosure without driving models underground?
This piece does gesture at some of that, but I’d love to see the discourse shift even more toward implementable, scalable mechanisms.