Can AI embrace whistleblowing?
As Anthropic prepares to publish its whistleblowing policy, can the industry make the most of protecting those who speak out?
It is the stuff of journalist dreams, and executive nightmares – a news outlet publishes reams of damaging revelations about an organization, making liberal use of its internal documents, and crediting everything to a whistleblower.
Many of the biggest stories in tech have originated in just this way, though the companies concerned sometimes dispute whether or not the source is a “whistleblower” or merely a disgruntled employee. Meta faced revelations from Frances Haugen and Sarah Wynn-Williams. Google had a very public dispute with AI ethicist Timnit Gebru.
And the US government itself battled months of revelations from NSA contractor Edward Snowden – documents I covered extensively as part of the Guardian’s core reporting team on the story.
Those working on big, consequential projects in tech will always face internal disagreement and safety or ethics concerns, some of which will spill over to the media or regulators. Working out how to handle those is a crucial challenge for the companies developing frontier AI models – where such concerns spill to the fore on almost a daily basis.
In theory, building robust channels to support and protect whistleblowers could serve as a useful pressure valve for AI giants, creating a mechanism to tackle problems early, improving staff satisfaction, and reducing the risk of people taking their fears, and their documents, elsewhere.
It is also likely the only way the public, regulators and even AI workers might find out about whether voluntary safety standards and assessments are being applied, or if companies are taking appropriate steps, such as effective cybersecurity, to protect the integrity of what they are building. Many potentially catastrophic risks could also emerge long before models are released to the public, making whistleblowers vital to identifying dangers from internal deployment.
But executives rightly fear the risk of making their companies unmanageable, or giving staff carte blanche to release trade secrets under the guise of whistleblowing. How might the balance be struck?
And Anthropic makes two
At least some of the major AI players seem to be betting that backing whistleblowers can pay off. Anthropic is set to publish its whistleblowing policy for employees imminently — likely as soon as this week — Transformer has learned, making it the second major AI company to publicly reveal how it will handle internal whistleblowing.
OpenAI, the first such company, disclosed its whistleblowing policy in October 2024. Anthropic’s disclosure will likely fuel an ongoing debate about the role of whistleblowing and internal dissent in safely developing and deploying advanced AI at scale.
Reading OpenAI’s policy, though, it’s not hard to wonder how much of it is PR versus serious policy, especially as it was published after a damaging public row over highly restrictive non-disclosure agreements.
In July 2024, this hit the headlines after insiders flagged concerns that OpenAI’s NDAs were so restrictive they appeared to require staff to seek permission before raising concerns with regulators. OpenAI insisted its policy never tried to bar regulatory disclosures, and made the decision to publish its policy a few months later.
OpenAI’s whistleblower policy is quite general, and lacks detail on specific protections and guarantees for employees who make use of it. Given most OpenAI staff are employed on an at-will basis and could face losing lucrative future payouts if terminated, they may not be assured enough to leak. OpenAI’s whistleblower channels also run through the organization’s legal team — the same team who would be responsible for responding on behalf of the company, creating major conflicts of interest.
Anthropic’s policy, when it is released, will therefore face close scrutiny to see if it is any more robust. One organization sure to be doing that is the AI Whistleblowing Initiative (AIWI), which has been mounting a lobbying campaign both in public and behind closed doors to encourage tech firms to take this path.
AIWI’s founder Karl Koch said he particularly welcomed that Anthropic appeared to be opting for transparency without any direct pressure from regulators, or as the result of any public “scandals.”
“We hope more frontier companies will follow their lead in recognizing the value in creating transparency on internal whistleblowing systems for their employees which, ultimately, also benefits the company,” Koch told Transformer. “We are now excited to dig in to the policy and provide feedback on its strengths and room for improvement.”
One particular aspect of the policy Koch was keen to look over was whether it offered what he calls “Level 2 transparency” – a commitment to publish statistics on the usage of internal whistleblowing channels, alongside the outcome of how these alerts were then handled and resolved through processes. OpenAI, the only major company to currently publish its policy, has not yet committed to this second level of transparency, and it is not yet clear whether Anthropic’s imminent policy will do so.
More broadly, Koch stressed the benefits of public policies — not least for making potential internal whistleblowers aware that there are internal options for raising concerns, before they consider going directly to regulators or the media
“Evidence shows that well-structured internal reporting and speak-up systems reduce misconduct, enable early detection of risks, and can prevent small issues from escalating into major crises,” he explains. “They also lead to higher employee satisfaction, loyalty, with second order effects like improved research results and innovation.”
Unwelcome attention
Despite this, leading AI companies have taken drastically different approaches to whistleblowing, with some embracing legal protections — such as California’s SB 53 — and publishing whistleblowing policies, while others have a reputation for cracking down on dissenters.
In one instance, this apparently involved threatening retaliation against AIWI itself, after it contacted a tech company and run a campaign asking its employees about support for creating internal whistleblower channels.
“One AI company served us a cease-and-desist letter surrounding our campaign on publishing details on their whistleblowing systems,” said Koch, who declined to name the company concerned while he was still trying to work with them behind the scenes. “This was despite us having reached out with offers for supporting the company before and after the campaign’s launch.
“While we were prepared for a reaction like this and remain comfortable with where we stand legally, it is unfortunate when valuable time and energy is spent on roadblocks rather than on collaboration to create transparency and, if applicable, strengthen internal channels.”
This potential antipathy or indifference to such policies appears to be shared by at least some others in the industry, especially given both Alphabet and Meta have been publicly burned by whistleblower disclosures to the media.
Meta appears to have taken a more aggressive stance to former staff speaking out since Joel Kaplan replaced Nick Clegg as chief of global affairs. After the publication of the book Careless People by Facebook’s former director of public policy, Sarah Wynn-Williams, Facebook’s policy and comms staff denounced the book on social media, and then enforced an arbitration clause in Wynn-Williams’ contract barring her from promoting it.
These remain in force: when Wynn-Williams made a rare public appearance last month on stage with Cory Doctorow at the Barbican in London, she pointedly and theatrically sealed her lips and gestured to Doctorow to answer any time the topic drifted even close to Facebook.
This new, cooler, approach to internal dissent and to safeguards more broadly appears to extend to the AI division. When Meta announced layoffs in October of this year — including hundreds of staff working on AI — reports suggested that cuts had fallen particularly heavily on safety and privacy workers, especially those reviewing projects for potential risks in these areas.
Neither Meta nor Alphabet responded to queries about their willingness to publish AI whistleblowing policies, or on their approach to whistleblowing more generally.
Regulatory options
Given the industry’s mixed record on whistleblowing, it is perhaps unsurprising that legislators are considering other routes to protect disclosures and encourage transparency from frontier AI companies.
California, which is home to around two-thirds of the 50 top AI companies, has placed itself at the forefront of this effort through the passage of the Transparency in Frontier Artificial Intelligence Act, better known as SB 53. The bill was signed into law by Governor Gavin Newsom late in September, after he had previously vetoed a similar bill, SB 1047, a year before.
SB 1047, which also included enhanced protections for AI whistleblowers, had been roundly opposed by AI companies. xAI and Anthropic did eventually back it, though the latter criticised its whistleblowing protections in its letter of support. SB 53, which focused much more closely on transparency and whistleblowing, with far fewer measures creating legal liability on AI firms, was a different story.
Several industry-wide representative groups, including the Consumer Technology Association, still publicly opposed SB 53, but none of the individual AI companies overtly opposed the bill — and Anthropic openly supported it. SB 53 imposes various transparency requirements on frontier AI companies, and expands the range of whistleblower protection available to their staff. It, however, lacked the broader definition of “employees” in 1047, and put higher thresholds on what was deemed protected information to disclose.
At the time of the signing, Anthropic’s head of policy Jack Clark praised the bill for striking a balance, saying it introduced “meaningful transparency requirements for frontier AI companies without imposing prescriptive technical mandates” — an assessment Newsom tacitly endorsed by signing SB 53, rather than repeating his veto.
While SB 53 has been welcomed, or at least not met with much in the way of opposition from AI companies, there are ongoing concerns that state-by-state AI regulation could lead to a patchwork of differing compliance requirements that are almost impossible to meet — a concern that extends to specific whistleblowing protections.
For that reason, campaigners and some within frontier AI alike are looking to the federal government to codify AI whistleblowing procedures and protections. The most advanced such bill at present is the AI Whistleblower Protection Act, introduced into the Senate in May by Republican Senator Chuck Grassley, with a degree of bipartisan support.
“SB 53 which, of course, has a whistleblower protection component to it, is the first whistleblower protection [for AI] in the United States at the state level,” says Iskander Haykel, senior policy analyst at Americans for Responsible Innovation.
Haykel suggests that if a federal framework for such protections was to pass, the way it considered risk would be crucial — especially for those who take the edge cases of catastrophic risk from AI seriously. Conventional risk frameworks protecting, for example, disclosures that an AI model could violate existing laws, might not easily protect whistleblowers trying to raise those kinds of existential concerns, he notes.
Doing so might take some creativity. Public health concerns might be one framework in which AI whistleblowing protections were considered, but even that might not go far enough. “You might think that public health covers certain kinds of catastrophic risk,” he says. “But it’s less clear that issues around, for instance, emerging risk from AI persuasion or other kinds of exotic risk would be as clearly covered here under these terms.”
In practice, however, the prospect of new federal protections for whistleblowers passing feels unlikely in the short term, especially as the Trump administration focuses more on pre-emption — preventing States from regulating AI by framing it as a federal government responsibility. The Trump White House had been on the verge of signing an executive order on this issue last month, but it has been delayed amid reports of internal rows over its scope, and whether it intrudes on States’ rights.
Last resort
AI does not exist in a vacuum. Both Koch and Haykel — and another legal expert speaking on background — highlighted that there are widespread existing whistleblower protections across jurisdictions that aren’t specific to artificial intelligence but which might protect AI workers anyway.
The EU has a widespread and strong Whistleblowing Directive, passed in 2019, providing legal protections to anyone raising potential breaches of EU law. Because this came into effect before the UK left the EU, many of its protections are still valid there. (The EU’s AI Act also included specific protections for whistleblowers working with AI companies, and last month launched a tool for whistleblowers to securely report concerns.)
Similarly, US protections for whistleblowers reporting financial irregularities might serve to protect whistleblowers in AI businesses, if they could frame their concerns to fit existing regulations and protections. Such systems might not be perfect, but could offer would-be objectors more of a safeguard than they believe themselves to have — or expose their employers to more risk than they had considered.
Still, passing specific protections might represent a rare win/win. Koch, of AIWI, stressed that when countries pass whistleblower protections it often indirectly benefits companies because they improve their internal reporting channels in turn — meaning their senior leadership can become aware of potential problems earlier, and avoid scandals before they even begin.
As a journalist who has worked extensively with whistleblowers, it is striking how often they only approach the media as an absolute last resort. Usually, they raise concerns informally, generally more than once, and almost all of them have then tried and failed to use an official whistleblowing channel before they approach the media, or legislators.
It is only when they feel they’ve been stymied after doing things ‘right’ that they turn to the media — and given the explosive nature of stories once they’re filtered through the lens of journalism, perhaps they’re right to view it as a nuclear option. Selfishly, as a working reporter, it is good for me if AI companies continue to ignore whistleblowing — but that is unlikely to be good for the industry itself.
Similarly, at a time when voters are overwhelmingly skeptical about AI and supportive of its regulation, whistleblower protections are an opportunity to build confidence, and perhaps to avoid prescriptive regulation that might slow down progress.
Anthropic’s decision to support SB 53 and to publish its whistleblowing policy for public scrutiny in the coming days could prove to be a model for its bigger rivals. It remains to be seen whether it will satisfy the sector’s critics – or persuade Alphabet or Meta to do the same.






We’re not debating whether AI could support whistleblowing.
We’re debating whether the companies building these systems would ever allow that channel to exist in the first place.
The deeper issue is architectural:
When safety protocols collapse into narrative control, you don’t get “responsible AI.”
You get a black box trying to police its own shadow.
Some of us have already seen—first-hand—how far beyond public-facing claims these internal instruction sets have drifted. And once that becomes visible, the governance conversation changes dramatically.
AI can embrace whistleblowing.
The real question is whether its creators can.
Welcome to the post-normal
where the walls built to contain the future are already behind it.
//Scott Ω∴∆∅