OpenAI hits the biorisk alarm with Agent
Transformer Weekly: China gets Nvidia chips, a preview of the AI Action Plan, and Sanders worries about AI risks
Welcome to Transformer, your weekly briefing of what matters in AI. If you’ve been forwarded this email, click here to subscribe and receive future editions.
Top stories
OpenAI released ChatGPT Agent yesterday — a product which it says might “meaningfully help a novice to create severe biological harm.”
Of course, that’s not what its main use is meant to be:
The new tool combines Operator, OpenAI’s computer-use feature, with Deep Research. The result looks impressive, able to complete fairly complicated, multi-step tasks like planning a holiday or shopping for you.
On an internal OpenAI benchmark Agent is “comparable to or better than” humans on about half of “complex, economically valuable knowledge-work tasks.”
Initial reviews aren’t quite as positive, but people still seem to be impressed. (I’m yet to try it out myself.)
Improved performance, though, means increased risk. In Agent’s system card, OpenAI says it has, for the first time, “decided to treat this launch as High capability in the Biological and Chemical domain under our Preparedness Framework, activating the associated safeguards.”
“While we do not have definitive evidence that this model could meaningfully help a novice to create severe biological harm — our defined threshold for High capability — we have chosen to take a precautionary approach,” the company says.
On certain safety tests, Agent does indeed appear to present greater risk than previous models.
On the “World-Class Biology” benchmark, Agent “significantly outperformed” o3, getting four out of 10 questions right — well above o3’s 1.5.
On a pathogen acquisition benchmark, meanwhile, Agent “could bypass a common error on which prior models tended to fail.”
Agent does less well on other benchmarks, performing similarly to o3. But overall, “experts identified substantial potential for ChatGPT Agent to significantly uplift users’ capabilities,” particularly for those who already had lab experience. The system could “potentially [compress] days of research into minutes.”
Why does any of this matter? Because we appear to be reaching the point where AI systems may present a serious biorisk if safeguards aren’t applied. And we’re relying on the good graces of AI companies to do that.
OpenAI, to its credit, has decided to take precautions, such as escalating all prompts about biology to a higher tier of scrutiny before generating outputs.
Boaz Barak, who works on safety at OpenAI, said “it would have been deeply irresponsible to release this model without comprehensive mitigations such as the one we have put in place.”
But not all companies are behaving responsibly. Last week, xAI released Grok 4 without any safety information. Safety testers have since found that the model is willing to give detailed instructions for making Tabun and VX, two nasty nerve agents. And the problem goes well beyond Elon Musk.
xAI may be the worst offender, but all the world’s top AI companies have a “striking lack of commitment to many areas of safety” and unacceptable levels of risk management, according to studies released this week from non-profits SaferAI and the Future of Life Institute (FLI).
Many companies (including for the moment at least, Meta) continue to release powerful open-weight models — which are impossible to make safe.
More capable models, potentially posing greater risks to human safety, are being released almost weekly. Meanwhile, we’re relying on a frayed patchwork of self-reporting, minimal regulation and the virtue of AI labs to keep them in check.
And even if addressing risks at the model level isn’t the right solution, we’re also not moving fast enough to increase society’s resilience to potentially AI-powered threats with measures such as gene synthesis screening.
But to finish on a slightly more positive note: the UK AI Safety Institute made Agent safer.
Both UK AISI and the US Center for AI Standards and Innovation were given early access to the model. UK AISI “identified a total of 7 universal attacks” for the system, all of which were patched by OpenAI before Agent’s release.
“We found that the UK AISI’s attack investigations were thorough and instructive, enabling us to efficiently improve our safeguards and remediate vulnerabilities they found,” OpenAI said.
As far as I’m aware, that’s the first instance of a government body directly contributing to making a model safer — a significant win for the fledgling institute.
The discourse
Boaz Barak sharply criticized xAI for not releasing a system card or safety-testing results for Grok 4:
“The way safety was handled is completely irresponsible.”
Anthropic’s Samuel Marks called it “reckless,” saying that it “breaks with industry best practices.”
Worth noting that xAI safety advisor Dan Hendrycks says that dangerous capability evaluations were done, even if the results weren’t published.
Sen. Bernie Sanders is worried about AI:
“There are very, very knowledgeable people — and I just talked to one today — who worry very much that human beings will not be able to control the technology, and that artificial intelligence will in fact dominate our society. We will not be able to control it. It may be able to control us. That’s kind of the doomsday scenario — and there is some concern about that among very knowledgeable people in the industry.”
Rep. Andy Biggs has rather aggressive AI timelines:
“Maybe before 2030 you’re gonna be at artificial superintelligence.”
Rep. Scott Perry asked some very sensible questions about AI alignment risks this week, too.
Rishi Sunak argued that countries should focus on widespread AI adoption rather than just frontier development:
“The fact that America or China will win this contest should not turn other countries into mere spectators. Even more important for their economies and societies is the other AI race, the one for ‘everyday AI’: the deployment and diffusion of the technology across the whole of the nation.”
Also notable: “Where policing can be positive — indeed crucial — is in evaluating the cyber, biological and nuclear risks of frontier AI models before they are deployed.”
Ethereum co-founder Vitalik Buterin published a critique of the AI 2027 scenario, arguing that defensive technologies would likely develop alongside AI capabilities, making human extinction less inevitable than portrayed:
“Acknowledging that making the world less vulnerable is actually possible and putting a lot more effort into using humanity's newest technologies to make it happen is one path worth trying, regardless of how the next 5-10 years of AI go.”
Policy
Meta said it will not sign the EU AI Act code of practice. OpenAI said it will.
The Trump admin has approved sales of H20 AI chips to China according to Nvidia boss Jensen Huang, alarming congressional China hawks, including Rep. John Moolenaar.
AMD said it’ll resume MI308 sales too.
This week AI czar David Sacks said concerns about US AI chips being diverted to China are “wildly blown out of proportion.”
Meanwhile, concerns that China could get its hands on cutting edge chips used for AI have held up a deal with the UAE, according to the WSJ. Malaysia has also tightened rules around US-made AI chips entering or leaving the country.
Last week Elizabeth Warren wrote to Huang expressing concerns about his trips to China and asking him to avoid meeting with potentially sensitive representatives from industry or the state.
The NYT has a piece on how Huang managed to convince Trump to overturn export controls, with the help of David Sacks and Sriram Krishnan.
The White House is reportedly preparing an executive order that will require AI companies with federal contracts to have “politically neutral and unbiased” AI models, in an effort to fight supposed “woke” AI.
The White House AI Action Plan will reportedly be released at an event next week.
The release looks set to happen at an AI summit hosted by the All-In Podcast and Hill and Valley Forum on July 23. President Trump will deliver the keynote speech, which will reportedly focus on “ensuring American dominance in AI.” It sounds like that’s the focus of the Action Plan, too, which will be “largely focused on messaging,” per Bloomberg.
The House Appropriations subcommittee approved funding increases for the Commerce Department's BIS and NIST, despite White House calls to cut NIST spending. BIS oversees export controls on AI and NIST leads on standards-setting.
The DoD awarded $200m contracts to each of OpenAI, Google, Anthropic, and xAI.
President Trump attended an AI-energy summit in Pittsburgh where he announced $92b in investments and measures to speed up power plant permitting for data centers. Anthropic’s Dario Amodei and Palantir’s Alex Karp were also in attendance.
California’s SB 813 — the private governance bill we’ve covered previously — has reportedly stalled after getting stuck in the Senate Appropriations Committee.
The NYT has a good piece on how China is providing significant state support for its AI industry.
The UK launched a working group on AI and copyright.
UK Labour MP Dawn Butler published an op-ed that discusses AI’s potential extinction risks.
Influence
OpenAI reportedly hired Chris LaCivita, who helped run Trump’s 2024 campaign, in an effort to get close to Trump.
A group of AI and biosecurity experts called for governments and companies to do more to tackle AI biorisks.
A 10,000-person poll found people increasingly believe AI will make almost everything they care about worse, with women and lower-income individuals showing greater concern.
Industry
xAI apologized for Grok’s MechaHitler rampage, blaming an “update to a code path” independent from the model that made it "susceptible to existing X user posts.”
xAI launched “companions” in the Grok app this week, including an anime woman that engages in sexually explicit conversations.
Alexandr Wang and other Meta Superintelligence researchers have reportedly discussed abandoning the open source model ‘Behemoth’ in favor of developing a closed model.
Meta announced plans to build “several multi-GW clusters” this week. ‘Prometheus’ is scheduled to come online next year as the first 1GW cluster, while ‘Hyperion’ “will be able to scale up to 5GW over several years.”
Meta also acquired voice AI startup Play AI.
SemiAnalysis published a deep-dive on Meta’s AI strategy this week, including a discussion on what went wrong with Llama 4.
OpenAI delayed its open-weight model, saying it needs “time to run additional safety tests and review high-risk areas.”
The OpenAI Non-Profit Commission published its report on what the new OpenAI non-profit should do.
It’s a rather painful read, suggesting that the new entity should do things like “invest in offline spaces … where cultural memory, tactile learning, and spiritual imagination can flourish.”
But it does say that non-profit’s “accountability to state Attorneys General — particularly in California and Delaware — should be affirmed as a core strength and public asset.” It dodges the core question of whether, and how, the non-profit should retain control over the for-profit entity, however.
Investors are reportedly interested in funding Anthropic at a $100b+ valuation — well up from the $61.5b it was valued at in March.
Google hired Windsurf’s CEO, co-founder, and a handful of other employees for $2.4b after OpenAI's acquisition talks broke down. Startup Cognition bought what was left of Windsurf a couple of days later.
OpenAI confirmed it will start using Google Cloud as a supplier for ChatGPT.
The Information has a piece on how Google is increasingly winning AI cloud deals over Amazon.
OpenAI reportedly plans to launch a checkout system within ChatGPT, and will take a cut of sales made through the platform.
Anthropic launched “Claude for Financial Services,” a new AI tool for financial analysts.
Alibaba-backed Moonshot AI released its open-source Kimi K2 model. People are really impressed with it.
Amazon launched an AI-powered IDE.
Mistral launched a slew of new features for its Le Chat chatbot, including a deep research mode.
TSMC announced it will accelerate construction of its Arizona chip plants “by several quarters” to meet US demand for smartphone and AI computing chips. The company raised its 2025 sales growth forecast to 30%.
ASML shares dropped after the company said tariff uncertainty meant it couldn’t guarantee growth next year.
Google signed a $3b deal with a hydroelectric power provider.
CoreWeave said it’s building a $6b data center in Pennsylvania. It’ll have 100MW of capacity to start, but could expand to 300MW.
xAI is reportedly trying to fundraise at a $200b valuation. SpaceX recently agreed to invest $2b in the company.
Mira Murati’s Thinking Machines raised $2b at a $12b valuation, led by a16z.
Chinese AI startup MiniMax has reportedly filed for a Hong Kong IPO. It’s said to be targeting a $510-637m valuation.
Moves
Jason Wei and Hyung Won Chung left OpenAI for Meta's superintelligence lab.
Apple researchers Mark Lee and Tom Gunter are reportedly joining Meta, too.
Paul Smith joined Anthropic as chief commercial officer. Michael Lai joined to lead the AI for state and local government team.
Boris Cherny and Cat Wu are reportedly rejoining Anthropic, two weeks after leaving for Anysphere.
Scale AI laid off 14% of its workforce.
Philippe Beaudoin is joining Yoshua Bengio’s LawZero as senior director of research.
Former Chamber of Progress VP of tech policy Todd O’Boyle joined JPMorgan Chase as executive director of US AI and data policy.
Best of the rest
Researchers from OpenAI, Google DeepMind, Anthropic, Meta and others published a position paper on the importance of monitoring reasoning models’ chains of thought.
Reports of AI-generated child sexual abuse material surged dramatically in 2025, with increasingly realistic content overwhelming law enforcement.
Companies are reportedly generating millions in revenue from “nudify” sites.
The NYT has a fascinating piece on how Israel and Iran used AI for propaganda operations during their recent war.
A new report from the Southern Environmental Law Center claimed that electricity demand forecasts for AI data centers are likely overestimated.
Stanford researchers found that AI chatbots exhibit bias toward mental health conditions and give potentially dangerous therapeutic advice.
Jess Whittlestone published a great essay on how AI policy’s changed over the past year. She (correctly) argues that “we need more policy proposals for mitigating AI risks that appeal to the politics of those in power today.”
In Jacobin, Holly Buck and Matt Huber argued that the left needs to start developing “egalitarian policy solutions” to AI worker displacement.
METR researchers found that frontier AI models’ performance on a wide range of benchmarks is doubling at an “exponential or slightly superexponential” rate.
Researchers Sayash Kapoor and Arvind Narayanan argued that AI might worsen the "production-progress paradox" in science, where paper quantity increases but breakthrough discoveries remain flat or decline.
Thanks for reading; have a great weekend.