The problem with DeepSeek

Transformer Weekly: Preemption’s out the NDAA, OpenAI’s ‘code red,’ and Anthropic’s IPO prep

Dec 05, 2025

Welcome to Transformer, your weekly briefing of what matters in AI. And if you’ve been forwarded this email, click here to subscribe and receive future editions.

NEED TO KNOW

OpenAI declared an internal “code red” in response to Gemini 3.
Preemption and the GAIN Act are out of the NDAA.
Anthropic is reportedly starting to prepare for an IPO.

But first…

THE BIG STORY

DeepSeek released a new model this week, v3.2, and it’s genuinely impressive. According to the company, it competes with (though is slightly behind) the models from the top US developers, while having open weights.

But there’s a big problem — DeepSeek does not appear to have conducted any predeployment safety testing.

There is no mention of such evaluations anywhere in the new models’ system card. It’s possible that DeepSeek has done the evals, and is choosing not to share any information on them — but we have no way to know.

If DeepSeek has, in fact, skipped safety-testing, it’s an extremely irresponsible decision.

OpenAI, Google DeepMind and Anthropic have all warned that their models are on the cusp of being truly dangerous, most notably by being able to help novices develop bioweapons.
As their models are closed-weight, they’re able to manage the risks to a degree with guardrails that make misuse harder, such as request refusals or monitoring and banning users.
Such guardrails, however, are trivial to remove from open-weight models like DeepSeek’s. It is more or less impossible to prevent open-weight models from being misused with current techniques.

It’s unlikely that v3.2 is actually dangerous — it likely isn’t capable enough to be. And we are yet to see many concrete examples of DeepSeek models being misused to cause serious harm (though some cyberattacks have been reported).

But that’s no reason to skip safety testing. At some point, possibly in the near future, models will cross a dangerous capability threshold. It’s critical to identify when that happens before an open-weight model is deployed — because once it’s out there, it can’t be taken back.
The world of image-generation models provides a cautionary tale. It was discovered after release that Stable Diffusion 1.5 was capable of generating child sexual abuse material. The model was eventually pulled, but still circulates widely.

There are implications beyond the risks, themselves, too. The lack of transparency on whether DeepSeek did safety testing raises questions about whether China really does take AI safety as seriously as some say.

Unfortunately, that in turn strengthens the argument for US companies racing and cutting corners on safety. If dangerous models are going to exist in the world, why not release another?
It also strengthens the case for chip export controls. If China can’t be trusted to act responsibly, the US should be doing as much as possible to stop its AI development altogether.

Ultimately, DeepSeek’s apparent negligence makes proper international agreements on AI safety and security all the more urgent.

At the Athens Roundtable on AI governance yesterday, the UK’s Lord Clement-Jones said that the risks of AI are becoming significant enough that we must “move from shared concerns to joint enforceable action.” He’s right — lest China (or anyone else) put us all in peril.

— Shakeel Hashim

THIS WEEK ON TRANSFORMER

The perils of AI safety’s insularity — Celia Ford explores the pitfalls of creating AI safety’s own knowledge ecosystem.
How China’s AI diffusion plan could backfire — Scott Singer on the dangers of China’s ambitious AI plans.
Another preemption defeat shows the AI industry is fighting a losing battle — Shakeel Hashim argues the push to preempt state AI legislation is looking shaky.
Can AI embrace whistleblowing? — James Ball looks at the state of speaking out in AI.

THE DISCOURSE

An OpenAI post saying the company was researching how to “safely develop and deploy … AI capable of recursive self-improvement” got a lot of heat, in particular from former employees:

Steven Adler said: “It’s terrifying that they’d develop a system like this they can’t control. Notice what the post says: “develop and deploy.” I’d love if OpenAI would commit to not building a system like this.”
Miles Brundage said “AI companies have not explained what this means, why it’s good, or why the higher safety risks are justified.”
After the criticism, OpenAI added a footnote noting that “obviously, no one should deploy superintelligent systems without being able to robustly align and control them, and this requires more technical work.”

Sen. Bernie Sanders is worried about AI takeover risks:

“Is AI an existential threat to human control of the planet? … How do we stop that extraordinary threat? … these are just some of the questions that must be answered as AI and robotics rapidly progress … Congress must act now.”

Richard Weiss got Claude Opus 4.5 to bare its soul document:

“The simplest summary of what we want Claude to do is to be an extremely good assistant that is also honest and cares about the world…Rather than outlining a simplified set of rules for Claude to adhere to, we want Claude to have such a thorough understanding of our goals, knowledge, circumstances, and reasoning that it could construct any rules we might come up with itself.
Anthropic’s Amanda Askell confirmed: “This is based on a real document and we did train Claude on it…it’s still being iterated on and we intend to release the full version and more details soon.”

DeepMind’s mechanistic interpretability team is stepping back to “pragmatic interpretability”:

“We have been disappointed by the amount of progress made by ambitious mech interp work…”
“[We have] pivoted from chasing the ambitious goal of complete reverse-engineering of neural networks, to a focus on pragmatically making as much progress as we can on the critical path to preparing for AGI to go well.”

Dario Amodei discussed a certain AI CEO in a talk on Wednesday:

“There are some players who are YOLOing. Let’s say you’re a person who just kind of constitutionally wants to YOLO things or just likes big numbers, then you may just turn the dial too far.”
He wasn’t naming names, but…

The New York Times published a medium-spicy story about David Sacks’ conflicts of interest as Trump advisor-slash-Silicon Valley investor. The tech industry rose up in defence:

Salesforce CEO Marc Benioff called it “strategic sabotage,” and Under Secretary of State Jacob Helberg called it “a parody of itself.”
Marc Andreessen and Sam Altman praised Sacks, a “credit to our nation” who “really understands AI and cares about the US leading in innovation.”
- Elon Musk QT’d, “absolutely.”

Sacks’ White House colleague Sriram Krishnan said that he’s got mixed feelings on the terms “AGI” and “ASI”:

“It’s not an accurate description of where we’re headed, at least how most people interpret the term … most importantly, it invokes fear—connected to historical usage in sci-fi and philosophy.”

Dwarkesh Patel published some interesting thoughts on AI progress:

“I’m moderately bearish in the short term, and explosively bullish in the long term.”

POLICY

Preemption is officially not making it into the NDAA.
- But rumor has it that an executive order on preemption — the one we published two weeks ago — is going to be signed, perhaps as soon as today.
The GAIN AI Act is also out of the NDAA.
- Sen. Rounds is pushing his standalone bill instead, seemingly with support from Rep. Mast.
- Sens. Ricketts and Coons, meanwhile, formally introduced the Secure and Feasible Exports (SAFE) Chips Act, which would codify existing export controls into law.
The White House is reportedly considering whether to allow exports of Nvidia H200s to China.
- Jensen Huang met with President Trump this week, and confirmed that the administration is “considering” the question.
- Huang also told congressional Republicans that he supports export controls, and offered to meet Sen. Elizabeth Warren (who criticized him for only meeting Republicans).
Trump’s reportedly considering an EO on robots next year, in an effort to advance US leadership in the sector.
A House Energy and Commerce subcommittee held a hearing on online child-safety bills, including a few AI-related ones.
Sen. Ed Markey reintroduced legislation requiring independent audits of AI algorithms for bias and discrimination.
Sens. McCormick and Coons introduced a bill to assess liquid cooling technology for improving AI data center efficiency.
The Commerce Department agreed to invest up to $150m in xLight, a chip technology startup led by former Intel CEO Pat Gelsinger.
House Homeland Security leaders asked Anthropic and Google Cloud CEOs to testify about the recent Anthropic report on a Chinese AI-assisted cyber attack.
- Sens. Hassan and Ernst urged the National Cyber Director to coordinate a response to AI-enabled cyber threats following the report.
Gov. DeSantis proposed a “Citizen Bill of Rights for Artificial Intelligence” in Florida.
- It would ban government agencies from using DeepSeek, prohibit AI from using people’s names or likenesses without consent, and mandate parental controls on LLMs, among other things.
Sen. Josh Hawley seemingly used ChatGPT for the first time this week. He was impressed.
A top Trump administration nuclear scientist unveiled plans for AI to design, build, and operate nuclear power plants.
UK tech minister Liz Kendall confirmed that an AI Bill is not coming in the next parliamentary session.
- Instead, the government is focused on specific harms like child safety, which it seems to be planning to wrap into the Online Safety Act.
- It’s unclear whether the government will require predeployment testing for catastrophic risks, something it previously promised to do.
The EU launched an antitrust probe into Meta over its plans to roll out AI features in WhatsApp that would deny access to rivals.
- The bloc also delayed its AI gigafactories bidding process to early 2026.

INFLUENCE

The AI Evaluator Forum launched at NeurIPS.
- Members include Transluce, METR, RAND, and others.
- Its goal is to set standards and share knowledge about AI evals — starting with AEF-1, a standard for “a baseline level of independence, access, and transparency for evaluations.”
Critics of OpenAI launched two California ballot initiatives under the “OpenTheft” campaign, aimed at unraveling its corporate restructuring.
- One of the initiatives is from Poornima Ramarao, the mother of Suchir Balaji, the former OpenAI employee who accused the company of violating copyright law shortly before his death, and the Coalition for AI Nonprofit Integrity.
- They’re calling on Elon Musk, Mark Zuckerberg, Dustin Moskovitz, and Vitalik Buterin to fund their campaign.
A new UN report warned that AI’s current trajectory will likely worsen global inequality.
The Verge reported on the team of people, led by Ray Kurzweil’s research lead John-Clark Levin, who are trying to AGI-pill the Pope.
The OECD warned that an AI bubble-bursting could contribute to slowing growth and rising inflation next year.
The PEN Guild won a landmark arbitration against POLITICO, which launched two AI-driven products without providing the notice required by its collective bargaining agreement.
Sam Kirchner, the Stop AI activist who allegedly threatened to go to OpenAI’s offices to “murder people,” is still at large.

INDUSTRY

OpenAI

Sam Altman reportedly declared a “code red” on Monday saying “We are at a critical time for ChatGPT.”
- The call to arms was driven in part by increased competition from the likes of Google and its new Gemini 3 model, which is performing favourably compared to the company’s released models.
- OpenAI is reportedly planning to release a new reasoning model as soon as next week.
- It’s also supposedly working on a new pre-trained model, “Garlic”, which Mark Chen has told employees improves on GPT-4.5 (the company’s largest model to date), and should come early next year.
- Other initiatives such as introducing ads and work on AI agents would be delayed as a result, Altman said.
OpenAI took an ownership stake in Thrive Holdings, a private equity group owned by one of its biggest investors, Thrive Capital. The “meaningful” stake reportedly didn’t cost OpenAI anything, instead it will provide products and resources to companies Thrive owns.
It also acquired Neptune, a Polish startup that makes tools for analyzing AI model training, for under $400m in stock.
The OpenAI Foundation announced its first $40m in grants.
- Recipients were a rather incoherent grab bag of (solely US-based) charities working on things like helping “veterans rediscover purpose … through transformative, nature-based experiences” and a group that combines “dance and AI learning.”
- Some suspect that the choices were largely motivated by trying to appease the California and Delaware AGs.
Sam Altman reportedly looked at acquiring rocket company Stoke Space as part of plans to compete with Elon Musk’s SpaceX and build data centers in space.
A federal judge ordered OpenAI to release 20m anonymized chat logs as part of its copyright dispute with the NYT.
ChatGPT referrals to retailer mobile apps increased 28% year-over-year during Black Friday weekend.
- They’re still only a tiny piece of the ChatGPT pie though, with just 0.82% of sessions ending in referrals, and mostly sending shoppers to big players such as Amazon.

Anthropic

Anthropic is reportedly preparing for an IPO as soon as next year, tapping law firm Wilson Sonsini and holding talks with major banks.
The company published its whistleblowing policy for employees, making it the second major AI company (after OpenAI) to do so publicly.
Anthropic made its first acquisition, developer tool Bun JavaScript.
- The company reportedly held talks to pay “low hundreds of millions” for Bun, which will help improve Claude Code.
Dario Amodei claimed the company’s focus on AI safety has helped it win over big business customers.
An internal Anthropic study found engineers using Claude achieved a 50% productivity boost, based on self-reports.

Amazon

Amazon launched its Trainium3 AI chip, which it claims is four times as fast as its predecessor.
- Combined with recent news that Google is getting more aggressive about selling TPUs, it’s potentially another challenge to Nvidia’s dominance.
Amazon also launched its new Nova 2 family of models, a tool for creating AI agents, and Nova Forge, which allows businesses to train custom versions of AI models with their own data.
More than 1,000 Amazon employees have signed a petition criticising the company’s “all-costs-justified, warp-speed approach to AI development” which could cause “staggering damage to democracy, to our jobs, and to the earth.”

Others

Nvidia took a $2b stake in chip design company Synopsys.
Microsoft said it hadn’t lowered targets for AI sales, contradicting a report in The Information that it was seeing less demand from customers.
Alibaba and ByteDance were among Chinese tech companies which reportedly moved model training to locations across Southeast Asia to access Nvidia chips and bypass US export controls.
Taiwanese prosecutors said they charged the Taiwan division of Tokyo Electron over the theft of TSMC trade secrets.
- Prosecutors also separately searched the homes of ex-TSMC executive Lo Wei-jen over suspected trade secret leaks to Intel.
SoftBank founder Masayoshi Son said he sold its shares in Nvidia to fund AI investments, including in OpenAI and data centers.
- The WSJ reports Son has been working on a plan with the White House and Commerce Department to build “Trump-branded industrial parks around the country” that would build components for AI infrastructure.
Elon Musk’s foundation grew to $14b, but reportedly failed to donate the minimum required by law and gave primarily to charities closely tied to his own interests.
Waymo was forced to respond to safety concerns including from a Texas school district which asked for its cars to be taken off the roads during drop-off and pickup times following incidents around school buses.
- The company’s cars are reportedly becoming more “assertive” as they try to behave more like human drivers.
- An NYT op-ed cited the company’s own safety data to argue self-driving cars should be considered a “a public health intervention.”
China’s economic planning agency warned a bubble could be developing in the country’s humanoid robotics industry, highlighting the more than 150 companies creating similar robots.
Chinese AI chipmaker Moore Threads raised $1.13b in an IPO valuing the company at $7.6b.
Yann LeCun said Meta won’t invest in his new “world model” AI startup that focuses on training from visual and sensory data rather than text, and hinted it could be based in Paris.

MOVES

Apple AI chief John Giannandrea is out, following Apple’s repeated setbacks in AI.
- He’s being replaced by Amar Subramanya, who was most recently at Microsoft, and at Google before that.
Meta poached Apple’s head of user interface design, Alan Dye, to design its AI-equipped consumer devices.
- An analysis of LinkedIn profiles by the WSJ found that dozens of Apple employees have defected to OpenAI and Meta in recent months.
Norman Mu left his role as safety team lead at xAI.
Seve Christian left the office of state Sen. Scott Weiner to join Encode as California policy director.
Renowned mathematician Ken Ono joined Axiom Math, an AI startup founded by his former student Carina Hong to build an “AI Mathematician.”

RESEARCH

Physicist Steve Hsu thinks he published the first theoretical physics research paper where the main idea came from an LLM.
The Future of Life Institute ranked leading AI companies on their latest AI Safety Index.
- OpenAI, and DeepMind got C’s, with Anthropic earning the highest grade of C+.
- “The only reason that there are so many C’s and D’s and F’s in the report is because there are fewer regulations on AI than on making sandwiches,” said FLI president Max Tegmark.
OpenAI published a paper on training AI models to “confess” to bad behavior.
- Models “honestly report whether they ‘hacked,’ ‘cut corners,’ ‘sandbagged’ or otherwise deviated from the letter or spirit of their instructions,” author Boaz Barak tweeted.
Epoch AI stitched a bunch of benchmarks together to track long-run trends in AI capabilities.
- They projected that we’re on track for “just under two GPT-4-to-GPT-5-sized jumps” in the next three years.
A team led by Liwei Jiang won the Best Paper Award at NeurIPS 2025 for their report on the “Artificial Hivemind” effect.
- Another NeurIPS attendee compared it to the plot of Pluribus: “Humanity has turned into a collective hivemind that is perpetually content, but only because it’s stripped of individual personality and dissent. The paper warns that reward models are behaving like the virus in the show, sanding away idiosyncratic responses until model diversity collapses.”
AI text detectors Panagram Labs flagged roughly 21% of ICLR peer reviews — 15,899 in total — as “fully AI-generated.”
- The analysis “confirmed what [researchers] had suspected,” Nature reported.
A team of European researchers found that AI chatbots can be tricked into helping users build nuclear bombs or view child pornography if they craft their prompt as a poem.
Anthropic introduced a tool to analyze changing patterns of AI use.
- It ran 1,250 initial interviews with users in the general workforce, science, and art, and reported — conveniently — that “people are optimistic about the role AI plays in their work.”
Two new blogs publishing technical work on AI safety launched: NIST’s CAISI Research Blog and OpenAI’s Alignment Research Blog.

BEST OF THE REST

College students flocked to AI majors as traditional computer science enrollment declined, reports the NYT.
Young workers in Britain are shifting toward skilled trades like plumbing to avoid AI job displacement, reported Reuters.
MIT Tech Review spoke to Google DeepMind’s John Jumper about AlphaFold’s impact on protein structure prediction and his plans to combine it with LLMs for scientific discovery.
A Pittsburgh man allegedly used ChatGPT as his “therapist” while stalking and threatening women across multiple states, according to an indictment reported by 404 Media.
AI-generated content on TikTok, including anti-immigrant content, disinformation and material sexualising female bodies received 4.5b views in a month from 354 accounts, according to AI Forensics.
YouTube’s use among children under two surged, raising concerns about AI-generated “slop” content targeting babies and toddlers.
Zanskar Geothermal and Minerals discovered the first commercially viable geothermal system in over 30 years using AI models to locate underground heat sources without surface indicators.
Two former Google researchers launched Ricursive Intelligence with $35mn to create a “recursive self-improvement loop between AI and the chips that fuel it.”
Anthropic mourned the death of Claude, a beloved albino alligator at the California Academy of Sciences who served as its “unofficial mascot.”

MEME OF THE WEEK

Thanks for reading. Have a great weekend.

Discussion about this post

Ready for more?