Eric Schmidt, Alexandr Wang and Dan Hendrycks released a splashy new paper on the geopolitics of superintelligence.

Its most notable new idea is the concept of “Mutual Assured AI Malfunction (MAIM): a deterrence regime resembling nuclear mutual assured destruction (MAD) where any state’s aggressive bid for unilateral AI dominance is met with preventive sabotage by rivals.”

According to the authors, “MAIM already describes the strategic picture AI superpowers find themselves in”. Why? Because states are, they argue, incentivized to stop their rivals from developing superintelligence for two reasons. “If, in a hurried bid for superiority, one state inadvertently loses control of its AI, it jeopardizes the security of all states.”

“Alternatively, if the same state succeeds in producing and controlling a highly capable AI, it likewise poses a direct threat to the survival of its peers.” This is because “advanced AI systems may drive technological breakthroughs that alter the strategic balance”, creating an “ AI-enabled strategic monopoly on power ”.



“Faced with the specter of superweapons and an AI-enabled strategic monopoly on power, some leaders may turn to preventive action. Rather than only relying on cooperation or seeking to outpace their adversaries, they may consider sabotage or datacenter attacks, if the alternative is to accept a future in which one’s national survival is perpetually at risk.” How might states “maim” other countries’ AI projects? The authors lay out a bunch of pathways, escalating from espionage and covert sabotage to overt cyberattacks, kinetic attacks on data centers (though they stress these are “likely unnecessary”, or broader hostilities.

The conclusion, then, is that it’s against states’ interests to unilaterally race to build superintelligence, and that countries should instead pursue a detente — during which we can have a “slow, multilaterally supervised intelligence recursion—marked by a low risk tolerance and negotiated benefit-sharing—[in which nations] slowly proceed to develop a superintelligence and further increase human wellbeing”.

I highly recommend reading the whole paper — it’s one of very few attempts to seriously consider the geopolitics of advanced AI, and it’s a very interesting read. And of course it’s particularly notable because Schmidt and Wang are vocal China hawks with a bunch of DC influence. In particular, it nicely articulates a very obvious flaw with the “ Manhattan Project for ASI” idea pushed by some folks: “[The Manhattan Project] facility, easily observed by satellite and vulnerable to preemptive attack, would inevitably raise alarm. China would not sit idle waiting to accept the US's dictates once they achieve superintelligence or wait as they risk a loss of control. The Manhattan Project assumes that rivals will acquiesce to an enduring imbalance or omnicide rather than move to prevent it.”



But while I think it’s right that the US shouldn’t assume it can unilaterally race to AGI or ASI, I’m less confident that the situation it lays out is as stable an equilibrium as MAD.

Consider why MAD works (insofar as it does). A few things are key: 1. If a nuke is launched at you, you know you’re screwed.

2. You can see a nuclear missile being launched at you.

3. If you see that missile, you can immediately retaliate with your own nuclear strike, which is guaranteed to hurt your adversary approximately just as much as they’ll hurt you.

It’s not obvious that these principles translate well to the MAIM situation. 1. It’s not necessarily the case that ASI will give states a “strategic monopoly on power” — and even if it does, it’s definitely not clear that other states (e.g. China) will believe that it will.

2. Unlike a nuclear missile — which you can see coming at you — you will not have nearly as clear signals that your adversary is developing ASI, in part because the lines are so blurry. Herbie Bradley thinks that it’s possible states will have good intelligence from espionage, but even if that is the case, it’s not going to be obvious how to interpret the evidence (look at how much debate there is around whether we’re currently on a path to ASI!).

3. Your maiming isn't guaranteed to work (particularly if we move to a more decentralized training regime), and if it doesn’t work, you’re screwed (because now your adversary has ASI, and they know you just tried to destroy it).

I think this ends up leading to a situation that’s much worse than MAD. In a US-China rivalry world where the US is ahead, there are two implications. One is that China is constantly incentivized to do low-level sabotage of the US, even if the US isn’t actually trying to build ASI, because a) China can’t be confident enough that the US isn’t trying to build it, and b) the costs of cyberattacks etc are pretty low so they may as well just constantly try. That would lead to constant instability, and it wouldn’t even necessarily incentivize the US not to pursue ASI, because the cyberattacks might not even be that decisive in slowing down the US.

But while China might be willing to take these low-level sabotage techniques, I think they’d likely not be willing to take the risk to properly sabotage a US ASI run (especially kinetically). As Michael Horowitz said on a recent ChinaTalk episode: “You need to be not just really confident, but almost absolutely certain that if somebody got to AGI first, that you're just done. That you can't be a fast follower, and probably that it negates your nuclear deterrent … If you even doubted a little bit that AGI would completely negate everything you have, then you might want to wait and see if you can catch up, rather than start a war.” And because the US knows this — and because the potential prize, however unlikely, is so big — the US would aim to build ASI anyway.



Perhaps these problems can be resolved by the work Schmidt, Wang and Hendrycks suggest is necessary: clarifying the escalation ladder, better defining what a “destabilizing AI project” is, and working on transparency and verification techniques so states can be more confident about what their adversaries are doing. But I’m very unsure.