Book Review: 'If Anyone Builds It, Everyone Dies'
Eliezer Yudkowsky and Nate Soares’ new book should be an AI wakeup call — shame it’s such a chore to read
Eliezer Yudkowsky has all the makings of a figure from Greek tragedy. He started off his career trying to build artificial general intelligence, captivated by the prospect of technological and social progress a superhuman mind could bring. But he soon realized that the system he was trying to build could be very hard to control — and potentially deadly. So he pivoted to researching how to make AI safe and spreading word of its dangers. Crisis averted: instead of building the potential doom machine, he would save us from it.
Alas, the warnings of the potential power of AI proved not just alarming, but intoxicating. A young Greg Brockman was an avid reader of Yudkowsky’s work — and went on to co-found OpenAI. Dario Amodei, founder of Anthropic, was an early acolyte of the effective altruist and rationalist movement he helped spawn. Yudkowsky introduced Demis Hassabis and Shane Legg (himself a longtime devotee) to their first major backer. Peter Thiel’s resulting investment in DeepMind helped launch the AI race that he had spent years warning against. As Sam Altman once observed, twisting the knife: Yudkowsky has “done more to accelerate AGI than anyone else.”
The story could yet have a happy ending. As the AI race intensifies and we creep ever closer to danger, Yudkowsky could produce a masterpiece: a civilization-shifting book that jolts everyone awake, catalyzing concrete policy action and averting disaster. Humanity could yet pull through.
But If Anyone Builds It, Everyone Dies is not that book.
If Anyone Builds It, written by Yudkowsky and Nate Soares, is an attempt to explain why the authors and their colleagues at the Machine Intelligence Research Institute (MIRI) believe that “superhuman AI would kill us all.” It unfolds in three parts: an explanation of why artificial superintelligence would be dangerous; a concrete scenario of how we might all die; and an attempt to offer some solutions to the predicament we find ourselves in today.
The core arguments the pair advance are fairly intuitive. Intelligence correlates with power, and companies are trying to make more intelligent — and powerful — systems. We don’t really understand how such systems work, and are unable to reliably control their goals. It is likely, the authors argue, that such goals would therefore differ from and conflict with our own. This could end badly.
Yudkowsky and Soares do a decent enough job of outlining these arguments. They deploy some elegant analogies to human evolution — someone designing the principle of natural selection would neither have intended nor expected humans to invent non-calorific sweeteners — to explain why we can expect the preferences of advanced AI to be weird and unpredictable. They do an equally admirable job explaining the futility of predicting how a superintelligent AI might kill us:
Our best guess is that a superintelligence will come at us with weird technology that we didn’t even think was possible, that we didn’t understand was allowed by the rules. That is what has usually happened when groups with different levels of technological capabilities meet. It’d be like the Aztecs facing down guns.
The book is particularly incisive when decrying the attitude of many AI developers towards the risks, noting that if the same approach were applied to other safety-relevant domains, we’d consider it grossly negligent: “We’ll make them care about truth, and then we’ll be okay,” or “We’ll just have AI solve the ASI alignment problem for us,” the authors write, “are not what engineers sound like when they respect the problem, when they know exactly what they’re doing.”
Yet for all their strengths, Yudkowsky and Soares often move too fast, failing to spell out things that need it (chains-of-thought and model distillation are both referred to, but not adequately explained). Other times, they linger much too long on things they needn’t (five pages on how nuclear reactors work and why Chernobyl happened are particularly misplaced). They do not draw nearly enough on the rapidly expanding body of empirical evidence that shows many of their concerns have begun to materialize. And though they make a valiant attempt to explain their worldview, there are too many insufficiently-justified assumptions. They assert that by default a superintelligence would have goals vastly different from our own — but they do not satisfactorily explain why those goals would necessarily result in our extermination. For all their analogy to evolution and human superiority over animals, they do not have a compelling explanation for why humans keep monkeys around, even though eliminating them might better satisfy our aims.
The bigger issue, though, is not content but style. The book is littered with torturous sentence structures, writing that is unbearably smug and self-satisfied. Chapters begin with parables, often effective at conveying the point but needlessly self-referential. (“Imagine, if you would — though of course nothing like this ever happened, it being just a parable” is how one opens.) There are only so many times one can struggle through a clause like “but the researchers could not make it be true, that Sable would get…”
If Anyone Builds It is objectively short: 233 pages, not including references, a miracle by Yudkowsky’s standards. But the painful prose makes it feel interminable. The stylistic choices, regularly lapsing into fantasy-novel flourishes, do not project competence.
That is a problem, particularly when the policy remedies they demand are so draconian. Yudkowsky and Soares call for all GPU clusters larger than eight of 2024’s best chips to be monitored by an international authority. “It should not be legal … for people to continue publishing research into more efficient and powerful AI techniques,” they argue. And international treaties, backed by the threat of bombing data centers, must be drawn up to enforce these rules globally.
Given their beliefs, it is completely understandable why they call for such measures. If, as Yudkowsky and Soares argue, we might not get a “warning shot” before the development of superintelligence, it may indeed be sensible to shut everything down right now — after all, the next training run might be the one that brings extinction. But they do not spend nearly enough time explaining why these particular measures are the ones policymakers should take, or explaining the inadequacy of other proposals. They have also chosen to describe their preferred policies in the most shocking way possible. The authors would do well to note that all international agreements are implicitly backed up by threats of military force, rather than just talk about bombing datacenters.
Style is in the eye of the beholder, of course, and what grates on me may well appeal to the general reader. Some initial reviews, even critical ones, have praised the book as “extremely readable.” It has received glowing testimonials from a wide range of prominent figures, and is certainly better than Yudkowsky’s blog posts. But If Anyone Builds It’s flaws undermine the message they argue is so existential, and prevent it from being an effective (and much-needed) clarion call about the risks posed by AI.
Towards the end of the book, Yudkowsky and Soares offer a haunting image. “Imagine that every competing AI company is climbing a ladder in the dark. At every rung but the top one, they get five times as much money: 10 billion, 50 billion, 250 billion, 1.25 trillion dollars. But if anyone reaches the top rung, the ladder explodes and kills everyone. Also, nobody knows where the ladder ends.”
“Are we sure that the next rung in the AI escalation ladder is the last fatal step, and not a rung that brings fame and riches to whoever takes it first? No, we are not sure at all … But if we can’t stop climbing while uncertainty remains, we predictably die.”
Yudkowsky and Soares are right that until we figure out how to make advanced AI safe, we must, at some point, stop climbing. If Anyone Builds It is an attempt to bring about that stop. Given the stakes they describe — nothing less than human extinction — one wishes they had crafted their warning with greater care. Instead of undoing Yudkowsky’s inadvertent role in accelerating the very future he fears, this book may instead cement him as the tragic figure he seems destined to be: a Cassandra, whose warnings were continually ignored until it was too late.
If Anyone Builds It, Everyone Dies, by Eliezer Yudkowsky and Nate Soares. Little, Brown and Company/Bodley Head; 272 pages; $30. Amazon US; UK.
I found it quite readable! Although I am a veteran of The Sequences so perhaps I am primed for Yudkowskianisms