Nuclear brinksmanship is not a good AI x-risk strategy

In a recent article, Eliezer Yudkowsky advocates for the following measures to stop AI development:

Shut down all the large GPU clusters (the large computer farms where the most powerful AIs are refined). Shut down all the large training runs. Put a ceiling on how much computing power anyone is allowed to use in training an AI system, and move it downward over the coming years to compensate for more efficient training algorithms. No exceptions for anyone, including governments and militaries. Make immediate multinational agreements to prevent the prohibited activities from moving elsewhere. Track all GPUs sold. If intelligence says that a country outside the agreement is building a GPU cluster, be less scared of a shooting conflict between nations than of the moratorium being violated; be willing to destroy a rogue datacenter by airstrike.

Frame nothing as a conflict between national interests, have it clear that anyone talking of arms races is a fool. That we all live or die as one, in this, is not a policy but a fact of nature. Make it explicit in international diplomacy that preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange, and that allied nuclear countries are willing to run some risk of nuclear exchange if that’s what it takes to reduce the risk of large AI training runs.

In this passage, Eliezer explicitly calls for airstrikes on “rogue datacentres”, and for being willing to declare war on countries that build GPU clusters. The last sentences are somewhat vaguer, in that he does not specify what specific actions would both “run some risk of nuclear exchange” and “reduce the risk of large AI training runs”. In the context of the previous passages, I think that the only interpretation that makes sense to me would be “threaten to bomb datacentres in nuclear armed countries”. If Eliezer meant something else, I am open to being corrected and will edit this post. Regardless, I still think it’s worth critiquing this idea, because some people have agreed with it, and I think it is a really, really, really bad idea.

Before I make my argument, I should say that the badness of the idea depends greatly on your beliefs about the inevitability of AI doom. I think it’s plausible that this proposal makes sense for Yudkowsky, given his beliefs that p(doom) is near 100%. I’m not condemning him for making this argument, given those beliefs. However, the majority of people here think that the odds of AI apocalypse are much lower than 100%, so I will make the case from that perspective.

  1. Nuclear blackmail doesn’t work

Suppose Xi Jinping declares tomorrow that he he believed AI was an imminent existential threat. He then makes an ultimatum to the US: dismantle OpenAI and ban AI research within 6 months, or China will launch airstrikes on silicon valley datacentres.

I would estimate the chance of the US complying with this ultimatum to be ridiculously low (<1%). The reasoning used would be this:

There is a high chance that China is bluffing. If the US gives in to this ultimatum, then there is no reason that China can’t make another ultimatum, and then another ultimatum, and then another one, effectively ceding sovereignty over it’s nation to China. And not just China, other nations would see that this tactic worked, and join in the fun. And by increasing the number of bombing blackmail incidents, giving in might actually raise the chance of warfare more than by holding out would.

Well actually, it would probably involve more flag waving, swearing, and other displays of nationalism and patriotism. The official response would be something like “bugger off, we do what we want, if you bomb us we will bomb you back”.

In fact, the response to the ultimatum is more likely to accelerate AI research, to see if AI could be used to defend against nukes (and safety will for sure be sacrificed). They would also probably start hiding datacentres where they are safe from airstrikes.

Now what happens when he ultimatum runs out? Either China backs down, in which case the ultimatum was worthless and actively counterproductive, or China bombs silicon valley, potentially starting nuclear Armageddon.

In this scenario, the risk of nuclear war has been raised significantly, but the risk of AI extinction has not been reduced at all. In fact it has arguably been increased, by making people more desperate.

Now, the actual case proposed would not be a single nation announcing a threat out the blue. It would involve multiple nations signing treaties together, precommitting to attacks as a last resort. But what if a nuclear armed country doesn’t sign on to the treaty, and doesn’t respond to other avenues of negotiation? In effect, you’re back to the scenario above, which as I’ve explained, doesn’t work at all.

2. If the world takes AI risk seriously, do we need threats?

A world in which most countries take AI risk seriously enough to risk nuclear war over is a very different world to the one we live in today. With that level of concern, the opportunities for alternate plans are massive. We could pour billions or even trillions into alignment research, and have orders of magnitude more people working on the problem, including the best of the best in every field. If the world does go down the “ban clusters” route, then there are plenty of nonviolent options available for dealing with rogue nations, given the massive amounts of resources available. For example, we could impose sanctions, try and cut their internet access, their semiconductor chip supply, etc. I’m not advocating for these things myself, but I am pointing out they are far preferable to risking nuclear war, and are available options in such a world.

Your estimate of x-risk may be high, in todays world. But the chances of AI x-risk are conditional on what humanity actually does. In the world described above, I think most peoples estimates would be substantially lower than they are now. This makes the math supporting nuclear brinksmanship even worse.

3. Don’t do morally wrong things

In the wake of the FTX fraud scandal, EA has repeatedly tried to emphasize that you shouldn’t do ethically dubious things in the pursuit of EA goals. I think the proposals above would involve doing ethically dubious things. For example, if a small country like Uganda tried to build an AI for medical discoveries to try and cure a disease plaguing their country, then it is advocated that outsiders that they have not signed a treaty with would order them to stop their research under threat of war and bombing, and then actually bomb them if they don’t comply. It’s true that we could try and compensate them in other ways, but I still find this morally wrong.

4. Nuclear exchanges could be part of a rogue AI plan

If we are in a world which already has a scheming AI that wants to kill us all, then “start a nuclear war” seems like a fairly obvious move for it to make, assuming it has planned ahead. This is especially the case if the worlds government are currently all cooperating to suppress your available resources. By blowing everything up, as long as the AI can survive the exchange somewhere, it has a pretty good chance of finishing the job.

With this in mind, if we are concerned about AI killing us all, we should also be reducing the ease of nuclear exchanges. Putting the line for nuclear brinksmanship at “a country makes GPU clusters” makes the job of an AI incredibly easy. They don’t even have to build any clusters in a nuclear armed nation. All they have to do is launch an Iraq-war WMD style disinformation campaign that convinces the world that clusters exist, and watch as we militarily demand that Russia dismantle their non-existent hidden GPU clusters.

Conclusion

I hope that the arguments above are enough to convince people that a policy of threatening to bomb clusters in nuclear armed nations is a bad idea that should not be pursued. It’s possible that Eliezer was not even arguing for this, but since the idea has now been floated by implication, I think it’s important to refute it.