This proposal seems to have become extremely polarizing, more so and for different reasons than I would have expected after first reading this. I am more on the “this is pretty fine” side of the spectrum, and think some of the reasons it has been controversial are sort of superficial. Given this though, I want to steelman the other side (I know Yudkowsky doesn’t like steelmanning, too bad, I do), with a few things that are plausibly bad about it that I don’t think are superficial or misreadings, as well as some start of my reasons for worrying less about them:
“While it’s true that this isn’t ‘the same as’ calling for outright violence, if we are at least a little bit on Orwell’s side on political violence, surely the position that significantly risking nuclear war is worse than a few terrorists bombing a GPU center seems quite silly. If he supports the former but not the latter, that is quite an extreme position!”:
I’m sympathetic to this, in no small part because I lean Orwell on state violence in many cases, but I think it misunderstands Yudkowsky’s problem with the terrorists. It’s not that the fact that this is terrorism adds enough intrinsic badness to outweigh a greater chance of literal nuclear war, it’s that legitimate state authority credibly being willing to go to nuclear war is likely to actually work, while terrorism is just a naïve tactic which will likely backfire. In fact nothing even rests on the idea that nuclear war is worth preventing AI (though in a quite bad and now deleted tweet Yudkowsky does argue for this, and given that he expects survivors of nuclear war but not AI misalignment nothing about this judgement rests on his cringe “reaching the stars” aside). If a NATO country is invaded, letting it be invaded is surely not as bad as global nuclear war, but supporters of Article 5 tacitly accept the cost of risking this outcome, because non-naïve consequentialism cares about credibly backing certain important norms, even when, in isolation, the cost of going through with them doesn’t look worth it.
“While there are international resolutions that involve credibly risking nuclear war, like Article 5, and there are international resolutions that involve punishing rogue states, like ones governing the development of weapons of mass destruction, the combination of these two is in practice not really there, so pointing to each in isolation fails to recognize the way this proposal pushes a difference in degrees all the way to basically a difference in kind”:
I am again sympathetic to this. What Yudkowsky is proposing here is kind of a big deal, and it involves a stricter international order than we have ever seen before. This is very troubling! It isn’t clear that there is a single difference in kind (except perhaps democracy) between Stalinism and a state that I would be mostly fine with. It’s largely about pushing state powers and state flaws that are tolerable at some degree to a point where they are no longer tolerable. I think I’m just not certain if his proposal reaches this crucial level for me. One reason is that I’m just not sure what level of international control really crosses that line for me, and risking war to prevent x-risk seems like a candidate okay think for countries to apply a unique level of force to. Certainly if you believe the things Yudkowsky does. The second reason however, is that his actual proposal is ambiguous in crucial ways that I will cover in point 3, so I would probably be okay with some but not other versions of it.
“Yes there is a difference between state force and random acts of violence, but it isn’t clear what general heuristic we can use to distinguish the two other than ‘one is carried out by a legitimate state authority and the other isn’t’. We know what this looks like at the country level because the relevant state authority is usually pretty clear, but on the international stage this is just not something where the difference between legitimate and illegitimate governance is super obvious! How many countries have to sign up? What portion of the world population do they have to represent? How powerful on the international stage do they already have to be? What agreement mechanism/mediating body is needed to grant this authority? Surely there is a difference in the types of ways Yudkowsky cares about between violence carried out by the mafia versus the local government, but on the international stage this sort of difference is very foggy. Given the huge difference he places on illegitimate versus legitimate force, Yudkowsky should have been specific about what it would take for such a governing agreement to be legitimate. Otherwise people can fill in the details however they think one should answer this question, and Yudkowsky specifically is shielded from the most relevant sort of criticism he could face for his proposal”:
This is the objection I am most sympathetic to, and the place I wish critics would focus most of their attention. If NATO agrees to this treaty, does that give them legitimate authority to threaten China with drone strikes that isn’t just like the mafia writing threatening letters to AI developers? What if China joins in with NATO, does this grant the authority to threaten Russia? Probably not for both, but while the difference is probably at an ambiguous threshold of which countries sign up, it’s pretty clear when a country-wide law becomes legitimate, because there’s an agreed upon legitimate process for passing it. These are questions deeply tied to any proposal like this and it does bug me how little Yudkowsky has spelled this out. That said, I think this is sort of a problem for everyone? As I’ve said, and Yudkowsky has said, basically everyone distinguishes between state enforcement and random acts of civilian violence, and aside from this, basically everyone seems confused about how to apply this at the international scale on ambiguous margins. Insofar as you want to apply something like this to the international scale sometimes, you have to live with this tension, and probably just remain a bit confused.
This proposal seems to have become extremely polarizing, more so and for different reasons than I would have expected after first reading this. I am more on the “this is pretty fine” side of the spectrum, and think some of the reasons it has been controversial are sort of superficial. Given this though, I want to steelman the other side (I know Yudkowsky doesn’t like steelmanning, too bad, I do), with a few things that are plausibly bad about it that I don’t think are superficial or misreadings, as well as some start of my reasons for worrying less about them:
“While it’s true that this isn’t ‘the same as’ calling for outright violence, if we are at least a little bit on Orwell’s side on political violence, surely the position that significantly risking nuclear war is worse than a few terrorists bombing a GPU center seems quite silly. If he supports the former but not the latter, that is quite an extreme position!”:
I’m sympathetic to this, in no small part because I lean Orwell on state violence in many cases, but I think it misunderstands Yudkowsky’s problem with the terrorists. It’s not that the fact that this is terrorism adds enough intrinsic badness to outweigh a greater chance of literal nuclear war, it’s that legitimate state authority credibly being willing to go to nuclear war is likely to actually work, while terrorism is just a naïve tactic which will likely backfire. In fact nothing even rests on the idea that nuclear war is worth preventing AI (though in a quite bad and now deleted tweet Yudkowsky does argue for this, and given that he expects survivors of nuclear war but not AI misalignment nothing about this judgement rests on his cringe “reaching the stars” aside). If a NATO country is invaded, letting it be invaded is surely not as bad as global nuclear war, but supporters of Article 5 tacitly accept the cost of risking this outcome, because non-naïve consequentialism cares about credibly backing certain important norms, even when, in isolation, the cost of going through with them doesn’t look worth it.
“While there are international resolutions that involve credibly risking nuclear war, like Article 5, and there are international resolutions that involve punishing rogue states, like ones governing the development of weapons of mass destruction, the combination of these two is in practice not really there, so pointing to each in isolation fails to recognize the way this proposal pushes a difference in degrees all the way to basically a difference in kind”:
I am again sympathetic to this. What Yudkowsky is proposing here is kind of a big deal, and it involves a stricter international order than we have ever seen before. This is very troubling! It isn’t clear that there is a single difference in kind (except perhaps democracy) between Stalinism and a state that I would be mostly fine with. It’s largely about pushing state powers and state flaws that are tolerable at some degree to a point where they are no longer tolerable. I think I’m just not certain if his proposal reaches this crucial level for me. One reason is that I’m just not sure what level of international control really crosses that line for me, and risking war to prevent x-risk seems like a candidate okay think for countries to apply a unique level of force to. Certainly if you believe the things Yudkowsky does. The second reason however, is that his actual proposal is ambiguous in crucial ways that I will cover in point 3, so I would probably be okay with some but not other versions of it.
“Yes there is a difference between state force and random acts of violence, but it isn’t clear what general heuristic we can use to distinguish the two other than ‘one is carried out by a legitimate state authority and the other isn’t’. We know what this looks like at the country level because the relevant state authority is usually pretty clear, but on the international stage this is just not something where the difference between legitimate and illegitimate governance is super obvious! How many countries have to sign up? What portion of the world population do they have to represent? How powerful on the international stage do they already have to be? What agreement mechanism/mediating body is needed to grant this authority? Surely there is a difference in the types of ways Yudkowsky cares about between violence carried out by the mafia versus the local government, but on the international stage this sort of difference is very foggy. Given the huge difference he places on illegitimate versus legitimate force, Yudkowsky should have been specific about what it would take for such a governing agreement to be legitimate. Otherwise people can fill in the details however they think one should answer this question, and Yudkowsky specifically is shielded from the most relevant sort of criticism he could face for his proposal”:
This is the objection I am most sympathetic to, and the place I wish critics would focus most of their attention. If NATO agrees to this treaty, does that give them legitimate authority to threaten China with drone strikes that isn’t just like the mafia writing threatening letters to AI developers? What if China joins in with NATO, does this grant the authority to threaten Russia? Probably not for both, but while the difference is probably at an ambiguous threshold of which countries sign up, it’s pretty clear when a country-wide law becomes legitimate, because there’s an agreed upon legitimate process for passing it. These are questions deeply tied to any proposal like this and it does bug me how little Yudkowsky has spelled this out. That said, I think this is sort of a problem for everyone? As I’ve said, and Yudkowsky has said, basically everyone distinguishes between state enforcement and random acts of civilian violence, and aside from this, basically everyone seems confused about how to apply this at the international scale on ambiguous margins. Insofar as you want to apply something like this to the international scale sometimes, you have to live with this tension, and probably just remain a bit confused.