Hey Rob, thanks for writing this, and sorry for the slow response. In brief, I think you do misunderstand my views, in ways that Buck, Ryan and Habryka point out. I’ll clarify a little more.
Some areas where the criticism seems reasonable:
I think it’s fair to say that I worded the compute governance sentence poorly, in ways Habryka clarified.
I’m somewhat sympathetic to the criticism that there was a “missing mood” (cf e.g. here and here), given that a lot of people won’t know my broader views. I’m very happy to say: “I definitely think it will be extremely valuable to have the option to slow down AI development in the future,” as well as “the current situation is f-ing crazy”. (Though there was also a further vibe on twitter of “we should be uniting rather than disagreeing” which I think is a bad road to go down.)
Now, clarifying my position:
Here’s what I take IABI to be arguing (written by GPT5-Pro, on the basis of a pdf, in an attempt not to infuse my biases):
The book argues that building a superhuman AI would be predictably fatal for humanity and therefore urges an immediate, globally enforced halt to AI escalation—consolidating and monitoring compute under treaty, outlawing capability‑enabling research, and, if necessary, neutralizing rogue datacenters—while mobilizing journalists and ordinary citizens to press leaders to act.
And what readers will think the book is about (again written by GPT5-Pro):
A “shut‑it‑all‑down‑now” manifesto warning that any superintelligent AI will wipe us out unless governments ban frontier AI and are prepared to sabotage or bomb rogue datacenters—so the public and the press must demand it.
The core message of the book is not saying merely “AI x-risk is worryingly high” or “stopping or slowing AI development would be one good strategy among many.” I wouldn’t disagree with the former at all, and the latter disagreement would be more about the details.
Here’s a different perspective:
AI takeover x-risk is high, but not extremely high (e.g. 1%-40%). The right response is an “everything and the kitchen sink” approach — there are loads of things we can do that all help a bit in expectation (both technical and governance, including mechanisms to slow the intelligence explosion), many of which are easy wins, and right now we should be pushing on most of them.
This is my overall strategic picture. If the book had argued for that (or even just the “kitchen sink” approach part) then I might have disagreed with the arguments, but I wouldn’t feel, “man, people will come away from this with a bad strategic picture”.
(I think the whole strategic picture would include:
There are a lot of other existential-level challenges, too (including human coups / concentration of power), and ideally the best strategies for reducing AI takeover risk shouldn’t aggravate these other risks.
But I think that’s fine not to discuss in a book focused on AI takeover risk.)
This is also the broad strategic picture, as I understand it, of e.g. Carl, Paul, Ryan, Buck. It’s true that I’m more optimistic than they are (on the 80k podcast I say 1-10% range for AI x-risk, though it depends on what exactly you mean by that) but I don’t feel deep worldview disagreement with them.
With that in mind, some reasons why I think the promotion of the Y&S view could be meaningfully bad:
If it means more people don’t pursue the better strategy of focusing on the easier wins.
Or they end up making the wrong tradeoffs. (e.g. intense centralisation of AI development in a way that makes misaligned human takeover risk more likely)
Or people might lapse into defeatism: “Ok we’re doomed, then: a decades-long international ban will never happen, so it’s pointless to work on AI x-risk.” (We already see this reaction to climate change, given doomerist messaging there. To be clear, I don’t think that sort of effect should be a reason for being misleading about one’s views.)
Overall, I feel pretty agnostic on whether Y&S shouting their message is on net good for the world.
I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.
Then, just to be clear, here are some cases where you misunderstand me, just focusing on the most-severe misunderstandings:
he’s more or less calling on governments to sit back and let it happen
I really don’t think that!
He thinks feedback loops like “AIs do AI capabilities research” won’t accelerate us too much first.
I vibeswise disagree, because I expect massive acceleration and I think that’s *the* key challenge: See e.g. PrepIE, 80k podcast.
But there is a grain of truth in that my best guess is a more muted software-only intelligence explosion than some others predict. E.g. a best guess where, once AI fully automates AI R&D, we get 3-5 years of progress in 1 year (at current rates), rather than 10+ years’ worth, or rather than godlike superintelligence. This is the best analysis I know of on the topic. This might well be the cause of much of the difference in optimism between me and e.g. Carl.
(Note I still take the much larger software explosions very seriously (e.g. 10%-20% probability). And I could totally change my mind on this — the issue feels very live and open to me.)
Will thinks government compute monitoring is a bad idea
Definitely disagree with this one! In general, society having more options and levers just seems great to me.
he’s sufficiently optimistic that the people who build superintelligence will wield that enormous power wisely and well, and won’t fall into any traps that fuck up the future
Definitely disagree!
Like, my whole bag is that I expect us to fuck up the future even if alignment is fine!! (e.g. Better Futures)
He’s proposing that humanity put all of its eggs in this one basket
Definitely disagree! From my POV, it’s the IABI perspective that is closer to putting all the eggs in one basket, rather than advocating for the kitchen sink approach.
It seems hard to be more than 90% confident in the whole conjunction, in which case there’s a double-digit chance that the everyone-races-to-build-superintelligence plan brings the world to ruin.
But “10% chance of ruin” is not what EY&NS, or the book, is arguing for, and isn’t what I was arguing against. (You could logically have the view of “10% chance of ruin and the only viable way to bring that down is a global moratorium”, but I don’t know anyone who has that view.)
a conclusion like “things will be totally fine as long as AI capabilities trendlines don’t change.”
Also not true, though I am more optimistic than many on the takeover side of things.
to advocate that we race to build it as fast as possible
Also not true—e.g. I write here about the need to slow the intelligence explosion.
There’s a grain of truth in that I’m pretty agnostic on whether speeding up or slowing down AI development right now is good or bad. I flip-flop on it, but I currently lean towards thinking speed up at the moment is mildly good, for a few reasons: it stretches out the IE by bringing it forwards, means there’s more of a compute constraint and so the software-only IE doesn’t go as far, and means society wakes up earlier, giving more time to invest more in alignment of more-powerful AI.
(I think if we’d gotten to human-level algorithmic efficiency at the Dartmouth conference, that would have been good, as compute build-out is intrinsically slower and more controllable than software progress (until we get nanotech). And if we’d scaled up compute + AI to 10% of the global economy decades ago, and maintained it at that level, that also would have been good, as then the frontier pace would be at the rate of compute-constrained algorithmic progress, rather than the rate we’re getting at the moment from both algorithmic progress AND compute scale-up.)
In general, I think that how the IE happens and is governed is a much bigger deal than when it happens.
like, I still associate Will to some degree with the past version of himself who was mostly unconcerned about near-term catastrophes and thought EA’s mission should be to slowly nudge long-term social trends.
Which Will-version are you thinking of? Even in DGB I wrote about preventing near-term catastrophes as a top cause area.
I think Will was being unvirtuously cagey or spin-y about his views
Again really not intended! I think I’ve been clear about my views elsewhere (see previous links).
Ok, that’s all just spelling out my views. Going back, briefly, to the review. I said I was “disappointed” in the book — that was mainly because I thought that this was E&Y’s chance to give the strongest version of their arguments (though I understood they’d be simplified or streamlined), and the arguments I read were worse than I expected (even though I didn’t expect to find them terribly convincing).
Regarding your object-level responses to my arguments — I don’t think any of them really support the idea that alignment is so hard that AI takeover x-risk is overwhelmingly likely, or that the only viable response is to delay AI development by decades. E.g.
As Joe Collman notes, a common straw version of the If Anyone Builds It, Everyone Dies thesis is that “existing AIs are so dissimilar” to a superintelligence that “any work we do now is irrelevant,” when the actual view is that it’s insufficient, not irrelevant.
But if it’s a matter of “insufficiency”, the question is how one can be so confident that any work we do now (including with ~AGI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned ~AGIs) is insufficient, such that the only thing that makes a meaningful difference to x-risk, even in expectation, is a global moratorium. And I’m still not seeing the case for that.
(I think I’m unlikely to respond further, but thanks again for the engagement.)
the better strategy of focusing on the easier wins
I feel that you are not really appreciating the point that such “easier wins” aren’t in fact wins at all, in terms of keeping us all alive. They might make some people feel better, but they are very unlikely to reduce AI takeover risk to, say, a comfortable 0.1% (In fact I don’t think they will reduce it to below 50%).
I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.
Well hearing this, I am triggered that someone “who takes AI takeover risk very seriously” would think that stopping AI development was “such a dumb idea”! I’d question whether they do actually take AI takeover risk seriously at all. Whether or not a Pause is “realistic” or “will never happen”, we have to try! It really is our only shot if we actually care about staying alive for more than another few years. More people need to realise this. And I still don’t understand how people can think that the default outcome of AGI/ASI is survival for humanity, or an OK outcome.
...the question is how one can be so confident that any work we do now (including with ~AGI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned ~AGIs) is insufficient, such that the only thing that makes a meaningful difference to x-risk, even in expectation, is a global moratorium. And I’m still not seeing the case for that.
I’d flip this completely, and say: the question is why we should be so confident that any work we do now (including with AI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned AIs) is sufficient to solve alignment, such that the only thing that makes a meaningful difference to x-risk, even in expectation, a global moratorium, is unnecessary. I’m still not seeing the case for that.
(I think if we’d gotten to human-level algorithmic efficiency at the Dartmouth conference, that would have been good, as compute build-out is intrinsically slower and more controllable than software progress (until we get nanotech). And if we’d scaled up compute + AI to 10% of the global economy decades ago, and maintained it at that level, that also would have been good, as then the frontier pace would be at the rate of compute-constrained algorithmic progress, rather than the rate we’re getting at the moment from both algorithmic progress AND compute scale-up.)
This is an interesting thought experiment. I think it probably would’ve been bad, because it would’ve initiated an intelligence explosion. Sure, it would’ve started off very slow, but it would’ve gathered steam inexorably, speeding tech development, including compute scaling. And all this before anyone had even considered the alignment problem. After a couple of decades perhaps humanity would already have been gradually disempowered past the point of no return.
(cross-posted from LW)
Hey Rob, thanks for writing this, and sorry for the slow response. In brief, I think you do misunderstand my views, in ways that Buck, Ryan and Habryka point out. I’ll clarify a little more.
Some areas where the criticism seems reasonable:
I think it’s fair to say that I worded the compute governance sentence poorly, in ways Habryka clarified.
I’m somewhat sympathetic to the criticism that there was a “missing mood” (cf e.g. here and here), given that a lot of people won’t know my broader views. I’m very happy to say: “I definitely think it will be extremely valuable to have the option to slow down AI development in the future,” as well as “the current situation is f-ing crazy”. (Though there was also a further vibe on twitter of “we should be uniting rather than disagreeing” which I think is a bad road to go down.)
Now, clarifying my position:
Here’s what I take IABI to be arguing (written by GPT5-Pro, on the basis of a pdf, in an attempt not to infuse my biases):
And what readers will think the book is about (again written by GPT5-Pro):
The core message of the book is not saying merely “AI x-risk is worryingly high” or “stopping or slowing AI development would be one good strategy among many.” I wouldn’t disagree with the former at all, and the latter disagreement would be more about the details.
Here’s a different perspective:
This is my overall strategic picture. If the book had argued for that (or even just the “kitchen sink” approach part) then I might have disagreed with the arguments, but I wouldn’t feel, “man, people will come away from this with a bad strategic picture”.
(I think the whole strategic picture would include:
But I think that’s fine not to discuss in a book focused on AI takeover risk.)
This is also the broad strategic picture, as I understand it, of e.g. Carl, Paul, Ryan, Buck. It’s true that I’m more optimistic than they are (on the 80k podcast I say 1-10% range for AI x-risk, though it depends on what exactly you mean by that) but I don’t feel deep worldview disagreement with them.
With that in mind, some reasons why I think the promotion of the Y&S view could be meaningfully bad:
If it means more people don’t pursue the better strategy of focusing on the easier wins.
Or they end up making the wrong tradeoffs. (e.g. intense centralisation of AI development in a way that makes misaligned human takeover risk more likely)
Or people might lapse into defeatism: “Ok we’re doomed, then: a decades-long international ban will never happen, so it’s pointless to work on AI x-risk.” (We already see this reaction to climate change, given doomerist messaging there. To be clear, I don’t think that sort of effect should be a reason for being misleading about one’s views.)
Overall, I feel pretty agnostic on whether Y&S shouting their message is on net good for the world.
I think I’m particularly triggered by all this because of a conversation I had last year with someone who takes AI takeover risk very seriously and could double AI safety philanthropy if they wanted to. I was arguing they should start funding AI safety, but the conversation was a total misfire because they conflated “AI safety” with “stop AI development”: their view was that that will never happen, and they were actively annoyed that they were hearing what they considered to be such a dumb idea. My guess was that EY’s TIME article was a big factor there.
Then, just to be clear, here are some cases where you misunderstand me, just focusing on the most-severe misunderstandings:
I really don’t think that!
I vibeswise disagree, because I expect massive acceleration and I think that’s *the* key challenge: See e.g. PrepIE, 80k podcast.
But there is a grain of truth in that my best guess is a more muted software-only intelligence explosion than some others predict. E.g. a best guess where, once AI fully automates AI R&D, we get 3-5 years of progress in 1 year (at current rates), rather than 10+ years’ worth, or rather than godlike superintelligence. This is the best analysis I know of on the topic. This might well be the cause of much of the difference in optimism between me and e.g. Carl.
(Note I still take the much larger software explosions very seriously (e.g. 10%-20% probability). And I could totally change my mind on this — the issue feels very live and open to me.)
Definitely disagree with this one! In general, society having more options and levers just seems great to me.
Definitely disagree!
Like, my whole bag is that I expect us to fuck up the future even if alignment is fine!! (e.g. Better Futures)
Definitely disagree! From my POV, it’s the IABI perspective that is closer to putting all the eggs in one basket, rather than advocating for the kitchen sink approach.
But “10% chance of ruin” is not what EY&NS, or the book, is arguing for, and isn’t what I was arguing against. (You could logically have the view of “10% chance of ruin and the only viable way to bring that down is a global moratorium”, but I don’t know anyone who has that view.)
Also not true, though I am more optimistic than many on the takeover side of things.
Also not true—e.g. I write here about the need to slow the intelligence explosion.
There’s a grain of truth in that I’m pretty agnostic on whether speeding up or slowing down AI development right now is good or bad. I flip-flop on it, but I currently lean towards thinking speed up at the moment is mildly good, for a few reasons: it stretches out the IE by bringing it forwards, means there’s more of a compute constraint and so the software-only IE doesn’t go as far, and means society wakes up earlier, giving more time to invest more in alignment of more-powerful AI.
(I think if we’d gotten to human-level algorithmic efficiency at the Dartmouth conference, that would have been good, as compute build-out is intrinsically slower and more controllable than software progress (until we get nanotech). And if we’d scaled up compute + AI to 10% of the global economy decades ago, and maintained it at that level, that also would have been good, as then the frontier pace would be at the rate of compute-constrained algorithmic progress, rather than the rate we’re getting at the moment from both algorithmic progress AND compute scale-up.)
In general, I think that how the IE happens and is governed is a much bigger deal than when it happens.
Which Will-version are you thinking of? Even in DGB I wrote about preventing near-term catastrophes as a top cause area.
Again really not intended! I think I’ve been clear about my views elsewhere (see previous links).
Ok, that’s all just spelling out my views. Going back, briefly, to the review. I said I was “disappointed” in the book — that was mainly because I thought that this was E&Y’s chance to give the strongest version of their arguments (though I understood they’d be simplified or streamlined), and the arguments I read were worse than I expected (even though I didn’t expect to find them terribly convincing).
Regarding your object-level responses to my arguments — I don’t think any of them really support the idea that alignment is so hard that AI takeover x-risk is overwhelmingly likely, or that the only viable response is to delay AI development by decades. E.g.
But if it’s a matter of “insufficiency”, the question is how one can be so confident that any work we do now (including with ~AGI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned ~AGIs) is insufficient, such that the only thing that makes a meaningful difference to x-risk, even in expectation, is a global moratorium. And I’m still not seeing the case for that.
(I think I’m unlikely to respond further, but thanks again for the engagement.)
I feel that you are not really appreciating the point that such “easier wins” aren’t in fact wins at all, in terms of keeping us all alive. They might make some people feel better, but they are very unlikely to reduce AI takeover risk to, say, a comfortable 0.1% (In fact I don’t think they will reduce it to below 50%).
Well hearing this, I am triggered that someone “who takes AI takeover risk very seriously” would think that stopping AI development was “such a dumb idea”! I’d question whether they do actually take AI takeover risk seriously at all. Whether or not a Pause is “realistic” or “will never happen”, we have to try! It really is our only shot if we actually care about staying alive for more than another few years. More people need to realise this. And I still don’t understand how people can think that the default outcome of AGI/ASI is survival for humanity, or an OK outcome.
I’d flip this completely, and say: the question is why we should be so confident that any work we do now (including with AI assistance, including if we’ve bought extra time via control measures and/or deals with misaligned AIs) is sufficient to solve alignment, such that the only thing that makes a meaningful difference to x-risk, even in expectation, a global moratorium, is unnecessary. I’m still not seeing the case for that.
This is an interesting thought experiment. I think it probably would’ve been bad, because it would’ve initiated an intelligence explosion. Sure, it would’ve started off very slow, but it would’ve gathered steam inexorably, speeding tech development, including compute scaling. And all this before anyone had even considered the alignment problem. After a couple of decades perhaps humanity would already have been gradually disempowered past the point of no return.