RobBensinger comments on A Reply to MacAskill on “If Anyone Builds It, Everyone Dies”

RobBensinger 28 Sep 2025 1:54 UTC
2 points
0 ∶ 2
Copying over my response from LW:
I wasn’t exclusively looking at that line; I was also assuming that if Will liked some of the book’s core policy proposals but disliked others, then he probably wouldn’t have expressed such a strong a blanket rejection. And I was looking at Will’s proposal here:
[IABIED skips over] what I see as the crucial period, where we move from the human-ish range to strong superintelligence[1]. This is crucial because it’s both the period where we can harness potentially vast quantities of AI labour to help us with the alignment of the next generation of models, and because it’s the point at which we’ll get a much better insight into what the first superintelligent systems will be like. The right picture to have is not “can humans align strong superintelligence”, it’s “can humans align or control AGI-”, then “can {humans and AGI-} align or control AGI” then “can {humans and AGI- and AGI} align AGI+” and so on.
This certainly sounds like a proposal that we advance AI as fast as possible, so that we can reach the point where productive alignment research is possible sooner.
The next paragraph then talks about “a gradual ramp-up to superintelligence”, which makes it sound like Will at least wants us to race to the level of superintelligence as quickly as possible, i.e., he wants the chain of humans-and-AIs-aligning-stronger-AIs to go at least that far:
Elsewhere, EY argues that the discontinuity question doesn’t matter, because preventing AI takeover is still a ‘first try or die’ dynamic, so having a gradual ramp-up to superintelligence is of little or no value. I think that’s misguided.
… Unless he thinks this “gradual ramp-up” should be achieved via switching over at some point from the natural continuous trendlines he expects from industry, to top-down government-mandated ratcheting up of a capability limit? But I’d be surprised if that’s what he had in mind, given the rest of his comment.
Wanting the world to race to build superintelligence as soon as possible also seems like it would be a not-that-surprising implication of his labs-have-alignment-in-the-bag claims.
And although it’s not totally clear to me how seriously he’s taking this hypothetical (versus whether he mainly intends it as a proof of concept), he does propose that we could build a superintelligent paperclip maximizer and plausibly be totally fine (because it’s risk averse and promise-keeping), and his response to “Maybe we won’t be able to make deals with AIs?” is:
I agree that’s a worry; but then the right response is to make sure that we can.
Not “in that case maybe we shouldn’t build a misaligned superintelligence”, but “well then we’d sure better solve the honesty problem!”.
All of this together makes me extremely confused if his real view is basically just “I agree with most of MIRI’s policy proposals but I think we shouldn’t rush to enact a halt or slowdown tomorrow”.
If his view is closer to that, then that’s great news from my perspective, and I apologize for the misunderstanding. I was expecting Will to just straightforwardly accept the premises I listed, and for the discussion to proceed from there.
I’ll add a link to your comment at the top of the post so folks can see your response, and if Will clarifies his view I’ll link to that as well.
Twitter says that Will’s tweet has had over a hundred thousand views, so if he’s a lot more pro-compute-governance, pro-slowdown, and/or pro-halt than he sounded in that message, I hope he says loud stuff in the near future to clarify his views to folks!