RobBensinger comments on AGI Ruin: A List of Lethalities

RobBensinger 9 Jun 2022 12:45 UTC
13 points
0 ∶ 0
Quoting Scott Alexander here:
I agree it’s not necessarily a good idea to go around founding the Let’s Commit A Pivotal Act AI Company.
But I think there’s room for subtlety somewhere like “Conditional on you being in a situation where you could take a pivotal act, which is a small and unusual fraction of world-branches, maybe you should take a pivotal act.”
That is, if you are in a position where you have the option to build an AI capable of destroying all competing AI projects, the moment you notice this you should update heavily in favor of short timelines (zero in your case, but everyone else should be close behind) and fast takeoff speeds (since your AI has these impressive capabilities). You should also update on existing AI regulation being insufficient (since it was insufficient to prevent you)
Somewhere halfway between “found the Let’s Commit A Pivotal Act Company” and “if you happen to stumble into a pivotal act, take it”, there’s an intervention to spread a norm of “if a good person who cares about the world happens to stumble into a pivotal-act-capable AI, take the opportunity”. I don’t think this norm would necessarily accelerate a race. After all, bad people who want to seize power can take pivotal acts whether we want them to or not. The only people who are bound by norms are good people who care about the future of humanity. I, as someone with no loyalty to any individual AI team, would prefer that (good, norm-following) teams take pivotal acts if they happen to end up with the first superintelligence, rather than not doing that.
Another way to think about this is that all good people should be equally happy with any other good person creating a pivotal AGI, so they won’t need to race among themselves. They might be less happy with a bad person creating a pivotal AGI, but in that case you should race and you have no other option. I realize “good” and “bad” are very simplistic but I don’t think adding real moral complexity changes the calculation much.
I am more concerned about your point where someone rushes into a pivotal act without being sure their own AI is aligned. I agree this would be very dangerous, but it seems like a job for normal cost-benefit calculation: what’s the risk of your AI being unaligned if you act now, vs. someone else creating an unaligned AI if you wait X amount of time? Do we have any reason to think teams would be systematically biased when making this calculation?
I’m more confident than Scott that the first AGI systems will be capable enough to execute a pivotal act (though alignability is another matter!). And, unlike Scott, I think AGI orgs should take the option more seriously at an earlier date, and center more of their strategic thinking around this scenario class. But if you don’t agree with me there, I think you should endorse a position more like Scott’s.
The alternative seems to just amount to writing off futures where early AGI systems are highly capable or impactful — giving up in advance, effectively deciding that endorsing a strategy that sounds weirdly extreme is a larger price to pay than human extinction. Phrased in those terms, this seems obviously absurd. (More absurd if you agree with me that this would mean writing off most possible futures.)
Nuclear weapons were an extreme technological development in their day, and MAD was an extreme and novel strategy developed in response to the novel properties of nuclear weapons. Strategically novel technologies force us to revise our strategies in counter-intuitive ways. The responsible way to handle this is to seriously analyze the new strategic landscape, have conversations about it, and engage in dialogue between major players until we collectively have a clear-sighted picture of what strategy makes sense, even if that strategy sounds weirdly extreme relative to other strategic landscapes.
If there’s some alternative to intervening on AGI proliferation, then that seems important to know as well. But we should discover that, if so, via investigation, argument, and analysis of the strategic situation, rather than encouraging a mindset under which most of the relevant strategy space is taboo or evil (and then just hoping that this part of the strategy space doesn’t end up being relevant).