This profile by 80k is pretty bad in terms of just glossing over all the intermediate steps and reducing it all to “But one day, every single person in the world suddenly dies.”
Universal Paperclips is slightly better about this, showing the process of the AI gaining our trust before betraying us, but the key power-grab step is still reduced to just “release the hypnodrones”.
There are other places that have fleshed out the details of how misaligned power-seeking might play out, such as Holden Karnofsky’s post AI Could Defeat All Of Us Combined.
That particular story, in which I write “one day, every single person in the world suddenly dies”, is about a fast takeoff self-improvement scenario. In such scenarios, a sudden takeover is exactly what we should expect to occur, and the intermediate steps set out by Holden and others don’t apply to such scenarios. Any guessing about what sort of advanced technology would do this necessarily makes the scenario less likely, and I think such guesses (e.g. “hypnodrones”) are extremely likely to be false and aren’t useful or informative.
For what it’s worth, I personally agree that slow takeoff scenarios like those described by Holden (or indeed those I discuss in the rest of this article) are far more likely. That’s why I focus many different ways in which an AI could take over—rather than on any particular failure story. And, as I discuss, any particular combination of steps is necessarily less likely than the claim that any or all of these capabilities could be used.
But a significant fraction of people working on AI existential safety disagree with both of us, and think that a story which literally claims that a sufficiently advanced system will suddenly kill all humans is the most likely way for this catastrophe to play out! That’s why I also included a story which doesn’t explain these intermediate steps, even though my inside view is that this is less likely to occur.
I’m one of the AI researchers worried about fast takeoff. Yes, it’s probably incorrect to pick any particular sudden-death scenario and say it’s how it’ll happen, but you can provide some guesses and a better illustration of one or more possibilities. For example, have you read Valuable Humans In Transit? https://qntm.org/transit
This profile by 80k is pretty bad in terms of just glossing over all the intermediate steps and reducing it all to “But one day, every single person in the world suddenly dies.”
Universal Paperclips is slightly better about this, showing the process of the AI gaining our trust before betraying us, but the key power-grab step is still reduced to just “release the hypnodrones”.
There are other places that have fleshed out the details of how misaligned power-seeking might play out, such as Holden Karnofsky’s post AI Could Defeat All Of Us Combined.
That particular story, in which I write “one day, every single person in the world suddenly dies”, is about a fast takeoff self-improvement scenario. In such scenarios, a sudden takeover is exactly what we should expect to occur, and the intermediate steps set out by Holden and others don’t apply to such scenarios. Any guessing about what sort of advanced technology would do this necessarily makes the scenario less likely, and I think such guesses (e.g. “hypnodrones”) are extremely likely to be false and aren’t useful or informative.
For what it’s worth, I personally agree that slow takeoff scenarios like those described by Holden (or indeed those I discuss in the rest of this article) are far more likely. That’s why I focus many different ways in which an AI could take over—rather than on any particular failure story. And, as I discuss, any particular combination of steps is necessarily less likely than the claim that any or all of these capabilities could be used.
But a significant fraction of people working on AI existential safety disagree with both of us, and think that a story which literally claims that a sufficiently advanced system will suddenly kill all humans is the most likely way for this catastrophe to play out! That’s why I also included a story which doesn’t explain these intermediate steps, even though my inside view is that this is less likely to occur.
I’m one of the AI researchers worried about fast takeoff. Yes, it’s probably incorrect to pick any particular sudden-death scenario and say it’s how it’ll happen, but you can provide some guesses and a better illustration of one or more possibilities. For example, have you read Valuable Humans In Transit? https://qntm.org/transit