My threat model is something like: the very first AGIs will probably be near human-level and won’t be too hard to limit/control. But in human society, tyrants are overrepresented among world leaders, relative to tyrants in the population of people smart enough to lead a country. We’ll probably end up inventing multiple versions of AGI, some of which may be straightforwardly turned into superintelligences and others not. The worst superintelligence we help to invent may win, and if it doesn’t it’ll probably be because a different one beats it (or reaches an unbeatable position first). Humans will probably be sidelined if we survive a battle between super-AGIs. So it would be much safer not to invent them―but it’s also hard to avoid inventing them! I have low confidence in my P(catastrophe) and I’m unsure how to increase my confidence.
But I prefer estimating P(catastrophe) over P(doom) because extinction is not all that concerns me. Some stories about AGI lead to extinction, others to mass death, others to dystopia (possibly followed by mass death later), others to utopia followed by catastrophe, and still others to a stable and wonderful utopia (with humanity probably sidelined eventually, which may even be a good thing). I think I could construct a story along any of these lines.
Well said. I also think it’s important to define what is meant by “catastrophe.” Just as an example, I personally would consider it catastrophic to see a future in which humanity is sidelined and subjugated by an AGI (even a “friendly,” aligned one), but many here would likely disagree with me that this would be a catastrophe. I’ve even heard otherwise rational (non-EA) people claim a future in which humans are ‘pampered pets’ of an aligned ASI to be ‘utopian,’ which just goes to show the level of disagreement.
To me, it’s important whether the AGIs are benevolent and have qualia/consciousness. If AGIs are ordinary computers but smart, I may agree; if they are conscious and benevolent, I’m okay being a pet.
I’m not sure whether we could ever truly know if an AGI was conscious or experienced qualia (which are by definition not quantifiable). And you’re probably right that being a pet of a benevolent ASI wouldn’t be a miserable thing (but it is still an x-risk … because it permanently ends humanity’s status as a dominant species).
I would caution against thinking the Hard Problem of Consciousness is unsolvable “by definition” (if it is solved, qualia will likely become quantifiable). I think the reasonable thing is to presume it is solvable. But until it is solved we must not allow AGI takeover, and even if AGIs stay under human control, it could lead to a previously unimaginable power imbalance between a few humans and the rest of us.
Opinions on this are pretty diverse. I largely agree with the bulleted list of things-you-think, and this article paints a picture of my current thinking.
My threat model is something like: the very first AGIs will probably be near human-level and won’t be too hard to limit/control. But in human society, tyrants are overrepresented among world leaders, relative to tyrants in the population of people smart enough to lead a country. We’ll probably end up inventing multiple versions of AGI, some of which may be straightforwardly turned into superintelligences and others not. The worst superintelligence we help to invent may win, and if it doesn’t it’ll probably be because a different one beats it (or reaches an unbeatable position first). Humans will probably be sidelined if we survive a battle between super-AGIs. So it would be much safer not to invent them―but it’s also hard to avoid inventing them! I have low confidence in my P(catastrophe) and I’m unsure how to increase my confidence.
But I prefer estimating P(catastrophe) over P(doom) because extinction is not all that concerns me. Some stories about AGI lead to extinction, others to mass death, others to dystopia (possibly followed by mass death later), others to utopia followed by catastrophe, and still others to a stable and wonderful utopia (with humanity probably sidelined eventually, which may even be a good thing). I think I could construct a story along any of these lines.
Well said. I also think it’s important to define what is meant by “catastrophe.” Just as an example, I personally would consider it catastrophic to see a future in which humanity is sidelined and subjugated by an AGI (even a “friendly,” aligned one), but many here would likely disagree with me that this would be a catastrophe. I’ve even heard otherwise rational (non-EA) people claim a future in which humans are ‘pampered pets’ of an aligned ASI to be ‘utopian,’ which just goes to show the level of disagreement.
To me, it’s important whether the AGIs are benevolent and have qualia/consciousness. If AGIs are ordinary computers but smart, I may agree; if they are conscious and benevolent, I’m okay being a pet.
I’m not sure whether we could ever truly know if an AGI was conscious or experienced qualia (which are by definition not quantifiable). And you’re probably right that being a pet of a benevolent ASI wouldn’t be a miserable thing (but it is still an x-risk … because it permanently ends humanity’s status as a dominant species).
I would caution against thinking the Hard Problem of Consciousness is unsolvable “by definition” (if it is solved, qualia will likely become quantifiable). I think the reasonable thing is to presume it is solvable. But until it is solved we must not allow AGI takeover, and even if AGIs stay under human control, it could lead to a previously unimaginable power imbalance between a few humans and the rest of us.