Hiii! I’d break this down into two questions and give my answers to each. I don’t know whether my takes are still the most widely shared ones, but they’re the arguments that were most convincing to me.
Why focus on AI when it comes to s-risks:Tobias Baumann’s book and his blog posts are probably the best online sources on this topic. The basic idea is that during multipolar takeoffs, i.e. if several groups on earth build two or more similarly powerful AIs, (or for acausal reasons or if AIs meet aliens at some later point), it seems likely that by default these AIs will be at war. I’ll defer to the Tobias’s book when it comes to all the specific scenarios that can occur here and cause suffering.
The urgency of this subarea of s-risks rests a lot on its tractability for me. AI alignment requires that (1) we find a solution to inner alignment, (2) we find a solution to outer alignment, (3) we either implement corrigibility (not sure if that’s the latest term?) or something akin to human values using those solutions, and (4) we convince every last group building AIs to do the same forever. I’m pessimistic that all of that can succeed, but it seems that people smarter than me are still somewhat optimistic, so I could easily be missing something.
But if we aim at the bigger target of just avoiding wars (zero-sum conflicts) between AIs, the particular values of the AIs don’t matter. Regardless of whatever the values might be, every rational AI will rather solve a conflict in a positive-sum way than a zero-sum way.
CLR has already identified a bunch of problems, e.g., that the bargaining solutions that allow for positive-sum conflicts don’t work at all when different sides use different ones, i.e. everyone needs to agree on the same bargaining solution. But that is again an example of a problem were every AI is on our side. No AI that already has values wants to be aligned with different values. But every AI, regardless of its values, will want to pick a bargaining solution such that it can resolve conflicts in a mutually beneficial rather than internecine way. Perhaps it’ll be enough to publish a bunch of arguments where, say, Nash is the Schelling point bargaining solution, and if they end up in the training data that is used for most AIs, that’ll convince the AI that Nash really is the Schelling point among AIs trained on that data. But that’s all part of the research that still needs to be done.
Personally, I also disvalue certain intense forms of 0suffering at least 10^8 times as much as death (for short durations where I have intuitions), but that varies between people.
You could say that neglectedness is also on the side of s-risks from AI vs. x-risk from AI since there are probably hundreds of people working on alignment but only some 30 or less working on cooperative AI. But really both are ridiculously neglected.
When it comes to other sources of s-risks, “AI war” type agential s-risks seem worse to me compared to natural ones and easier to avert compared to incidental ones.
Tractability of long-term interventions: I don’t think s-risks are special here, and I don’t have any novel thoughts on the topic. They’re one dystopian lock-in among other less bad ones. My favorite resource here is Tarsney’s “The Epistemic Challenge to Longtermism.” It makes a lot of very conservative assumptions, but even so it can’t rule out that we can have a sizeable effect on hundreds to thousands of years. An AI war or its repercussions lasting for hundreds or thousands of years is sufficiently bad in my book, but the conditions under which they occur will be so different to ours today that I find it hard to tell whether effect will be more or less enduring. Time probably passes more slowly for AIs (which think so much faster than us), so that they may also be subject to more drift. But if you send otherwise dumb von Neumann probes out into space at high speeds so no one can catch up with them, I would intuitively guess that they could keep going for a long time before they all fail.
Hiii! I’d break this down into two questions and give my answers to each. I don’t know whether my takes are still the most widely shared ones, but they’re the arguments that were most convincing to me.
Why focus on AI when it comes to s-risks: Tobias Baumann’s book and his blog posts are probably the best online sources on this topic. The basic idea is that during multipolar takeoffs, i.e. if several groups on earth build two or more similarly powerful AIs, (or for acausal reasons or if AIs meet aliens at some later point), it seems likely that by default these AIs will be at war. I’ll defer to the Tobias’s book when it comes to all the specific scenarios that can occur here and cause suffering.
The urgency of this subarea of s-risks rests a lot on its tractability for me. AI alignment requires that (1) we find a solution to inner alignment, (2) we find a solution to outer alignment, (3) we either implement corrigibility (not sure if that’s the latest term?) or something akin to human values using those solutions, and (4) we convince every last group building AIs to do the same forever. I’m pessimistic that all of that can succeed, but it seems that people smarter than me are still somewhat optimistic, so I could easily be missing something.
But if we aim at the bigger target of just avoiding wars (zero-sum conflicts) between AIs, the particular values of the AIs don’t matter. Regardless of whatever the values might be, every rational AI will rather solve a conflict in a positive-sum way than a zero-sum way.
CLR has already identified a bunch of problems, e.g., that the bargaining solutions that allow for positive-sum conflicts don’t work at all when different sides use different ones, i.e. everyone needs to agree on the same bargaining solution. But that is again an example of a problem were every AI is on our side. No AI that already has values wants to be aligned with different values. But every AI, regardless of its values, will want to pick a bargaining solution such that it can resolve conflicts in a mutually beneficial rather than internecine way. Perhaps it’ll be enough to publish a bunch of arguments where, say, Nash is the Schelling point bargaining solution, and if they end up in the training data that is used for most AIs, that’ll convince the AI that Nash really is the Schelling point among AIs trained on that data. But that’s all part of the research that still needs to be done.
Personally, I also disvalue certain intense forms of 0suffering at least 10^8 times as much as death (for short durations where I have intuitions), but that varies between people.
You could say that neglectedness is also on the side of s-risks from AI vs. x-risk from AI since there are probably hundreds of people working on alignment but only some 30 or less working on cooperative AI. But really both are ridiculously neglected.
When it comes to other sources of s-risks, “AI war” type agential s-risks seem worse to me compared to natural ones and easier to avert compared to incidental ones.
Tractability of long-term interventions: I don’t think s-risks are special here, and I don’t have any novel thoughts on the topic. They’re one dystopian lock-in among other less bad ones. My favorite resource here is Tarsney’s “The Epistemic Challenge to Longtermism.” It makes a lot of very conservative assumptions, but even so it can’t rule out that we can have a sizeable effect on hundreds to thousands of years. An AI war or its repercussions lasting for hundreds or thousands of years is sufficiently bad in my book, but the conditions under which they occur will be so different to ours today that I find it hard to tell whether effect will be more or less enduring. Time probably passes more slowly for AIs (which think so much faster than us), so that they may also be subject to more drift. But if you send otherwise dumb von Neumann probes out into space at high speeds so no one can catch up with them, I would intuitively guess that they could keep going for a long time before they all fail.