As I understand it, there are two parts to the case for a focus on AI safety research:
-
If we do achieve AGI and the AI safety / alignment problem isn’t solved by then, it poses grave, even existential, risks to humanity. Given these grave risks, and some nontrivial probability of AGI in the medium-term, it makes sense to focus on AI safety.
-
If we are able to achieve a safe and aligned AGI, then many other problems will go away or at least get much better or simper to solve. So, focusing on other cause areas may not matter that much anyway if a safe/aligned AGI is likely in the near term.
I’ve seen a lot of fleshing out of 1; in recent times, it seems to be the dominant reason for the focus on AI safety in effective altruist circles, though 2 (perhaps without the focus on “safe”) is a likely motivation for many of those working on AI development.
The sentiment of 2 is echoed in many texts on superintelligence. For instance, from the preface of Nick Bostrom’s Superintelligence:
In this book, I try to present the challenges presented by the prospect of superintelligence, and how we might best respond. This is quite possibly the most important and most daunting challenge humanity has ever faced. And—whether we succeed or fail—it is probably the last challenge we will ever face.
Similar sentiments are found in Bostrom’s Letter from Utopia.
Historical aside: MIRI’s motivation around AI started off more around 2 and gradually moved to 1 -- an evolution that you can see in the timeline of MIRI that I financed and partly wrote.
Another note: whereas 1 is a strong argument for AI safety even at low but nontrivial probabilities of AGI, 2 becomes a strong argument only at moderately high probabilities over a short time horizon. So if one has a low probability estimate for AGI in the near-term, only 1 may be a compelling argument even if both 1 and 2 are true.
So, question: what are some interesting analyses involving 2 and their implications for the relative prioritization of AI safety and other causes that safe, aligned AI might solve? The template question I’m interested in, for any given cause area C:
Would safe, aligned AGI help radically with the goals of cause C? Does this consideration meaningfully impact current prioritization of (and within) cause C? And does it cause anybody interested in cause C to focus more on AI safety?
Examples of cause area C for which I’m particularly interested in answers to the question include:
Animal welfare
Life extension
Global health
Thanks to Issa Rice for the Superintelligence quote and many of the other links!
Brian Tomasik believes that there’s a chance that AI alignment may itself be dangerous, since a “near miss” in AI alignment could cause vastly more suffering than a paperclip maximizer. In his article on his donation recommendations, he estimates that organizations like MIRI may have a ~38% chance of doing active harm.