You say that “my opinion is that Nuclear War is among the most likely causes of death for a person my age in the Northern Hemisphere”. I think I agree with this in a literal sense, but your most of your post strikes me as more pessimistic than this statistic alone. Based on actuarial tables, the risk of dying at 45 years old (from accidents, disease, etc) is about 0.5% for the average person. So, in order to be the biggest single risk of death, the odds of dying in a nuclear war probably need to be at least 0.2% per year.
0.2% chance of nuclear-war-death per year actually lines up pretty well with this detailed post by a forecasting group? They estimated in October that the situation in Ukraine maybe has around a 0.5% chance of escalating to a full-scale nuclear war in which major NATO cities like London are getting hit. Obviously there is big uncertainty and plenty of room for debate on many steps of a forecast like this, but my point is that something like a 0.2% yearly risk of experiencing full-scale nuclear war sounds believable. (Of course, most years will be less dangerous than the current Ukraine war, but a handful years, like a potential future showdown over Taiwan, will obviously contain most of the risk.)
But, wait a second—this argument cuts both ways!! What if My Most Likely Reason to Die Young is AI X-Risk?! AI systems aren’t very powerful right now, so nuclear war definitely has a better chance of killing me right now, in 2023. But over the next, say, thirty years, it’s not clear to me if nuclear risk, moseying along at perhaps a 0.5% chance per year and adding up to 15% chance of war by 2053, is greater than the total risk of AI catastrophe by 2053.
So far, we’ve been talking about personal probability-of-death. But many EAs are concerned both with the lives of currently living people like ourselves, and with the survival of human civilization as a whole so that humanity’s overall potential is not lost. (Your mention of greatest risk of death for people “in the northern hemisphere” hints at this.) Obviously a full-scale nuclear war would have a devastating impact on civilization. But it nevertheless seems unlikely to literally extinguish all humanity, thus giving civilization a chance to bounce back and try again. (Of course, opinions differ about how severe the effects of nuclear war / nuclear winter would be, and how easy it would be for civilization to bounce back. See Luisa’s excellent series of posts about this for much more detail!) By contrast, scenarios involving superintelligent AI seem more likely to eventually lead to completely extinguishing human life. So, that’s one reason we might not want to trade ambient nuclear risk for superintelligent AI risk, even if they both gave a 15% chance of personal death by 2053.
Totally unrelated side-note, but IMO the fermi paradox doesn’t argue against the idea that alien civilizations are getting taken over by superintelligent AIs that rapidly expand to colonize the universe. That’s because if the AI civilizations are expanding at a reasonable fraction of the speed of light, we wouldn’t see them coming! So, we’d logically expect to observe a “vast galactic silence” even if the universe is actually chock full of rapidly-expanding civilizations which are about to overtake the earth and destroy us. For more info on this, read about Robin Hanson’s “grabby aliens” model—full website here, or entertaining video explanation here.
Alright, that is a lot of bullet points! Forgive me if this post comes across as harsh criticism—that is not at all how I am intending this, rather just as a rapid-fire list of responses and thoughts to this thought-provoking post. Also forgive me for not trying to make the case for the plausibility of AI risk, since I’m guessing you’re already familiar with some of the arguments. (If not, there are many great explainers out there including waitbutwhy, Cold Takes, and some long FAQs by Eliezer Yudkowsky and Scott Alexander.
Ultimately I agree with you that one of the aspirational goals of AI technology (if we can solve the seemingly impossibly difficult challenge of understanding and being able to control something vastly smarter than ourselves) is to use superintelligent AI to finally end all forms of existential risk and achieve a position of “existential security”, from which humanity can go on to build a thriving and diverse super-civilization. But I personally feel like AI is probably more dangerous than nuclear war (both to my individual odds of dying a natural death in old age, and to humanity’s chances of surviving to achieve its long-term potential), so I would be happy to trade an extra decade of nuclear risk for the precious opportunity for humanity to do more alignment research during an FLI-style pause on new AI capabilities deployments.
As for my proposed “alternative exit strategy”, I agree with you that civilization as it stands today seems woefully inadequate to safely handle either nuclear weapons or advanced AI technology for very long. Personally I am optimistic about trying to create new, experimental institutions (like better forms of voting, or governments run in part by prediction markets) that could level-up civilization’s adequacy/competence and create a wiser civilization better equipped to handle these dangerous technologies. But I recognize that this strategy, too, would be very difficult to accomplish and any benefits might arrive too late to help with situations where AI shows up soon. But at least it is another strategy in the portfolio of efforts that are trying to mitigate existential risk.
Do you have any canonical referece for AI aligment research? I have read Eliezer Yudkowsky FAQ and I have been surprised of how little technical details are commented. His arguments are very much “we are building alien squids and they will eat us all”. But they are not squids, and we have not trained them to prey on mammals, but to navigate across symbols. The IAs we are training are not as alien as giant a squid, but far more: they are not even trained for self-preservation.
MR suggests that there is not peer reviewed literature on AI risk:
But perhaps I can read something comprehensive (a pdf, if possible), and not depend on navigating posts, FAQs and similar stuff. Currently my understanding of AI risk is based in technical knowledge about Reinforcement Learning for games and multiagent systems. I have no knowledge nor intuition on other kind of systems, and I want to engage with the “state of the art” (in compact format) before I make a post focused on the AI alignement side.
Yes, it is definitely a little confusing how EA and AI safety often organize themselves via online blog posts instead of papers / books / etc like other fields! Here are two papers that seek to give a comprehensive overview of the problem:
Alternatively, this paper by Joseph Carlsmith at Open Philanthropy is a more philosophical overview that tries to lay out the big-picture argument that powerful, agentic AI is likely to be developed and that safe deployment/control would present a number of difficulties.
There are also lots of papers and reports and such about individual technical topics in the behavior existing AI systems—Research in goal misgeneralization (Shah et al., 2022); power-seeking (Turner et al., 2021); specification gaming (Krakovna et al., 2020); mechanistic interpretability (Olsson et al. (2022), Meng et al. (2022)); ML safety divided into robustness, monitoring, alignment and external safety (Hendrycks et al., 2022). But these are probably more in-the-weeds than you are looking for.
Not technically a paper (yet?), but there have been several surveys of expert machine-learning researchers on questions like “when do you think AGI will be developed?”, “how good/bad do you think this will be for humanity overall?”, etc, which you might find interesting.
Hi! Some assorted thoughts on this post:
You say that “my opinion is that Nuclear War is among the most likely causes of death for a person my age in the Northern Hemisphere”. I think I agree with this in a literal sense, but your most of your post strikes me as more pessimistic than this statistic alone. Based on actuarial tables, the risk of dying at 45 years old (from accidents, disease, etc) is about 0.5% for the average person. So, in order to be the biggest single risk of death, the odds of dying in a nuclear war probably need to be at least 0.2% per year.
0.2% chance of nuclear-war-death per year actually lines up pretty well with this detailed post by a forecasting group? They estimated in October that the situation in Ukraine maybe has around a 0.5% chance of escalating to a full-scale nuclear war in which major NATO cities like London are getting hit. Obviously there is big uncertainty and plenty of room for debate on many steps of a forecast like this, but my point is that something like a 0.2% yearly risk of experiencing full-scale nuclear war sounds believable. (Of course, most years will be less dangerous than the current Ukraine war, but a handful years, like a potential future showdown over Taiwan, will obviously contain most of the risk.)
But, wait a second—this argument cuts both ways!! What if My Most Likely Reason to Die Young is AI X-Risk?! AI systems aren’t very powerful right now, so nuclear war definitely has a better chance of killing me right now, in 2023. But over the next, say, thirty years, it’s not clear to me if nuclear risk, moseying along at perhaps a 0.5% chance per year and adding up to 15% chance of war by 2053, is greater than the total risk of AI catastrophe by 2053.
So far, we’ve been talking about personal probability-of-death. But many EAs are concerned both with the lives of currently living people like ourselves, and with the survival of human civilization as a whole so that humanity’s overall potential is not lost. (Your mention of greatest risk of death for people “in the northern hemisphere” hints at this.) Obviously a full-scale nuclear war would have a devastating impact on civilization. But it nevertheless seems unlikely to literally extinguish all humanity, thus giving civilization a chance to bounce back and try again. (Of course, opinions differ about how severe the effects of nuclear war / nuclear winter would be, and how easy it would be for civilization to bounce back. See Luisa’s excellent series of posts about this for much more detail!) By contrast, scenarios involving superintelligent AI seem more likely to eventually lead to completely extinguishing human life. So, that’s one reason we might not want to trade ambient nuclear risk for superintelligent AI risk, even if they both gave a 15% chance of personal death by 2053.
Totally unrelated side-note, but IMO the fermi paradox doesn’t argue against the idea that alien civilizations are getting taken over by superintelligent AIs that rapidly expand to colonize the universe. That’s because if the AI civilizations are expanding at a reasonable fraction of the speed of light, we wouldn’t see them coming! So, we’d logically expect to observe a “vast galactic silence” even if the universe is actually chock full of rapidly-expanding civilizations which are about to overtake the earth and destroy us. For more info on this, read about Robin Hanson’s “grabby aliens” model—full website here, or entertaining video explanation here.
Alright, that is a lot of bullet points! Forgive me if this post comes across as harsh criticism—that is not at all how I am intending this, rather just as a rapid-fire list of responses and thoughts to this thought-provoking post. Also forgive me for not trying to make the case for the plausibility of AI risk, since I’m guessing you’re already familiar with some of the arguments. (If not, there are many great explainers out there including waitbutwhy, Cold Takes, and some long FAQs by Eliezer Yudkowsky and Scott Alexander.
Ultimately I agree with you that one of the aspirational goals of AI technology (if we can solve the seemingly impossibly difficult challenge of understanding and being able to control something vastly smarter than ourselves) is to use superintelligent AI to finally end all forms of existential risk and achieve a position of “existential security”, from which humanity can go on to build a thriving and diverse super-civilization. But I personally feel like AI is probably more dangerous than nuclear war (both to my individual odds of dying a natural death in old age, and to humanity’s chances of surviving to achieve its long-term potential), so I would be happy to trade an extra decade of nuclear risk for the precious opportunity for humanity to do more alignment research during an FLI-style pause on new AI capabilities deployments.
As for my proposed “alternative exit strategy”, I agree with you that civilization as it stands today seems woefully inadequate to safely handle either nuclear weapons or advanced AI technology for very long. Personally I am optimistic about trying to create new, experimental institutions (like better forms of voting, or governments run in part by prediction markets) that could level-up civilization’s adequacy/competence and create a wiser civilization better equipped to handle these dangerous technologies. But I recognize that this strategy, too, would be very difficult to accomplish and any benefits might arrive too late to help with situations where AI shows up soon. But at least it is another strategy in the portfolio of efforts that are trying to mitigate existential risk.
Dear Mr. Wagner,
Do you have any canonical referece for AI aligment research? I have read Eliezer Yudkowsky FAQ and I have been surprised of how little technical details are commented. His arguments are very much “we are building alien squids and they will eat us all”. But they are not squids, and we have not trained them to prey on mammals, but to navigate across symbols. The IAs we are training are not as alien as giant a squid, but far more: they are not even trained for self-preservation.
MR suggests that there is not peer reviewed literature on AI risk:
https://marginalrevolution.com/marginalrevolution/2023/04/from-the-comments-on-ai-safety.html
“The only peer-reviewed paper making the case for AI risk that I know of is: https://onlinelibrary.wiley.com/doi/10.1002/aaai.12064. Though note that my paper (the second you linked) is currently under review at a top ML conference.”
But perhaps I can read something comprehensive (a pdf, if possible), and not depend on navigating posts, FAQs and similar stuff. Currently my understanding of AI risk is based in technical knowledge about Reinforcement Learning for games and multiagent systems. I have no knowledge nor intuition on other kind of systems, and I want to engage with the “state of the art” (in compact format) before I make a post focused on the AI alignement side.
Yes, it is definitely a little confusing how EA and AI safety often organize themselves via online blog posts instead of papers / books / etc like other fields! Here are two papers that seek to give a comprehensive overview of the problem:
This one, by Richard Ngo at OpenAI along with some folks from UC Berkeley and the University of Oxford, is a technical overview of why modern deep-learning techniques might lead to various alignment problems, like deceptive behavior, that could be catastrophic in very powerful systems.
Alternatively, this paper by Joseph Carlsmith at Open Philanthropy is a more philosophical overview that tries to lay out the big-picture argument that powerful, agentic AI is likely to be developed and that safe deployment/control would present a number of difficulties.
There are also lots of papers and reports and such about individual technical topics in the behavior existing AI systems—Research in goal misgeneralization (Shah et al., 2022); power-seeking (Turner et al., 2021); specification gaming (Krakovna et al., 2020); mechanistic interpretability (Olsson et al. (2022), Meng et al. (2022)); ML safety divided into robustness, monitoring, alignment and external safety (Hendrycks et al., 2022). But these are probably more in-the-weeds than you are looking for.
Not technically a paper (yet?), but there have been several surveys of expert machine-learning researchers on questions like “when do you think AGI will be developed?”, “how good/bad do you think this will be for humanity overall?”, etc, which you might find interesting.