Artificial Intelligence as exit strategy from the age of acute existential risk
This article argues that given the baseline level of existential risk implied by nuclear weapons the development of Artificial Intelligence (AI) probably implies a net reduction in existential risk. The so called Artificial General Intelligence (AGI) can replace the human political systems and solve the worst alignment problem: the one that human groups have with respect to others.
The Age of Existential Risk
If we had to describe in a few words our historical moment, not from the perspective of years or decades, but from that of our existence as a species, this moment should be called the age of acute existential risk.
In the last two hundred years, Humanity has experienced an immense expansion of its material capabilities that has intensified its ecological domination and has taken us out of the Malthusian demographic regime in which all other living species are trapped.
In August 6th, 1945, with the first use of a nuclear weapon on a real target, Humanity became aware that its material capabilities now encompassed the possibility of self-extinction. The following decades saw a steady increase in the destructive capacity of nuclear arsenals and several incidents where an escalation of political tension or a technical failure threatened to bring down the sword of Damocles.
An important feature of nuclear war is that it is a funnel for many other sub- existential risks. Financial, ecological, and geopolitical crises, while threatening neither human civilization nor its survival, substantially increase the risk of war, and wars can escalate into a nuclear exchange. Barring the possibility of nuclear war, the risks of a more populous, hotter world with growing problems of political legitimacy are partly mitigated by technology and economic interconnection. But the risk of nuclear war amplifies the other purely historical and environmental risks and turns them into existential risks.
If the nuclear war risk does not reduce over time, an accident, whether technical or political will happen sooner or later. Each of these 77 years after Hiroshima and Nagasaki is a miracle and a tribute to human reason and self-restraint. But without an exit strategy, sustained levels of nuclear war risk doom our technological and post-Malthusian civilization to be an ephemeral phenomenon.
In my opinion we can classify nuclear war risk exit strategies into two types: i) organic stabilization and ii) technological deux ex machina.
Organic stabilization refers to a set of social processes, linked to human development, that naturally reduce the risk of nuclear war. In the first place, in industrial societies the activities with the highest added value are linked to human capital. Consequently, the incentives to conquest and war are drastically reduced in a world where wealth is made by work, education and technology, compared to a world were land is the main source of wealth. Additionally, economic development implies a lower propensity for violence, either at the individual or at the group level. Economic interdependence and inter-elite permeability (that have increased for the last two centuries) are also a necessary condition for definitive pacification.
The other way out of acute existential risk is some technological deux ex machina. In my view, AI can be that game changer.
In the next section I am going to outline my subjective position on how the risk of nuclear conflict has evolved during my lifetime, and in the next I argue that these advances have proven to be limited and fragile, and that it is necessary to promote all the forms of technological progress and especially AI, to get out of this stage of acute existential risk as soon as possible.
Organic exit: nuclear war risk in my lifetime
I was born in 1977, and therefore I was six years old in the year of the highest risk of nuclear war in history: in 1983, the KGB launched the most extensive operation in Soviet intelligence history to assess the probability that NATO was preparing a preventive nuclear war; in September of that year the Petrov incident took place, and in November, Moscow seriously considered that the NATO winter maneuvers (Able Archer 83′) were a covert operation for an all-out war with the USSR. Nuclear tension lasted for several more years amid an intensification of the Cold War that included: i) the Strategic Defense Initiative, ii) the dismantling of Soviet technological intelligence (Vetrov’s leak), and the subsequent sabotage of the Trans-Siberian gas pipeline, iv) and the shooting down of the KAL007 plane in Korea.
During the 1990s, after the end of the Socialist Bloc, nuclear war disappeared from the collective consciousness, although political instability and ultra-nationalist and neo-communist currents in Russia suggest that in the final decade of the XX century nuclear war risk remained high. With the election of Vladimir Putin, an apparently authoritarian modernizer with an agenda of internal consolidation, the Russian origin nuclear war risk seemed to fade.
In parallel, since the mid-eighties in China a system of collective leadership was established, then a transition to a market economy was successfully performed, and finally the Communist Party appeared to open up to the homegrown moneyed class: in a few decades China seemed to have traveled the road from an absolute communist monarchy to a census-based bourgeois republic. Beyond the Middle East issues (which never implied nuclear risk), between 1991 and the mid-2010s, global trends were: i) universal economic convergence and interdependence, ii) consociational (not necessarily democratic) governance in the great powers, and iii) increasing permeability among the national elites of the different countries. For twenty-five years (those that go from my adolescence to my middle age) I saw the consolidation of a post-Malthusian, technocratic, and post-national world, where wealth depended above all on capital and technology. Of course, I was vaguely aware of what is obvious today: organic stabilization of existential risk is a soft solution to a hard problem, but everything looked so well behaved that Hegelian complacency (the opium of elites) looked a sensible position.
The general trends thus described were real, and they are more structural than the post-Ukrainian shock leads us to believe. However, after the mid-2010s, Xi Jinping has succeeded in replacing collective governance with his absolute power, and in Russia the “authoritarian modernizer” has become a “totalitarian warmonger”. Additionally, the waterways of the nuclear non-proliferation regime have widened with the development of the nuclear arsenal of North Korea and probably, in a few years, that of Iran.
The lesson of these decades of success is bleak: human institutions have an error margin clearly above what is tolerable for the risks of the Nuclear Age: nuclear war can mean billions of deaths, and the fall “into the abyss of a new Dark Age made more sinister and perhaps more prolonged by the lights of a perverted science” (the famous Churchill’s description of the consequences of a nazi victory perfectly fits a post nuclear world).
Even if we overcome the Russian-Ukrainian war, these years have shown that the probability of democratic regression is high even in developed countries. Only open societies have been able to generate the kind of international ties that can definitely lower international tensions. Autocracies may temporarily ally, but their elites are nationalized, and do not have the systems of reciprocal social influence and commitment on which the “Pax Democratica” is based. Apart from their intrinsic flaws (which technology is going to sharpen) autocracies have not a natural pathway for nuclear war risk reduction, and their resilience means that the trust that can be placed in the organic social progress to overcome the era of acute existential risk is limited.
Of course, economic stabilization, institutional innovation, and democratizing and internationalist activism are not useless: they are the only way in which the vast majority of Humanity can participate in the task of surviving the nuclear age. The organic path is not totally impracticable even as a definitive solution (social science is developing and can offer new forecasting and governance mechanisms safe enough for a nuclear world). Furthermore, each day that we survive by opportunistic means is one more day of life, and one more day to find a definitive solution.
But looking back at these four and a half decades, and given the regression towards autocracy and international chaos in less than ten years, my opinion is that Nuclear War is among the most likely causes of death for a person my age in the Northern Hemisphere.
Deus ex machina: the technological exit
That is why it makes no sense to fear the great technological transformations ahead. Humanity is on the verge of a universal catastrophe, so in reality, accelerationism is the only prudent strategy.
I have serious doubts that Artificial General Intelligence (AGI) is close: cars have not been able to drive autonomously in big cities and the spectacular results in robotics from Boston Dynamics are not yet being seen in civilian life nor in the battlefield. It’s very easy to point to major successes (like chat GTP), but the failures are prominent as well.
An additional argument to consider AI risk as remote, is the Fermi Paradox. Unlike nuclear war, AI-risk is not a Fermi Paradox explanation. If an alien civilization is destroyed by the development of AI, the genocidal AI would still be in place to expand (even faster than the original alien species) across the Universe. So, while nuclear war is a very likely alien killer, AI is only an alien replacer. The vast galactic silence we observe suggest a substantially higher nuclear than AI risk. Probably, the period between the first nuclear detonation and the development of AGI is simply too risky for the majority of intelligent species, and we have been extremely lucky so far (or we live in a low probability Everett’s multiverse branch where 77 years of nuclear war risk have not materialized).
In any case, a super-human intelligence is the definitive governance tool: it would be capable of proposing social and political solutions superior to those that the human mind can develop, it would have a wide predictive superiority, and since it is not human it would not have particularistic incentives. All of this would give an AGI immense political legitimacy: a government oriented AGI would give countries that follow its lead a decisive advantage. During the Cold War, Asimov already saw AI (see the short story “ The Avoidable Conflict”) as a possible way to achieve a “convergence of systems” that would overcome the ideological confrontation through an ideal technocracy.
Despite fears about the alignment of interests between AI and Humanity, in reality what we know for sure is that the most intractable problem is the alignment among humans, and that problem with nuclear weapons is also existential. Technological progress has already given us the tools of our own destruction. The safest way out for Mankind is forward, upward and ever faster, because from this height, the fall is already deadly.
Apart from AGI, there are several technologies for nuclear war risk mitigation. Decentralized manufacturing and mini-nuclear power plants could lead to a world without large concentrations of population and with moderate economic interdependence, that is, without the large bottlenecks that would be the main military targets in the event of a strategic nuclear war. Cheap rockets (like those developed by Space X) could allow the development of anti-missile shields, leading to a viable Strategic Defense Initiative. Should the worst happen, artificial food synthesis could allow survival in a nuclear winter. The portfolio of nuclear resilience technologies shall be developed in parallel to the paths of organic mitigation of existential risk. AI can accelerate them even if we do not succeed in producing the AGI that can solve the human alignment problem.
The risks that AGI implies for Humanity are serious, but they should not be assessed without considering that it is the most promising path out of the age of acute existential risk. Those who support a ban of this technology shall at least propose their own alternative exit strategy.
In my view, we are already in the brink of destruction, so we shall recklessly gamble for resurrection.
- Why can’t we accept the human condition as it existed in 2010? by 9 Jan 2024 18:02 UTC; 35 points) (
- 21 Apr 2023 13:12 UTC; 6 points) 's comment on 12 tentative ideas for US AI policy (Luke Muehlhauser) by (
- World and Mind in Artificial Intelligence: arguments against the AI pause by 18 Apr 2023 14:35 UTC; 6 points) (
- 17 Sep 2023 12:40 UTC; 4 points) 's comment on Debate series: should we push for a pause on the development of AI? by (
- 28 Jul 2023 11:33 UTC; 4 points) 's comment on Nuclear Risk and Philanthropic Strategy [Founders Pledge] by (
- 27 Aug 2023 13:31 UTC; 3 points) 's comment on EA is underestimating intelligence agencies and this is dangerous by (
- 6 May 2023 7:09 UTC; 3 points) 's comment on Introducing the AI Objectives Institute’s Research: Differential Paths toward Safe and Beneficial AI by (
- 13 Sep 2023 9:35 UTC; 1 point) 's comment on Summary: High risk, low reward: A challenge to the astronomical value of existential risk mitigation by (
- 24 Aug 2023 9:15 UTC; 1 point) 's comment on The Ethical Basilisk Thought Experiment by (
- 4 Mar 2024 11:19 UTC; 1 point) 's comment on Defense Job Ethics by (
- 16 Apr 2023 13:37 UTC; 1 point) 's comment on Artificial Intelligence as exit strategy from the age of acute existential risk by (Progress Forum;
Hi! Some assorted thoughts on this post:
You say that “my opinion is that Nuclear War is among the most likely causes of death for a person my age in the Northern Hemisphere”. I think I agree with this in a literal sense, but your most of your post strikes me as more pessimistic than this statistic alone. Based on actuarial tables, the risk of dying at 45 years old (from accidents, disease, etc) is about 0.5% for the average person. So, in order to be the biggest single risk of death, the odds of dying in a nuclear war probably need to be at least 0.2% per year.
0.2% chance of nuclear-war-death per year actually lines up pretty well with this detailed post by a forecasting group? They estimated in October that the situation in Ukraine maybe has around a 0.5% chance of escalating to a full-scale nuclear war in which major NATO cities like London are getting hit. Obviously there is big uncertainty and plenty of room for debate on many steps of a forecast like this, but my point is that something like a 0.2% yearly risk of experiencing full-scale nuclear war sounds believable. (Of course, most years will be less dangerous than the current Ukraine war, but a handful years, like a potential future showdown over Taiwan, will obviously contain most of the risk.)
But, wait a second—this argument cuts both ways!! What if My Most Likely Reason to Die Young is AI X-Risk?! AI systems aren’t very powerful right now, so nuclear war definitely has a better chance of killing me right now, in 2023. But over the next, say, thirty years, it’s not clear to me if nuclear risk, moseying along at perhaps a 0.5% chance per year and adding up to 15% chance of war by 2053, is greater than the total risk of AI catastrophe by 2053.
So far, we’ve been talking about personal probability-of-death. But many EAs are concerned both with the lives of currently living people like ourselves, and with the survival of human civilization as a whole so that humanity’s overall potential is not lost. (Your mention of greatest risk of death for people “in the northern hemisphere” hints at this.) Obviously a full-scale nuclear war would have a devastating impact on civilization. But it nevertheless seems unlikely to literally extinguish all humanity, thus giving civilization a chance to bounce back and try again. (Of course, opinions differ about how severe the effects of nuclear war / nuclear winter would be, and how easy it would be for civilization to bounce back. See Luisa’s excellent series of posts about this for much more detail!) By contrast, scenarios involving superintelligent AI seem more likely to eventually lead to completely extinguishing human life. So, that’s one reason we might not want to trade ambient nuclear risk for superintelligent AI risk, even if they both gave a 15% chance of personal death by 2053.
Totally unrelated side-note, but IMO the fermi paradox doesn’t argue against the idea that alien civilizations are getting taken over by superintelligent AIs that rapidly expand to colonize the universe. That’s because if the AI civilizations are expanding at a reasonable fraction of the speed of light, we wouldn’t see them coming! So, we’d logically expect to observe a “vast galactic silence” even if the universe is actually chock full of rapidly-expanding civilizations which are about to overtake the earth and destroy us. For more info on this, read about Robin Hanson’s “grabby aliens” model—full website here, or entertaining video explanation here.
Alright, that is a lot of bullet points! Forgive me if this post comes across as harsh criticism—that is not at all how I am intending this, rather just as a rapid-fire list of responses and thoughts to this thought-provoking post. Also forgive me for not trying to make the case for the plausibility of AI risk, since I’m guessing you’re already familiar with some of the arguments. (If not, there are many great explainers out there including waitbutwhy, Cold Takes, and some long FAQs by Eliezer Yudkowsky and Scott Alexander.
Ultimately I agree with you that one of the aspirational goals of AI technology (if we can solve the seemingly impossibly difficult challenge of understanding and being able to control something vastly smarter than ourselves) is to use superintelligent AI to finally end all forms of existential risk and achieve a position of “existential security”, from which humanity can go on to build a thriving and diverse super-civilization. But I personally feel like AI is probably more dangerous than nuclear war (both to my individual odds of dying a natural death in old age, and to humanity’s chances of surviving to achieve its long-term potential), so I would be happy to trade an extra decade of nuclear risk for the precious opportunity for humanity to do more alignment research during an FLI-style pause on new AI capabilities deployments.
As for my proposed “alternative exit strategy”, I agree with you that civilization as it stands today seems woefully inadequate to safely handle either nuclear weapons or advanced AI technology for very long. Personally I am optimistic about trying to create new, experimental institutions (like better forms of voting, or governments run in part by prediction markets) that could level-up civilization’s adequacy/competence and create a wiser civilization better equipped to handle these dangerous technologies. But I recognize that this strategy, too, would be very difficult to accomplish and any benefits might arrive too late to help with situations where AI shows up soon. But at least it is another strategy in the portfolio of efforts that are trying to mitigate existential risk.
Dear Mr. Wagner,
Do you have any canonical referece for AI aligment research? I have read Eliezer Yudkowsky FAQ and I have been surprised of how little technical details are commented. His arguments are very much “we are building alien squids and they will eat us all”. But they are not squids, and we have not trained them to prey on mammals, but to navigate across symbols. The IAs we are training are not as alien as giant a squid, but far more: they are not even trained for self-preservation.
MR suggests that there is not peer reviewed literature on AI risk:
https://marginalrevolution.com/marginalrevolution/2023/04/from-the-comments-on-ai-safety.html
“The only peer-reviewed paper making the case for AI risk that I know of is: https://onlinelibrary.wiley.com/doi/10.1002/aaai.12064. Though note that my paper (the second you linked) is currently under review at a top ML conference.”
But perhaps I can read something comprehensive (a pdf, if possible), and not depend on navigating posts, FAQs and similar stuff. Currently my understanding of AI risk is based in technical knowledge about Reinforcement Learning for games and multiagent systems. I have no knowledge nor intuition on other kind of systems, and I want to engage with the “state of the art” (in compact format) before I make a post focused on the AI alignement side.
Yes, it is definitely a little confusing how EA and AI safety often organize themselves via online blog posts instead of papers / books / etc like other fields! Here are two papers that seek to give a comprehensive overview of the problem:
This one, by Richard Ngo at OpenAI along with some folks from UC Berkeley and the University of Oxford, is a technical overview of why modern deep-learning techniques might lead to various alignment problems, like deceptive behavior, that could be catastrophic in very powerful systems.
Alternatively, this paper by Joseph Carlsmith at Open Philanthropy is a more philosophical overview that tries to lay out the big-picture argument that powerful, agentic AI is likely to be developed and that safe deployment/control would present a number of difficulties.
There are also lots of papers and reports and such about individual technical topics in the behavior existing AI systems—Research in goal misgeneralization (Shah et al., 2022); power-seeking (Turner et al., 2021); specification gaming (Krakovna et al., 2020); mechanistic interpretability (Olsson et al. (2022), Meng et al. (2022)); ML safety divided into robustness, monitoring, alignment and external safety (Hendrycks et al., 2022). But these are probably more in-the-weeds than you are looking for.
Not technically a paper (yet?), but there have been several surveys of expert machine-learning researchers on questions like “when do you think AGI will be developed?”, “how good/bad do you think this will be for humanity overall?”, etc, which you might find interesting.
Great post! It is so easy to get focused on the bad that we forget to look towards the path towards the good, and I want to see more of this kind of thinking.
One little note about AGI:
”cars have not been able to drive autonomously in big cities ”...
I think that autonomous car driving is a very bad metric for AGI because humans are -hyper specialized- at the traits that allow it—any organism with hyperspecialized traits shouldn’t be expected to be easily reached by a ‘general’ intelligence without specialized training!
In order to drive a car, you need to:
1. Understand complex visual information as you are moving through a very complex environment, in wildly varying conditions, and respond almost instantly to changes to keep safe
2. Know the right path to move an object through a complex environment to avoid dangers, infer the intentions of other objects based on their movement, and calculate this incredibly fast
3. Coordinate with other actors on the road in a way that allows harmonious, low-risk movement to meet a common objective
It turns out—these are all hard problems—and ones that Homo sapiens was evolutionary designed to do in order to survive as persistence hunters, working in a group, following prey through forests and savannah, and sharing the proceeds when the gazelle collapsed from exhaustion! Our brain’s circuits are designed for this task and it excels at them, and it does it in a way that we don’t even realize how hard driving is! (You know how you are completely exhausted after a long drive? it’s hard!)
It’s easy to not notice how hard something is when your unconscious is designed to do the hard work effortlessly :)
Best,
Kristopher
First I will answer to the “nuclear war” is not existential issue. Even a NATO Russia full exchange in the worst nuclear winter case, would not kill everybody. But what kind of societies will be left after the shock? Military aristocracies, North Korea like totalitarian regimes, large tracts of Somalian anarchy awaiting to be invaded by their imperialist neighbors, etc. Nothing else can keep political coherence after such a shock. Nuclear supremacy will be the only natural goal of any surviving political entity.
The problem with nuclear weapons is that it is an unavoidable step in technological process. At some point, you have the “Godlike powers” and the “medieval institutions”, no matter how many times you iterate. Let’s simplify: if you need 1000 years to recover from nuclear war, and (given the intractability of then Human alignment problem) a nuclear major nuclear war every 150 years, you are in some new kind of Malthusian trap (more specifically, a nuclear fueled Hobbesian trap).
In reality I don’t expect a post nuclear war world to be one of 1000 years of recovery and then a major nuclear war (the “Canticle for Leibowitz” typical story), but more a world of totalitarian belicism, with frequent nuclear exchanges and the whole society oriented for war. At some point, if AGI is possible, some country will develop it, with the kind of purpose that guarantees it to be Skynet.
As a consequence, if we have not an alternative solution perspective to the human alignment problem, my view is that we should try to develop AGI as soon as possible, because we are the best version of the Mankind that can develop it (We the Mankind of 2023, we, the Western democratic world: both “we”).
I agree with the idea that nuclear wars, whether small or large, would probably push human civilization in a bad, slower-growth, more zero-sum and hateful, more-warlike direction. And thus, the idea of civilizational recovery is not as bright a silver lining as it seems (although it is still worth something).
I disagree that this means that we should “try to develop AGI as soon as possible”, which connotes to me “tech companies racing to deploy more and more powerful systems without much attention paid to alignment concerns, and spurred on by a sense of economic competition rather than cooperating for the good of humanity, or being subject to any kind of democratic oversight”.
I don’t think we should pause AI development indefinitely—because like you say, eventually something would go wrong, whether a nuclear war or someone skirting the ban to train a dangerous superintelligent AI themselves. But I would be very happy to “pause” for a few years while the USA / western world figures out a regulatory scheme to restrain the sense of an arms race between tech companies, and puts together some sort of “manhattan/apollo project for alignment”. Then we could spend a decade working hard on alignment, while also developing AI capabilities in a more deliberate, responsible, centralized way. At the end of that decade I think we would still be ahead of China and everyone else, and I think we would have put humanity in a much better position than if we tried to rush to get AGI “as soon as possible”.
Nuclear war would be very bad, but not truly an existential risk:
https://www.navalgazing.net/Nuclear-Weapon-Destructiveness
The argument on AGI as a tool to overcome existential risk was developed here years before this post by Kyrtin Atreides
http://dx.doi.org/10.13140/RG.2.2.26522.62407
Now, the second issue: “As for my proposed “alternative exit strategy”, I agree with you that civilization as it stands today seems woefully inadequate to safely handle either nuclear weapons or advanced AI technology for very long. Personally I am optimistic about trying to create new, experimental institutions (like better forms of voting, or governments run in part by prediction markets) that could level-up civilization’s adequacy/competence and create a wiser civilization better equipped to handle these dangerous technologies.”
First of all, I consider myself to have some expert knowledge about the “human alignment” problem, and precisely this post is mainly about my reasons for pessimism (either rational and historic-intuitive). Even if macro trends were uniformly positive (which they are not), human governance systems are simply not reliable enough. And in any case if you are lucky enough to create an ideal republic, let’s say in Switerzerland (they are almost there!), how would you “export” it to Russia, China and North Korea? It is not only the problem of designing governance systems. You need to deploy your system against entrenched and ruthless political elites.
I agree that hoping for ideal societies is a bit of a pipe dream. But there is some reason for hope. China and Russia, for instance, were both essentially forced to abandon centrally-planned economies and adopt some form of capitalism in order to stay competitive with a faster-growing western world. Unfortunately, the advantages of democracy vs authoritarianism (although there are many) don’t seem quite as overwhelming as the advantages of capitalism vs central planning. (Also, if you are the authoritarian in charge, maybe you don’t mind switching economic systems, but you probably really want to avoid switching political systems!)
But maybe if the West developed even better governing institutions (like “futarchy”, a form of governance based partly on democratic voting and partly on prediction markets), or otherwise did a great job of solving our own problems (like doing a good job generating lots of cheap, clean energy, or adopting Georgist and Yimby policies to lower the cost of housing and threreby boost the growth of western economies), once again we might pressure our geopolitical competitors to reform their own institutions in order to keep up.
Alternatively, if Russia/China/etc didn’t reform, I would expect them to eventually fall further and further behind (like North Korea vs South Korea); they’d still have nukes of course, but after many decades of falling behind, eventually I’d expect the US could field some technology—maybe aligned AI like you say, or maybe just really effective missile-defence—that would help end the era of nuclear risk (or at least nuclear risk from backwards, fallen-behind countries).
I agree with all your political positions! Let’s run for office together!
Now, more seriously, thank you very much for the links.