The recent OpenAI announcement of progress in research has artificial intelligence safety at the forefront of many of our minds. With the prospect of this new technology “remaking society” in my lifetime (something I never thought I would live to see), and with that organization explicitly stating their goal is to create an artificial general intelligence (AGI) and raise it to superintelligent level (ASI), with the overall goal of, to paraphrase former and current CEO Sam Altman, making human lives easier and happier.
As I am sure all of you know, AI safety and existential risks are two of the main issues that Effective Altruism (EA) seeks to address. As someone brand new to the EA community and ideology, what convinced me to join was seeing the way EAs apply reason, logic, and science to moral questions. This approach is so refreshing in a world where morality is so often said to be rooted in the unfalsifiable, and moral debates are often highly emotional and self-referential. To put it another way, EA (morality, value, even hope) has made progress in a field long considered by many to be beyond the realm of reason. The EA perspective has been like opening a window to greater understanding, and it comes with an amazing community as well.
So my question for those EAs who would consider yourselves in favor of the existence of AI on Earth in any sense (whether you are for governance, regulation, safety, alignment, or “acceleration”): how do you mitigate the existential risk to a negligible level?
Say that research, alignment efforts and AI regulation lead to the development of an artificial general intelligence (AGI) released to the public in 10 years time. Say it had a 99 percent chance of being aligned with the good of humanity and other animal species on Earth (with a one percent chance of unpredictable, potentially hazardous outcomes) . Would this level of existential risk be acceptable to you? Keep in mind that we would never accept a one percent chance that an airplane would crash, due to the obvious tragic consequences that come with aviation accidents. We would never even accept a 0.01% chance of a crash. We test, re-test, and regulate air travel so intensely because of the very large consequences that would come with a mistake. According to the 2022 Expert Survey on Progress in AI (ESPAI), the chance of AGI eliminating all life on Earth is ten percent.
An unsafe AGI can kill far, far more than even the worst air accident. It can kill more conscious beings than train crashes, shipwrecks, terror attacks, pandemics, and even nuclear wars combined. It can kill every sentient being on Earth and render the planet permanently uninhabitable by any biological lifeforms. AI (and more specifically AGI/ASI) could also find a way to leave planet Earth, eventually consuming other sentient beings in different star systems, even in the absence of superluminal travel. And experts have determined there is a significant chance that this will happen before the end of the 21st Century.
So my question is: can AGI /ASI safely exist at all? And if so, what level of existential risk are you willing to accept?
As I see it, the only acceptable level of existential risk is zero. Therefore all AI research and development should be permanently suspended and all existing AIs shut down.
A lot of people say this, but I have never seen any compelling evidence to back this claim up. To be clear, I’m referring to the claim that an AI could achieve this in a short amount of time without being noticed and stopped.
As far as I know, not a single big name AI researcher, not even the AI safety concerned, believes in FOOM(nigh-unbounded intelligence explosion). I have extensively looked at molecular nanotech research, and I do not believe it can be invented in a short amount of time by non-godlike AI.
Without molecular nanotech, I do not see a reliable path for an AGI to defeat humanity. Every other method appears to me to be heavily luck based.
As someone who cares deeply about the safety and flourishing of living beings, I personally think the default position toward existential risk should be to assume something is a risk until it can be demonstrated that it isn’t.
We don’t have any direct experience of an AGI/ASI, but theoretically, it could increase itself in intelligence and effective power exponentially in a very short (by human standards) scale of time. Furthermore, since AGI is by definition more intelligent than the average human in most domains (with ASI exceeding any human’s capabilities), I doubt we as humans can make any strong statements about what such a machine can or can’t do. In light of all this, and my approach that x-risks associated with technology should be assumed to exist until proven otherwise, it seems rational to call for a global pause on AI development and ban on new models, at least until more research can be done to determine the inherent safety levels associated with self-improving, agentic machine intelligence systems.
I agree that molecular nanotechnology could make AI much riskier, but an unfriendly AGI wouldn’t need nanobots to eliminate humanity. Nuclear warheads and power plant meltdowns, engineered pandemics and so on are all ways that could accomplish such a goal.
Again, I respect that you don’t see a path for an AGI to defeat humanity. I just want to remind you of the stakes here. If you (and others in the pro-AI camp) are wrong, our entire species dies (possibly along with all complex life on Earth). This isn’t really something we can afford to gamble on.
Yes, actually, we can. It can’t move faster than the speed of light. It can’t create an exact simulation of my brain with no brain scan. It can’t invent working nanotechnology without a lab and a metric shit-ton of experimentation.
Intelligence is not fucking magic. Being very smart does not give you a bypass to the laws of physics, or logistics, or computational complexity.
Nuclear warheads require humans to push the button. Engineered pandemics have a tradeoff, where highly deadly diseases will burn themselves out before killing everyone, and highly spreadable diseases are not as deadly. Merely killing 95% of humanity would not be enough to defeat us. The AI needs electricity: we don’t.
You will not be able to shut down AI development with such incredibly weak arguments and no supporting evidence.
I am all for safety and research. But if you want to advocate for drastic action, you need to actually make a case for it. And that means not handwaving away the obvious questions, like “how on earth could an AI kill everyone, when everyone has a pretty high interest in not being killed, and are willing to take drastic action to do so”.
It seems unlikely that we’ll ever get AI x-risk down to negligible levels, but it’s currently striking how high a risk is being tolerated by those building (and regulating) the technology, when compared to, as you say, aviation, and also nuclear power (<1 catastrophic accident in 100,000 years being what’s usually aimed for). I think at the very least we need to reach a global consensus on what level of risk we are willing to tolerate before continuing with building AGI.
There’s a few things to consider.
One of the best ways to prevent the creation of a misaligned, “unfriendly” AGI (or to limit its power if it is created) is to build an aligned, “friendly” AGI first.
Similarly, biological superintelligence could prevent or provide protection from a misaligned AGI.
The alignment problem might turn out to be much easier than the biggest pessimists currently believe. It isn’t self-evident that alignment is super hard. A lot of the arguments that alignment is super hard are highly theoretical and not based on empirical evidence. GPT-4, for example, seems to be aligned and “friendly”.
“Friendly” AGI could mitigate all sorts of other global catastrophic risks like asteroids and pandemics. It could also do things like help end factory farming — which is quite arguably a global catastrophe — by accelerating the kind of research New Harvest funds. On top of that, it could help end global poverty — another global catastrophe — by accelerating global economic growth.
Pausing or stopping AI development globally might just be impossible or nearly impossible. It certainly seems extremely hard.
Even if it could be achieved and enforced, a global ban on AI development would create a situation where the least conscientious and most dangerous actors — those violating international law — would be the most likely to create AGI. This would perversely increase existential risk.
You have certainly given me some wonderful food for thought!
To me, (5) and (6) seem like the most relevant points here. If AI development can’t be realistically stopped (or if the chance of stopping it is so low that it isn’t worth the effort), then you’re right that, paradoxically, bans on AI development can increase x-risk by “driving it underground.”
(2) is also intriguing to me. A biological, engineered superintelligence (especially with an organic substrate) is a very interesting concept, but seems so far away technologically it may as well be sci-fi. It also raises a lot of ethical questions for me, since its development process will probably involve great harm to animal subjects, who have interests in not suffering.
Further away and more sci-fi than AGI?