I think (as do others) that advanced AI could have really big undesired impacts like causing the extinction of people. I also think, with higher confidence, that advanced AI is likely to have some large impacts on the way that people live, without saying exactly what these impacts are likely to be. AI X-risk seems to be regarded as one of the most important potential impacts for AI safety researchers to focus on, particularly by people who think that promoting a long and prosperous future for humans and other living things is a top priority. Considering the amount of work on AI X-risk overall (not just within EA), should a lot more attention be given to AI X-risk? What other AI impacts should receive a lot more attention alongside X-risk?
I am interested in impacts that are explained in a manner that is nearly concrete enough to be the subject of a prediction tournament or prediction market, though some flaws are acceptable. For example, the impact “AI causes the extinction of people in the next 1000 years” has at least two flaws from the point of view of a prediction tournament: first, establishing that AI is responsible for an extinction event might not be straightforward, and second if people are extinct then there will be on one to resolve the question. However, it’s concrete enough for my purposes.
Please propose impacts as an answer to this question, and only propose one potential impact per answer. You can also include reasons why you think the identified impact is a priority. If you want to discuss multiple impacts, or say something other than proposing an impact to consider, please post it as a comment instead. And, to reiterate, I’m interested in impacts you think should receive more attention overall, not just more attention within the EA community.
Advanced AI may allow totalitarian regimes to solidify their power. E.g. large language models used to monitor all online communication, or distribute personalised propaganda.
Suffering risks. S-risks are arguably a far more serious issue than reducing the risk of extinction, as the scope of the suffering could be infinite. The fact that there is a risk of a maligned superintelligence creating a hellish dystopia on a cosmic scale with more intense suffering than has ever existed in history means that even if the risk of this happening is small, this is balanced by its extreme disutility. S-risks are also highly neglected, relative to their potential extreme disutility. It could even be argues that it would be rational to completely dedicate your life to reducing S-risks because of this. The only organizations I’m aware of that are directly working on reducing S-risks are the Center on Long-Term Risk and the Center for Reducing Suffering. One possible way AI could lead to astronomical suffering is if there is a “near miss” in AI alignment, where the AI alignment problem is partially solved, but not entirely. Other potential sources of S-risks may include malevolence, or an AI that includes religious hells when aligned to reflect the values of humanity.
Other S-risks that may or may not sound more plausible are suffering simulations (maybe an AI comes to the conclusion that a good way to study humans is to simulate earth at the time of the Black Death) or suffering subroutines (maybe reinforcement learners that are able to suffer enable faster or more efficient algorithms).
FWIW, infinities could go either way if you recognize moral goods that can aggregate by summing. I think where infinities seem more likely for suffering than goods are if your views are ethically asymmetric and assign more weight to suffering, especially some kinds of suffering being infinitely bad, but no goods being infinitely good (or no goods at all), or goods only being able to offset but not outweigh bads.
To preface my criticism I’ll say I think concrete ways that AI may cause great suffering do deserve attention.
But:
the scope is surely not infinite. The heat death of the universe and the finite number of atoms in it pose a limit.
Unless you think unaligned AIs will somehow be inclined to not only ignore what people want, but actually keep them alive and torture them—which sounds implausible to me—how’s this not Pascal’s mugging?
We can’t say for certain that travel to other universes is impossible, so we can’t rule it out as a theoretical possibility. As for the heat death if the universe, Alexey Turchin created this chart of theoretical ways that the heat death of the universe could be survivable by our descendants.
The entities that are being subjected to the torture wouldn’t necessarily be “people” per se. I am talking about conscious entities in general. Solving the alignment problem from the perspective of hedonistic utilitarianism would involve the superintelligence having consciousness-centric values and the ability to create and preserve conscious states with high levels of valence. If a superintelligence with consciousness-centric values that can create large amounts of bliss is realistically possible, the possibility of a consciousness-centric superintelligence that creates large amounts of suffering isn’t necessarily that much less realistic. If you believe that a superintelligence causing torture is implausible, you also have to accept that a superintelligence creating a utopia is also implausible.
It should be mentioned that all (or at least most) ideas to survive the heat death of the universe involve speculative physics. Moreover, you have to deal with infinities. If everyone is suffering but there is one sentient being that experiences a happy moment every million years, does this mean that there is an infinite amount of suffering and an infite amount of happiness and the future is of neutral value? If any future with an infinite amount of suffering is bad, does this mean that it is good if sentient life does not exists forever? There is no obvious answer to these questions.
How’s this argument different from saying, for example, that we can’t rule out God’s existence so we should take him into consideration? Or that we can’t rule out the possibility of the universe being suddenly magically replaced with a utilitarian optional one?
The linked post is basically a definition of what “survival” means, without any argument on how any of it is at all plausible.
I believe neither is plausible by mistake.
If you want to reduce the risk of going to some form of hell as much as possible, you ought to determine what sorts of “hells” have the highest probability of existing, and to what extent avoiding said hells is tractable. As far as I can tell, the “hells” that seem to be the most realistic are hells resulting from bad AI alignment, and hells resulting from living in a simulation. Hells resulting from bad AI alignment can be plausibly avoided by contributing in some way to solving the AI alignment problem. It’s not clear how hells resulting from living in a simulation could be avoided, but it’s possible that ways to avoid these sorts of hells could be discovered with further analysis of different theoretical types of simulations we may be living in, such as in this map. Robin Hanson explored some of the potential utilitarian implications of the simulation hypothesis in his article How To Live In A Simulation. Furthermore, mind enhancement could potentially reduce S-risks. If you manage to improve your general thinking abilities, you could potentially discover a new way to reduce S-risks.
A Christian or a Muslim could argue that you ought to convert to their religions in order to avoid going to hell. But a problem with Pascal’s Wager-type arguments is the issue of tradeoffs. It’s not clear that practicing a religion is the most optimal way to avoid hell/S-risks. The time spent going to church, praying, and otherwise being dedicated to your religion is time not spent thinking about AI safety and strategizing ways to avoid S-risks. Working on AI safety, strategizing ways to avoid S-risks, and trying to improve your thinking abilities would probably be more effective at reducing your risk of going to some sort of hell than, say, converting to Christianity would.
It mentions finding ways to travel to other universes, send information to other universes, creating a superintelligence to figure out ways to avoid heat death, convincing the creators of the simulation to not turn it off, etc. While these hypothetical ways to survive heat death do involve a lot of speculative physics, they are more than just “defining survival”.
Yet we live in a reality where happiness and suffering exist seemingly by mistake. Your nervous system is the result of millions of years of evolution, not the result of an intelligent designer.
Impact: AI causes the extinction of people in the next 1000 years.
Why is this a priority? Extinction events are very bad from the point of view of people who want the future to be big and utopian. The 1000-year time frame (I think) is long enough to accommodate most timelines for very advanced AI, but short enough that we don’t have to worry about “a butterfly flaps its wings and 10 million years later everyone is dead” type scenarios. While it is speculative, it does not seem reasonable given what we know right now to assign this event vanishingly low probability. Finally, my impression is that while it is taken seriously in and near the EA community, it is largely not taken seriously outside the community commensurate with reasonable estimates of subjective likelihood and severity.