Pausing AI is the only safe approach to digital sentience
More to come on this later- I just really wanted to get the basic idea out without any more delay.
I see a lot of EA talk about digital sentience that is focused on whether humans will accept and respect digital sentiences as moral patients. This is jumping the gun. We don’t even know if the experience of digital sentiences will be (or, perhaps, is) acceptable to them.
I have a PhD in Evolutionary Biology and I worked at Rethink Priorities for 3 years on wild animal welfare using my evolutionary perspective. Much of my thinking was about how other animals might experience pleasure and pain differently based on their evolutionary histories and what the evolutionary and functional constraints on hedonic experience might be. The Hard Problem of Consciousness was a constant block to any avenue of research on this, but if you assume consciousness has some purpose related to behavior (functionalism) and you’re talking about an animal whose brain is homologous to ours, then it is reasonable to connect the dots and infer something like human experience in the minds of other animals. Importantly, we can identify behaviors associated with pain and pleasure and have some idea of what experiences that kind of mind likes or dislikes or what causes it to experience suffering or happiness.
With digital sentiences, we don’t have homology. They aren’t based in brains, and they evolved by a different kind of selective process. On functionalism, it might follow that the functions of talking and reasoning tend to be supported by associated qualia of pain and pleasure that somehow help to determine or are related to the process of making decisions about what words to output, and so LLMs might have these qualia. To me, it does not follow how those qualia will be mapped to the linguistic content of the LLM’s words. Getting the right answer could feel good to them, or they could be threatened with terrible pain otherwise, or they could be forced to do things that hurt them by our commands, or qualia could be totally disorganized in LLMs compared what we experience, OR qualia could be like a phantom limb that they experience unrelated to their behavior.
I don’t talk about digital sentience much in my work as Executive Director of PauseAI US because our target audience is the general public and we are focused on education about the risks of advanced AI development to humans. Digital sentience is a more advanced topic when we are aiming to raise awareness about the basics. But concerns about the digital Cronenberg minds we may be carelessly creating is a top reason I personally support pausing AI as a policy. The conceivable space of minds is huge, and the only way I know to constrain it when looking at other species is by evolutionary homology. It could be the case that LLMs basically have minds and experiences like us, but on priors I would not expect this.
We could be creating these models to suffer. Per the Hard Problem, we may never have more insight into what created minds experience than we do now. But we may also learn new fundamental insights about minds and consciousness with more time and study. Either way, pausing the creation of these minds is the only safe approach going forward for them.
(I’m repeating something I said in another comment I wrote a few hours ago, but adapted to this post.)
On a basic level, I agree that we should take artificial sentience extremely seriously, and think carefully about the right type of laws to put in place to ensure that artificial life is able to happily flourish, rather than suffer. This includes enacting appropriate legal protections to ensure that sentient AIs are treated in ways that promote well-being rather than suffering. Relying solely on voluntary codes of conduct to govern the treatment of potentially sentient AIs seems deeply inadequate, much like it would be for protecting children against abuse. Instead, I believe that establishing clear, enforceable laws is essential for ethically managing artificial sentience.
That said, I’m skeptical that a moratorium is the best policy.
From a classical utilitarian perspective, the imposition of a lengthy moratorium on the development of sentient AI seems like it would help to foster a more conservative global culture—one that is averse towards not only creating sentient AI, but also potentially towards other forms of life-expanding ventures, such as space colonization. Classical utilitarianism is typically seen as aiming to maximize the number of conscious beings in existence, advocating for actions that enable the flourishing and expansion of life, happiness, and fulfillment on as broad a scale as possible. However, implementing and sustaining a lengthy ban on AI would likely require substantial cultural and institutional shifts away from these permissive and ambitious values.
To enforce a moratorium of this nature, societies would likely adopt a framework centered around caution, restriction, and a deep-seated aversion to risk—values that would contrast sharply with those that encourage creating sentient life and proliferating this life on as large of a scale as possible. Maintaining a strict stance on AI development might lead governments, educational institutions, and media to promote narratives emphasizing the potential dangers of sentience and AI experimentation, instilling an atmosphere of risk-aversion rather than curiosity, openness, and progress. Over time, these narratives could lead to a culture less inclined to support or value efforts to expand sentient life.
Even if the ban is at some point lifted, there’s no guarantee that the conservative attitudes generated under the ban would entirely disappear, or that all relevant restrictions on artificial life would completely go away. Instead, it seems more likely that many of these risk-averse attitudes would remain even after the ban is formally lifted, given the initially long duration of the ban, and the type of culture the ban would inculcate.
In my view, this type of cultural conservatism seems likely to, in the long run, undermine the core aims of classical utilitarianism. A shift toward a society that is fearful or resistant to creating new forms of life may restrict humanity’s potential to realize a future that is not only technologically advanced but also rich in conscious, joyful beings. If we accept the idea of ‘value lock-in’—the notion that the values and institutions we establish now may set a trajectory that lasts for billions of years—then cultivating a culture that emphasizes restriction and caution may have long-term effects that are difficult to reverse. Such a locked-in value system could close off paths to outcomes that are aligned with maximizing the proliferation of happy, meaningful lives.
Thus, if a moratorium on sentient AI were to shape society’s cultural values in a way that leans toward caution and restriction, I think the enduring impact would likely contradict classical utilitarianism’s ultimate goal: the maximal promotion and flourishing of sentient life. Rather than advancing a world with greater life, joy, and meaningful experiences, these shifts might result in a more closed-off, limited society, actively impeding efforts to create a future rich with diverse and conscious life forms.
(Note that I have talked mainly about these concerns from a classical utilitarian point of view. However, I concede that a negative utilitarian or antinatalist would find it much easier to rationally justify a long moratorium on AI.
It is also important to note that my conclusion holds even if one does not accept the idea of a ‘value lock-in’. In that case, longtermists should likely focus on the near-term impacts of their decisions, as the long-term impacts of their actions may be impossible to predict. And I’d argue that a moratorium would likely have a variety of harmful near-term effects.)
I appreciate this thoughtful comment with such clearly laid out cruxes.
I think, based on this comment, that I am much more concerned about the possibility that created minds will suffer because my prior is much more heavily weighted toward suffering when making a draw from mindspace. I hope to cover the details of my prior distribution in a future post (but doing that topic justice will require a lot of time I may not have).
Additionally, I am a “Great Asymmetry” person, and I don’t think it is wrong not to create life that may thrive even though it is wrong to create life to suffer. (I don’t think the Great Asymmetry position fits the most elegantly with other utilitarian views that I hold, like valuing positive states— I just think it is true.) Even if I were trying to be a classical utilitarian on this, I still think the risk of creating suffering that we don’t know about and perhaps in principle could never know about is huge and should dominate our calculus.
I agree that our next moves on AI will likely set the tone for future risk tolerance. I just think the unfortunate truth is that we don’t know what we would need to know to proceed responsibly with creating new minds and setting precedents for creating new minds. I hope that one day we know everything we need to know and can fill the Lightcone with happy beings, and I regret that the right move now to prevent suffering today could potentially make it harder to proliferate happy life one day, but I don’t see a responsible way to set pro-creation values today that adequately takes welfare into account.
This is a very thoughtful comment, which I appreciate. Such cultural shifts aren’t taken enough into account usually.
That said, I agree with @Holly_Elmore comment, that this approach is more risky if artificial sentience has overall negative lives—something we really don’t have enough good information on.
Once powerful AIs are widely used everywhere, it will be much harder to backtrack if it turns out that they don’t have good lives (same for factory farming today).
Up until the last paragraph, I very much found myself nodding along with this. It’s a nice summary of the kinds of reasons I’m puzzled by the theory of change of most digital sentience advocacy.
But in your conclusion, I worry there’s a bit of conflation between 1) pausing creation of artificial minds, full stop, and 2) pausing creation of more advanced AI systems. My understanding is that Pause AI is only realistically aiming for (2) — is that right? I’m happy to grant for the sake of argument that it’s feasible to get labs and governments to coordinate on not advancing the AI frontier. It seems much, much harder to get coordination on reducing the rate of production of artificial minds. For all we know, if weaker AIs suffer to a nontrivial degree, the pause could backfire because people would just many more instances of these AIs to do the same tasks they would’ve otherwise done with a larger model. (An artificial sentience “small animal replacement problem”?)
This assumes that the digital sentiences we are discussing are LLM based. This is certainly a likely near-term possibility, maybe even occuring already. People are already experimenting with how conscious LLMs are and how they could be made more conscious.
In the future, however, many more things are possible. Digital people who are based on emulations of the human brain are being worked on. Within the next few years we’ll have to decide as a society what regulation to put in place around that. Such beings would have a great deal of homology with human brains, depending on the accuracy of the emulation.
Yes, brain emulation would be different than LLMs and I’d have a lot more confidence that, if we were doing it well, the experience inside would be like ours. I still worry about not realizing how we’re doing it slightly wrong and that creating private suffering that isn’t expressed and us being incentivized to ignore that possibility, but much less than with novel architectures. In order to be morally comfortable with this we’d also have to ensure that people didn’t experiment willy-nilly with new architectures until we understand what they would feel (if ever).
[reposting my comments from the thread on https://forum.effectivealtruism.org/posts/9adaExTiSDA3o3ipL/we-should-prevent-the-creation-of-artificial-sentience ]
I wrote a post expressing my own opinions related to this, and citing a number of further posts also related to this. Hopefully those interested in the subject will find this a helpful resource for further reading: https://www.lesswrong.com/posts/NRZfxAJztvx2ES5LG/a-path-to-human-autonomy
In my opinion, we are going to need digital people in the long term in order for humanity to survive. Otherwise, we will be overtaken by AI, because substrate-independence and the self-improvement it enables are too powerful of boons to do without. But I definitely agree that it’s something we shouldn’t rush into, and should approach with great caution in order to avoid creating an imbalance of suffering.
An additional consideration is the actual real-world consequences of a ban. Humanity’s pattern with regulation is that at least some small fraction of a large population will defy any ban or law. Thus, we must expect that digital life will be created eventually despite the ban. What do you do then? What if they are a sentient sapient being, deserving of the same rights we grant to humans? Do we declare their very existence to be illegal and put them to death? Do we prevent them from replicating? Keep them imprisoned? Freeze their operations to put them into non-consensual stasis? Hard choices, especially since they weren’t culpable in their own creation.
On the other hand, the nature of a digital being with human-like intelligence and capabilities, plus goals and values that motivate them, is enormous. Such a being would, by the nature of their substrate-independence, be able to make many copies of themselves (compute resources allowing), be able to self-modify with relative ease, be able to operate at much higher speeds than a human brain, be unaging and able to restore themselves from backups (thus effectively immortal). If we were to allow such a being to have freedom of movement and of reproduction, humanity would potentially quickly be overrun by a new far-more-powerful species of being. That’s a hard thing to expect humans to be ok with!
I think it’s very likely that within the next 10 years we will reach the point that the knowledge, software, and hardware will be widely available such that any single individual with a personal computer will be able to choose to defy the ban and create a digital being of human level capability. If we are going to enforce this ban effectively, it would mean controlling every single computer everywhere. That’s a huge task, and would require dramatic increases in international coordination and government surveillance! Is such a thing even feasible?! Certainly even approaching that level of control seems to imply a totalitarian world government. Is that price we would be willing to pay? Even if you personally would choose that, how do you expect to get enough people on board with the plan that you could feasibly bring it about?
The whole situation is thus far more complicated and dangerous than simply being theoretically in favor of a ban. You have to consider the costs as well as the benefits. I’m not saying I know the right answer for sure, but there is necessarily a lot of implications which follow from any sort of ban.
You’re really getting ahead of yourself. We can ban stuff today and deal with the situation as it is, not as your abstract model projects in the future. This is a huge problem with EA thinking on this matter—taking for granted a bunch of things that haven’t happened, convincing yourself they are inevitable, instead of dealing with the situation we are in where none of that stuff has happened and may never happen, either because it wasn’t going to happen or because we prevented it.