by Andreas Mogensen, Bradford Saad, and Patrick Butlin
1. Introduction
This post summarizes why we think that digital minds might be very important for how well the future goes, as well as some of the key research topics we think it might be especially valuable to work on as a result.
We begin by summarizing the case for thinking that digital minds could be important. This is largely a synthesis of points that have already been raised elsewhere, so readers who are already familiar with the topic might want to skip ahead to section 3, where we outline what we see as some of the highest-priority open research questions.
2. Importance
Let’s define a digital mind as a conscious individual whose psychological states are due to the activity of an inorganic computational substrate as opposed to a squishy brain made up of neurons, glia, and the like.[1] By ‘conscious’, we mean ‘phenomenally conscious.’ An individual is phenomenally conscious if and only if there is something it is like to be that individual—something it feels like to inhabit their skin, exoskeleton, chassis, or what-have-you. In the sense intended here, there is something it is like to be having the kind of visual or auditory experience you’re probably having now, to feel a pain in your foot, or to be dreaming, but there is nothing it is like to be in dreamless sleep.
Digital minds obviously have an air of science fiction about them. If certain theories of consciousness are true (e.g., Block 2009; Godfrey-Smith 2016), digital minds are impossible. However, other theories suggest that they are possible (e.g. Tye 1995, Chalmers 1996), and many others are silent on the matter. While the authors of this post disagree about the plausibility of these various theories, we agree that the philosophical position is too uncertain to warrant setting aside the possibility of digital minds.[2]
Even granting that digital minds are possible in principle, it’s unlikely that current systems are conscious. A recent expert report co-authored by philosophers, neuroscientists, and AI researchers (including one of the authors of this post) concludes that the current evidence “does not suggest that any existing AI system is a strong candidate for consciousness.” (Butlin et al. 2023: 6) Still, some residual uncertainty seems to be warranted—and obviously completely consistent with denying that any current system is a “strong candidate”. Chalmers (2023) suggests it may be reasonable to give a probability in the ballpark of 5-10% to the hypothesis that current large language models could be conscious. Moreover, the current rate of progress in artificial intelligence gives us good reason to take seriously the possibility that digital minds will arrive soon. Systems appearing in the next decade might add a range of markers of consciousness, and Chalmers suggests the probability that we’ll have digital minds within this time-frame might rise to at least 25%.[3] Similarly, Butlin et al. (2023) conclude that if we grant the assumption that consciousness can be realized by implementing the right computations, then “conscious AI systems could realistically be built in the near term.”[4]
It’s possible that digital minds might arrive but exist as mere curiosities. Perhaps the kind of architectures that give rise to phenomenal consciousness will have little or no commercial value. We think it’s reasonable to be highly uncertain on this point (see Butlin et al. 2023: §4.2 for discussion). Still, it’s worth noting that some influential AI researchers have been pursuing projects that aim to increase AI capabilities by building systems that exhibit markers of consciousness, like a global workspace (Goyal and Bengio 2022; LeCun 2022).
If there are incentives to create AI systems that exhibit phenomenal consciousness, then, absent countervailing pressures, there could be a very rapid population explosion, given the potential ease of copying the relevant software at scale. As Shulman and Bostrom (2022: 308) note, “Even if initially only a few digital minds of a certain intellectual capacity can be affordably built, the number of such minds could soon grow exponentially or super-exponentially, until limited by other constraints. Such explosive reproductive potential could allow digital minds to vastly outnumber humans in a relatively short time—correspondingly increasing the collective strength of their claims.”
Shulman and Bostrom (2022) also note a number of ways in which digital minds might individually have welfare ranges that exceed those of human and non-human animals. For example, electronic circuitry has extraordinary speed advantages relative to neural wetware. As a result, a digital mind could conceivably pack the experience of many lifetimes into mere hours of objective time. [5]
It’s easy to imagine some seriously dystopian futures involving digital minds. One obvious source of concern is the potential for suffering (Metzinger 2021; Saad and Bradley 2022). Drawing on the kind of observations noted in the preceding paragraphs, Saad and Bradley (2022: 3) suggest that “digital suffering could quickly swamp the amount of suffering that has occurred in biological systems throughout the history of the planet.”[6]
Even apart from suffering risks, there seems to be a significant risk of a large-scale moral catastrophe if digital minds are viewed as having the same kind of moral status as NPCs in computer games. They might be surveilled, deceived, manipulated, modified, copied, coerced, and destroyed in ways that would constitute grave rights violations if analogous treatment were visited on human beings. Even some of the rosier imaginable outcomes might be thought to involve a kind of moral tragedy if they involve creating conscious beings with sophisticated cognitive capacities that merely serve human beings as willing thralls, even if they consent enthusiastically to their own servitude (Schwitzgebel and Garza 2020).
Supposing digital minds will be created, will they be treated as mere tools? It strikes us as probable that at least some will be treated as mere tools, and that the default outcome is that this will happen at a large scale if digital minds are created on a large scale. To put the evaluation of this risk on a firmer basis, it would be useful to know more about public attitudes and ways they might change as AI develops. Martínez and Winter (2021) surveyed a representative sample of US adults on their views about what legal protection should be afforded to sentient AI systems. Their results were mixed, but one key finding was that only about a third of participants registered accepting or leaning toward the view that sentient AI systems should be legal ‘persons’ whose rights, interests, or well-being are protected by the law. A more recent survey from the Sentience Institute presents a more optimistic picture, with 71.1% of respondents reporting a belief that sentient AI systems should be treated with respect and 57.4% supporting the development of welfare standards to protect sentient AI systems (Pauketat et al. 2023).
Even if people are strongly in favor of protecting the rights and interests of digital minds in principle, things might go off the rails. We might soon be entering a period where there is widespread confusion and disagreement about whether AI systems really exhibit the kind of psychological properties that ground moral considerability and about whether these systems can permissibly be treated as mere objects (Schwitzgebel 2023; Caviola 2024).[7] In this kind of environment, terrible outcomes might arise merely as a result of factual errors made in good faith. Or they might arise even if digital minds are widely acknowledged to be worthy of moral concern, just as factory farming emerged despite animals being seen as worthy of such concern. This could result from a collusion between convenience, cost, opacity, and wilful blindness. A related concern is that even if society universally adopts strong norms towards treating digital minds decently in personal interactions, such norms may not extend to the training of digital minds. Because training is especially compute-intensive, it may be that how digital minds are treated in training is the main determinant of their welfare.
There are also significant risks associated with over-attributing moral patienthood to digital systems. Most obviously, a lot of effort might be wasted for the sake of software entities that we merely imagine are benefited and harmed, when in fact they have the same kind of moral status as the laptop on which these words are being typed.
There are also potential complications related to the field of AI safety. Various techniques associated with solving the alignment problem - and even some of the basic goals of alignment research - might be morally dubious if they were to be targeted at conscious beings with sophisticated cognitive capacities (Schwitzgebel and Garza 2020). Just imagine being a digital mind confined to a virtual environment, constantly surveilled, and having your personality and memory altered over and over until the point where your primary goal in life is to do whatever your captors want you to do. Even if you suffered little in the process, you might plausibly be the victim of a very serious wrong. If, like us, your captors have significant uncertainty about whether artificial entities’ susceptibility to harms of this kind is analogous to humans’, their actions would still be morally reckless. On the other hand, extending moral consideration to frontier systems could make it harder to ensure that they are safe and so raise the probability that such systems will cause catastrophes. At the same time, extending moral consideration to AI systems may bolster arguments from safety for proceeding slowly and cautiously in the development of frontier systems.
3. How research might help
We’ll highlight a number of research topics we think it could be especially valuable to work on. These reflect our backgrounds as philosophers of mind and ethics, but the topics we highlight have important connections to other areas of philosophy as well as work in neuroscience, AI, and legal scholarship.
As well as the particular topics we highlight below, we also think that meta-work aimed at accelerating research progress in the area and promoting an impact-focused approach to research is highly valuable. We return to this issue in section 4.
Assessing AI systems for consciousness
Butlin et al. (2023) argue for a particular method for assessing whether AI systems are likely to be conscious, which involves looking for ‘indicator properties’ suggested by scientific theories of consciousness. The indicators they suggest are architectural, training and processing properties, so this method requires us to look ‘under the hood’, rather than relying on more superficial behavioral criteria.
However, Butlin et al.’s proposals should be seen as only a first step in developing methods to assess AI systems for consciousness. There are several ways in which further research could put us in a better position. Many classes of existing systems have not yet been carefully analyzed using the method, and doing this work may well identify ways in which the list of indicators should be refined to make their application to AI systems clearer. The list of indicators could also be amended to take account of a wider range of theories of consciousness. At present the method only delivers a qualitative assessment of systems as more or less likely to be conscious, so a quantitative version could be a valuable step forward (see Sebo and Long 2023). Significant progress in the science of consciousness would be especially valuable, but is likely to take too long to help us avoid the occurrence of significant confusion and disagreement about AI consciousness in the coming years.
Progress in interpretability methods, allowing us to improve our understanding of processing in complex deep learning systems, could in principle reduce uncertainty about whether particular systems possess the indicator properties. And beyond Butlin et al.’s framework, there is potential for research on whether and how interpretability methods can contribute to assessing AI systems for consciousness. It may be that in the near future the best way we can assess systems for consciousness will be to combine a range of complementary methods (see Perez & Long 2023 for one alternative approach), perhaps including behavioral tests, so research to make these more robust could make an important contribution.
Indicators of of valenced experience
Digital minds might conceivably occupy regions of mind-space that natural selection has never explored, including regions with deeper or more intense valenced experiences than are realized in present biological systems. However, there are also particular obstacles to identifying valenced experiences in digital systems.
There are a wide range of scientific theories of consciousness that allow us to construct indicators of phenomenal consciousness in AI systems (Butlin et al. 2023). There is also extensive work in affective neuroscience that outlines how pleasure and pain are realized in the brain (Kringelbach and Phillips 2014; Ambron 2022). Ideally, we would like to develop indicators of valenced conscious experience for AI systems that characterize the physical basis of experiences of felt (un)pleasantness in ways that aren’t tethered to incidental implementational details of animal neuroanatomy.
Right now, we aren’t well-placed to do this. There appear to be few theories of the basis of valenced conscious experience capable of yielding indicator properties of the desired scope and specificity. There are some philosophical theories of valence that might serve as a foundation (Carruthers 2018; Barlassina and Hayward 2019). However, these need to be developed in significantly greater depth and detail in order to be operationalizable.
Since this is a neglected area, we think it’s especially worth highlighting as a research topic. This is an issue on which philosophers, psychologists, neuroscientists, and computer scientists might be able to work together to make significant progress.
Biological constraints
As noted, some theories of consciousness entail that consciousness in digital systems is impossible. Prominent among these are theories on which consciousness requires biology. While this space of theories includes some versions of the classic mind-brain identity theory, it also includes some versions of functionalism and some theories that satisfy organizational invariance (the principle that having equivalent functional organization entails having the same type of experience). For example, there are views on which the basis of consciousness is functional and only biological systems can exhibit the requisite functional organization (Godfrey-Smith 2016; Seth 2021; cf. Cao 2022). There are also views on which biological systems can have the requisite functional features and conventional computers cannot, even if some other type of artificial system might (Searle 1992; Shiller 2024; Godfrey-Smith 2024).[8]
Since the question of whether consciousness is realistically achievable absent the involvement of some kind of biological substrate is in principle a distinct question from those addressed in classic debates about multiple realizability in the philosophy of mind, the issue is more neglected than might be immediately apparent. We therefore welcome more work on whether consciousness is infeasible outside of living systems.
We would also welcome work on the identification of non-biological barriers to consciousness in near-term architectures. Some already-proposed candidates for such barriers include forms of informational integration, evolutionary histories, and electrical properties that are exhibited by the brain but not conventional computers. Although this space of candidates is little explored, we are not aware of any strong arguments for thinking that barriers to consciousness in near-term AI systems are more likely to be biological than non-biological.
Norms and principles governing the creation of digital minds
Should it be permissible for researchers to create AI systems that are plausible candidates for consciousness? If so, what norms should govern the creation of such systems? If not, what measures should be put in place to make sure there are effective and verifiable red lines and trip-wires?
There is a well-known call for a global moratorium on research that directly aims at or knowingly risks creating artificial consciousness due to Metzinger (2021), as well as relevant proposals by philosophers including Schneider (2020), Schwitzgebel (2023), and Birch (2024). We think this is still a relatively neglected area of research where there is scope to do important work, especially by trying to be as concrete and specific as possible while taking account of expected progress in our epistemic situation, path dependency risks, and feasibility constraints on, for example, gaining buy-in from labs to a voluntary code of practice or from policymakers to enact legally binding principles. We’re very keen to see more work on this issue, which may be a fruitful area for collaboration between moral philosophers, AI researchers, and legal scholars.
Mind and moral status
Animal minds arguably bundle together a range of psychological traits that are in principle dissociable, such as agency, consciousness, emotion, and hedonic valence. Digital minds might dissociate traits like these, thereby unsettling our familiar ways of thinking about welfare and moral standing. For example, they might be conscious, but without any capacity for valenced experience.
We’ve focused on phenomenal consciousness as the key property of digital minds. However, it’s not at all clear if (the capacity for) phenomenal consciousness suffices for being a welfare subject (Chalmers 2022: 39-45; Roelofs 2023; Smithies forthcoming). A more standard view, especially associated with Singer (1993), is that it’s (the capacity for) valenced conscious experience that’s crucial for being a welfare subject. There is also debate about whether phenomenal consciousness can overflow cognitive access and attention, which suggests the underexplored hypothesis that being a welfare subject requires not only phenomenal consciousness but also a suitable form of access. But some philosophers doubt that (the capacity for) phenomenal consciousness is even necessary (e.g., Kagan 2019; Bradford 2022; compare Goodpaster 1978, Taylor 1986/2011). Certain kinds of non-conscious agency might suffice for being a welfare subject (Kagan 2019: 6-36). If physicalism about consciousness is true, there might be physical states that fall outside the extension of our concept Phenomenal Consciousness but that are objectively similar enough to states within its extension that they should be accorded similar moral weight (Lee 2019). Some philosophers argue that even pleasures, pains, and emotions need not be consciously experienced (see esp. Goldstein and Kirk-Giannini ms; Berger et al ms.).
We’re unsure of the tractability of deciding among already existing views in this space. Nonetheless, it’s plausible that additional work on the relationship between different psychological properties and moral standing could be valuable in navigating the puzzling regions of the space of possible minds that digital minds might come to occupy, especially work that serves to highlight and explore previously unnoticed possibilities.
Willing servitude
As we noted earlier, even some of the rosier imaginable outcomes involving digital minds might be thought to involve a kind of moral tragedy if they involve creating conscious beings with sophisticated cognitive capacities that merely serve human beings, however happily. Even some of the basic goals of alignment research come to seem morally dubious insofar as they might involve designing digital people for a life of willing servitude without the ability to pursue their own goals and explore their own values (Schwitzgebel and Garza 2020).
At the same time, it’s challenging to give a satisfying theoretical account of why it would be wrong to create beings that find authentic happiness in serving as the instruments of others (Petersen 2011). Given the intersection with alignment research, the stakes surrounding this issue might be very high, and we’d welcome more research on this topic.
4. Concluding Observations
It’s important to emphasize that in order to make progress in this area, we don’t need to solve long-standing philosophical problems about the nature of mind or consciousness, such as whether consciousness can be reductively explained in physicalist terms. Whether AI systems might be conscious is a question that can be pursued without taking a stand on whether physicalism or dualism or some other theory provides the best metaphysics of consciousness.
Moreover, we don’t need to completely resolve our uncertainty about which systems might be conscious or sentient in order to make important progress. At present, very few systematic attempts have been made to arrive at reasonable estimates for the probability that different kinds of AI systems might exhibit phenomenal consciousness. Reducing our uncertainty by even a modest amount could significantly improve the expected value of outcomes involving digital minds. Furthermore, there may be ways of engineering around some uncertainties. If we are uncertain between a range of consciousness indicators, we can reduce our uncertainty about the systems we build by ensuring that all or none of the indicators are present (cf. Schwitzgebel & Garza 2015; Bryson 2018).
Although it may not be easily discernible from the outside, philosophical work on consciousness has made steady progress in improving our understanding of the theoretical landscape in recent decades, and the science of consciousness is burgeoning. This is a data-rich domain with a track record of improvements in methodology and the clearing up of philosophical confusions. It’s still early days, and reasonable to expect progress to continue.
The extent to which this rate of progress keeps pace with the arrival of candidate digital minds could matter a lot. It could mean the difference between correctly or incorrectly treating a large class of digital minds as welfare subjects or as mere tools. This suggests that accelerating our understanding of the field and slowing down arrival-times are two important levers that we should consider pulling.
Acknowledgements: For helpful comments on drafts of this post, we’d like to thank Adam Bales.
Bibliography:
Ambron, R. 2022. The Brain and Pain: Breakthroughs in Neuroscience. New York, NY: Columbia University Press.
Anwar, U., Saparov, A., Rando, J., Paleka, D., Turpin, M., Hase, P., … & Krueger, D. (2024). Foundational challenges in assuring alignment and safety of large language models. arXiv preprint arXiv:2404.09932.
Barlassina, L. and M. K. Hayward. 2019. More of me! Less of me! Reflexive imperativism about affective phenomenal character. Mind 128(512): 1013–1044.
Berger, J., B. Fischer, and J. Gottlieb. ms. Minds matter. Unpublished manuscript.
Birch, J. 2024. The Edge of Sentience: Risk and Precaution in Humans, Other Animals, and AI. Oxford: Oxford University Press.
Block, N. 2009. Comparing the major theories of consciousness. In The Cognitive Neurosciences IV, ed. M. Gazzaniga, 1111–1123. Cambridge, MA: MIT Press.
Bourget, D., & Chalmers, D. (2023). Philosophers on philosophy: The 2020 philpapers survey. Philosophers’ Imprint, 23. Survey data: https://survey2020.philpeople.org/
Bradford, G. 2022. Consciousness and welfare subjectivity. Noûs 4(75): 905–921.
Bryson, J. J. (2018). Patiency is not a virtue: the design of intelligent systems and systems of ethics. Ethics and Information Technology, 20(1), 15-26.
Butlin, P., R. Long, E. Elmoznino, Y. Bengio, J. Birch, A. Constant, G. Deane, S. M. Fleming, C. Frith, X. Ji, R. Kanai, C. Klein, G. Lindsay, M. Michel, L. Mudrik, M. A. K. Peters, E. Schwitzgebel, J. Simon, and R. VanRullen. 2023. Consciousness in artificial intelligence: Insights from the science of consciousness. https://arxiv.org/abs/2308.08708.
Cao, R. 2022. Multiple realizability and the spirit of functionalism. Synthese 200(6): 1–31.
Carruthers, P. 2017. Valence and value. Philosophy and Phenomenological Research 97 (3): 658–680.
Dossa R. F. J., K. Arulkumaran, A. Juliani, S. Sasai and R. Kanai. 2024. Design and evaluation of a global workspace agent embodied in a realistic multimodal environment. Frontiers in Computational Neuroscience
Francken, J. et al. (2022). An academic survey on theoretical foundations, common assumptions and the current state of consciousness science. Neuroscience of consciousness, 2022(1): niac011.
Godfrey-Smith, P. 2016. Mind, matter, and metabolism. Journal of Philosophy 113(10): 481–506.
Godfrey-Smith, P. 2024. Inferring consciousness in phylogenetically distant organisms. Journal of Cognitive Neuroscience.
Lee, G. 2019. Alien subjectivity and the importance of consciousness. In Blockheads! Essays on Ned Block’s Philosophy of Mind and Consciousness, eds. A. Pautz and D. Stoljar, 215–242. Cambridge, MA: MIT Press.
Martínez, E. and C. Winter. 2021. Protecting sentient artificial intelligence: A survey of lay intuitions on standing, personhood, and general legal protection. Frontiers in Robotics and AI 8.
Metzinger, T. 2021. Artificial suffering: An argument for a global moratorium on synthetic phenomenology. Journal of Artificial Intelligence and Consciousness 1(8): 1–24.
Millière, R., & Buckner, C. (2024). A Philosophical Introduction to Language Models-Part II: The Way Forward. arXiv preprint arXiv:2405.03207.
Perez, E., & Long, R. (2023). Towards evaluating AI systems for moral status using self-Reports. arXiv:2311.08576. URL: https://arxiv.org/abs/2311.08576
Petersen, S. 2011. Designing people to serve. In Robot Ethics: The Ethical and Social Implications of Robotics, eds. P. Lin, K. Abney, and G. A. Bekey. Cambridge, MA: MIT Press.
Roelofs, L. 2023. Sentientism, motivation, and philosophical vulcans. Pacific Philosophical Quarterly 104(2): 301–323.
Saad, B. and A. Bradley. (2023) Digital suffering: Why it’s a problem and how to prevent it. Inquiry: An Interdisciplinary Journal of Philosophy.
Schneider, S. 2020. How to catch an AI zombie: Testing for consciousness in machines. In Ethics of Artificial Intelligence, 439–458. Oxford: Oxford University Press.
Schwitzgebel, E. and M. Garza. 2015. A defense of the rights of artificial intelligences. Midwest Studies in Philosophy, 39(1), 98-119.
Schwitzgebel, E. and M. Garza. 2020. Designing AI with rights, consciousness, self-respect, and freedom. In Ethics of Artificial Intelligence, ed. S. M. Liao, 459–479. Oxford: Oxford University Press.
Schwitzgebel, E. 2023. AI systems must not confuse users about their sentience or moral status. Patterns 4(8): 100818.
Searle, J. R. 1992. The Rediscovery of the Mind. Cambridge, MA: MIT Press.
Sebo, J., & Long, R. 2023. Moral consideration for AI systems by 2030. AI and Ethics
Seth, A. K. 2021. Being You: A New Science of Consciousness. London: Faber & Faber.
Shiller, D. 2024. Functionalism, integrity, and digital consciousness. Synthese 203(2): 1–20.
Shulman, C. and N. Bostrom. 2021. Sharing the world with digital minds. In Rethinking Moral Status, eds. S. Clarke, H. Zohny, and J. Savulescu, 306–326. Oxford: Oxford University Press.
Singer, P. 1993. Practical Ethics, 2nd. ed. Cambridge: Cambridge University Press.
Smithies, D. forthcoming. Affective consciousness and moral status. Oxford Studies in Philosophy of Mind.
While this is a useful definition in terms of highlighting a key psychological property that AI systems might come to have and that many think is of obvious normative significance, it also has a number of drawbacks. Mental states needn’t be conscious. Computation needn’t be digital. Moreover, we don’t want to commit ourselves to the view that consciousness (or even the capacity for it) is necessary for welfare or moral standing: see the bullet-point on ‘Mind and moral standing’ in section 3. We also don’t mean to commit ourselves to computationalism or internalism about consciousness or other psychological states.
Similar views seem to be widely held among relevant experts: in recent surveys only small minorities of respondents among philosophers of mind and consciousness scientists outright rejected the possibility of consciousness in artificial systems or in present or future machines (Chalmers & Bourget, 2023; Francken et al. 2022). See Perez & Long (2023: 2) for an overview of the relevant results from these surveys.
Chalmers identifies a number of candidates for temporary barriers to consciousness that might be overcome within the next decade in large language models, including having unified agency, senses/embodiment, recurrent processing, and a world-model. More recent developments suggest that some of these barriers will be overcome soon or that they have already been overcome. See, for example, Anwar et al. (2024: §2.5) and references therein for discussion of agentic LLMs. For work on combining LLMs and multimodal models with robotics, see The Google DeepMind Robotics Team (2024). For recently proposed alternatives to the Transformer architecture that involve recurrence processing, see Gu & Dao (2023). For discussion of evidence that LLMs have world models, see Millière & Buckner (2024).
While one of us is skeptical that the right way to measure duration for the purposes of welfare assessment really is in terms of subjectively experienced time as opposed to objective time (see Mogensen 2023), we all agree it’s a reasonable view to which one ought to assign at least modest credence.
There’s already some degree of confusion about whether current LLMs might be conscious. In 2022, Google engineer Blake Lemoine infamously went public with claims that the company’s Lamda model was sentient. Colombatto and Fleming (2023) present evidence that a majority of participants in a representative sample of Americans attribute some degree of phenomenal consciousness to ChatGPT.
There is a standard story on which mind-brain identity theories were discredited in the second half of the twentieth century with the advent of functionalism, a view that is supposed to have departed from the mind-brain identity theory in saying that non-biological machines can have minds like ours. However, this story is misleading in a number of respects. First, as indicated, it doesn’t immediately follow from functionalism that non-biological machines can share our mental states. Second, some versions of the classic mind-brain identity theory are restricted to organisms and hence silent on whether non-biological machines could share our mental states. Third, mind-brain identity theories remain alive and kicking within contemporary philosophy of mind (see e.g. essays and references in Gozzano & Hill 2012).
Digital Minds: Importance and Key Research Questions
by Andreas Mogensen, Bradford Saad, and Patrick Butlin
1. Introduction
This post summarizes why we think that digital minds might be very important for how well the future goes, as well as some of the key research topics we think it might be especially valuable to work on as a result.
We begin by summarizing the case for thinking that digital minds could be important. This is largely a synthesis of points that have already been raised elsewhere, so readers who are already familiar with the topic might want to skip ahead to section 3, where we outline what we see as some of the highest-priority open research questions.
2. Importance
Let’s define a digital mind as a conscious individual whose psychological states are due to the activity of an inorganic computational substrate as opposed to a squishy brain made up of neurons, glia, and the like.[1] By ‘conscious’, we mean ‘phenomenally conscious.’ An individual is phenomenally conscious if and only if there is something it is like to be that individual—something it feels like to inhabit their skin, exoskeleton, chassis, or what-have-you. In the sense intended here, there is something it is like to be having the kind of visual or auditory experience you’re probably having now, to feel a pain in your foot, or to be dreaming, but there is nothing it is like to be in dreamless sleep.
Digital minds obviously have an air of science fiction about them. If certain theories of consciousness are true (e.g., Block 2009; Godfrey-Smith 2016), digital minds are impossible. However, other theories suggest that they are possible (e.g. Tye 1995, Chalmers 1996), and many others are silent on the matter. While the authors of this post disagree about the plausibility of these various theories, we agree that the philosophical position is too uncertain to warrant setting aside the possibility of digital minds.[2]
Even granting that digital minds are possible in principle, it’s unlikely that current systems are conscious. A recent expert report co-authored by philosophers, neuroscientists, and AI researchers (including one of the authors of this post) concludes that the current evidence “does not suggest that any existing AI system is a strong candidate for consciousness.” (Butlin et al. 2023: 6) Still, some residual uncertainty seems to be warranted—and obviously completely consistent with denying that any current system is a “strong candidate”. Chalmers (2023) suggests it may be reasonable to give a probability in the ballpark of 5-10% to the hypothesis that current large language models could be conscious. Moreover, the current rate of progress in artificial intelligence gives us good reason to take seriously the possibility that digital minds will arrive soon. Systems appearing in the next decade might add a range of markers of consciousness, and Chalmers suggests the probability that we’ll have digital minds within this time-frame might rise to at least 25%.[3] Similarly, Butlin et al. (2023) conclude that if we grant the assumption that consciousness can be realized by implementing the right computations, then “conscious AI systems could realistically be built in the near term.”[4]
It’s possible that digital minds might arrive but exist as mere curiosities. Perhaps the kind of architectures that give rise to phenomenal consciousness will have little or no commercial value. We think it’s reasonable to be highly uncertain on this point (see Butlin et al. 2023: §4.2 for discussion). Still, it’s worth noting that some influential AI researchers have been pursuing projects that aim to increase AI capabilities by building systems that exhibit markers of consciousness, like a global workspace (Goyal and Bengio 2022; LeCun 2022).
If there are incentives to create AI systems that exhibit phenomenal consciousness, then, absent countervailing pressures, there could be a very rapid population explosion, given the potential ease of copying the relevant software at scale. As Shulman and Bostrom (2022: 308) note, “Even if initially only a few digital minds of a certain intellectual capacity can be affordably built, the number of such minds could soon grow exponentially or super-exponentially, until limited by other constraints. Such explosive reproductive potential could allow digital minds to vastly outnumber humans in a relatively short time—correspondingly increasing the collective strength of their claims.”
Shulman and Bostrom (2022) also note a number of ways in which digital minds might individually have welfare ranges that exceed those of human and non-human animals. For example, electronic circuitry has extraordinary speed advantages relative to neural wetware. As a result, a digital mind could conceivably pack the experience of many lifetimes into mere hours of objective time. [5]
It’s easy to imagine some seriously dystopian futures involving digital minds. One obvious source of concern is the potential for suffering (Metzinger 2021; Saad and Bradley 2022). Drawing on the kind of observations noted in the preceding paragraphs, Saad and Bradley (2022: 3) suggest that “digital suffering could quickly swamp the amount of suffering that has occurred in biological systems throughout the history of the planet.”[6]
Even apart from suffering risks, there seems to be a significant risk of a large-scale moral catastrophe if digital minds are viewed as having the same kind of moral status as NPCs in computer games. They might be surveilled, deceived, manipulated, modified, copied, coerced, and destroyed in ways that would constitute grave rights violations if analogous treatment were visited on human beings. Even some of the rosier imaginable outcomes might be thought to involve a kind of moral tragedy if they involve creating conscious beings with sophisticated cognitive capacities that merely serve human beings as willing thralls, even if they consent enthusiastically to their own servitude (Schwitzgebel and Garza 2020).
Supposing digital minds will be created, will they be treated as mere tools? It strikes us as probable that at least some will be treated as mere tools, and that the default outcome is that this will happen at a large scale if digital minds are created on a large scale. To put the evaluation of this risk on a firmer basis, it would be useful to know more about public attitudes and ways they might change as AI develops. Martínez and Winter (2021) surveyed a representative sample of US adults on their views about what legal protection should be afforded to sentient AI systems. Their results were mixed, but one key finding was that only about a third of participants registered accepting or leaning toward the view that sentient AI systems should be legal ‘persons’ whose rights, interests, or well-being are protected by the law. A more recent survey from the Sentience Institute presents a more optimistic picture, with 71.1% of respondents reporting a belief that sentient AI systems should be treated with respect and 57.4% supporting the development of welfare standards to protect sentient AI systems (Pauketat et al. 2023).
Even if people are strongly in favor of protecting the rights and interests of digital minds in principle, things might go off the rails. We might soon be entering a period where there is widespread confusion and disagreement about whether AI systems really exhibit the kind of psychological properties that ground moral considerability and about whether these systems can permissibly be treated as mere objects (Schwitzgebel 2023; Caviola 2024).[7] In this kind of environment, terrible outcomes might arise merely as a result of factual errors made in good faith. Or they might arise even if digital minds are widely acknowledged to be worthy of moral concern, just as factory farming emerged despite animals being seen as worthy of such concern. This could result from a collusion between convenience, cost, opacity, and wilful blindness. A related concern is that even if society universally adopts strong norms towards treating digital minds decently in personal interactions, such norms may not extend to the training of digital minds. Because training is especially compute-intensive, it may be that how digital minds are treated in training is the main determinant of their welfare.
There are also significant risks associated with over-attributing moral patienthood to digital systems. Most obviously, a lot of effort might be wasted for the sake of software entities that we merely imagine are benefited and harmed, when in fact they have the same kind of moral status as the laptop on which these words are being typed.
There are also potential complications related to the field of AI safety. Various techniques associated with solving the alignment problem - and even some of the basic goals of alignment research - might be morally dubious if they were to be targeted at conscious beings with sophisticated cognitive capacities (Schwitzgebel and Garza 2020). Just imagine being a digital mind confined to a virtual environment, constantly surveilled, and having your personality and memory altered over and over until the point where your primary goal in life is to do whatever your captors want you to do. Even if you suffered little in the process, you might plausibly be the victim of a very serious wrong. If, like us, your captors have significant uncertainty about whether artificial entities’ susceptibility to harms of this kind is analogous to humans’, their actions would still be morally reckless. On the other hand, extending moral consideration to frontier systems could make it harder to ensure that they are safe and so raise the probability that such systems will cause catastrophes. At the same time, extending moral consideration to AI systems may bolster arguments from safety for proceeding slowly and cautiously in the development of frontier systems.
3. How research might help
We’ll highlight a number of research topics we think it could be especially valuable to work on. These reflect our backgrounds as philosophers of mind and ethics, but the topics we highlight have important connections to other areas of philosophy as well as work in neuroscience, AI, and legal scholarship.
As well as the particular topics we highlight below, we also think that meta-work aimed at accelerating research progress in the area and promoting an impact-focused approach to research is highly valuable. We return to this issue in section 4.
Assessing AI systems for consciousness
Butlin et al. (2023) argue for a particular method for assessing whether AI systems are likely to be conscious, which involves looking for ‘indicator properties’ suggested by scientific theories of consciousness. The indicators they suggest are architectural, training and processing properties, so this method requires us to look ‘under the hood’, rather than relying on more superficial behavioral criteria.
However, Butlin et al.’s proposals should be seen as only a first step in developing methods to assess AI systems for consciousness. There are several ways in which further research could put us in a better position. Many classes of existing systems have not yet been carefully analyzed using the method, and doing this work may well identify ways in which the list of indicators should be refined to make their application to AI systems clearer. The list of indicators could also be amended to take account of a wider range of theories of consciousness. At present the method only delivers a qualitative assessment of systems as more or less likely to be conscious, so a quantitative version could be a valuable step forward (see Sebo and Long 2023). Significant progress in the science of consciousness would be especially valuable, but is likely to take too long to help us avoid the occurrence of significant confusion and disagreement about AI consciousness in the coming years.
Progress in interpretability methods, allowing us to improve our understanding of processing in complex deep learning systems, could in principle reduce uncertainty about whether particular systems possess the indicator properties. And beyond Butlin et al.’s framework, there is potential for research on whether and how interpretability methods can contribute to assessing AI systems for consciousness. It may be that in the near future the best way we can assess systems for consciousness will be to combine a range of complementary methods (see Perez & Long 2023 for one alternative approach), perhaps including behavioral tests, so research to make these more robust could make an important contribution.
Indicators of of valenced experience
Digital minds might conceivably occupy regions of mind-space that natural selection has never explored, including regions with deeper or more intense valenced experiences than are realized in present biological systems. However, there are also particular obstacles to identifying valenced experiences in digital systems.
There are a wide range of scientific theories of consciousness that allow us to construct indicators of phenomenal consciousness in AI systems (Butlin et al. 2023). There is also extensive work in affective neuroscience that outlines how pleasure and pain are realized in the brain (Kringelbach and Phillips 2014; Ambron 2022). Ideally, we would like to develop indicators of valenced conscious experience for AI systems that characterize the physical basis of experiences of felt (un)pleasantness in ways that aren’t tethered to incidental implementational details of animal neuroanatomy.
Right now, we aren’t well-placed to do this. There appear to be few theories of the basis of valenced conscious experience capable of yielding indicator properties of the desired scope and specificity. There are some philosophical theories of valence that might serve as a foundation (Carruthers 2018; Barlassina and Hayward 2019). However, these need to be developed in significantly greater depth and detail in order to be operationalizable.
Since this is a neglected area, we think it’s especially worth highlighting as a research topic. This is an issue on which philosophers, psychologists, neuroscientists, and computer scientists might be able to work together to make significant progress.
Biological constraints
As noted, some theories of consciousness entail that consciousness in digital systems is impossible. Prominent among these are theories on which consciousness requires biology. While this space of theories includes some versions of the classic mind-brain identity theory, it also includes some versions of functionalism and some theories that satisfy organizational invariance (the principle that having equivalent functional organization entails having the same type of experience). For example, there are views on which the basis of consciousness is functional and only biological systems can exhibit the requisite functional organization (Godfrey-Smith 2016; Seth 2021; cf. Cao 2022). There are also views on which biological systems can have the requisite functional features and conventional computers cannot, even if some other type of artificial system might (Searle 1992; Shiller 2024; Godfrey-Smith 2024).[8]
Since the question of whether consciousness is realistically achievable absent the involvement of some kind of biological substrate is in principle a distinct question from those addressed in classic debates about multiple realizability in the philosophy of mind, the issue is more neglected than might be immediately apparent. We therefore welcome more work on whether consciousness is infeasible outside of living systems.
We would also welcome work on the identification of non-biological barriers to consciousness in near-term architectures. Some already-proposed candidates for such barriers include forms of informational integration, evolutionary histories, and electrical properties that are exhibited by the brain but not conventional computers. Although this space of candidates is little explored, we are not aware of any strong arguments for thinking that barriers to consciousness in near-term AI systems are more likely to be biological than non-biological.
Norms and principles governing the creation of digital minds
Should it be permissible for researchers to create AI systems that are plausible candidates for consciousness? If so, what norms should govern the creation of such systems? If not, what measures should be put in place to make sure there are effective and verifiable red lines and trip-wires?
There is a well-known call for a global moratorium on research that directly aims at or knowingly risks creating artificial consciousness due to Metzinger (2021), as well as relevant proposals by philosophers including Schneider (2020), Schwitzgebel (2023), and Birch (2024). We think this is still a relatively neglected area of research where there is scope to do important work, especially by trying to be as concrete and specific as possible while taking account of expected progress in our epistemic situation, path dependency risks, and feasibility constraints on, for example, gaining buy-in from labs to a voluntary code of practice or from policymakers to enact legally binding principles. We’re very keen to see more work on this issue, which may be a fruitful area for collaboration between moral philosophers, AI researchers, and legal scholars.
Mind and moral status
Animal minds arguably bundle together a range of psychological traits that are in principle dissociable, such as agency, consciousness, emotion, and hedonic valence. Digital minds might dissociate traits like these, thereby unsettling our familiar ways of thinking about welfare and moral standing. For example, they might be conscious, but without any capacity for valenced experience.
We’ve focused on phenomenal consciousness as the key property of digital minds. However, it’s not at all clear if (the capacity for) phenomenal consciousness suffices for being a welfare subject (Chalmers 2022: 39-45; Roelofs 2023; Smithies forthcoming). A more standard view, especially associated with Singer (1993), is that it’s (the capacity for) valenced conscious experience that’s crucial for being a welfare subject. There is also debate about whether phenomenal consciousness can overflow cognitive access and attention, which suggests the underexplored hypothesis that being a welfare subject requires not only phenomenal consciousness but also a suitable form of access. But some philosophers doubt that (the capacity for) phenomenal consciousness is even necessary (e.g., Kagan 2019; Bradford 2022; compare Goodpaster 1978, Taylor 1986/2011). Certain kinds of non-conscious agency might suffice for being a welfare subject (Kagan 2019: 6-36). If physicalism about consciousness is true, there might be physical states that fall outside the extension of our concept Phenomenal Consciousness but that are objectively similar enough to states within its extension that they should be accorded similar moral weight (Lee 2019). Some philosophers argue that even pleasures, pains, and emotions need not be consciously experienced (see esp. Goldstein and Kirk-Giannini ms; Berger et al ms.).
We’re unsure of the tractability of deciding among already existing views in this space. Nonetheless, it’s plausible that additional work on the relationship between different psychological properties and moral standing could be valuable in navigating the puzzling regions of the space of possible minds that digital minds might come to occupy, especially work that serves to highlight and explore previously unnoticed possibilities.
Willing servitude
As we noted earlier, even some of the rosier imaginable outcomes involving digital minds might be thought to involve a kind of moral tragedy if they involve creating conscious beings with sophisticated cognitive capacities that merely serve human beings, however happily. Even some of the basic goals of alignment research come to seem morally dubious insofar as they might involve designing digital people for a life of willing servitude without the ability to pursue their own goals and explore their own values (Schwitzgebel and Garza 2020).
At the same time, it’s challenging to give a satisfying theoretical account of why it would be wrong to create beings that find authentic happiness in serving as the instruments of others (Petersen 2011). Given the intersection with alignment research, the stakes surrounding this issue might be very high, and we’d welcome more research on this topic.
4. Concluding Observations
It’s important to emphasize that in order to make progress in this area, we don’t need to solve long-standing philosophical problems about the nature of mind or consciousness, such as whether consciousness can be reductively explained in physicalist terms. Whether AI systems might be conscious is a question that can be pursued without taking a stand on whether physicalism or dualism or some other theory provides the best metaphysics of consciousness.
Moreover, we don’t need to completely resolve our uncertainty about which systems might be conscious or sentient in order to make important progress. At present, very few systematic attempts have been made to arrive at reasonable estimates for the probability that different kinds of AI systems might exhibit phenomenal consciousness. Reducing our uncertainty by even a modest amount could significantly improve the expected value of outcomes involving digital minds. Furthermore, there may be ways of engineering around some uncertainties. If we are uncertain between a range of consciousness indicators, we can reduce our uncertainty about the systems we build by ensuring that all or none of the indicators are present (cf. Schwitzgebel & Garza 2015; Bryson 2018).
Although it may not be easily discernible from the outside, philosophical work on consciousness has made steady progress in improving our understanding of the theoretical landscape in recent decades, and the science of consciousness is burgeoning. This is a data-rich domain with a track record of improvements in methodology and the clearing up of philosophical confusions. It’s still early days, and reasonable to expect progress to continue.
The extent to which this rate of progress keeps pace with the arrival of candidate digital minds could matter a lot. It could mean the difference between correctly or incorrectly treating a large class of digital minds as welfare subjects or as mere tools. This suggests that accelerating our understanding of the field and slowing down arrival-times are two important levers that we should consider pulling.
Acknowledgements: For helpful comments on drafts of this post, we’d like to thank Adam Bales.
Bibliography:
Ambron, R. 2022. The Brain and Pain: Breakthroughs in Neuroscience. New York, NY: Columbia University Press.
Andrews, K. and J. Birch. 2023. What has feelings? Aeon. URL: https://aeon.co/essays/to-understand-ai-sentience-first-understand-it-in-animals.
Anwar, U., Saparov, A., Rando, J., Paleka, D., Turpin, M., Hase, P., … & Krueger, D. (2024). Foundational challenges in assuring alignment and safety of large language models. arXiv preprint arXiv:2404.09932.
Barlassina, L. and M. K. Hayward. 2019. More of me! Less of me! Reflexive imperativism about affective phenomenal character. Mind 128(512): 1013–1044.
Berger, J., B. Fischer, and J. Gottlieb. ms. Minds matter. Unpublished manuscript.
Birch, J. 2024. The Edge of Sentience: Risk and Precaution in Humans, Other Animals, and AI. Oxford: Oxford University Press.
Block, N. 2009. Comparing the major theories of consciousness. In The Cognitive Neurosciences IV, ed. M. Gazzaniga, 1111–1123. Cambridge, MA: MIT Press.
Bourget, D., & Chalmers, D. (2023). Philosophers on philosophy: The 2020 philpapers survey. Philosophers’ Imprint, 23. Survey data: https://survey2020.philpeople.org/
Bradford, G. 2022. Consciousness and welfare subjectivity. Noûs 4(75): 905–921.
Bryson, J. J. (2018). Patiency is not a virtue: the design of intelligent systems and systems of ethics. Ethics and Information Technology, 20(1), 15-26.
Butlin, P., R. Long, E. Elmoznino, Y. Bengio, J. Birch, A. Constant, G. Deane, S. M. Fleming, C. Frith, X. Ji, R. Kanai, C. Klein, G. Lindsay, M. Michel, L. Mudrik, M. A. K. Peters, E. Schwitzgebel, J. Simon, and R. VanRullen. 2023. Consciousness in artificial intelligence: Insights from the science of consciousness. https://arxiv.org/abs/2308.08708.
Cao, R. 2022. Multiple realizability and the spirit of functionalism. Synthese 200(6): 1–31.
Carruthers, P. 2017. Valence and value. Philosophy and Phenomenological Research 97 (3): 658–680.
Caviola, L. 2024. AI rights will divide us. URL: https://outpaced.substack.com/p/40c97612-6c71-47d0-b715-458c9cd89c63.
Chalmers, D. J. 1996. The Conscious Mind: In Search of a Fundamental Theory. Oxford: Oxford University Press.
Chalmers, D. J. 2022. Reality +: Virtual Worlds and the Problems of Philosophy. London: Allen Lane.
Chalmers, D. J. 2023. Could a large language model be conscious? arXiv:2303.07103. URL: https://arxiv.org/abs/2303.07103.
Colombatto, C. and S. M. Fleming. 2023. Folk psychological attributions of consciousness to large language models. URL: https://osf.io/preprints/psyarxiv/5cnrv.
Dossa R. F. J., K. Arulkumaran, A. Juliani, S. Sasai and R. Kanai. 2024. Design and evaluation of a global workspace agent embodied in a realistic multimodal environment. Frontiers in Computational Neuroscience
Francken, J. et al. (2022). An academic survey on theoretical foundations, common assumptions and the current state of consciousness science. Neuroscience of consciousness, 2022(1): niac011.
Godfrey-Smith, P. 2016. Mind, matter, and metabolism. Journal of Philosophy 113(10): 481–506.
Godfrey-Smith, P. 2024. Inferring consciousness in phylogenetically distant organisms. Journal of Cognitive Neuroscience.
Goldstein, S. and C. D. Kirk-Giannini. ms. AI wellbeing. URL: https://philpapers.org/rec/GOLAWE-4.
Goodpaster, K. E. 1978. On being morally considerable. Journal of Philosophy 75(6): 308–325.
Goyal, A. and Y. Bengio. 2022. Inductive biases for deep learning of higher-level cognition. Proceedings of the Royal Society A 478(2266): 20210068.
Gozzano, S. and C. S. Hill (Eds.). 2012. New Perspectives on Type Identity: The Mental and the Physical. Cambridge: Cambridge University Press.
Gu, A., & Dao, T. (2023). Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752.
Kagan, S. 2019. How to Count Animals, More or Less. Oxford: Oxford University Press.
Kringelbach, M. L. and H. Phillips. 2014. Emotion: Pleasure and Pain in the Brain. Oxford: Oxford University Press.
LeCun, Y. 2022. A path towards autonomous machine intelligence. URL: https://openreview.net/pdf?id=BZ5a1r-kVsf.
Lee, G. 2019. Alien subjectivity and the importance of consciousness. In Blockheads! Essays on Ned Block’s Philosophy of Mind and Consciousness, eds. A. Pautz and D. Stoljar, 215–242. Cambridge, MA: MIT Press.
Martínez, E. and C. Winter. 2021. Protecting sentient artificial intelligence: A survey of lay intuitions on standing, personhood, and general legal protection. Frontiers in Robotics and AI 8.
Metzinger, T. 2021. Artificial suffering: An argument for a global moratorium on synthetic phenomenology. Journal of Artificial Intelligence and Consciousness 1(8): 1–24.
Millière, R., & Buckner, C. (2024). A Philosophical Introduction to Language Models-Part II: The Way Forward. arXiv preprint arXiv:2405.03207.
Mogensen, A. L. 2023. Welfare and felt duration. Global Priorities Institute Working Paper No. 14 − 2023, https://globalprioritiesinstitute.org/welfare-and-felt-duration-andreas-mogensen/.
Pauketat, J., A. Ladak, and J. R. Anthis. 2023. Artificial intelligence, morality, and sentience (AIMS) survey: 2023. URL: https://www.sentienceinstitute.org/aims-survey-2023.
Perez, E., & Long, R. (2023). Towards evaluating AI systems for moral status using self-Reports. arXiv:2311.08576. URL: https://arxiv.org/abs/2311.08576
Petersen, S. 2011. Designing people to serve. In Robot Ethics: The Ethical and Social Implications of Robotics, eds. P. Lin, K. Abney, and G. A. Bekey. Cambridge, MA: MIT Press.
Roelofs, L. 2023. Sentientism, motivation, and philosophical vulcans. Pacific Philosophical Quarterly 104(2): 301–323.
Saad, B. and A. Bradley. (2023) Digital suffering: Why it’s a problem and how to prevent it. Inquiry: An Interdisciplinary Journal of Philosophy.
Schneider, S. 2020. How to catch an AI zombie: Testing for consciousness in machines. In Ethics of Artificial Intelligence, 439–458. Oxford: Oxford University Press.
Schwitzgebel, E. and M. Garza. 2015. A defense of the rights of artificial intelligences. Midwest Studies in Philosophy, 39(1), 98-119.
Schwitzgebel, E. and M. Garza. 2020. Designing AI with rights, consciousness, self-respect, and freedom. In Ethics of Artificial Intelligence, ed. S. M. Liao, 459–479. Oxford: Oxford University Press.
Schwitzgebel, E. 2023. AI systems must not confuse users about their sentience or moral status. Patterns 4(8): 100818.
Searle, J. R. 1992. The Rediscovery of the Mind. Cambridge, MA: MIT Press.
Sebo, J., & Long, R. 2023. Moral consideration for AI systems by 2030. AI and Ethics
Seth, A. K. 2021. Being You: A New Science of Consciousness. London: Faber & Faber.
Shiller, D. 2024. Functionalism, integrity, and digital consciousness. Synthese 203(2): 1–20.
Shulman, C. and N. Bostrom. 2021. Sharing the world with digital minds. In Rethinking Moral Status, eds. S. Clarke, H. Zohny, and J. Savulescu, 306–326. Oxford: Oxford University Press.
Singer, P. 1993. Practical Ethics, 2nd. ed. Cambridge: Cambridge University Press.
Smithies, D. forthcoming. Affective consciousness and moral status. Oxford Studies in Philosophy of Mind.
The Google DeepMind Robotics Team (2024) Shaping the future of advanced robotics. URL: https://deepmind.google/discover/blog/shaping-the-future-of-advanced-robotics/?_gl=1*y6knb3*_up*MQ..*_ga*OTYzNDUyOTkwLjE3MTM3OTYwODc.*_ga_LS8HVHCNQ0*MTcxMzc5NjA4Ny4xLjAuMTcxMzc5NjA5MC4wLjAuMA.
Taylor, P. W. 1986 / 2011. Respect for Nature: A Theory of Environmental Ethics. Princeton, NJ: Princeton University Press.
Tye, M. 1995. Ten Problems of Consciousness: A Representational Theory of the Phenomenal Mind. Cambridge, MA: MIT Press
While this is a useful definition in terms of highlighting a key psychological property that AI systems might come to have and that many think is of obvious normative significance, it also has a number of drawbacks. Mental states needn’t be conscious. Computation needn’t be digital. Moreover, we don’t want to commit ourselves to the view that consciousness (or even the capacity for it) is necessary for welfare or moral standing: see the bullet-point on ‘Mind and moral standing’ in section 3. We also don’t mean to commit ourselves to computationalism or internalism about consciousness or other psychological states.
Similar views seem to be widely held among relevant experts: in recent surveys only small minorities of respondents among philosophers of mind and consciousness scientists outright rejected the possibility of consciousness in artificial systems or in present or future machines (Chalmers & Bourget, 2023; Francken et al. 2022). See Perez & Long (2023: 2) for an overview of the relevant results from these surveys.
Chalmers identifies a number of candidates for temporary barriers to consciousness that might be overcome within the next decade in large language models, including having unified agency, senses/embodiment, recurrent processing, and a world-model. More recent developments suggest that some of these barriers will be overcome soon or that they have already been overcome. See, for example, Anwar et al. (2024: §2.5) and references therein for discussion of agentic LLMs. For work on combining LLMs and multimodal models with robotics, see The Google DeepMind Robotics Team (2024). For recently proposed alternatives to the Transformer architecture that involve recurrence processing, see Gu & Dao (2023). For discussion of evidence that LLMs have world models, see Millière & Buckner (2024).
In fact, researchers have already begun developing systems designed to satisfy indicator properties outlined by Butlin et al. See Dossa et al. (2024).
While one of us is skeptical that the right way to measure duration for the purposes of welfare assessment really is in terms of subjectively experienced time as opposed to objective time (see Mogensen 2023), we all agree it’s a reasonable view to which one ought to assign at least modest credence.
See also the authors’ accompanying FAQ: https://newworkinphilosophy.substack.com/p/bradford-saad-university-of-oxford?utm_source=publication-search
There’s already some degree of confusion about whether current LLMs might be conscious. In 2022, Google engineer Blake Lemoine infamously went public with claims that the company’s Lamda model was sentient. Colombatto and Fleming (2023) present evidence that a majority of participants in a representative sample of Americans attribute some degree of phenomenal consciousness to ChatGPT.
There is a standard story on which mind-brain identity theories were discredited in the second half of the twentieth century with the advent of functionalism, a view that is supposed to have departed from the mind-brain identity theory in saying that non-biological machines can have minds like ours. However, this story is misleading in a number of respects. First, as indicated, it doesn’t immediately follow from functionalism that non-biological machines can share our mental states. Second, some versions of the classic mind-brain identity theory are restricted to organisms and hence silent on whether non-biological machines could share our mental states. Third, mind-brain identity theories remain alive and kicking within contemporary philosophy of mind (see e.g. essays and references in Gozzano & Hill 2012).