Google could build a conscious AI in three months
Summary: Theories of consciousness do not present significant technical hurdles to building conscious AI systems. Recent advances in AI relate to capacities that aren’t obviously relevant to consciousness. Satisfying major theories with current technology has been and will remain quite possible. The potentially short time lines to plausible digital consciousness mean that issues relating to digital minds are more pressing than they might at first seem. Key ideas are bolded. You can get the main points just by skimming these.
Viability
Claim 1: Google could build a conscious AI in as little as three months if it wanted to.
Claim 2: Microsoft[1] could have done the same in 1990.
I’m skeptical of both of these claims, but I think something in the ballpark is true. Google could assemble a small team of engineers to quickly prototype a system that existing theories, straightforwardly applied, would predict is conscious. The same is true for just about any tech company, today or in 1990.[2]
Philosophers and neuroscientists have little to say about why a digital system that implemented fairly simple patterns of information processing would not be conscious. Even if individual theorists might have a story to tell about what was missing, they would probably not agree on what that story was.
The most prominent theories of consciousness lay out relatively vague requirements for mental states to be conscious. The requirements for consciousness (at least for the more plausible theories[3]) generally have to do with patterns of information storage, access, and processing. Theorists typically want to accommodate our uncertainty about the low-level functioning of the human brain and also allow for consciousness in species with brains rather different from ours. This means that their theories involve big picture brain architectures, not specific cellular structures.
Take the Global Workspace Theory: roughly put, conscious experiences result from stored representations in a centralized cognitive workspace. That workspace is characterized by its ability to broadcast its contents to a variety of (partially) modularized subsystems, which can in turn submit future contents to the workspace. According to the theory, any system that uses such an architecture to route information is conscious.
A software system that included a global workspace would be easy to build. All you have to do is set up some modules with the right rules for access to a global repository. To be convincing, these modules should resemble the modules of human cognition, but it isn’t obvious which kinds of faculties matter. Perhaps some modules for memory, perception, motor control, introspection, etc. You need these modules to be able to feed information into the global workspace and receive information from it in turn. These modules need to be able to make use of the information they receive, which requires some contents usable by the different systems.
Critically for my point, complexity and competence aren’t desiderata for consciousness. The modules with access to the workspace don’t need to perform their assigned duties particularly well. Given no significant requirements on complexity or competence, a global workspace architecture could be achieved in a crude way quite quickly by a small team of programmers. It doesn’t rely on any genius, or any of the technological advances of the past 30 years.
More generally:
1.) Consciousness does not depend on general intelligence, mental flexibility, organic unpredictability, or creativity.
These traits distinguish humans from current computer programs. There are no programs that can produce creative insights outside of very constrained contexts. Perhaps because of this, we may use these traits as a heuristic guide to consciousness when in doubt. In science fiction, for instance, we often implicitly assess the status of alien and artificial creatures without knowing anything about their internal structures. We naturally regard the artificial systems that exhibit the same sorts of creativity and flexibility as having conscious minds. However, these heuristics are not grounded in theory.
There are no obvious reasons why these traits should have to be correlated with consciousness in artificial systems. Nothing about consciousness obviously requires intelligence or mental flexibility. In contrast, there might be good reasons why you’d expect systems that evolved by natural selection to be conscious if and only if they were intelligent, flexible, and creative. For instance, it might be that the architectures that allow for consciousness are most useful with intelligent systems, or help to generate creativity. But even if this is so, it doesn’t show that such traits have to travel together in structures that we design and build ourselves. Compare: since legs are generally used to get around, we should expect most animals with legs to be pretty good at using them to walk or run. But we could easily build robots that had legs but who would fall over whenever they tried to go anywhere. The fact that they are clumsy doesn’t mean they lack legs.
2.) Consciousness does not require and is not made easier with neural networks.
Neural networks are exciting because they resemble human brains and because they allow for artificial cognition that resembles human cognition in being flexible and creative. However, most theorists accept that consciousness is multiply realizable, meaning that consciousness can be produced in many different kinds of systems, including systems that don’t use neurons or anything like neurons.
There is no obvious reason why neural networks should be better able to produce the kinds of information architectures that are thought to be characteristic of consciousness. Most plausible major theories of consciousness have nothing to say about neurons or what they might contribute. It is unclear why neural networks should be more likely to lead to consciousness.
Reception
Even though I think a tech company could build a system that checked all the boxes of current theories, I doubt it would convince people that their AI was really conscious (though not for particularly good reasons). If true, this provides reasons to think no company will try any time soon. Plausibly, a company would only set out to make a conscious system if they could convince their audience that they may have succeeded.
We can divide the question of reception into two parts: How would the public respond and how would experts respond?
Tech companies may soon be able to satisfy the letter of most of the current major non-biological theories of consciousness, but any AI developed soon would probably still remind us more of a computer than an animal. It might make simple mistakes suggestive of imperfect computer algorithms. It might be limited to a very specific domain of behaviors. If it controlled a robot body, the movements might be jerky or might sound mechanical. Consider the biases people feel about animals like fish that don’t have human physiologies. It seems likely that people would be even more biased against crude AIs.
The AI wouldn’t necessarily have language skills capable of expressing its feelings. If it did, it might talk about its consciousness in a way which mimics us rather than as the result of organic introspection[4]. This might lead to the same sorts of mistakes that make LaMDA so implausibly conscious. (E.g. by talking about how delicious ice cream is despite never having tried it.) The fact that a system is just mimicking us when talking about its conscious experiences doesn’t mean it lacks them—human actors (e.g. in movies) still have feelings, even if you can’t trust their reports -- but it seems to me that it would make claims about their consciousness to be a tough sell to the general public.
The Public
The candidate system I’m imagining would probably not convince the general public that artificial consciousness had arrived by itself. People have ways of thinking about minds and machines and use various simple and potentially misleading heuristics for differentiating them. On these heuristics, crude systems that passed consciousness hurdles would still, I expect, be grouped with the machines, because of their computer-like behavioral quirks and because people aren’t used to thinking about computers as conscious.
On the other hand, systems that presented the right behavioral profile may be regarded by the public as conscious regardless of theoretical support. If a system does manage to hook into the right heuristics, or if it reminds us more of an animal than a computer, people might be generally inclined to regard it as conscious, particularly if they can interact with it and if experts aren’t generally dismissive. People are primed to anthropomorphize. We do it with our pets, with the weather, even with dots moving on a screen.
The Experts
I suspect that most experts who have endorsed theories of consciousness wouldn’t be inclined to attribute consciousness to a crude system that satisfied the letter of their theories. It is reputationally safer (in terms of both public perceptions and academic credibility) to not endorse consciousness in systems that give off a computer vibe. There is a large kooky side to consciousness research that the more conservative mainstream likes to distinguish itself from. So many theorists will likely want some grounds on which to deny or at least suspend judgement about consciousness in crude implementations of their favored architectures. On the other hand, the threat of kookiness may lose its bite if the public is receptive to an AI being conscious.
Current theories are missing important criteria that might be relevant to artificial consciousness because they’re defined primarily with the goal of distinguishing conscious from unconscious states of human brains (or possibly conscious human brains from unconscious animals, or household objects). They aren’t built to distinguish humans from crude representations of human architectures. It is open to most theorists to expand their theories to exclude digital minds. Alternatively, they may simply profess not to know what to make of apparent digital minds (e.g. level-headed mysterianism). This is perhaps safer and more honest, but if widely adopted, means the public would be on its own in assessing consciousness.
Implications
The possibility that a tech company could soon develop a system that was plausibly conscious according to popular theories should be somewhat unsettling. The main barriers to this happening seem to have more to do with the desires of companies to build conscious systems rather than with technical limitations. The skeptical reception such systems are likely to receive is good—it provides an averse incentive that buys us more time. However, these thoughts are very tentative. There might be ways of taking advantage of our imperfect heuristics to encourage people to accept AI systems as conscious.
The overall point is that timelines for apparent digital consciousness may be very short. While there are presently no large groups interested in producing digital consciousness, the situation could quickly change if consciousness becomes a selling point and companies think harder about how to market their products as conscious, such as for chatbot friends or artificial pets. There is no clear technological hurdle to creating digital consciousness. Whether we think we have succeeded may have more to do with imperfect heuristics.
We’re not ready, legally or socially, for potentially sentient artificial creatures to be created and destroyed at will for commercial purposes. Given the current lack of attention to digital consciousness, we’re not in a good position to even agree about which systems might need our protection or what protections are appropriate. This is a problem.
In the short run, worries about sentient artificial systems are dwarfed by the problems faced by humans and animals. However, there are longtermist considerations that suggest we should care more about digital minds now than we currently do. How we decide to incorporate artificial systems into our society may have a major impact on the shape of the future. That decision is likely to be highly sensitive to the initial paths we take.
Because of the longterm importance of digital minds, the people who propose and evaluate theories of consciousness need to think harder about applications to artificial systems. Three months (or three years) will not be nearly enough time to develop better theories about consciousness or to work out what policies we should put in place given our lack of certainty.
- ↩︎
Brian Tomasik makes the case that Microsoft may have done so unintentionally.
- ↩︎
Theories of consciousness have come along further since 1990 than the technology relevant to implementing them. Developers in 1990 would have had a much less clear idea about what to try to build.
- ↩︎
I include among more plausible theories the Global Workspace Theory, the various mid-level representationalist theories (E.g. Prinz’s AIR, Tye’s PANIC), first-order representationalist theories higher-order theories that require metarepresentation (attention tracking theories, HOT theory, dual content theory, etc.). I don’t find IIT plausible, despite it’s popularity, and am not sure what effect it’s inclusion would have on the present arguments. Error theories and indeterminacy theories are plausible, but introduce a range of complications beyond the scope of this post. Some philosophers have maintained that biological aspects of the brain are necessary for consciousness, but this view generally doesn’t include a specific story about exactly what critical element exactly is missing.
- ↩︎
Human beings aren’t inclined to talk about our conscious experiences in the customary way unprompted. We acquire ways of framing consciousness and mental states from our culture as children, so much of what we do is mimicking. Nevertheless, the frames we have acquired have been developed by people with brains like ours, so the fact that we’re mimicking others (to whatever extent we are) isn’t problematic in the way that it is for an AI.
I would find this more persuasive if it had thorough references to the existing science of consciousness. I was under the impression that we still don’t know what the necessary conditions for consciousness are, and there are many competing theories on the topic. Stating that one theory is correct doesn’t answer that key question for me.
We definitely don’t, and I hope I haven’t committed myself to any one theory. The point is that the most developed views provide few obstacles. Those views tend to highlight different facets of human cognitive architecture. For instance, it may be some form of self-representation that matters, or the accessibility of representations to various cognitive modules. I didn’t stress this enough: of the many views, we may not know which is right, but it wouldn’t be technically hard to satisfy them all. After all, human cognitive architecture satisfies every plausible criterion of consciousness.
On the other hand, it is controversial whether any of the developed views are remotely right. There are some people who think we’ve gone in the wrong direction. However, these people generally don’t have specific alternative proposals that clearly come down one way or another on AI systems.
I think this is a key point that counts strongly against the possibility of building conscious AI so soon, as in the title. Some theories of consciousness are basically theories of human (and sometimes animal) consciousness, really just explaining which neural correlates predict subjective report, but do not claim those minimal neural correlates generate consciousness in any system, so building a computer or writing software that just meets their minimally stated requirements should not be taken as generating consciousness. I think Global Workspace Theory and Attention Schema Theory are theories like this, although Graziano, the inventor of AST, also describes how consciousness is generated in the general case, i.e. illusionism, and how AST relates to this, i.e. the attention schema is where the “illusion” is generated in practice in animals. (I’m not as familiar with other theories, so won’t comment on them.)
If you overextend such theories of consciousness, you can get panpsychism, and we already have artificial or otherwise non-biological consciousness:
https://www.unil.ch/files/live/sites/philo/files/shared/DocsPerso/EsfeldMichael/2007/herzog.esfeld.gerstner.07.pdf
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6074090/
https://forum.effectivealtruism.org/posts/YE9d4yab9tnkK5yfd/physical-theories-of-consciousness-reduce-to-panpsychism (I just updated this post to say that I think I was overextending)
From the 50s to the 90s, there was a lot of debate about the basic nature of consciousness and its relation to the brain. The theory that emerged from that debate as the most plausible physicalist contender suggested that it was something about the functional structure of the brain that matters for consciousness. Materials aren’t relevant, just abstract patterns. These debates were very high level and the actual functional structures responsible for consciousness weren’t much discussed, but they suggested that we could fill in the details and use those details to tell whether AI systems were conscious.
From the 90s to the present, there have been a number of theories developed that look like they aim to fill in the details of the relevant functional structures that are responsible for consciousness. I think these theories are most plausibly read as specific versions of functionalism. But you’re right, the people who have developed them often haven’t committed fully either to functionalism or to the completeness of the functional roles they describe. They would probably resist applying them in crude ways to AIs.
The theorists who might resist the application of modern functionalist theories to digital minds are generally pretty silent on what might be missing in the functional story they tell (or what else might matter apart from functional organization). I think this would undercut any authority they might have in denying consciousness to such systems, or even raising doubts about it.
Suppose that Replika produced a system that had perceptual faculties and a global workspace, that tracked its attention and utilized higher-order representations of its own internal states in deciding what to do. Suppose they announced to the media that they had created a digital person, and charged users $5 an hour to talk to it. Suppose that Replika told journalists that they had worked hard to implement Graziano’s theory in their system, and yes, it was built out of circuits, but everything special about human consciousness was coded in there. What would people’s reactions be? What would Graziano say about it? I doubt he could come up with many compelling reasons to think it wasn’t conscious, even if he could say that his theory wasn’t technically intended to apply to such systems. This leaves curious public in the who can really say camp reminiscent of solipsism or doubts about the consciousness of dogs. I think they’d fall back on problematic heuristics.
If they weren’t silent, they would be proposing further elaborated theories.
Maybe, but I don’t think we would be justified in giving much weight to the extension of their theory of consciousness to artificial entities. Taking GWT, for example, many things could qualify as a global workspace.
So far, I think this would be possible with <1000 neurons, and very probably 100 or fewer, if you interpret all of those things minimally. For example, a single neuron to represent an internal state and another another neuron for a higher-order representation of that internal state. Of course, maybe that’s still conscious, but the bar seems very low.
“everything special about human consciousness was coded in there” sounds like whole brain emulation, and I think Graziano and most functionalists should and would endorse its consciousness. Generally, though, I think Graziano and other illusionists would want to test whether it treats its own internal states or information processing as having or seeming to have properties consciousness seems to have, like being mysterious/unphysical/ineffable. Maybe some more specific tests, too. Wilterson and Graziano have already come up with artificial agents with attention schemas, which should be conscious if you overextended the neural correlate claims of AST too literally and forgot the rest, but would not be conscious according to Graziano (or AST), since it wouldn’t pass the illusionist tests. Graziano wrote:
I suppose my claim about AST in my first comment is inaccurate. AST is in fact a general theory of consciousness, but
the general theory is basically just illusionism (AFAIK), and
having an attention schema is neither necessary nor sufficient for consciousness in general, but it is both in animals in practice.
Oh, apparently Wilterson and Graziano did admit that artificial agent in their paper is kind of conscious, in the section “Is the Artificial Agent Conscious? Yes and No.”
This requires an extremely simplistic theory of representation, but yeah, if you allow any degree of crudeness you might get consciousness in very simple systems.
I suppose you could put my overall point this way: current theories present very few technical obstacles, so there it would take little effort to build a system which would be difficult to rule out. Even if you think we need more criteria to avoid getting stuck with panpsychism, we don’t have those criteria and so can’t wield them to do any work in the near future.
I mean everything that is plausibly relevant according to current theories, which is a relatively short list. There is a big gulf between everything people have suggested is necessary for consciousness and a whole brain emulation.
It has been awhile since I’ve read Graziano—but if I recall correctly (and as your quote illustrates) he likes both illusionism and an attention schema theory. Since illusionism denies consciousness, he can’t take AST as a theory of what consciousness is; he treats it instead as a theory of the phenomena that leads us to puzzle mistakenly about consciousness. If that is right, he should think that any artificial mind might be led by an AST architecture, even a pretty crude one, to make mistakes about mind-brain relations and that isn’t indicative of any further interesting phenomenon. The question of the consciousness of artificial systems is settled decisively in the negative by illusionism.
This is also my impression of the theories with which I’m familiar, except illusionist ones. I think only illusionist theories actually give plausible accounts of consciousness in general, as far as I’m aware, and I think they probably rule out panpsychism, but I’m not sure (if small enough animal brains are conscious, and counterfactual robustness is not necessary, then you might get panspychism again).
Fair. That’s my impression, too.
I guess this is a matter of definitions. I wouldn’t personally take illusionism as denying consciousness outright, and instead illusionism says that consciousness does not actually have the apparently inaccessible, ineffable, unphysical or mysterious properties people often attribute to it, and it’s just the appearance/depiction/illusion of such properties that makes a system conscious. At any (typo) rate, whether consciousness is a real phenomenon or not, however we define it, I would count systems that have illusions of consciousness, or specifically illusions of conscious evaluations (pleasure, suffering, “conscious” preferences) as moral patients and consider their interests in the usual ways. (Maybe with some exceptions that don’t count, like giant lookup tables and some other systems that don’t have causal structures at all resembling our own.) This is also Luke Muehlhauser’s approach in 2017 Report on Consciousness and Moral Patienthood.
I agree that this sounds semantic. I think of illusionism as a type of error theory, but people in this camp have always been somewhat cagey what they’re denying and there is a range of interesting theories.
Interesting. Do you go the other way too? E.g. if a creature doesn’t have illusions of consciousness, then it isn’t a moral patient?
Assuming illusionism is true, then yes, I think only those with illusions of consciousness are moral patients.
It seems like this may be a non-standard interpretation of illusionism. Being under illusions of consciousness isn’t necessary for consciousness according to Frankish, and what is necessary is that if a sufficiently sophisticated introspective/monitoring system were connected in to the system in the right way, then that would generate illusions of consciousness. See, e.g. his talks:
https://youtu.be/xZxcair9oNk?t=3590
https://www.youtube.com/watch?v=txiYTLGtCuM
https://youtu.be/me9WXTx6Z-Q
I suspect now that this is also how AST is supposed to be understood, based on the artficial agents paper.
I do wonder if this is setting the bar too low, though. Humphrey seems to set a higher bar, where some kind of illusion is in fact required, but also mammals and birds probably have them.
I think we get into a definitional problem. What exactly do mean by “illusion” or “belief”? If an animal has a “spooky” attention schema, and cognitive access to it, then plausibly the animal has beliefs about it of some kind. If an animal or system believes something is good or bad or whatever, is that not an illusion, too, and is that not enough?
Also see section 6.2 in https://www.openphilanthropy.org/research/2017-report-on-consciousness-and-moral-patienthood for discussion of some specific theories and that they don’t answer the hard problem.
Weak downvoted because I think the title is somewhat misleading given the content of the post (and I think you know this, since you replied below: ‘Perhaps I oversold the provocative title.’) I think we should discourage clickbait of this nature on the Forum.
Fair enough
It sounds like you’re giving IIT approximately zero weight in your all-things-considered view. I find this surprising, given IIT’s popularity amongst people who’ve thought hard about consciousness, and given that you seem aware of this.
Additionally, I’d be interested to hear how your view may have updated in light of the recent empirical results from the IIT-GNWT adversarial collaboration:
From my experience, there is a significant difference in the popularity of IIT by field. In philosophy, where I got my training, it isn’t a view that is widely held. Partly because of this bias, I haven’t spent a whole lot of time thinking about it. I have read the seminal papers that introduce the formal model and given the philosophical justifications, but I haven’t looked much into the empirical literature. The philosophical justifications seem very weak to me—the formal model seems very loosely connected to the axioms of consciousness that supposedly motivate it. And without much philosophical justification, I’m wary of the empirical evidence. The human brain is messy enough that I expect you could find evidence to confirm IIT whether or not it is true, if you look long enough and frame your assumptions correctly. That said, it is possible that existing empirical work does provide a lot of support for IIT that I haven’t taken the time to appreciate.
If you don’t buy the philosophical assumptions, as I do not, then I don’t think you should update much on the IIT-GNWT adversarial collaboration results. IIT may have done better, but if the two views being compared were essentially picked out of a hat as representatives of different philosophical kinds, then the fact that one view does better doesn’t say much about the kinds. It seems weird to me to compare the precise predictions of theories that are so drastically different in their overall view of things. I don’t really like the idea of adversarial approaches across different frameworks. I would think it makes more sense to compare nearby theories to one another.
This is very informative to me, thanks for taking the time to reply. For what it’s worth, my exposure to theories of consciousness is from the neuroscience + cognitive science angle. (I very nearly started a PhD in IIT in Anil Seth’s lab back in 2020.) The overview of the field I had in my head could be crudely expressed as: higher-order theories and global workspace theories are ~dead (though, on the latter, Baars and co. have yet to give up); the exciting frontier research is in IIT and predictive processing and re-entry theories.
I’ve been puzzled by the mentions of GWT in EA circles—the noteworthy example here is how philosopher Rob Long gave GWT a fair amount of air time in his 80k episode. But given EA’s skew toward philosopher-types, this now makes a lot more sense.
I am wondering if the fact that theories of consciousness relate more to the overall architecture of a system rather than to the low-level workings of that system is a limitation that should be strongly considered, particularly if there are other theories that are more focused on low-level cellular functioning. For example, I’ve seen a theory from renowned theoretical physicist Roger Penrose (video below) stating that consciousness is a quantum process. If this is the case, then current computers wouldn’t be conscious because they aren’t quantum devices at the level of individual transistors or circuits. Therefore, no matter what the overall neural architecture is, even in a complicated AI, the system wouldn’t be conscious.
Another interesting point is that the way we incorporate AI into society may be affected by whether the AIs we build are generally sentient. If they are, and we have to respect their rights when developing, implementing, or shutting down such systems; that may create an incentive to do these things more slowly. I think slowing down may be good for the world at large given that plunging into the AI revolution while we still only have this black-box method of designing and training AIs—as in digital neuroscience hasn’t progressed enough for us to decode the neural circuits of AIs so that we know what they’re actually thinking when they make decisions (https://forum.effectivealtruism.org/posts/rJRw78oihoT5paFGd/high-level-hopes-for-ai-alignment)- seems like a very dangerous result.
Wow, that is a strong claim!
Could these conscious AI also have affective experience?
Perhaps I oversold the provocative title. But I do think that affective experiences are much harder, so even if there is a conscious AI it is unlikely to have the sorts of morally significant states we care about. While I think that it is plausible that current theories of consciousness might be relatively close to complete, I’m less sympathetic that current theories of valence are plausible as relatively complete accounts. There has been much less work in this direction.
Which makes me wonder how anyone expects to identify whether software entities have affective experience.
Is there any work in this direction that you like and can recommend?
There is a growing amount of work in philosophy investigating the basic nature of pain that seems relevant to identifying important valenced experiences in software entities. What the body commands by Colin Klein is a representative and reasonably accessible book-length introduction that pitches one of the current major theories of pain. Applying it to conscious software entities wouldn’t be too hard. Otherwise, my impression is that most of the work is too recent and too niche to have accessible surveys yet.
Overall, I should say that not particularly sympathetic to the theories that people have come up with here, but you might disagree and I don’t think you have much reason to take my word for it. In any case they are trying to answer the right questions.