Quick update from AE Studio: last week, Judd (AE’s CEO) hosted a panel at SXSW with Anil Seth, Allison Duettmann, and Michael Graziano, entitled “The Path to Conscious AI” (discussion summary here[1]).
We’re also making available an unedited Otter transcript/recording for those who might want to read along or increase the speed of the playback.
Why AI consciousness research seems critical to us
With the release of each new frontiermodel seems to follow a cascade of questions probing whether or not the model is conscious in training and/or deployment. We suspect that these questions will only grow in number and volume as these models exhibit increasingly sophisticated cognition.
If consciousness is indeed sufficient for moral patienthood, then the stakes seem remarkably high from a utilitarian perspective that we do not commit the Type II error of behaving as if these and future systems are not conscious in a world where they are in fact conscious.
Because the ground truth here (i.e., how consciousness works mechanistically) is still poorly understood, it is extremely challenging to reliably estimate the probability that we are in any of the four quadrants above—which seems to us like a very alarming status quo. Different people have different default intuitions about this question, but the stakes here seem too high for default intuitions to be governing our collective behavior.
In an ideal world, we’d have understood far more about consciousness and human cognition before getting near AGI. For this reason, we suspect that there is likely substantial work that ought to be done at a smaller scale first to better understand consciousness and its implications for alignment. Doing this work now seems far preferable to a counterfactual world where we build frontier models that end up being conscious while we still lack a reasonable model for the correlates or implications of building sentient AI systems.
Accordingly, we are genuinely excited about rollouts of consciousness evals at large labs, though the earlier caveat still applies: our currently-limited understanding of how consciousness actually works may engender a (potentially dangerous) false sense of confidence in these metrics.
Additionally, we believe testing and developing an empirical model of consciousness will enable us to better understand humans, our values, and any future conscious models. We also suspect that consciousness may be an essential cognitive component of human prosociality and may have additional broader implications for solutions to alignment. To this end, we are currently collaborating with panelist Michael Graziano in pursuing a more mechanistic model of consciousness by operationalizing attention schema theory.
Ultimately, we believe that immediately devoting time, resources, and attention towards better understanding the computational underpinnings of consciousness may be one of the most important neglected approaches that can be pursued in the short term. Better models of consciousness could likely (1) cause us to dramatically reconsider how we interact with and deploy our current AI systems, and (2) yield insights related to prosociality/human values that lead to promising novel alignment directions.
Resources related to AI consciousness
Of course, this is but a small part of a larger, accelerating conversation that has been ongoing on LW and the EAF for some time. We thought it might be useful to aggregate some of the articles we’ve been reading here, including panelists Michael Graziano’s book, “Rethinking Consciousness” (and article, Without Consciousness, AIs Will Be Sociopaths) as well as Anil Seth’s book, “Being You”.
GPT-generated summary from the raw transcript: the discussion, titled “The Path to Conscious AI,” explores whether AI can be considered conscious and the impact on AI alignment, starting with a playful discussion around the new AI model Claude Opus.
Experts in neuroscience, AI, and philosophy debate the nature of consciousness, distinguishing it from intelligence and discussing its implications for AI development. They consider various theories of consciousness, including the attention schema theory, and the importance of understanding consciousness in AI for ethical and safety reasons.
The conversation delves into whether AI could or should be designed to be conscious and the potential existential risks AI poses to humanity. The panel emphasizes the need for humility and scientific rigor in approaching these questions due to the complexity and uncertainty surrounding consciousness.
AE Studio @ SXSW: We need more AI consciousness research (and further resources)
Quick update from AE Studio: last week, Judd (AE’s CEO) hosted a panel at SXSW with Anil Seth, Allison Duettmann, and Michael Graziano, entitled “The Path to Conscious AI” (discussion summary here[1]).
We’re also making available an unedited Otter transcript/recording for those who might want to read along or increase the speed of the playback.
Why AI consciousness research seems critical to us
With the release of each new frontier model seems to follow a cascade of questions probing whether or not the model is conscious in training and/or deployment. We suspect that these questions will only grow in number and volume as these models exhibit increasingly sophisticated cognition.
If consciousness is indeed sufficient for moral patienthood, then the stakes seem remarkably high from a utilitarian perspective that we do not commit the Type II error of behaving as if these and future systems are not conscious in a world where they are in fact conscious.
Because the ground truth here (i.e., how consciousness works mechanistically) is still poorly understood, it is extremely challenging to reliably estimate the probability that we are in any of the four quadrants above—which seems to us like a very alarming status quo. Different people have different default intuitions about this question, but the stakes here seem too high for default intuitions to be governing our collective behavior.
In an ideal world, we’d have understood far more about consciousness and human cognition before getting near AGI. For this reason, we suspect that there is likely substantial work that ought to be done at a smaller scale first to better understand consciousness and its implications for alignment. Doing this work now seems far preferable to a counterfactual world where we build frontier models that end up being conscious while we still lack a reasonable model for the correlates or implications of building sentient AI systems.
Accordingly, we are genuinely excited about rollouts of consciousness evals at large labs, though the earlier caveat still applies: our currently-limited understanding of how consciousness actually works may engender a (potentially dangerous) false sense of confidence in these metrics.
Additionally, we believe testing and developing an empirical model of consciousness will enable us to better understand humans, our values, and any future conscious models. We also suspect that consciousness may be an essential cognitive component of human prosociality and may have additional broader implications for solutions to alignment. To this end, we are currently collaborating with panelist Michael Graziano in pursuing a more mechanistic model of consciousness by operationalizing attention schema theory.
Ultimately, we believe that immediately devoting time, resources, and attention towards better understanding the computational underpinnings of consciousness may be one of the most important neglected approaches that can be pursued in the short term. Better models of consciousness could likely (1) cause us to dramatically reconsider how we interact with and deploy our current AI systems, and (2) yield insights related to prosociality/human values that lead to promising novel alignment directions.
Resources related to AI consciousness
Of course, this is but a small part of a larger, accelerating conversation that has been ongoing on LW and the EAF for some time. We thought it might be useful to aggregate some of the articles we’ve been reading here, including panelists Michael Graziano’s book, “Rethinking Consciousness” (and article, Without Consciousness, AIs Will Be Sociopaths) as well as Anil Seth’s book, “Being You”.
There’s also Propositions Concerning Digital Minds and Society, Consciousness in Artificial Intelligence: Insights from the Science of Consciousness, Consciousness as Intrinsically Valued Internal Experience, and Improving the Welfare of AIs: A Nearcasted Proposal.
Further articles/papers we’ve been reading:
Preventing antisocial robots: A pathway to artificial empathy
New Theory Suggests Chatbots Can Understand Text
Folk Psychological Attributions of Consciousness to Large Language Models
Chatbots as social companions: How people perceive consciousness, human likeness, and social health benefits in machines
Robert Long on why large language models like GPT (probably) aren’t conscious
Assessing Sentience in Artificial Entities
A Conceptual Framework for Consciousness
Zombies Redacted
Minds of Machines: The great AI consciousness conundrum
Some relevant tweets:
https://twitter.com/ESYudkowsky/status/1667317725516152832?s=20
https://twitter.com/Mihonarium/status/1764757694508945724
https://twitter.com/josephnwalker/status/1736964229130055853?t=D5sNUZS8uOg4FTcneuxVIg
https://twitter.com/Plinz/status/1765190258839441447
https://twitter.com/DrJimFan/status/1765076396404363435
https://twitter.com/AISafetyMemes/status/1769959353921204496
https://twitter.com/joshwhiton/status/1770870738863415500
https://x.com/DimitrisPapail/status/1770636473311572321?s=20
https://twitter.com/a_karvonen/status/1772630499384565903?s=46&t=D5sNUZS8uOg4FTcneuxVIg
…along with plenty of other resources we are probably not aware of. If we are missing anything important, please do share in the comments below!
GPT-generated summary from the raw transcript: the discussion, titled “The Path to Conscious AI,” explores whether AI can be considered conscious and the impact on AI alignment, starting with a playful discussion around the new AI model Claude Opus.
Experts in neuroscience, AI, and philosophy debate the nature of consciousness, distinguishing it from intelligence and discussing its implications for AI development. They consider various theories of consciousness, including the attention schema theory, and the importance of understanding consciousness in AI for ethical and safety reasons.
The conversation delves into whether AI could or should be designed to be conscious and the potential existential risks AI poses to humanity. The panel emphasizes the need for humility and scientific rigor in approaching these questions due to the complexity and uncertainty surrounding consciousness.