Research Fellow at the Center for AI Safety
rgb
What to think when a language model tells you it’s sentient
Writing about my job: Research Fellow, FHI
Key questions about artificial sentience: an opinionated guide
80k podcast episode on sentience in AI systems
The pretty hard problem of consciousness
How to make independent research more fun (80k After Hours)
I work at Trajan House and I wanted to comment on this:
But a great office gives people the freedom to not worry about what they need for work, a warm environment in which they feel welcome and more productive, and supports them in ways they did not think were necessary.
By these metrics, Trajan House is a really great office! I’m so grateful for the work that Jonathan and the other operations staff do. It definitely makes me happier and more productive.
Trajan House in 2022 is a thriving hub of work, conversation, and fun.
Unsurprisingly, I agree with a lot of this! It’s nice to see these principles laid out clearly and concisely:
You write
AI welfare is potentially an extremely large-scale issue. In the same way that the invertebrate population is much larger than the vertebrate population at present, the digital population has the potential to be much larger than the biological population in the future.
Do you know of any work that estimates these sizes? There are various places that people have estimated the ‘size of the future’ including potential digital moral patients in the long run, but do you know of anything that estimates how many AI moral patients there could be by (say) 2030?
I think that the post should explain briefly, or even just link to, what a “superforecaster” is. And if possible explain how and why this serves an independent check.
The superforecaster panel is imo a credible signal of good faith, but people outside of the community may think “superforecasters” just means something arbitrary and/or weird and/or made up by FTX.
(The post links to Tetlock’s book, but not in the context of explaining the panel)
Since the article is paywalled, it may be helpful to excerpt the key parts or say what you think Searle’s argument is. I imagine the trivial inconvenience of having to register will prevent a lot of people from checking it out.
I read that article a while ago, but can’t remember exactly what it says. To the extent that it is rehashing Searle’s arguments that AIs, no matter how sophisticated their behavior, necessarily lack understanding / intentionality/ something like that, then I think that Searle’s arguments are just not that relevant to work on AI alignment.
Basically I think what Chalmers says in his paper The Singularity: a Philosophical Analysis.
As for the Searle and Block objections, these rely on the thesis that even if a system duplicates our behavior, it might be missing important “internal” aspects of mentality: consciousness, understanding, intentionality, and so on. Later in the paper, I will advocate the view that if a system in our world duplicates not only our outputs but our internal computational structure, then it will duplicate the important internal aspects of mentality too. For present purposes, though, we can set aside these objections by stipulating that for the purposes of the argument, intelligence is to be measured wholly in terms of behavior and behavioral dispositions, where behavior is construed operationally in terms of the physical outputs that a system produces. The conclusion that there will be AI++ in this sense is still strong enough to be interesting. If there are systems that produce apparently superintelligent outputs, then whether or not these systems are truly conscious or intelligent, they will have a transformative impact on the rest of the world. (emph mine)
Thanks Darius! It was my pleasure.
[Replying separately with comments on progress on the pretty hard problem; the hard problem; and the meta-problem of consciousness]
The meta-problem of consciousness is distinct from both a) the hard problem: roughly, the fundamental relationship between the physical and the phenomenal b) the pretty hard problem, roughly, knowing which systems are phenomenally consciousness
The meta-problem is c) explaining “why we think consciousness poses a hard problem, or in other terms, the problem of explaining why we think consciousness is hard to explain” (6)
The meta-problem has a very interesting relationship to the hard problem. To see what this relationship is, we need a distinction between what the “hard problem” of explaining consciousness, and what Chalmers calls the ‘easy’ problems of explaining “various objective behavioural or cognitive functions such as learning, memory, perceptual integration, and verbal report”.
(Much like ‘pretty hard’, the ‘easy’ is tongue in cheek—the easy problems are tremendously difficult and thousands of brilliant people with expensive fancy machines are constantly hard at work on them).
Ease of the easy problems: “the easy problems are easy because we have a standard paradigm for explaining them. To explain a function, we just need to find an appropriate neural or computational mechanism that performs that function. We know how to do this at least in principle.”
Hardness of the hard problem: “Even after we have explained all the objective functions that we like, there may still remain a further question: why is all this functioning accompanied by conscious experience?...the standard methods in the cognitive sciences have difficulty in gaining purchase on the hard problem.”
The meta problem is interesting because it is deeply related to the hard problem, but it is strictly speaking an ‘easy’ problem: it is about explaining certain cognitive and behavioral functions. For example: thinking “I am currently seeing purple and it seems strange to me that this experience could simply be explained in terms of physics” or “It sure seems like Mary in the black and white room lacks knowledge of what it’s like to see red”; or sitting down and writing “boy consciousness sure is puzzling, I bet I can funding to work on this.”
Chalmers hopes that cognitive science can make traction on the meta-problem, by explaining how these cognitive functions and behaviors come about in ‘topic neutral’ terms that don’t commit to any particular metaphysical theory of consciousness. And then if we have a solution to the meta problem, this might shed light on the hard problem.
One particular intriguing connection is that it seems like a) a solution to the meta problem should at least be possible and b) if it is, then it gives us a really good reason not to trust our beliefs about consciousness!
A solution to the meta problem is possible, so there is a correct explanation of our beliefs about consciousness that is independent of consciousness.
If there is a correct explanation of our beliefs about consciousness that is independent of consciousness, those beliefs are not justified.
Our beliefs about consciousness are not justified.
Part of the aforementioned growing interest in illusionism is that I think this argument is pretty good. Chalmers came up with it and elaborated it—even though he is not an illusionist—and I like his elaboration of it more than his replies!
Well, I looked it up and found a free pdf, and it turns out that Searle does consider this counterargument.
Why is it so important that the system be capable of consciousness? Why isn’t appropriate behavior enough? Of course for many purposes it is enough. If the computer can fly airplanes, drive cars, and win at chess, who cares if it is totally nonconscious? But if we are worried about a maliciously motivated superintelligence destroying us, then it is important that the malicious motivation should be real. Without consciousness, there is no possibility of its being real.
But I find the arguments that he then gives in support of this claim quite unconvincing / I don’t understand exactly what the argument is. Notice that Searle’s argument is based on comparing a spell-checking program on a laptop with human cognition. He claims that reflecting on the difference between the human and the program establishes that it would never make sense to attribute psychological states to any computational system at all. But that comparison doesn’t seem to show that at all.
And it certainly doesn’t show, as Searle thinks it does, that computers could never have the “motivation” to pursue misaligned goals, in the sense that Bostrom needs to establish that powerful AGI could be dangerous.
I should say—while Searle is not my favorite writer on these topics, I think these sorts of questions at the intersection of phil mind and AI are quite important and interesting, and it’s cool that you are thinking about them. (Then again, I *would *think that given my background). And it’s important to scrutinize the philosophical assumptions (if any) behind AI risk arguments.
That’s a great point. A related point that I hadn’t really clocked until someone pointed it out to me recently, though it’s obvious in retrospect, is that (EA aside) in an academic department it is structurally unlikely that you will have a colleague who shares your research interests to a large extent. Since it’s rare that a department is big enough to have two people doing the same thing, and departments need coverage of their whole field, for teaching and supervision.
small correction that Jonathan Birch is at LSE, not QMUL. Lars Chittka, the co-lead of the project, is at QUML
That’s a great question. I’ll reply separately with my takes on progress on a) the pretty hard problem, b) the hard problem, and c) something called the meta-problem of consciousness [1].
[1] With apologies for introducing yet another ‘problem’ to distinguish between, when I’ve already introduced two! (Perhaps you can put these three problems into Anki?)
Progress on the pretty hard problem
This is my attempt to explain Jonathan Birch’s recent proposal for studying invertebrate consciousness. Let me know if it makes rough sense!
The problem with studying animal consciousness is that it is hard to know how much we can extrapolate from what we know about what suffices for human consciousness. Let’s grant that we know from experiments on humans that you will be conscious of a visual perception if you have a neural system for broadcasting information to multiple sub-systems in the brain. (This is the Global Workspace Theory mentioned above), and that visual perception is broadcast. Great, now we know that this sophisticated human Global Workspace suffices for consciousness. But how much of that is necessary? How much simpler could the Global Workspace be and still result in consciousness?
When we try to take a theory of consciousness “off the shelf” and apply it to animals, we face a choice of how strict to be. We could say that the Global Workspace must be as complicated as the human case. Then no animals count as conscious. We could say that the Global Workspace can be very simple. Then maybe even simple programs count as conscious. To know how strict or liberal to be in applying the theory, we need to know what animals are conscious. Which is the very question!
Some people try to get around this by proposing tests for consciousness that avoid the need for theory—the Turing Test would be an example of this in the AI case. But these usually end up sneaking theory in the backdoor.
Here’s Birch’s proposal for getting around this impass.
Make a minimal theoretical assumption about consciousness.
The ‘facilitation hypothesis’:
Phenomenally conscious perception, relative to unconscious perception, facilitates a “cluster” of cognitive abilities.
It’s a cluster because it seems like “the abilities will come and go together, co-varying in a way that depends on whether or not a stimulus is consciously perceived” (8). Empirically we have evidence that some abilities in the cluster include: trace conditioning, rapid reversal learning, cross-modal learning.
-
Look for these clusters of abilities of animals.
-
See if things which are able to make perceptions unconscious in humans—flashing them quickly and so forth—seems to ‘knock out’ that cluster in animals. If we can make the clusters come and go like this, it’s a pretty reasonable inference that the cause of this is consciousness coming and going.
As I understand it, Birch (a philosopher) is currently working with scientists to flash stuff at bees and so forth. I think Birch’s research proposal is a great conceptual advance and I find the empirical research itself very exciting and am curious to see what comes out of it.
- 26 Aug 2021 9:20 UTC; 6 points) 's comment on The pretty hard problem of consciousness by (
I suggest that “why I don’t trust pseudonymous forecasters” would be a more appropriate title. When I saw the title I expected an argument that would apply to all/most forecasting, but this worry is only about a particular subset
This is very interesting! I’m excited to see connections drawn between AI safety and the law / philosophy of law. It seems there are a lot of fruitful insights to be had.
You write,
The rules of Evidence have evolved over long experience with high-stakes debates, so their substantive findings on the types of arguments that prove problematic for truth-seeking are relevant to Debate.
Can you elaborate a bit on this?
I don’t know anything about the history of these rules about evidence. But why think that over this history, these rules have trended towards truth-seeking per se? I wouldn’t be surprised if the rules have evolved to better serve the purposes of the legal system over time, but presumably the relationship between this end and truth-seeking is quite complex. Also, people changing the rules could be mistaken about what sorts of evidence do in fact tend to lead to wrong decisions.
I think all of this is compatible with your claim. But I’d like to hear more!
Feedback: I find the logo mildly unsettling. I think it triggers my face detector, and I see sharp teeth. A bit like the Radiohead logo.
On the other hand, maybe this is just a sign of some deep unwellness in my brain. Still, if even a small percentage of people get this feeling from the logo, could be worth reconsidering.
Just wanted to say that I really appreciated this post. As someone who followed the campaign with interest, but not super closely, I found it very informative about the campaign. And it covered all of the key questions I have been vaguely wondering about re: EAs running for office.