Research Fellow at the Center for AI Safety
rgb
How to make independent research more fun (80k After Hours)
Hi Brian! Thanks for your reply. I think you’re quite right to distinguish between your flavor of panpsychism and the flavor I was saying doesn’t entail much about LLMs. I’m going to update my comment above to make that clearer, and sorry for running together your view with those others.
80k podcast episode on sentience in AI systems
Ah, thanks! Well, even if it wasn’t appropriately directed at your claim, I appreciate the opportunity to rant about how panpsychism (and related views) don’t entail AI sentience :)
The Brian Tomasik post you link to considers the view that fundamental physical operations may have moral weight (call this view “Physics Sentience”).
[Edit: see Tomasik’s comment below. What I say below is true of a different sort of Physics Sentience view like constitutive micropsychism, but not necessarily of Brian’s own view, which has somewhat different motivations and implications]
But even if true, [many versions of] Physics Sentience [but not necessarily Tomasik’s] doesn’t have straightforward implications about what high-level systems, like organisms and AI systems, also comprise a sentient subject of experience. Consider: a human being touching a stove is experiencing pain on Physics Sentience; but a pan touching a stove is not experiencing pain. On Physics Sentience, the pan is made up of sentient matter, but this doesn’t mean that the pan qua pan is also a moral patient, another subject of experience that will suffer if it touches the stove.
To apply this to the LLMs case:
-Physics Sentience will hold that the hardware on which LLMs run is sentient—after all, it’s a bunch of fundamental physical operations.
-But Physics Sentience will also hold that the hardware on which a giant lookup table is running is sentient, to the same extent and for the same reason.
-Physics Sentience is silent on whether there’s a difference between (1) and (2), in the way that there’s a difference between the human and the pan.
The same thing holds for other panpsychist views of consciousness, fwiw. Panpsychist views that hold that fundamental matter is consciousness don’t tell us anything, themselves, about what animals or AI systems are sentient. It just says they are made of conscious (or proto-conscious) matter.
I like it! I think one thing the post itself could have been clearer on is that reports could be indirect evidence for sentience, in that they are evidence of certain capabilities that are themselves evidence of sentience. To give an example (though it’s still abstract), the ability of LLMs to fluently mimic human speech —> evidence for capability C—> evidence for sentience. You can imagine the same thing for parrots: ability to say “I’m in pain”—> evidence of learning and memory —> evidence of sentience. But what they aren’t are reports of sentience.
so maybe at the beginning: aren’t “strong evidence” or “straightforward evidence”
Thanks for the comment. A couple replies:
I want to clarify that these are examples of self-reports about consciousness and not evidence of consciousness in humans.
Self-report is evidence of consciousness in Bayesian sense (and in common parlance): in a wide range of scenarios, if a human says they are conscious of something, you should have a higher credence than if they do not say they are. And in the scientific sense: it’s commonly and appropriately taken as evidence in scientific practice; here is Chalmers’s “How Can We Construct a Science of Consciousness?” on the practice of using self-reports to gather data about people’s conscious experiences:
Of course our access to this data depends on our making certain assumptions: in particular, the assumption that other subjects really are having conscious experiences, and that by and large their verbal reports reflect these conscious experiences. We cannot directly test this assumption; instead, it serves as a sort of background assumption for research in the field. But this situation is present throughout other areas of science. When physicists use perception to gather information about the external world, for example, they rely on the assumption that the external world exists, and that perception reflects the state of the external world. They cannot directly test this assumption; instead, it serves as a sort of background assumption for the whole field. Still, it seems a reasonable assumption to make, and it makes the science of physics possible. The same goes for our assumptions about the conscious experiences and verbal reports of others. These seem to be reasonable assumptions to make, and they make the science of consciousness possible .
It’s suppose it’s true that self-reports can’t budge someone from the hypothesis that other actual people are p-zombies, but few people (if any) think that. From the SEP:
Few people, if any, think zombies actually exist. But many hold that they are at least conceivable, and some that they are possible....The usual assumption is that none of us is actually a zombie, and that zombies cannot exist in our world. The central question, however, is not whether zombies can exist in our world, but whether they, or a whole zombie world (which is sometimes a more appropriate idea to work with), are possible in some broader sense.
So yeah: my take is that no one, including anti-physicalists who discuss p-zombies like Chalmers, really thinks that we can’t use self-report as evidence, and correctly so.
Agree, that’s a great pointer! For those interested, here is the paper and here is the podcast episode.
[Edited to add a nit-pick: the term ‘meta-consciousness’ is not used, it’s the ‘meta-problem of consciousness’, which is the problem of explaining why people think and talk the way they do about consciousness]
Thank you!
What to think when a language model tells you it’s sentient
I enjoyed this excerpt and the pointer to the interview, thanks. It might be helpful to say in the post who Jim Davies is.
That may be right—an alternative would be to taboo the word in the post, and just explain that they are going to use people with an independent, objective track record of being good at reasoning under uncertainty.
Of course, some people might be (wrongly, imo) skeptical of even that notion, but I suppose there’s only such much one can do to get everyone on board. It’s a tricky balance of making it accessible to outsiders while still just saying what you believe about how the contest should work.
I think that the post should explain briefly, or even just link to, what a “superforecaster” is. And if possible explain how and why this serves an independent check.
The superforecaster panel is imo a credible signal of good faith, but people outside of the community may think “superforecasters” just means something arbitrary and/or weird and/or made up by FTX.
(The post links to Tetlock’s book, but not in the context of explaining the panel)
I think you mean “schisms”
You write,
Those who do see philosophical zombies as possible don’t have a clear idea of how consciousness relates to the brain, but they do think...that consciousness is something more than just the functions of the brain. In their view, a digital person (an uploaded human mind which runs on software) may act like a conscious human, and even tell you all about its ‘conscious experience’, but it is possible that it is in fact empty of experience.
It’s consistent to think that p-zombies are possible but to think that, given the laws of nature, digital people would be conscious. David Chalmers is someone who argues for both views.
It might be useful to clarify that the questions of
(a) whether philosophical zombies are metaphysically possible (and the closely related question of physicalism about consciousness)
is actually somewhat orthogonal to the question of
(b) whether uploads that are functionally isomorphic to humans would be conscious
David Chalmers thinks that philosophical zombies are metaphysically possible, and that consciousness is not identical to the physical. But he also argues that, given the laws of nature in this world, uploaded minds, of sufficiently fine-grained functional equivalence to human minds, that act and talk like conscious humans would be conscious. In fact, he’s the originator of the ‘fading qualia’ argument that Holden appeals to in his post.
On the other side, Ned Block thinks that zombies are not possible, and is a physicalist. But he also thinks that only biological-instantiated minds can be conscious.
Here’s Chalmers (2010) on the distinction between the two issues:
I have occasionally encountered puzzlement that someone with my own property dualist views (or even that someone who thinks that there is a significant hard problem of consciousness) should be sympathetic to machine consciousness. But the question of whether the physical correlates of consciousness are biological or functional is largely orthogonal to the question of whether consciousness is identical to or distinct from its physical correlates. It is hard to see why the view that consciousness is restricted to creatures with our biology should be more in the spirit of property dualism! In any case, much of what follows is neutral on questions about materialism and dualism.
You might be interested in this LessWrong shortform post by Harri Besceli, “The best and worst experiences you had last week probably happened when you were dreaming.” Including a comment from gwern.
Thanks for the post! Wanted to flag a typo: “ To easily adapt to performing complex and difficult math problems, Minerva has That’s not to say that Minerva is an AGI—it clearly isn’t.”
Well, I looked it up and found a free pdf, and it turns out that Searle does consider this counterargument.
Why is it so important that the system be capable of consciousness? Why isn’t appropriate behavior enough? Of course for many purposes it is enough. If the computer can fly airplanes, drive cars, and win at chess, who cares if it is totally nonconscious? But if we are worried about a maliciously motivated superintelligence destroying us, then it is important that the malicious motivation should be real. Without consciousness, there is no possibility of its being real.
But I find the arguments that he then gives in support of this claim quite unconvincing / I don’t understand exactly what the argument is. Notice that Searle’s argument is based on comparing a spell-checking program on a laptop with human cognition. He claims that reflecting on the difference between the human and the program establishes that it would never make sense to attribute psychological states to any computational system at all. But that comparison doesn’t seem to show that at all.
And it certainly doesn’t show, as Searle thinks it does, that computers could never have the “motivation” to pursue misaligned goals, in the sense that Bostrom needs to establish that powerful AGI could be dangerous.
I should say—while Searle is not my favorite writer on these topics, I think these sorts of questions at the intersection of phil mind and AI are quite important and interesting, and it’s cool that you are thinking about them. (Then again, I *would *think that given my background). And it’s important to scrutinize the philosophical assumptions (if any) behind AI risk arguments.
Feedback: I find the logo mildly unsettling. I think it triggers my face detector, and I see sharp teeth. A bit like the Radiohead logo.
On the other hand, maybe this is just a sign of some deep unwellness in my brain. Still, if even a small percentage of people get this feeling from the logo, could be worth reconsidering.
Hi Timothy! I agree with your main claim that “assumptions [about sentience] are often dubious as they are based on intuitions that might not necessarily ‘track’ sentience”, shaped as they are by potentially unreliable evolutionary and cultural factors. I also think it’s a very important point! I commend you for laying it out in a detailed way.
I’d like to offer a piece of constructive criticism if I may. I’d add more to the piece that answers, for the reader:
what kind of piece am I reading? What is going to happen in it?
why should I care about the central points? (as indicated, I think there are many reasons to care, and could name quite a few myself)
how does this piece relate to what other people say about this topic?
While getting ‘right to the point’ is a virtue, I feel like more framing and intro would make this piece more readable, and help prospective readers decide if it’s for them.
[meta-note: if other readers disagree, please do of course vote ‘disagree’ on this comment!]