It seems reasonable to guess that modern language models aren’t conscious in any morally relevant sense. But it seems odd to use that as the basis for a reductio of arguments about consciousness, given that we know nothing about the consciousness of language models.
Put differently: if a line of reasoning would suggest that language models are conscious, then I feel like the main update should be about consciousness of language models rather than about the validity of the line of reasoning. If you think that e.g. fish are conscious based on analysis of their behavior rather than evolutionary analogies with humans, then I think you should apply the same reasoning to ML systems.
I don’t think that biological brains are plausibly necessary for consciousness. It seems extremely likely to me that a big neural network can in principle be conscious without adding any of these bells or whistles, and it seems clear that SGD could find conscious models.
I don’t think the fact that language models say untrue things show they have no representation of the world (in fact for a pre-trained model that would be a clearly absurd inference—they are trained to predict what someone else would say and then sample from that distribution, which will of course lead to confidently saying false things when the predicted-speaker can know things the model does not!)
That all said, I think it’s worth noting and emphasizing that existing language models’ statements about their own consciousness are not evidence that they are conscious, and that more generally the relationship between a language model’s inner life and its utterances is completely unlike the relationship between a human’s inner life and their utterances (because they are trained to produce these utterances by mimicking humans, and they would make similar utterances regardless of whether they are conscious). A careful analysis of how models generalize out of distribution, or about surprisingly high accuracy on some kinds of prediction tasks could provide evidence of consciousness, but we don’t have that kind of evidence right now.
Thanks for this response. It seems like we are coming at this topic from very different starting assumptions. If I’m understanding you correctly, you’re saying that we have no idea whether LLMs are conscious, so it doesn’t make sense to draw any inferences from them to other minds.
That’s fair enough, but I’m starting from the premise that LLMs in their current form are almost certainly not conscious. Of course, I can’t prove this. It’s my belief based on my understanding of their architecture. I’m very much not saying they lack consciousness because they aren’t instantiated in a biological brain. Rather, I don’t think that GPUs performing parallel searches through a probabilistic word space by themselves are likely to support consciousness.
Stepping back a bit: I can’t know if any animal other than myself is conscious, even fellow humans. I can only reason through induction that consciousness is a feature of my brain, so other animals that have brains similar in construction to mine may also have consciousness. And I can use the observed output of those brains—behavior—as an external proxy for internal function. This makes me highly confident that, for example, primates are conscious, with my uncertainty growing with greater evolutionary distance.
Now along come LLMs to throw a wrench in that inductive chain. LLMs are—in my view—zombies that can do things previously only humans were capable of. And the truth is, a mosquito’s brain doesn’t really have all that much in common with a human’s. So now I’m even more uncertain—is complex behavior really a sign for interiority? Does having a brain made of neurons really put lower animals on a continuum with humans? I’m not sure anymore.
Rather, I don’t think that GPUs performing parallel searches through a probabilistic word space by themselves are likely to support consciousness.
This seems like the crux. It feels like a big neural network run on a GPU, trained to predict the next word, could definitely be conscious. So to me this is just a question about the particular weights of large language models, not something that can be established a priori based on architecture.
It seems reasonable to guess that modern language models aren’t conscious in any morally relevant sense. But it seems odd to use that as the basis for a reductio of arguments about consciousness, given that we know nothing about the consciousness of language models.
Put differently: if a line of reasoning would suggest that language models are conscious, then I feel like the main update should be about consciousness of language models rather than about the validity of the line of reasoning. If you think that e.g. fish are conscious based on analysis of their behavior rather than evolutionary analogies with humans, then I think you should apply the same reasoning to ML systems.
I don’t think that biological brains are plausibly necessary for consciousness. It seems extremely likely to me that a big neural network can in principle be conscious without adding any of these bells or whistles, and it seems clear that SGD could find conscious models.
I don’t think the fact that language models say untrue things show they have no representation of the world (in fact for a pre-trained model that would be a clearly absurd inference—they are trained to predict what someone else would say and then sample from that distribution, which will of course lead to confidently saying false things when the predicted-speaker can know things the model does not!)
That all said, I think it’s worth noting and emphasizing that existing language models’ statements about their own consciousness are not evidence that they are conscious, and that more generally the relationship between a language model’s inner life and its utterances is completely unlike the relationship between a human’s inner life and their utterances (because they are trained to produce these utterances by mimicking humans, and they would make similar utterances regardless of whether they are conscious). A careful analysis of how models generalize out of distribution, or about surprisingly high accuracy on some kinds of prediction tasks could provide evidence of consciousness, but we don’t have that kind of evidence right now.
Thanks for this response. It seems like we are coming at this topic from very different starting assumptions. If I’m understanding you correctly, you’re saying that we have no idea whether LLMs are conscious, so it doesn’t make sense to draw any inferences from them to other minds.
That’s fair enough, but I’m starting from the premise that LLMs in their current form are almost certainly not conscious. Of course, I can’t prove this. It’s my belief based on my understanding of their architecture. I’m very much not saying they lack consciousness because they aren’t instantiated in a biological brain. Rather, I don’t think that GPUs performing parallel searches through a probabilistic word space by themselves are likely to support consciousness.
Stepping back a bit: I can’t know if any animal other than myself is conscious, even fellow humans. I can only reason through induction that consciousness is a feature of my brain, so other animals that have brains similar in construction to mine may also have consciousness. And I can use the observed output of those brains—behavior—as an external proxy for internal function. This makes me highly confident that, for example, primates are conscious, with my uncertainty growing with greater evolutionary distance.
Now along come LLMs to throw a wrench in that inductive chain. LLMs are—in my view—zombies that can do things previously only humans were capable of. And the truth is, a mosquito’s brain doesn’t really have all that much in common with a human’s. So now I’m even more uncertain—is complex behavior really a sign for interiority? Does having a brain made of neurons really put lower animals on a continuum with humans? I’m not sure anymore.
This seems like the crux. It feels like a big neural network run on a GPU, trained to predict the next word, could definitely be conscious. So to me this is just a question about the particular weights of large language models, not something that can be established a priori based on architecture.