I agree that the text an LLM outputs shouldn’t be thought of as communicating with the LLM “behind the mask” itself.
But I don’t agree that it’s impossible in principle to say anything about the welfare of a sentient AI. Could we not develop some guesses about AI welfare by getting a much better understanding of animal welfare? (For example, we might learn much more about when brains are suffering, and this could be suggestive of what to look for in artificial neural nets)
It’s also not completely clear to me what the relationship between the sentient being “behind the mask” is, and the “role-played character”, especially if we imagine conscious, situationally-aware future models. Right now, it’s for sure useful to see the text output by an LLM as simulating a character, which is nothing to do with the reality of the LLM itself, but could that be related to the LLM not being conscious of itself? I feel confused.
Also, even if it was impossible in principle to evaluate the welfare of a sentient AI, you might still want to act differently in some circumstances:
Some ethical views see creating suffering as worse than creating the same amount of pleasure.
Empirically, in animals, it seems to me that the total amount of suffering is probably more than the total amount of pleasure. So we might worry that this could also be the case for ML models.
I should not have said it’s in principle impossible to say anything about the welfare of LLMs, since that too strong a statement. Still, we are very far from being able to say such a thing; our understanding of animal welfare is laughably bad, and animal brains don’t look anything like the neural networks of LLMs. Maybe there would be something to say in 100 years (or post-singularity, whichever comes first), but there’s nothing interesting to say in the near future.
Empirically, in animals, it seems to me that the total amount of suffering is probably more than the total amount of pleasure. So we might worry that this could also be the case for ML models.
This is a weird EA-only intuition that is not really shared by the rest of the world, and I worry about whether cultural forces (or “groupthink”) are involved in this conclusion. I don’t know whether the total amount of suffering is more than the total amount of pleasure, but it is worth noting that the revealed preference of living things is nearly always to live. The suffering is immense, but so is the joy; EAs sometimes sound depressed to me when they say most life is not worth living.
To extrapolate from the dubious “most life is not worth living” to “LLMs’ experience is also net bad” strikes me as an extremely depressed mentality, and one that reminds me of Tomasik’s “let’s destroy the universe” conclusion. I concede that logically this could be correct! I just think the evidence is so weak is says more about the speaker than about LLMs.
I agree the notion that wild animals suffer is primarily an EA notion and considered weird by most other people, but I think most people think it’s weird to even examine the question at all, rather than most people thinking wild animals have overall joyful lives, so I don’t think this is evidence that EAs are wrong about the bottom line. (It’s mild evidence that EAs are wrong to consider the issue, but I just feel like the argument for the inside view is quite strong, and people’s reasons for being different seem quite transparently bad.)
I reject the “depression” characterisation, because I don’t think my life is overall unpleasant. It’s just that I think the goodness of my life rests significantly on a lot of things that I have that most animals don’t, mainly reliable access to food, shelter, and sleep, and protection from physical harm. I would be happy to live in a world where most sentient beings had a life like mine, but I don’t.
the revealed preference of living things is nearly always to live
That’s because almost no living things have the ability to conceive of, or execute on, alternative options.
Consider a hypothetical squirrel whose life is definitely not worth living (say, they are subjected to torture daily). Would you expect this squirrel to commit suicide?
I don’t know—it’s a good question! It probably depends on the suicide method available. I think if you give the squirrel some dangerous option to escape the torture, like “swim across this lake” or “run past a predator”, it’d probably try to take it, even with a low chance of success and high chance of death. I’m not sure, though.
You do see distressed animals engaging in self-destructive behavior, like birds plucking out their own feathers. (Birds in the wild tend not to do this, hence presumably they are not sufficiently distressed.)
Yeah, I agree that many animals can & will make tradeoffs where there’s a chance of death, even a high chance (though I’m not confident they’d be aware that what they’re doing is taking on some chance of death — I’m not sure many animals have a mental concept of death similar to ours. Some might, but it’s definitely not a given.).
I also agree that animals engage in self-destructive behaviours, e.g. feather pulling, chewing/biting, pacing, refusing food when sick, eating things that are bad for them, excessive licking at wounds, pulling on limbs when stuck, etc etc.
I’m just not sure that any of them are undertaken with the purpose/intent to end their own life, even when they have that effect. That’s because I’d guess that it’s kind of hard to understand “I’d be better off dead” because you need to have a concept of death, and not being conscious, plus the ability to reason causally from taking a particular action to your eventual death.
To be clear, I’ve not done any research here on animal suicide & concepts of death, & I’m not all that confident, but I overall think the lack of mass animal suicides is at best extremely weak evidence that animal lives are mostly worth living.
I agree that the text an LLM outputs shouldn’t be thought of as communicating with the LLM “behind the mask” itself.
But I don’t agree that it’s impossible in principle to say anything about the welfare of a sentient AI. Could we not develop some guesses about AI welfare by getting a much better understanding of animal welfare? (For example, we might learn much more about when brains are suffering, and this could be suggestive of what to look for in artificial neural nets)
It’s also not completely clear to me what the relationship between the sentient being “behind the mask” is, and the “role-played character”, especially if we imagine conscious, situationally-aware future models. Right now, it’s for sure useful to see the text output by an LLM as simulating a character, which is nothing to do with the reality of the LLM itself, but could that be related to the LLM not being conscious of itself? I feel confused.
Also, even if it was impossible in principle to evaluate the welfare of a sentient AI, you might still want to act differently in some circumstances:
Some ethical views see creating suffering as worse than creating the same amount of pleasure.
Empirically, in animals, it seems to me that the total amount of suffering is probably more than the total amount of pleasure. So we might worry that this could also be the case for ML models.
I should not have said it’s in principle impossible to say anything about the welfare of LLMs, since that too strong a statement. Still, we are very far from being able to say such a thing; our understanding of animal welfare is laughably bad, and animal brains don’t look anything like the neural networks of LLMs. Maybe there would be something to say in 100 years (or post-singularity, whichever comes first), but there’s nothing interesting to say in the near future.
This is a weird EA-only intuition that is not really shared by the rest of the world, and I worry about whether cultural forces (or “groupthink”) are involved in this conclusion. I don’t know whether the total amount of suffering is more than the total amount of pleasure, but it is worth noting that the revealed preference of living things is nearly always to live. The suffering is immense, but so is the joy; EAs sometimes sound depressed to me when they say most life is not worth living.
To extrapolate from the dubious “most life is not worth living” to “LLMs’ experience is also net bad” strikes me as an extremely depressed mentality, and one that reminds me of Tomasik’s “let’s destroy the universe” conclusion. I concede that logically this could be correct! I just think the evidence is so weak is says more about the speaker than about LLMs.
I agree the notion that wild animals suffer is primarily an EA notion and considered weird by most other people, but I think most people think it’s weird to even examine the question at all, rather than most people thinking wild animals have overall joyful lives, so I don’t think this is evidence that EAs are wrong about the bottom line. (It’s mild evidence that EAs are wrong to consider the issue, but I just feel like the argument for the inside view is quite strong, and people’s reasons for being different seem quite transparently bad.)
I reject the “depression” characterisation, because I don’t think my life is overall unpleasant. It’s just that I think the goodness of my life rests significantly on a lot of things that I have that most animals don’t, mainly reliable access to food, shelter, and sleep, and protection from physical harm. I would be happy to live in a world where most sentient beings had a life like mine, but I don’t.
(I’m not sure what to extrapolate about LLMs.)
That’s because almost no living things have the ability to conceive of, or execute on, alternative options.
Consider a hypothetical squirrel whose life is definitely not worth living (say, they are subjected to torture daily). Would you expect this squirrel to commit suicide?
I don’t know—it’s a good question! It probably depends on the suicide method available. I think if you give the squirrel some dangerous option to escape the torture, like “swim across this lake” or “run past a predator”, it’d probably try to take it, even with a low chance of success and high chance of death. I’m not sure, though.
You do see distressed animals engaging in self-destructive behavior, like birds plucking out their own feathers. (Birds in the wild tend not to do this, hence presumably they are not sufficiently distressed.)
Yeah, I agree that many animals can & will make tradeoffs where there’s a chance of death, even a high chance (though I’m not confident they’d be aware that what they’re doing is taking on some chance of death — I’m not sure many animals have a mental concept of death similar to ours. Some might, but it’s definitely not a given.).
I also agree that animals engage in self-destructive behaviours, e.g. feather pulling, chewing/biting, pacing, refusing food when sick, eating things that are bad for them, excessive licking at wounds, pulling on limbs when stuck, etc etc.
I’m just not sure that any of them are undertaken with the purpose/intent to end their own life, even when they have that effect. That’s because I’d guess that it’s kind of hard to understand “I’d be better off dead” because you need to have a concept of death, and not being conscious, plus the ability to reason causally from taking a particular action to your eventual death.
To be clear, I’ve not done any research here on animal suicide & concepts of death, & I’m not all that confident, but I overall think the lack of mass animal suicides is at best extremely weak evidence that animal lives are mostly worth living.