If AIs are conscious, then they likely deserve moral consideration
AIs could have negligible welfare (in expectation) even if they are conscious. They may not be sentient even if they are conscious, or have negligible welfare even if they are sentient. I would say the (expected) total welfare of a group (individual welfare times population) matters much more for its moral consideration than the probability of consciousness of its individuals. Do you have any plans to compare the individual (expected hedonistic) welfare of AIs, animals, and humans? You do not mention this in the section âWhatâs nextâ.
The choice of a prior is often somewhat arbitrary and intended to reflect a state of ignorance about the details of the system. The final (posterior) probability the model generates can vary significantly depending on what we choose for the prior. Therefore, unless we are confident in our choices of priors, we shouldnât be confident in the final probabilities.
Do you have any ideas for how to decide on the priors for the probability of sentience? I agree decisions about priors are often very arbitrary, and I worry they will have significantly different implications.
[...] We report what each perspective concludes, then combine these conclusions based on how credible experts find each perspective.
[...]
Which theory of consciousness is right matters a lot. Because different stances give strikingly different judgments about the probability of LLM consciousness, significant changes in the weights given to stances will yield significant differences in the results of the Digital Consciousness Model. [...]
I like that your report the results for each perspective. People usually give weights that are at least 0.1/âânumber of modelsâ, which are not much smaller than the uniform weight of 1/âânumber of modelsâ, but this could easily lead to huge mistakes. As a silly example, if I asked random people with age 7 about whether the gravitational force between 2 objects is proportional to âdistanceâ^-2 (correct answer), âdistanceâ^-20, or âdistanceâ^-200, I imagine I would get a significant fraction picking the exponents of â20 and â200. Assuming 60 % picked â2, 20 % picked â20, and 20 % picked â200, a respondant may naively conclude the mean exponent of â45.2 (= 0.6*(-2) + 0.2*(-20) + 0.2*(-200)) is reasonable. Alternatively, a respondant may naively conclude an exponent of â9.19 (= 0.933*(-2) + 0.0333*(-20) + 0.0333*(-200)) is reasonable giving a weight of 3.33 % (= 0.1/â3) to each of the 2 wrong exponents, equal to 10 % of the uniform weight, and the remaining weight of 93.3 % (= 1 â 2*0.0333) to the correct exponent. Yet, there is lots of empirical evidence against the exponents of â45.2 and â9.19 which the respondants are not aware of. The right conclusion would be that the respondants have no idea about the right exponent, or how to weight the various models because they would not be able to adequately justify their picks. This is also why I am sceptical that the absolute value of the welfare per unit time of animals is bound to be relatively close to that of humans, as one may naively infer from the welfare ranges Rethink Priorities (RP) initially presented, or the ones in Bob Fischerâs book about comparing welfare across species, where there seems to be only 1 line about the weights. âWe assigned 30 percent credence to the neurophysiological model, 10 percent to the equality model, and 60 percent to the simple additive modelâ.
Mistakes like the one illustrated above happen when the weights of models are guessed independently of their output. People are often sensitive to astronomical outputs, but not to the astronomically low weights they imply. How do you ensure the weights of the models to estimate the probability of consciousness are reasonable, and sensitive to their outputs? I would model the weights of the models as very wide distributions to represent very high model uncertainty.
AIs could have negligible welfare (in expectation) even if they are conscious. They may not be sentient even if they are conscious, or have negligible welfare even if they are sentient. I would say the (expected) total welfare of a group (individual welfare times population) matters much more for its moral consideration than the probability of consciousness of its individuals. Do you have any plans to compare the individual (expected hedonistic) welfare of AIs, animals, and humans? You do not mention this in the section âWhatâs nextâ.
This is an important caveat. While our motivation for looking at consciousness is largely from its relation to moral status, we donât think that establishing that AIs were conscious would entail that they have significant states that counted strongly one way or the other for our treatment of them, and establishing that they werenât conscious wouldnât entail that we should feel free to treat them however we like.
We think that it estimates of consciousness still play an important practical role. Work on AI consciousness may help us to achieve consensus on reasonable precautionary measures and motivate future research directions with a more direct upshot. I donât think the results of this model can be directly plugged into any kind of BOTEC, and should be treated with care.
Do you have any ideas for how to decide on the priors for the probability of sentience? I agree decisions about priors are often very arbitrary, and I worry they will have significantly different implications.
We favored a 1â6 prior for consciousness relative to every stance and we chose that fairly early in the process. To some extent, you can check the prior against what you update to on the basis of your evidence. Given an assignment of evidence strength and an opinion about what it should say about something that satisfies all of the indicators, you can backwards infer the prior needed to update to the right posterior. That prior is basically implicit in your choices about evidential strength. We didnât explicitly set our prior this way, but we would probably have reconsidered our choice of 1â6 if it was giving really implausible results for humans, chickens, and ELIZA across the board.
The right conclusion would be that the respondants have no idea about the right exponent, or how to weight the various models because they would not be able to adequately justify their picks.
There is a tension here between producing probabilities we think are right and producing probabilities which could reasonably act as a consensus conclusion. I have my own favorite stance, and I think I have good reason for it, but I didnât try to convince anyone to give it more weight in our aggregation. Insofar as weâre aiming in the direction of something that could achieve broad agreement, we donât want to give too much weight to our own views (even if we think weâre right). Unfortunately,among people with significant expertise in this area, there is broad and fairly fundamental disagreement. We think that it is still valuable to shoot for consensus, even if that means everyone will think it is flawed (by giving too much weight to different stances.)
Thanks for this work. I find it valuable.
AIs could have negligible welfare (in expectation) even if they are conscious. They may not be sentient even if they are conscious, or have negligible welfare even if they are sentient. I would say the (expected) total welfare of a group (individual welfare times population) matters much more for its moral consideration than the probability of consciousness of its individuals. Do you have any plans to compare the individual (expected hedonistic) welfare of AIs, animals, and humans? You do not mention this in the section âWhatâs nextâ.
Do you have any ideas for how to decide on the priors for the probability of sentience? I agree decisions about priors are often very arbitrary, and I worry they will have significantly different implications.
I like that your report the results for each perspective. People usually give weights that are at least 0.1/âânumber of modelsâ, which are not much smaller than the uniform weight of 1/âânumber of modelsâ, but this could easily lead to huge mistakes. As a silly example, if I asked random people with age 7 about whether the gravitational force between 2 objects is proportional to âdistanceâ^-2 (correct answer), âdistanceâ^-20, or âdistanceâ^-200, I imagine I would get a significant fraction picking the exponents of â20 and â200. Assuming 60 % picked â2, 20 % picked â20, and 20 % picked â200, a respondant may naively conclude the mean exponent of â45.2 (= 0.6*(-2) + 0.2*(-20) + 0.2*(-200)) is reasonable. Alternatively, a respondant may naively conclude an exponent of â9.19 (= 0.933*(-2) + 0.0333*(-20) + 0.0333*(-200)) is reasonable giving a weight of 3.33 % (= 0.1/â3) to each of the 2 wrong exponents, equal to 10 % of the uniform weight, and the remaining weight of 93.3 % (= 1 â 2*0.0333) to the correct exponent. Yet, there is lots of empirical evidence against the exponents of â45.2 and â9.19 which the respondants are not aware of. The right conclusion would be that the respondants have no idea about the right exponent, or how to weight the various models because they would not be able to adequately justify their picks. This is also why I am sceptical that the absolute value of the welfare per unit time of animals is bound to be relatively close to that of humans, as one may naively infer from the welfare ranges Rethink Priorities (RP) initially presented, or the ones in Bob Fischerâs book about comparing welfare across species, where there seems to be only 1 line about the weights. âWe assigned 30 percent credence to the neurophysiological model, 10 percent to the equality model, and 60 percent to the simple additive modelâ.
Mistakes like the one illustrated above happen when the weights of models are guessed independently of their output. People are often sensitive to astronomical outputs, but not to the astronomically low weights they imply. How do you ensure the weights of the models to estimate the probability of consciousness are reasonable, and sensitive to their outputs? I would model the weights of the models as very wide distributions to represent very high model uncertainty.
This is an important caveat. While our motivation for looking at consciousness is largely from its relation to moral status, we donât think that establishing that AIs were conscious would entail that they have significant states that counted strongly one way or the other for our treatment of them, and establishing that they werenât conscious wouldnât entail that we should feel free to treat them however we like.
We think that it estimates of consciousness still play an important practical role. Work on AI consciousness may help us to achieve consensus on reasonable precautionary measures and motivate future research directions with a more direct upshot. I donât think the results of this model can be directly plugged into any kind of BOTEC, and should be treated with care.
We favored a 1â6 prior for consciousness relative to every stance and we chose that fairly early in the process. To some extent, you can check the prior against what you update to on the basis of your evidence. Given an assignment of evidence strength and an opinion about what it should say about something that satisfies all of the indicators, you can backwards infer the prior needed to update to the right posterior. That prior is basically implicit in your choices about evidential strength. We didnât explicitly set our prior this way, but we would probably have reconsidered our choice of 1â6 if it was giving really implausible results for humans, chickens, and ELIZA across the board.
There is a tension here between producing probabilities we think are right and producing probabilities which could reasonably act as a consensus conclusion. I have my own favorite stance, and I think I have good reason for it, but I didnât try to convince anyone to give it more weight in our aggregation. Insofar as weâre aiming in the direction of something that could achieve broad agreement, we donât want to give too much weight to our own views (even if we think weâre right). Unfortunately,among people with significant expertise in this area, there is broad and fairly fundamental disagreement. We think that it is still valuable to shoot for consensus, even if that means everyone will think it is flawed (by giving too much weight to different stances.)