AIs could have negligible welfare (in expectation) even if they are conscious. They may not be sentient even if they are conscious, or have negligible welfare even if they are sentient. I would say the (expected) total welfare of a group (individual welfare times population) matters much more for its moral consideration than the probability of consciousness of its individuals. Do you have any plans to compare the individual (expected hedonistic) welfare of AIs, animals, and humans? You do not mention this in the section “What’s next”.
This is an important caveat. While our motivation for looking at consciousness is largely from its relation to moral status, we don’t think that establishing that AIs were conscious would entail that they have significant states that counted strongly one way or the other for our treatment of them, and establishing that they weren’t conscious wouldn’t entail that we should feel free to treat them however we like.
We think that it estimates of consciousness still play an important practical role. Work on AI consciousness may help us to achieve consensus on reasonable precautionary measures and motivate future research directions with a more direct upshot. I don’t think the results of this model can be directly plugged into any kind of BOTEC, and should be treated with care.
Do you have any ideas for how to decide on the priors for the probability of sentience? I agree decisions about priors are often very arbitrary, and I worry they will have significantly different implications.
We favored a 1⁄6 prior for consciousness relative to every stance and we chose that fairly early in the process. To some extent, you can check the prior against what you update to on the basis of your evidence. Given an assignment of evidence strength and an opinion about what it should say about something that satisfies all of the indicators, you can backwards infer the prior needed to update to the right posterior. That prior is basically implicit in your choices about evidential strength. We didn’t explicitly set our prior this way, but we would probably have reconsidered our choice of 1⁄6 if it was giving really implausible results for humans, chickens, and ELIZA across the board.
The right conclusion would be that the respondants have no idea about the right exponent, or how to weight the various models because they would not be able to adequately justify their picks.
There is a tension here between producing probabilities we think are right and producing probabilities which could reasonably act as a consensus conclusion. I have my own favorite stance, and I think I have good reason for it, but I didn’t try to convince anyone to give it more weight in our aggregation. Insofar as we’re aiming in the direction of something that could achieve broad agreement, we don’t want to give too much weight to our own views (even if we think we’re right). Unfortunately,among people with significant expertise in this area, there is broad and fairly fundamental disagreement. We think that it is still valuable to shoot for consensus, even if that means everyone will think it is flawed (by giving too much weight to different stances.)
I have my own favorite stance, and I think I have good reason for it, but I didn’t try to convince anyone to give it more weight in our aggregation. Insofar as we’re aiming in the direction of something that could achieve broad agreement, we don’t want to give too much weight to our own views (even if we think we’re right).
To clarify, I do not have a view about which models should get more weight. I just think that, when results differ a lot across models, the top priority should be further research to decrease the uncertainty instead of acting based on a consensus view represented by best guesses for the weights of the models.
I would model the weights of the models as very wide distributions to represent very high model uncertainty.
In particular, I would model the weights of the stances as distributions instead of point estimates. As you note in the report, there was lots of variation across the 13 experts you surveyed
I wonder what exactly you asked the experts. I think the above would underestimate uncertainty if you just asked them to rate plausibility from 0 to 10, and there were experts reporting 0. Have you considered having a range of possible responses in a logarithmtic scale ranging from a weight/probability of e.g. 10^-6 to 1?
Thanks vasco. And thanks for helping us think through what we can do better. Some thoughts on this:
We considered several framings, scales and options to give experts. Since they were evaluating a lot of stances and we wanted experts to really know what we meant, we prioritised giving them context and then asking them the simplified general question of plausibility, with an intuitive scale. The exact question was: ‘how plausible do you find X stance?’, just after having fully describing X. We also asked them for general notes and comments and they didn’t seem to find that part of the survey particularly confusing (perhaps to your and my surprise). More broadly, I agree with you that sometimes perfectly defining terms and scales can help some people think through it but not everyone, and the science on how much it helps points is mixed.
We didn’t find that people were responding with zero plausibility very much at all. As you can see from the results, almost all respondents found most, if not all, stances at least a little bit plausible. I agree that had we found a lot of concentration around the very high or very low plausibility, having some sort of logarithmic scale could help distinguish results.
I’m not sure what you have in mind in terms of modelling the stances’ weight as distributions instead of point estimates. Perhaps you mean something like leveraging those distributions above via some sort of Monte Carlo where weights are drawn from these distributions and the process is repeated many times, then aggregated. That indeed sounds more sophisticated and could possibly help track uncertainty but I suspect it would very little difference. In particular, I think so because we observed that unweighted pooling of results across all stances is surprisingly similar to the pool when weighted by experts; the same if you squint.
We didn’t find that people were responding with zero plausibility very much at all.
I wonder how people decided between a plausibility of 0⁄10 and 1⁄10. It could be that people picked 0 for a plausibility lower than 0.5/10, or that they interpreted it as almost impossible, and therefore sometimes picked 1⁄10 even for a plausibility lower than 0.5/10. A logarithmic scale would allow experts to specify plausibilities much lower than 1⁄10 (e.g. 10^-6/10) without having to pick 0, although I do not know whether they would actually pick such values.
I’m not sure what you have in mind in terms of modelling the stances’ weight as distributions instead of point estimates. Perhaps you mean something like leveraging those distributions above via some sort of Monte Carlo where weights are drawn from these distributions and the process is repeated many times, then aggregated.
Yes, this is what I had in mind. Denoting by W_i and P_i the distributions for the weight and probability of consciousness for stance i, I would calculate the final distribution for the probability of consciousness from (W_1*P_1 + W_2*P_2 + … W_13*P_13)/(W_1 + W_2 + … W_13).
That indeed sounds more sophisticated and could possibly help track uncertainty but I suspect it would very little difference. In particular, I think so because we observed that unweighted pooling of results across all stances is surprisingly similar to the pool when weighted by experts; the same if you squint.
I think the mean of the final distribution for the probability of consciousness would be very similar. However, the final distribution would be more spread out. I do not know how much more spread out it would be, but I agree it would help track uncertainty better.
This is an important caveat. While our motivation for looking at consciousness is largely from its relation to moral status, we don’t think that establishing that AIs were conscious would entail that they have significant states that counted strongly one way or the other for our treatment of them, and establishing that they weren’t conscious wouldn’t entail that we should feel free to treat them however we like.
We think that it estimates of consciousness still play an important practical role. Work on AI consciousness may help us to achieve consensus on reasonable precautionary measures and motivate future research directions with a more direct upshot. I don’t think the results of this model can be directly plugged into any kind of BOTEC, and should be treated with care.
We favored a 1⁄6 prior for consciousness relative to every stance and we chose that fairly early in the process. To some extent, you can check the prior against what you update to on the basis of your evidence. Given an assignment of evidence strength and an opinion about what it should say about something that satisfies all of the indicators, you can backwards infer the prior needed to update to the right posterior. That prior is basically implicit in your choices about evidential strength. We didn’t explicitly set our prior this way, but we would probably have reconsidered our choice of 1⁄6 if it was giving really implausible results for humans, chickens, and ELIZA across the board.
There is a tension here between producing probabilities we think are right and producing probabilities which could reasonably act as a consensus conclusion. I have my own favorite stance, and I think I have good reason for it, but I didn’t try to convince anyone to give it more weight in our aggregation. Insofar as we’re aiming in the direction of something that could achieve broad agreement, we don’t want to give too much weight to our own views (even if we think we’re right). Unfortunately,among people with significant expertise in this area, there is broad and fairly fundamental disagreement. We think that it is still valuable to shoot for consensus, even if that means everyone will think it is flawed (by giving too much weight to different stances.)
Thanks, Derek.
To clarify, I do not have a view about which models should get more weight. I just think that, when results differ a lot across models, the top priority should be further research to decrease the uncertainty instead of acting based on a consensus view represented by best guesses for the weights of the models.
In particular, I would model the weights of the stances as distributions instead of point estimates. As you note in the report, there was lots of variation across the 13 experts you surveyed
I wonder what exactly you asked the experts. I think the above would underestimate uncertainty if you just asked them to rate plausibility from 0 to 10, and there were experts reporting 0. Have you considered having a range of possible responses in a logarithmtic scale ranging from a weight/probability of e.g. 10^-6 to 1?
Thanks vasco. And thanks for helping us think through what we can do better. Some thoughts on this:
We considered several framings, scales and options to give experts. Since they were evaluating a lot of stances and we wanted experts to really know what we meant, we prioritised giving them context and then asking them the simplified general question of plausibility, with an intuitive scale. The exact question was: ‘how plausible do you find X stance?’, just after having fully describing X. We also asked them for general notes and comments and they didn’t seem to find that part of the survey particularly confusing (perhaps to your and my surprise). More broadly, I agree with you that sometimes perfectly defining terms and scales can help some people think through it but not everyone, and the science on how much it helps points is mixed.
We didn’t find that people were responding with zero plausibility very much at all. As you can see from the results, almost all respondents found most, if not all, stances at least a little bit plausible. I agree that had we found a lot of concentration around the very high or very low plausibility, having some sort of logarithmic scale could help distinguish results.
I’m not sure what you have in mind in terms of modelling the stances’ weight as distributions instead of point estimates. Perhaps you mean something like leveraging those distributions above via some sort of Monte Carlo where weights are drawn from these distributions and the process is repeated many times, then aggregated. That indeed sounds more sophisticated and could possibly help track uncertainty but I suspect it would very little difference. In particular, I think so because we observed that unweighted pooling of results across all stances is surprisingly similar to the pool when weighted by experts; the same if you squint.
Thanks for clarifying, Arvo.
I wonder how people decided between a plausibility of 0⁄10 and 1⁄10. It could be that people picked 0 for a plausibility lower than 0.5/10, or that they interpreted it as almost impossible, and therefore sometimes picked 1⁄10 even for a plausibility lower than 0.5/10. A logarithmic scale would allow experts to specify plausibilities much lower than 1⁄10 (e.g. 10^-6/10) without having to pick 0, although I do not know whether they would actually pick such values.
Yes, this is what I had in mind. Denoting by W_i and P_i the distributions for the weight and probability of consciousness for stance i, I would calculate the final distribution for the probability of consciousness from (W_1*P_1 + W_2*P_2 + … W_13*P_13)/(W_1 + W_2 + … W_13).
I think the mean of the final distribution for the probability of consciousness would be very similar. However, the final distribution would be more spread out. I do not know how much more spread out it would be, but I agree it would help track uncertainty better.