Assuming that future agents are mostly indifferent towards the welfare of their “tools”, their actions would affect powerless beings only via (in expectation random) side-effects. It is thus relevant to know the “default” level of welfare of powerless beings.
By “in expectation random”, do you mean 0 in expectation? I think there are reasons to expect the effect to be negative (individually), based on our treatment of nonhuman animals. Our indifference to chicken welfare has led to severe deprivation in confinement, more cannibalism in open but densely packed systems, the spread of diseases, artificial selection causing chronic pain and other health issues, and live boiling. I expect chickens’ wild counterparts (red jungle fowls) to have greater expected utility, individually, and plausibly positive EU (from a classical hedonistic perspective, although I’m not sure either way). Optimization for productivity seems usually to come at the cost of individual welfare.
Even for digital sentience, if designed with the capacity to suffer—regardless of our intentions and their “default” level of welfare, and especially if we mistakenly believe them not to be sentient—we might expect their levels of welfare to decrease as we demand more from them, since there’s not enough instrumental value for us to recalibrate their affective responses or redesign them with higher welfare. The conditions in which they are used may become significantly harsher than the conditions for which they were initially designed.
It’s also very plausible that many of our digital sentiences will be designed through evolutionary/genetic algorithms or other search algorithms that optimize for some performance (“fitness”) metric, and because of how expensive these approaches are computationally, we may be likely to reuse the digitial sentiences with only minor adjustments outside of the environments for which they were optimized. This is already being done for deep neural networks now.
Similarly, we might expect more human suffering (individually) from AGI with goals orthogonal to our welfare, an argument against positive expected human welfare.
By “in expectation random”, do you mean 0 in expectation?
Yes, that’s what we meant.
I am not sure I understand your argument. You seem to say the following:
Post-humans will put “sentient tools” into harsher conditions than the ones the tools were optimized for.
If “sentient tools” are put into these conditions, their welfare decreases (compared with the situations they were optimized for).
My answer: The complete “side-effects” (in the meaning of the article) on sentient tools comprises bringing them into existence and using them. The relevant question seems to be if this package is positive or negative, compared to the counterfactual (no sentient tools). Humanity might bring sentient tools into conditions that are worse for the tools than the conditions they were optimized for. Even these conditions might still be overall positive.
Apart from that, I am not sure if the two assumptions listed as bullet points above will actually hold for the majority of “sentient tools”. I think that we know very little about the way tools will be created and used in the far future, which was one reason for assuming “zero in expectation” side-effects.
Isn’t it equally justified to assume that their welfare in the conditions they were originally optimized/designed for is 0 in expectation? If anything, it makes more sense to me to make assumptions about this setting first, since it’s easier to understand their motivations and experiences in this setting based on their value for the optimization process.
Apart from that, I am not sure if the two assumptions listed as bullet points above will actually hold for the majority of “sentient tools”.
We can ignore any set of tools that has zero total wellbeing in expectation; what’s left could still dominate the expected value of the future. We can look at sets of sentient tools that we might think could be biased towards positive or negative average welfare:
1. the set of sentient tools used in harsher conditions,
2. the set used in better conditions,
3. the set optimized for pleasure, and
4. the set optimized for pain.
Of course, there are many other sets of interest, and they aren’t all mutually exclusive.
The expected value of the future could be extremely sensitive to beliefs about these sets (their sizes and average welfares). (And this could be a reason to prioritize moral circle expansion instead.)
These are all very good points. I agree that this part of the article is speculative, and you could easily come to a different conclusion.
Overall, I still think that this argument alone (part 1.2 of the article) points into the direction of extinction risk reduction being positive. Although the conclusion does depend on the “default level of welfare of sentient tools” that we are discussing in this thread, it more critically depends on whether future agents’ preferences will be aligned with ours.
But I never gave this argument (part 1.2) that much weight anyway. I think that the arguments later in that article (part 2 onwards, I listed them in my answer to Jacy’s comment) are more robust and thus more relevant. So maybe I somewhat disagree with your statement:
The expected value of the future could be extremely sensitive to beliefs about these sets (their sizes and average welfares). (And this could be a reason to prioritize moral circle expansion instead.)
To some degree this statement is, of course, true. The uncertainty gives some reason to deprioritize extinction risk reduction. But: The expected value of the future (with (post-) humanity) might be quite sensitive to these beliefs, but the expected value of extinction risk reduction efforts is not the same as the expected value of the future. You also need to consider what would happen if humanity goes extinct (non-human animals, S-risks by omission), non-extinction long-term effects of global catastrophes, option value,… (see my comments to Jacy). So the question of whether to prioritize moral circle expansion is maybe not extremely sensitive to “beliefs about these sets [of sentient tools]”.
By “in expectation random”, do you mean 0 in expectation? I think there are reasons to expect the effect to be negative (individually), based on our treatment of nonhuman animals. Our indifference to chicken welfare has led to severe deprivation in confinement, more cannibalism in open but densely packed systems, the spread of diseases, artificial selection causing chronic pain and other health issues, and live boiling. I expect chickens’ wild counterparts (red jungle fowls) to have greater expected utility, individually, and plausibly positive EU (from a classical hedonistic perspective, although I’m not sure either way). Optimization for productivity seems usually to come at the cost of individual welfare.
Even for digital sentience, if designed with the capacity to suffer—regardless of our intentions and their “default” level of welfare, and especially if we mistakenly believe them not to be sentient—we might expect their levels of welfare to decrease as we demand more from them, since there’s not enough instrumental value for us to recalibrate their affective responses or redesign them with higher welfare. The conditions in which they are used may become significantly harsher than the conditions for which they were initially designed.
It’s also very plausible that many of our digital sentiences will be designed through evolutionary/genetic algorithms or other search algorithms that optimize for some performance (“fitness”) metric, and because of how expensive these approaches are computationally, we may be likely to reuse the digitial sentiences with only minor adjustments outside of the environments for which they were optimized. This is already being done for deep neural networks now.
Similarly, we might expect more human suffering (individually) from AGI with goals orthogonal to our welfare, an argument against positive expected human welfare.
Hi Michael,
Yes, that’s what we meant.
I am not sure I understand your argument. You seem to say the following:
Post-humans will put “sentient tools” into harsher conditions than the ones the tools were optimized for.
If “sentient tools” are put into these conditions, their welfare decreases (compared with the situations they were optimized for).
My answer: The complete “side-effects” (in the meaning of the article) on sentient tools comprises bringing them into existence and using them. The relevant question seems to be if this package is positive or negative, compared to the counterfactual (no sentient tools). Humanity might bring sentient tools into conditions that are worse for the tools than the conditions they were optimized for. Even these conditions might still be overall positive.
Apart from that, I am not sure if the two assumptions listed as bullet points above will actually hold for the majority of “sentient tools”. I think that we know very little about the way tools will be created and used in the far future, which was one reason for assuming “zero in expectation” side-effects.
Isn’t it equally justified to assume that their welfare in the conditions they were originally optimized/designed for is 0 in expectation? If anything, it makes more sense to me to make assumptions about this setting first, since it’s easier to understand their motivations and experiences in this setting based on their value for the optimization process.
We can ignore any set of tools that has zero total wellbeing in expectation; what’s left could still dominate the expected value of the future. We can look at sets of sentient tools that we might think could be biased towards positive or negative average welfare:
1. the set of sentient tools used in harsher conditions,
2. the set used in better conditions,
3. the set optimized for pleasure, and
4. the set optimized for pain.
Of course, there are many other sets of interest, and they aren’t all mutually exclusive.
The expected value of the future could be extremely sensitive to beliefs about these sets (their sizes and average welfares). (And this could be a reason to prioritize moral circle expansion instead.)
These are all very good points. I agree that this part of the article is speculative, and you could easily come to a different conclusion.
Overall, I still think that this argument alone (part 1.2 of the article) points into the direction of extinction risk reduction being positive. Although the conclusion does depend on the “default level of welfare of sentient tools” that we are discussing in this thread, it more critically depends on whether future agents’ preferences will be aligned with ours.
But I never gave this argument (part 1.2) that much weight anyway. I think that the arguments later in that article (part 2 onwards, I listed them in my answer to Jacy’s comment) are more robust and thus more relevant. So maybe I somewhat disagree with your statement:
To some degree this statement is, of course, true. The uncertainty gives some reason to deprioritize extinction risk reduction. But: The expected value of the future (with (post-) humanity) might be quite sensitive to these beliefs, but the expected value of extinction risk reduction efforts is not the same as the expected value of the future. You also need to consider what would happen if humanity goes extinct (non-human animals, S-risks by omission), non-extinction long-term effects of global catastrophes, option value,… (see my comments to Jacy). So the question of whether to prioritize moral circle expansion is maybe not extremely sensitive to “beliefs about these sets [of sentient tools]”.