I don’t think it’s obvious that this is in expectation negative. I’m not at all confident that negative valence is easier to induce than positive valence today (though I think it’s probablytrue), but conditional upon that being true, I also think it’s a weird quirk of biology that negative valence may be more common than positive valence in evolved animals. Naively I would guess that the experiences of tool AI (that we may wrongly believe to not be sentient, or are otherwise callous towards) is in expectation zero. However, this may be enough for hedonic utilitarians with a moderate negative lean (3-10x, say) to believe that suffering overrides happiness in those cases.
It might be 0 in expectation to a classical utilitarian in the conditions for which they are adapted, but I expect it to go negative if the tools are initially developed through evolution (or some other optimization algorithm for design) and RL (for learning and individual behaviour optimization), and then used in different conditions. Think of “sweet spots”: if you raise temperatures, that leads to more deaths by hyperthermia, but if you decrease temperatures, more deaths by hypothermia. Furry animals have been selected to have the right amount of fur for the temperatures they’re exposed to, and sentient tools may be similarly adapted. I think optimization algorithms will tend towards local maxima like this (although by local maxima here, I mean with respect to conditions, while the optimization algorithm is optimizing genes; I don’t have a rigorous proof connecting the two).
On the other hand, environmental conditions which are good to change in one direction and bad in the other should cancel in expectation when making a random change (with a uniform prior), and conditions that lead to improvement in each direction don’t seem stable (or maybe I just can’t even think of any), so are less likely than conditions which are bad to change in each direction. I.e. is there any kind of condition such that a change in each direction is positive? Like increasing the temperature and decreasing the temperature are both good?
This is also a (weak) theoretical argument that wild animal welfare is negative on average, because environmental conditions are constantly changing.
It might be 0 in expectation to a classical utilitarian in the conditions for which they are adapted, but I expect it to go negative if the tools are initially developed through evolution (or some other optimization algorithm for design) and RL (for learning and individual behaviour optimization), and then used in different conditions. Think of “sweet spots”: if you raise temperatures, that leads to more deaths by hyperthermia, but if you decrease temperatures, more deaths by hypothermia. Furry animals have been selected to have the right amount of fur for the temperatures they’re exposed to, and sentient tools may be similarly adapted. I think optimization algorithms will tend towards local maxima like this (although by local maxima here, I mean with respect to conditions, while the optimization algorithm is optimizing genes; I don’t have a rigorous proof connecting the two).
On the other hand, environmental conditions which are good to change in one direction and bad in the other should cancel in expectation when making a random change (with a uniform prior), and conditions that lead to improvement in each direction don’t seem stable (or maybe I just can’t even think of any), so are less likely than conditions which are bad to change in each direction. I.e. is there any kind of condition such that a change in each direction is positive? Like increasing the temperature and decreasing the temperature are both good?
This is also a (weak) theoretical argument that wild animal welfare is negative on average, because environmental conditions are constantly changing.
Fair enough on the rest.