Thanks for your discussion of the Moral Weight Project’s methodology, Carl. (And to everyone else for the useful back-and-forth!) We have some thoughts about this important issue and we’re keen to write more about it. Perhaps 2024 will provide the opportunity!
For now, we’ll just make one brief point, which is that it’s important to separate two questions. The first concerns the relevance of the two envelopes problem to the Moral Weight Project. The second concerns alternative ways of generating moral weights. We considered the two envelopes problem at some length when we were working on the Moral Weight Project and concluded that our approach was still worth developing. We’d be glad to revisit this and appreciate the challenge to the methodology.
However, even if it turns out that the methodology has issues, it’s an open question how best to proceed. We grant the possibility that, as you suggest, more neurons = more compute = the possibility of more intense pleasures and pains. But it’s also possible that more neurons = more intelligence = less biological need for intense pleasures and pains, as other cognitive abilities can provide the relevant fitness benefits, effectively muting the intensities of those states. Or perhaps there’s some very low threshold of cognitive complexity for sentience after which point all variation in behavior is due to non-hedonic capacities. Or perhaps cardinal interpersonal utility comparisons are impossible. And so on. In short, while it’s true that there are hypotheses on which elephants have massively more intense pains than fruit flies, there are also hypotheses on which the opposite is true and on which equality is (more or less) true. Once we account for all these hypotheses, it may still work out that elephants and fruit flies differ by a few orders of magnitude in expectation, but perhaps not by five or six. Presumably, we should all want some approach, whatever it is, that avoids being mugged by whatever low-probability hypothesis posits the largest difference between humans and other animals.
That said, you’ve raised some significant concerns about methods that aggregate over different relative scales of value. So, we’ll be sure to think more about the degree to which this is a problem for the work we’ve done—and, if it is, how much it would change the bottom line.
I agree that I also am disagreeing on the object-level, as Michael made clear with his comments (I do not think I am talking about a tiny chance, although I do not think the RP discussions characterized my views as I would), and some other methodological issues besides two-envelopes (related to the object-level ones). E.g. I would not want to treat a highly networked AI mind (with billions of bodies and computation directing them in a unified way, on the scale of humanity) as a millionth or a billionth of the welfare of the same set of robots and computations with less integration (and overlap of shared features, or top-level control), ceteris paribus.
Indeed, I would be wary of treating the integrated mind as though welfare stakes for it were half or a tenth as great, seeing that as a potential source of moral catastrophe, like ignoring the welfare of minds not based on proteins. E.g. having tasks involving suffering and frustration done by large integrated minds, and pleasant ones done by tiny minds, while increasing the amount of mental activity in the former. It sounds like the combination of object-level and methodological takes attached to these reports would favor ignoring almost completely the integrated mind.
Incidentally, in a world where small animals are being treated extremely badly and are numerous, I can see a temptation to err in their favor, since even overestimates of their importance could be shifting things in the right marginal policy direction. But thinking about the potential moral catastrophes on the other side helps sharpen the motivation to get it right.
In practice, I don’t prioritize moral weights issues in my work, because I think the most important decisions hinging on it will be in an era with AI-aided mature sciences of mind, philosophy and epistemology. And as I have written regardless of your views about small minds and large minds, it won’t be the case that e.g. humans are utility monsters of impartial hedonism (rather than something bigger, smaller, or otherwise different), and grounds for focusing on helping humans won’t be terminal impartial hedonistic in nature. But from my viewpoint baking in that integration (and unified top-level control or mental overlap of some parts of computation) close to eliminates mentality or welfare (vs less integrated collections of computations) seems bad in non-Pascalian fashion.
FWIW, I think something like conscious subsystems (in huge numbers in one neural network) is more plausible by design in future AI. It just seems unlikely in animals because all of the apparent subjective value seems to happen at roughly the highest level where everything is integrated in an animal brain.
Felt desire seems to (largely) be motivational salience, a top-down/voluntary attention control function driven by high-level interpretations of stimuli (e.g. objects, social situations), so relatively late in processing. Similarly, hedonic states depend on high-level interpretations, too.
Or, according to Attention Schema Theory, attention models evolved for the voluntary control of attention. It’s not clear what the value would be for an attention model at lower levels of organization before integration.
And evolution will select against realizing functions unnecessarily if they have additional costs, so we should provide a positive argument for the necessary functions being realized earlier or multiple times in parallel that overcomes or doesn’t incur such additional costs.
So, it’s not that integration necessarily reduces value; it’s that, in animals, all the morally valuable stuff happens after most of the integration, and apparently only once or in small number.
In artificial systems, the morally valuable stuff could instead be implemented separately by design at multiple levels.
EDIT:
I think there’s still crux about whether realizing the same function the same number of times but “to a greater degree” makes it more morally valuable. I think there are some ways of “to a greater degree” that don’t matter, and some that could. If it’s only sort of (vaguely) true that a system is realizing a certain function, or it realizes some but not all of the functions possibly necessary for some type of welfare in humans, then we might discount it for only meeting lower precisifications of the vague standards. But adding more neurons just doing the same things:
doesn’t make it more true that it realizes the function or the type of welfare (e.g. adding more neurons to my brain wouldn’t make it more true that I can suffer),
doesn’t clearly increase welfare ranges, and
doesn’t have any other clear reason for why it should make a moral difference (I think you disagree with this, based on your examples).
But maybe we don’t actually need good specific reasons to assign non-tiny probabilities to neuron count scaling for 2 or 3, and then we get domination of neuron count scaling in expectation, depending on what we’re normalizing by, like you suggest.
Thanks for your discussion of the Moral Weight Project’s methodology, Carl. (And to everyone else for the useful back-and-forth!) We have some thoughts about this important issue and we’re keen to write more about it. Perhaps 2024 will provide the opportunity!
For now, we’ll just make one brief point, which is that it’s important to separate two questions. The first concerns the relevance of the two envelopes problem to the Moral Weight Project. The second concerns alternative ways of generating moral weights. We considered the two envelopes problem at some length when we were working on the Moral Weight Project and concluded that our approach was still worth developing. We’d be glad to revisit this and appreciate the challenge to the methodology.
However, even if it turns out that the methodology has issues, it’s an open question how best to proceed. We grant the possibility that, as you suggest, more neurons = more compute = the possibility of more intense pleasures and pains. But it’s also possible that more neurons = more intelligence = less biological need for intense pleasures and pains, as other cognitive abilities can provide the relevant fitness benefits, effectively muting the intensities of those states. Or perhaps there’s some very low threshold of cognitive complexity for sentience after which point all variation in behavior is due to non-hedonic capacities. Or perhaps cardinal interpersonal utility comparisons are impossible. And so on. In short, while it’s true that there are hypotheses on which elephants have massively more intense pains than fruit flies, there are also hypotheses on which the opposite is true and on which equality is (more or less) true. Once we account for all these hypotheses, it may still work out that elephants and fruit flies differ by a few orders of magnitude in expectation, but perhaps not by five or six. Presumably, we should all want some approach, whatever it is, that avoids being mugged by whatever low-probability hypothesis posits the largest difference between humans and other animals.
That said, you’ve raised some significant concerns about methods that aggregate over different relative scales of value. So, we’ll be sure to think more about the degree to which this is a problem for the work we’ve done—and, if it is, how much it would change the bottom line.
Thank you for the comment Bob.
I agree that I also am disagreeing on the object-level, as Michael made clear with his comments (I do not think I am talking about a tiny chance, although I do not think the RP discussions characterized my views as I would), and some other methodological issues besides two-envelopes (related to the object-level ones). E.g. I would not want to treat a highly networked AI mind (with billions of bodies and computation directing them in a unified way, on the scale of humanity) as a millionth or a billionth of the welfare of the same set of robots and computations with less integration (and overlap of shared features, or top-level control), ceteris paribus.
Indeed, I would be wary of treating the integrated mind as though welfare stakes for it were half or a tenth as great, seeing that as a potential source of moral catastrophe, like ignoring the welfare of minds not based on proteins. E.g. having tasks involving suffering and frustration done by large integrated minds, and pleasant ones done by tiny minds, while increasing the amount of mental activity in the former. It sounds like the combination of object-level and methodological takes attached to these reports would favor ignoring almost completely the integrated mind.
Incidentally, in a world where small animals are being treated extremely badly and are numerous, I can see a temptation to err in their favor, since even overestimates of their importance could be shifting things in the right marginal policy direction. But thinking about the potential moral catastrophes on the other side helps sharpen the motivation to get it right.
In practice, I don’t prioritize moral weights issues in my work, because I think the most important decisions hinging on it will be in an era with AI-aided mature sciences of mind, philosophy and epistemology. And as I have written regardless of your views about small minds and large minds, it won’t be the case that e.g. humans are utility monsters of impartial hedonism (rather than something bigger, smaller, or otherwise different), and grounds for focusing on helping humans won’t be terminal impartial hedonistic in nature. But from my viewpoint baking in that integration (and unified top-level control or mental overlap of some parts of computation) close to eliminates mentality or welfare (vs less integrated collections of computations) seems bad in non-Pascalian fashion.
(Speaking for myself only.)
FWIW, I think something like conscious subsystems (in huge numbers in one neural network) is more plausible by design in future AI. It just seems unlikely in animals because all of the apparent subjective value seems to happen at roughly the highest level where everything is integrated in an animal brain.
Felt desire seems to (largely) be motivational salience, a top-down/voluntary attention control function driven by high-level interpretations of stimuli (e.g. objects, social situations), so relatively late in processing. Similarly, hedonic states depend on high-level interpretations, too.
Or, according to Attention Schema Theory, attention models evolved for the voluntary control of attention. It’s not clear what the value would be for an attention model at lower levels of organization before integration.
And evolution will select against realizing functions unnecessarily if they have additional costs, so we should provide a positive argument for the necessary functions being realized earlier or multiple times in parallel that overcomes or doesn’t incur such additional costs.
So, it’s not that integration necessarily reduces value; it’s that, in animals, all the morally valuable stuff happens after most of the integration, and apparently only once or in small number.
In artificial systems, the morally valuable stuff could instead be implemented separately by design at multiple levels.
EDIT:
I think there’s still crux about whether realizing the same function the same number of times but “to a greater degree” makes it more morally valuable. I think there are some ways of “to a greater degree” that don’t matter, and some that could. If it’s only sort of (vaguely) true that a system is realizing a certain function, or it realizes some but not all of the functions possibly necessary for some type of welfare in humans, then we might discount it for only meeting lower precisifications of the vague standards. But adding more neurons just doing the same things:
doesn’t make it more true that it realizes the function or the type of welfare (e.g. adding more neurons to my brain wouldn’t make it more true that I can suffer),
doesn’t clearly increase welfare ranges, and
doesn’t have any other clear reason for why it should make a moral difference (I think you disagree with this, based on your examples).
But maybe we don’t actually need good specific reasons to assign non-tiny probabilities to neuron count scaling for 2 or 3, and then we get domination of neuron count scaling in expectation, depending on what we’re normalizing by, like you suggest.