PAVs and total views are different theories, so the comparisons are intertheoretic, by definition. Even if they agree on many rankings (in fixed population cases, say), they do so for different reasons. The value being compared is actually of a different kind, as total utilitarian value is non-comparative, but PA value is comparative.
So what I’m doing here is assigning categories such as “astronomically bad”, “very bad”, “bad”, “neutral”, “good” etc. to acts under different ethical views—which seems easy enough.
These vague categories might be useful and they do seem kind of intuitive to me, but
“Astronomically bad” effectively references the size of an affected population and hints at aggregation, so I’m not sure it’s a valid category at all for intertheoretic comparisons. Astronomically bad things are also not consistently worse than things that are not astronomically bad under all views, especially lexical views and some deontological views. You can have something which is astronomically bad on leximin (or another lexical view) due to an astronomically large (sub)population made worse off, but which is dominated by effects limited to a small (sub)population in another outcome that’s not astronomically bad. Astronomically bad might still be okay to use for person-affecting utilitarianism (PAU) vs total utilitarianism, though.
“Infinitely bad” (or “infinitely bad of a certain cardinality”) could be used to a similar effect, making lexical views dominate over classical utilitarianism (unless you use lexically “amplified” versions of classical utilitarianism, too). Things can break down if we have infinitely many different lexical thresholds, though, since there might not be a common scale to put them on if the thresholds’ orders are incompatible, but if we allow pairwise comparisons at least where there are only finitely many thresholds, we’d still have classical utilitarianism dominated by lexical threshold utilitarian views with finitely many lexical thresholds, and when considering them all together, this (I would guess) effectively gives us leximin, anyway.
These kinds of intuitive vague categories aren’t precise enough to fix exactly one normalization for each theory for the purpose of maximizing some kind of expected value over and across theories, and the results will be sensitive to which normalizations are chosen, which will also be basically somewhat arbitrary. If you used precise categories, you’d still have arbitrariness to deal with in assigning to categories on each view.
Comparisons between theories A and B, theories B and C and theories A and C might not be consistent with each other, unless you find a single common scale for all three theories. This limits what kinds of categories you can use to those that are universally applicable if you want to take expected values across all theories at once. You also still need the categories and the theories to be basically roughly cardinally (ratio scale) interpretable to use expected values across theories with intertheoretic comparisons, but some theories are not cardinally interpretable at all.
Vague categories like “very bad” that don’t reference objective cardinal numbers (even imprecisely) will probably not be scope-sensitive in a way that makes the total view dominate over PAVs. On a PAV according to which death is bad, killing 50% of people would plausibly hit the highest category, or near it. The gaps between the categories won’t be clear or even necessarily consistent across theories. So, I think you really need to reference cardinal numbers in these categories if you want the total view to dominate PAVs with this kind of approach.
Expected values don’t even make sense on some theories, those which are not cardinally interpretable, so it’s weird to entertain such theories and therefore the possibility that expected value reasoning is wrong, and then force them into an expected value framework anyway. If you entertain the possibility of expected value reasoning being wrong at the normative level, you should probably do so for handling moral uncertainty, too.
Some comparisons really seem to be pretty arbitrary. Consider weak negative hedonistic total utilitarianism vs classical utilitarianism, where under the weak NU view, pleasure matters 1/X times as much as suffering, or suffering matters X times more than pleasure. There are at least two possible normalizations here: a. suffering matters equally on each view, but pleasure matters X times less on weak NU view than on CU, and b. pleasure matters equally on each view, but suffering matters X times more on the weak NU view relative to pleasure on each view. When X is large enough, the vague intuitive categories probably won’t work, and you need some way to resolve this problem. If you include both comparisons, then you’re effectively splitting one of the views into two with different cardinal strengths. To me, this undermines intertheoretic comparisons if you have two different views which make exactly the same recommendations and for (basically) the same reasons, but have different cardinal strengths. Where do these differences in cardinal strengths come from? MacAskill, Bykvist and Ord call these “amplifications” of theories in their book, and I think suggest that they will come from some universal absolute scale common across theories (chapter 6 , section VII), but they don’t explain where this scale actually comes from.
My understanding is that those who support such intertheoretic comparisons only do so in limited cases anyway and so would want to combine them with another approach where intertheoretic comparisons aren’t justified. My impression is also that using intertheoretic comparisons but saying nothing when intertheoretic comparisons aren’t justified is the least general/applicable approach of those typically discussed, because it requires ratio-scale comparisons. You can use variance voting with interval-scale comparisons, and you can basically always use moral parliament or “my favourite theory”.
Some of the above objections are similar to those in this chapter by MacAskill, Bykvist and Ord, and the book generally.
PAVs and total views are different theories, so the comparisons are intertheoretic, by definition. Even if they agree on many rankings (in fixed population cases, say), they do so for different reasons. The value being compared is actually of a different kind, as total utilitarian value is non-comparative, but PA value is comparative.
These vague categories might be useful and they do seem kind of intuitive to me, but
“Astronomically bad” effectively references the size of an affected population and hints at aggregation, so I’m not sure it’s a valid category at all for intertheoretic comparisons. Astronomically bad things are also not consistently worse than things that are not astronomically bad under all views, especially lexical views and some deontological views. You can have something which is astronomically bad on leximin (or another lexical view) due to an astronomically large (sub)population made worse off, but which is dominated by effects limited to a small (sub)population in another outcome that’s not astronomically bad. Astronomically bad might still be okay to use for person-affecting utilitarianism (PAU) vs total utilitarianism, though.
“Infinitely bad” (or “infinitely bad of a certain cardinality”) could be used to a similar effect, making lexical views dominate over classical utilitarianism (unless you use lexically “amplified” versions of classical utilitarianism, too). Things can break down if we have infinitely many different lexical thresholds, though, since there might not be a common scale to put them on if the thresholds’ orders are incompatible, but if we allow pairwise comparisons at least where there are only finitely many thresholds, we’d still have classical utilitarianism dominated by lexical threshold utilitarian views with finitely many lexical thresholds, and when considering them all together, this (I would guess) effectively gives us leximin, anyway.
These kinds of intuitive vague categories aren’t precise enough to fix exactly one normalization for each theory for the purpose of maximizing some kind of expected value over and across theories, and the results will be sensitive to which normalizations are chosen, which will also be basically somewhat arbitrary. If you used precise categories, you’d still have arbitrariness to deal with in assigning to categories on each view.
Comparisons between theories A and B, theories B and C and theories A and C might not be consistent with each other, unless you find a single common scale for all three theories. This limits what kinds of categories you can use to those that are universally applicable if you want to take expected values across all theories at once. You also still need the categories and the theories to be basically roughly cardinally (ratio scale) interpretable to use expected values across theories with intertheoretic comparisons, but some theories are not cardinally interpretable at all.
Vague categories like “very bad” that don’t reference objective cardinal numbers (even imprecisely) will probably not be scope-sensitive in a way that makes the total view dominate over PAVs. On a PAV according to which death is bad, killing 50% of people would plausibly hit the highest category, or near it. The gaps between the categories won’t be clear or even necessarily consistent across theories. So, I think you really need to reference cardinal numbers in these categories if you want the total view to dominate PAVs with this kind of approach.
Expected values don’t even make sense on some theories, those which are not cardinally interpretable, so it’s weird to entertain such theories and therefore the possibility that expected value reasoning is wrong, and then force them into an expected value framework anyway. If you entertain the possibility of expected value reasoning being wrong at the normative level, you should probably do so for handling moral uncertainty, too.
Some comparisons really seem to be pretty arbitrary. Consider weak negative hedonistic total utilitarianism vs classical utilitarianism, where under the weak NU view, pleasure matters 1/X times as much as suffering, or suffering matters X times more than pleasure. There are at least two possible normalizations here: a. suffering matters equally on each view, but pleasure matters X times less on weak NU view than on CU, and b. pleasure matters equally on each view, but suffering matters X times more on the weak NU view relative to pleasure on each view. When X is large enough, the vague intuitive categories probably won’t work, and you need some way to resolve this problem. If you include both comparisons, then you’re effectively splitting one of the views into two with different cardinal strengths. To me, this undermines intertheoretic comparisons if you have two different views which make exactly the same recommendations and for (basically) the same reasons, but have different cardinal strengths. Where do these differences in cardinal strengths come from? MacAskill, Bykvist and Ord call these “amplifications” of theories in their book, and I think suggest that they will come from some universal absolute scale common across theories (chapter 6 , section VII), but they don’t explain where this scale actually comes from.
My understanding is that those who support such intertheoretic comparisons only do so in limited cases anyway and so would want to combine them with another approach where intertheoretic comparisons aren’t justified. My impression is also that using intertheoretic comparisons but saying nothing when intertheoretic comparisons aren’t justified is the least general/applicable approach of those typically discussed, because it requires ratio-scale comparisons. You can use variance voting with interval-scale comparisons, and you can basically always use moral parliament or “my favourite theory”.
Some of the above objections are similar to those in this chapter by MacAskill, Bykvist and Ord, and the book generally.