Thanks for your reply! :)
I think that in practice no one does A.
This is true, but we could all be mistaken. This doesn’t seem unlikely to me, considering that our brains simply were not built to handle such incredibly small probabilities and incredibly large magnitudes of disutility. That said, I won’t practically bite the bullet, any more than people who would choose torture over dust specks probably do, or any more than pure impartial consequentialists truly sacrifice all their own frivolities for altruism. (This latter case is often excused as just avoiding burnout, but I seriously doubt the level of self-indulgence of the average consequentialist EA, myself included, is anywhere close to altruistically optimal.)
In general—and this is something I seem to disagree with many in this community about—I think following your ethics or decision theory through to its honest conclusions tends to make more sense than assuming the status quo is probably close to optimal. There is of course some reflective equilibrium involved here; sometimes I do revise my understanding of the ethical/decision theory.
This is similar to how you might dismiss this proof that 1+1=3 even if you cannot see the error.
To the extent that I assign nonzero probability to mathematically absurd statements (based on precedents like these), I don’t think there’s very high disutility in acting as if 1+1=2 in a world where it’s actually true that 1+1=3. But that could be a failure of my imagination.
It is however a bit of a dissatisfying answer as it is not very rigorous, it is unclear when a conclusion is so absurd as to require outright objection.
This is basically my response. I think there’s some meaningful distinction between good applications of reductio ad absurdum and relatively hollow appeals to “common sense,” though, and the dismissal of Pascal’s mugging strikes me as more the latter.
For example you could worry about future weapons technology that could destroy the world and try to explore what this would look like – but you can safely say it is very unlikely to look like your explorations.
I’m not sure I follow how this helps. People who accept giving into Pascal’s mugger don’t dispute that the very bad scenario in question is “very unlikely.”
This might allow you to avoid the pascal mugger and invest appropriate time into more general more flexible evil wizard protection.
I think you might be onto something here, but I’d need the details fleshed out because I don’t quite understand the claim.
I don’t call the happiness itself “slight,” I call it “slightly more” than the suffering (edit: and also just slightly more than the happiness per person in world A). I acknowledge the happiness is tremendous. But it comes along with just barely less tremendous suffering. If that’s not morally compelling to you, fine, but really the point is that there appears (to me at least) to be quite a strong moral distinction between 1,000,001 happiness minus 1,000,000 suffering, and 1 happiness.
The Repugnant Conclusion is worse than I thought
At the risk of belaboring the obvious to anyone who has considered this point before: The RC glosses over the exact content of happiness and suffering that are summed up to the quantities of “welfare” defining world A and world Z. In world A, each life with welfare 1,000,000 could, on one extreme, consist purely of (a) good experiences that sum in intensity to a level 1,000,000, or on the other, (b) good experiences summing to 1,000,000,000 minus bad experiences summing (in absolute value) to 999,000,000. Similarly, each of the lives of welfare 1 in world Z could be (a) purely level 1 good experiences, or (b) level 1,000,001 good experiences minus level 1,000,000 bad experiences.
To my intuitions, it’s pretty easy to accept the RC if our conception of worlds A and Z is the pair (a, a) from the (of course non-exhaustive) possibilities above, even more so for (b, a). However, the RC is extremely unpalatable if we consider the pair (a, b). This conclusion, which is entailed by any plausible non-negative total utilitarian view, is that a world of tremendous happiness with absolutely no suffering is worse than a world of many beings each experiencing just slightly more happiness than those in the first, but along with tremendous agony.
To drive home how counterintuitive that is, we can apply the same reasoning often applied against NU views: Suppose the level 1,000,001 happiness in each being in world Z is compressed into one millisecond of some super-bliss, contained within a life of otherwise unremitting misery. There doesn’t appear to be any temporal ordering of the experiences of each life in world Z such that this conclusion isn’t morally absurd to me. (Going out with a bang sounds nice, but not nice enough to make the preceding pure misery worth it; remember this is a millisecond!) This is even accounting for the possible scope neglect involved in considering the massive number of lives in world Z. Indeed, multiplying these lives seems to make the picture more horrifying, not less.
Again, at the risk of sounding obvious: The repugnance of the RC here is that on total non-NU axiologies, we’d be forced to consider the kind of life I just sketched a “net-positive” life morally speaking. Worse, we’re forced to consider an astronomical number of such lives better than a (comparatively small) pure utopia.
 “Negative” here includes lexical and lexical threshold views.
 I’m setting aside possible defenses based on the axiological importance of duration. This is because (1) I’m quite uncertain about that point, though I share the intuition, and (2) it seems any such defense rescues NU just as well. I.e. one can, under this principle, maintain that 1 hour of torture-level suffering is impossible to morally outweigh, but 1 millisecond isn’t.
At the time I thought I was explaining [Pascal’s mugging] badly but reading more on this topic I think it is just a non-problem: it only appears to be a problem to those whose only decision making tool is an expected value calculation.
This is quite a strong claim IMO. Could you explain exactly which other decision making tool(s) you would apply to Pascal’s mugging that makes it not a problem? The descriptions of the tools in stories 1 and 2 are too vague for me to clearly see how they’d apply here.
Indeed, if anything, some of those tools strengthen the case for giving into Pascal’s mugging. E.g. “developing a set of internally consistent descriptions of future events based on each uncertainty, then developing plans that are robust to all options”: if you can’t reasonably rule out the possibility that the mugger is telling the truth, paying the mugger seems a lot more robust. Ruling out that possibility in the literal thought experiment doesn’t seem obviously counterintuitive to me, but the standard stories for x- and s-risks don’t seem so absurd that you can treat them as probability 0 (more on this below). Appealing to the possibility that one’s model is just wrong, which does cut against naive EV calculations, doesn’t seem to help here.
I can imagine a few candidates, but none seem satisfactory to me:
“Very small probabilities should just be rounded down to zero.” I can’t think of a principled basis for selecting the threshold for a “very small” probability, at least not one that doesn’t subject us to absurd conclusions like that you shouldn’t wear a seatbelt because probabilities of car crashes are very low. This rule also seems contrary to maximin robustness.
“Very high disutilities are practically impossible.” I simply don’t see sufficiently strong evidence in favor of this to outweigh the high disutility conditional on the mugger telling the truth. If you want to say my reply is just smuggling expected value reasoning in through the backdoor, well, I don’t really consider this a counterargument. Declaring a hard rule like this one, which treats some outcomes as impossible absent a mathematical or logical argument, seems epistemically hubristic and is again contrary to robustness.
“Don’t do anything that extremely violates common sense.” Intuitive, but I don’t think we should expect our common sense to be well-equipped to handle situations involving massive absolute values of (dis)utility.
Do you think this is highly implausible even if you account for:
the opportunities to reduce other people’s extreme suffering that a person committing suicide would forego,
the extreme suffering of one’s loved ones this would probably increase,
plausible views of personal identity on which risking the extreme suffering of one’s future self is ethically similar to, if not the same as, risking it for someone else,
relatedly, views of probability where the small measure of worlds with a being experiencing extreme suffering are as “real” as the large measure without, and
the fact that even non-negative utilitarian views will probably consider some forms of suffering so bad, that small risks of them would outweigh any upsides that a typical human experiences, for oneself (ignoring effects on other people)?
I don’t think that if someone rejects the rationality of trading off neutrality for a combination of happiness and suffering, they need to explain every case of this. (Analogously, the fact that people often do things for reasons other than maximizing pleasure and minimizing pain isn’t an argument against ethical hedonism, just psychological hedonism.) Some trades might just be frankly irrational or mistaken, and one can point to biases that lead to such behavior.
If we reject either of these premises, we must also reject the overwhelming importance of shaping the far future.
Perhaps a nitpick (on a post that is otherwise very well done!), but as phrased this doesn’t appear true. Rejecting either of those premises only entails rejecting the overwhelming importance of populating the far future with lots of happy lives. You could still consider the far future overwhelmingly ethically important in that you want to prevent it from being worse than extinction, for example.
I’m glad “distillation” is emphasized as well in the acronym, because I think it resolves an important question about competitiveness. My initial impression, from the pitch of IA as “solve arbitrarily hard problems with aligned AIs by using human-endorsed decompositions,” was that this wouldn’t work because explicitly decomposing tasks this way in deployment sounds too slow. But distillation in theory solves that problem, because the decomposition from the training phase becomes implicit. (Of course, it raises safety risks too, because we need to check that the compression of this process into a “fast” policy didn’t compromise the safety properties that motivated decomposition in the training in the first place.)
Under this interpretation I would say my position is doubt that positive welfare exists in the first place. There’s only the negation or absence of negative welfare. So to my ears it’s like arguing 5 x 0 > 1 x 0. (Edit: Perhaps a better analogy, if suffering is like dust that can be removed by the vacuum-cleaner of happiness, it doesn’t make sense to say that vacuuming a perfectly clean floor for 5 minutes is better than doing so for 1 minute, or not at all.)
Taken in isolation I can see how counterintuitive this sounds, but in the context of observations about confounders and the instrumental value of happiness, it’s quite sensible to me compared with the alternatives. In particular, it doesn’t commit us to biting the bullets I mentioned in my last comment, doesn’t violate transitivity, and accounts for the procreation asymmetry intuition. The main downside I think is the implication that death is not bad for the dying person themselves, but I don’t find this unacceptable considering: (a) it’s quite consistent with e.g. Epicurean and Buddhist views, not “out there” in the history of philosophy, and (b) practically speaking every life is entangled with others so that even if my death isn’t a tragedy to myself, it is a strong tragedy to people who care about or depend on me.
Maybe your intuition that the latter is better than the former is confounded by the pleasant memories of this beautiful sight, which could remove suffering from their life in the future. Plus the confounder I mentioned in my original comment.
Of course one can cite confounders against suffering-focused intuitions as well (e.g. the tendency of the worst suffering in human life to be much more intense than the best happiness). But for me the intuition that C > B when all these confounders are accounted for really isn’t that strong—at least not enough to outweigh the very repugnant conclusion, utility monster, and intuition that happiness doesn’t have moral importance of the sort that would obligate us to create it for its own sake.
Any reasonable theory of population ethics must surely accept that C is better than B.
I dispute this, at least if we interpret the positive-welfare lives as including only happiness (of varying levels) but no suffering. If a life contains no suffering, such that additional happiness doesn’t play any palliative role or satisfy any frustrated preferences or cravings, I’m quite comfortable saying that this additional happiness doesn’t add value to the life (hence B = C).
I suspect the strength of the intuition in favor of judging C > B comes from the fact that in reality, extra happiness almost always does play a palliative role and satisfies preferences. But a defender of the procreation asymmetry (not the neutrality principle, which I agree with Michael is unpalatable) doesn’t need to dispute this.
Instrumental to causing them to have a frustrated preference. If they weren’t born, they wouldn’t have that preference.
Is it bad to have created that mind?
It doesn’t personally affect anyone. And they personally don’t care about having been created (again: they don’t have any preference about their existence). So is it bad to have created them?
I don’t know if I’m missing something obvious, but even though the birth itself doesn’t violate this mind’s preference, their birth creates a preference that cannot be fulfilled. So (under the usual psychology of what it’s like to have a frustrated preference) it is instrumentally bad to have created that mind.
Asymmetries need not be deontological; they could be axiological. A pure consequentialist could maintain that negative experiences are lexically worse than absence of good experiences, all else equal (in particular, controlling for the effects of good experiences on the prevalence of negative experiences). This is controversial, to be sure, but not inconsistent with consequentialism and hence not vulnerable to Will’s argument.
It seems to me plausible that anyone who uses the word agony in the standard sense is committing her/himself to agony being undesirable. This is not an argument for irreducible normativity, but it may give you a feeling that there is some intrinsic connection underlying the set of self-evident cases.
Could you please clarify this? As someone who is mainly convinced of irreducible normativity by the self-evident badness of agony—in particular, considering the intuition that someone in agony has reason to end it even if they don’t consciously “desire” that end—I don’t think this can be dissolved as a linguistic confusion.
It’s true that for all practical purposes humans seem not to desire their own pain/suffering. But in my discussions with some antirealists they have argued that if a paperclip maximizer, for example, doesn’t want not to suffer (by hypothesis all it wants is to maximize paperclips), then such a being doesn’t have a reason to avoid suffering. That to me seems patently unbelievable. Apologies if I’ve misunderstood your point!
We could also ask how many days of one’s human life one would be willing to forgo to experience some duration of time as another species. This approach would allow us to assign cardinal numbers to the value of animal lives.
I hope I’m not being too obvious here, but I’ve seen people frequently speak of animals “mattering” X times as much as a human, say, without drawing this distinction: we’d need to be very careful to distinguish what we mean by value of life. For prioritizing which lives to save, this quote perhaps makes sense. But not if “value of animal lives” is meant to correspond to how much we should prioritize alleviating different animals’ suffering. I wouldn’t trade days of my life to experience days of a very poor person’s life, but that doesn’t mean my life is more valuable in the sense that helping me is more important. Quite the opposite: the less value there is in a human’s/animal’s life, the more imperative it is to help them (in non-life-saving ways), for reasons of diminishing returns at least.
I would strongly encourage surveys about intuitions of this sort to precisely ask about tradeoffs of experiences, rather than “value of life” (as in the Norwood and Lusk survey that you cite).
Do you think they would have a similar response to intervening in the lives of young children in X oppressed group (or any group for that matter)? That seems to be a relevantly similar case to wild animals, in terms of their lack of capacity to self-govern and vulnerability.
Excellent and important, if sobering, work! I’ve gotten the sense that very general social psychology arguments about animal advocacy strategy can go either way (foot in the door vs door in the face, etc.), so it’s refreshing to see specific studies on this that tell me something not at all obvious. I like the preregistration and use of FDR control. Some minor remarks:
“the power (the risk of false negative results)”—I believe this should be the complement of that risk
“If the AFFT articles encourage the view that animal-free alternatives are unnatural, they could strengthen one of the key justifications for animal product consumption.”—Seems like your results for the model with an interaction between reading about AFFT and preference for naturalness have some implications for this. In that model reading about AFFT is no longer significant, nor is the interaction. But I suppose under this hypothesis you’d expect a noticeable negative interaction: the stronger one’s preference for naturalness, the more strongly reading about AFFT decreases their AFO.
the reason you maintain and continue to value the relationship is not so circumstantial, and has more to do with your actual relationship with that other person
Right, but even so it seems like a friend who cares for you because they believe caring for you is good, and better than the alternatives, is “warmer” than one who doesn’t think this but merely follows some partiality (or again, bias) toward you.
I suppose it comes down to conflicting intuitions on something like “unconditional love.” Several people, not just hardcore consequentialists, find that concept hollow and cheap, because loving someone unconditionally implies you don’t really care who they are, in any sense other than the physical continuity of their identity. Conditional love identifies the aspects of the person actually worth loving, and that seems more genuine to me, though less comforting to someone who wants (selfishly) to be loved no matter what they do.
I suppose the point is that you don’t recognize that reason as an ethical one; it’s just something that happens to explain your behaviour in practice, not what you think is right.
Yeah, exactly. It would be an extremely convenient coincidence if our feelings for partial friendship etc., which evolved in small communities where these feelings were largely sufficient for social cohesion, just happened to be the ethically best things for us to follow - when we now live in a world where it’s feasible for someone to do a lot more good by being impartial.
Edit: seems based on one of your other comments that we actually agree more than I thought.