The Intersection of Moral Weights and Logarithmic Pain
[Draft Amnesty post. I explored this question about two years ago and hoped to find some kind of resolution, but quickly found myself out of my depth and abandoned it. I still think the core question is important and underexplored, so I’m posting this in the hope that someone finds it interesting enough to pick up.]
This post is about the intersection of three important ideas in animal welfare prioritisation. There are a few ways that they can interact, and I’m not sure which is the right interpretation.
The three ideas
Rethink Priorities’ Moral Weights estimates, where they estimated that in a unit of time, a pig can suffer 52% as much as a human (the figures have large error bars, and make assumptions like hedonistic utilitarianism, among other things).
The idea that the unpleasantness of pain increases superlinearly with its intensity (i.e. an 8⁄10 on the pain scale is more than twice as bad as a 4⁄10). The Qualia Research Institute were the first org I heard talking about this, but the idea precedes them.
Welfare Footprint’s Cumulative Pain Framework, where pain is quantified as the cumulative time spent in negative states of different intensities.
I am interested in how these three ideas interact. I will explain the problem in the way I first encountered it, when thinking about insects.
The problem
RP’s welfare range project estimates black soldier flies have a “moral weight” of 1%, relative to human’s defined moral weight of 100%. The human-suffering-equivalent of a black soldier fly suffering at full capacity for 100 hours could be interpreted in two ways:
A human suffering at full capacity for 1 hour
A human suffering at 1% capacity for 100 hours.
Another way of expressing this difference:
Use the moral weight to scale the duration: the fly’s 100 hours at full capacity becomes 1 human hour at full capacity pain (intensity kept constant)
Use the moral weight to scale the intensity: the fly’s full capacity becomes 1% of human capacity pain for 100 hours (duration kept constant)
If the unpleasantness of pain scaled linearly with its intensity, then these two views are equivalent. However, if unpleasantness scales superlinearly (the logarithmic pain hypothesis), then option 1 is far far worse than option 2.
In vague mathematical terms, (assuming the scaling is exponential) the amount of suffering in these two scenarios is given by:
human suffering at 100% capacity for 1 hour ∝ 1 · e^100
human suffering at 1% capacity for 100 hours ∝ 100 · e^1
And option 1 is clearly far far larger than option 2.
Why this matters
This difference is incredibly important, as it could point us towards focusing far more on smaller, more numerous animals (e.g. insects, shrimp), and less on larger, less numerous animals (humans, pigs) — or vice versa, depending on which interpretation is correct.
I don’t have good arguments either way, and so I’m very uncertain which is the right way to think about this. I think it ultimately depends on how the RP numbers were designed. I’ve read through all of RP’s writing on the topic when they first published it, and I didn’t see anything that would suggest one interpretation over the other, although it’s very possible that I missed something.
Why the 10-point scale is probably compressing something huge
Here’s a simple thought experiment that made me take superlinearity more seriously. In an incredibly underwhelming skateboarding-related incident, I broke my leg, and at the time I rated my pain as 6⁄10. If I had been unlucky enough to break both my legs simultaneously, I imagine I would have rated it something like 9⁄10, despite the physical intensity being exactly twice as large. Obviously I couldn’t rate it 12⁄10. The scale has a ceiling, and that ceiling forces compression. In the mericfully-unlikely scenario where I had broken every bone in my body, I presumably would have rated it at 10⁄10, less than twice as bad as breaking only my leg. Clearly there’s something fishy going on when we rate our pain.
https://existentialcomics.com/comic/290
There are some studies which show this effect, e.g. this study reported that 80% of women who recently gave birth preferred 12 hours at 4⁄10 over 6 hours at 8⁄10, when linear scaling would predict indifference.
This suggests that if people really did anchor 10⁄10 to the worst pain imaginable (say, torture), then all normal life pain would barely register above 1⁄10, making the scale nearly useless for everyday purposes. Instead, people spread their ratings across the range they actually encounter, which means the gaps between adjacent points on the scale must represent vastly different amounts of actual suffering as you move upward. The difference between 9⁄10 and 10⁄10 is probably enormous compared to the difference between 1⁄10 and 2⁄10.
This becomes especially vivid if you compare 1 minute of excruciating torture (10/10 pain) against 10 minutes of a very mild headache (1/10). It seems obvious that the torture is far, far worse — not just 10x worse adjusted for duration, which is what a linear scale would imply.
The Princess and the Pea problem
There’s a related puzzle about interpersonal comparisons. Imagine someone so sheltered from discomfort that she becomes incredibly sensitive, rating minor issues as high on her pain scale. Is she just using a different scale to everyone else — speaking a different language, so we should normalize her ratings when aggregating? Or has she actually increased her sensitivity in a qualia sense, becoming a kind of utility monster where her stubbing her toe is genuinely worse than a regular person being involved in a car crash?
The strongest counterargument I’ve encountered
I discussed this question with Bob Fischer, who led the Rethink Priorities Moral Weight Project, and he wasn’t convinced. His argument, as I understood it, went roughly like this:
Yes, superlinearity in self-reported pain might be a real effect, but this is an artifact of how humans use pain scales, not a feature of the fundamental experience of pain. And crucially, non-human animals don’t self-report, so this non-linearity is irrelevant for animal welfare. Animal welfare pain assessment (such as the Cumulative Pain Framework) relies on behavioural indicators. If a researcher judges an animal to be at 10% of its capacity, they simply mean 1⁄10 as bad as its worst state — there’s no question about whether 100% is “really” 10x worse, because that’s just what the numbers mean by construction. The human self-report distortions are irrelevant.
Why I’m not fully convinced by this counterargument
I find this argument powerful but not decisive, for a few reasons:
1. Human self-report is behaviour, it’s just verbal behaviour. The counterargument draws a sharp line between human self-report (nonlinear, distorted) and animal behavioural indicators (clean, cardinal). But self-report is itself a form of behaviour, it’s just the verbal kind. We already know that this verbal behaviour relates nonlinearly to the underlying experience. Why should we expect nonverbal behaviour to be any different? This doesn’t argue that pain itself is superlinear, but I think it means we can’t dismiss the possibility just because we’re using behavioural indicators instead of self-report.
2. There are some evolutionary arguments for superlinearity is species-general. One reason to expect superlinear pain is that organisms need fine-grained discrimination among the mild pains they encounter frequently (is my sore ankle worse than my sunburnt neck?) while also being capable of registering extreme pain. This could lead to a scale where most of the discriminatory resolution is concentrated in the lower intensities, with extreme pain compressed at the top. This evolutionary pressure would apply to any organism that needs to make trade-offs between competing mild threats, not just humans.
Conclusion
I am still very uncertain whether the superlinearity is physiological, or just a reporting artifact.
This increases my uncertainty intervals when comparing pain across species.
Overall I am surprised by how little attention the idea of logarithmic pain receives in EA circles, as if true, it is an incredibly important effect, and would push us in the direction of focusing more on reducing extreme pain (even if the duration of it is low).
Appendices
This whole thing is further complicated by the fact that the experience of duration might not scale in a linear way either (it could be superlinear due to sensitisation with exposure to pain, or sublinear due to habituation). The subjective perception of time when in pain is definitely worthy of research, as it could inform questions of prioritisation between pain of varying duration.
Interestingly, when the intensity is low enough, the hierarchy between Options 1 and 2 switches, and Option 2 actually gives larger results than Option 1. E.g. if the insect was suffering at only 0.1% of its maximal capability, then:
e^0.1
0.1 · e^1
And in this case option 2 is greater than option 1. I’m probably taking the exponential model too literally in this case, but it’s a curious artefact.
I’m confused about what “superlinearity” is even supposed to mean here.
In the intro you distinguish “unpleasantness” and “intensity”, and say that one grows superlinearly with the other, but how are these two things even defined to begin with? And what is the difference between them? Defining one scale for measuring pain is hard enough, but before we can evaluate this “superlinear” claim we first need to define two!
In the examples with humans, I can see what the claim is. There are at least two ways you could try to define a pain scale: (i) self-report on a scale of 1-10, and (ii) something that more consistently tracked actual preferences with respect to gambles or experiences of different duration, and in this example the claim is that (ii) grows super-linearly with (i).
But this just seems like a claim about the limitations of the self-report 1-10 scale, which is only relevant for humans (think I’m probably agreeing with the summary of Bob Fischer’s take here).
In the case of non-humans, it’s not that I disagree, but I don’t even understand what the claim is that is being made?
Thanks for your clear question! You’re right, I should have been much clearer in what I meant.
Concretely, I was thinking about the cumulative pain framework, and how it has 4 different intensities of pain (Annoying, Hurtful, Disabling, and Excruciating), and what the relative unpleasantness of the levels might be.
There’s a longer report here about that question, and I’m very sympathetic to the view that the 4 pain ratings should increase in a very superlinear way https://forum.effectivealtruism.org/posts/C2qiY9hwH3Xuirce3/short-agony-or-long-ache-comparing-sources-of-suffering-that
And in this post I was trying to understand how the moral-weight numbers would interact with the pain-intensity-weightings (whether they should apply to the duration or the intensity dimension, as that makes a big difference).
Did this answer your question, or was there a more fundamental crux I missed?
I’m open to the idea that I’m barking up the wrong tree. It’s been a few years since I really sat down to think all this through.
Thank you for your reply and clarification!
If the claim is that the gap between ‘Disabling’ and ‘Excruciating’ should be larger than the gap between ‘Annoying’ and ‘Hurtful’, then that makes sense to me, and seems interesting.
But it sounds like this wasn’t a numerical scale to begin with? So this again just feels like a claim about how we should go about assigning numbers to those categories (if we need numbers), rather than a claim that pain unpleasantness is ‘superlinear’ in some objective sense?
Defining what a numerical score for pain means seems like a hard problem. From my perspective, it seems like it should be defined so that the being concerned would be indifferent between a day of 2*x and 2 days of x. I think this is the notion you are referring to as ‘unpleasantness’. The question then for any other pain metric is just: “how well does it measure this?”. I’m still not sure it makes sense to ask “How does pain intensity scale with unpleasantness?”, since then we would first have to define a numerical scale for pain intensity in some different way, and I’m still not sure how we begin to do that?
I suppose there is another ineresting complication here, which is that you could also try to define your pain scale in terms of preferences among gambles. For example, the pain scale should be defined so that a rational being is indifferent between 100% chance of x and a 50% chance of 2*x. And then you’re confronted with the question of whether this should give you the same answer as defining it in terms of preferences among durations. My feeling is that it should be the same (something about personal identity not being a ‘further fact’ and applying standard utilitarian aggregation approach to person-moments rather than persons..?) but would be interesting to explore points of view where those two potential scale definitions are different. That doesn’t feel quite the same as ‘intensity’ vs ‘unpleasantness’ though. More like two different definitions of ‘unpleasantness’.
Ah, I see where you’re coming from. You’re saying that the real problem is deciding where to place the intensity categories (Annoying, Hurtful, etc) on the number line of pain, instead of pretending those categories make their own dimension called intensity, and mapping them to another thing called unpleasentness.
The way I was thinking about it:
The way you’re thinking about it:
I think the reason I was drawn to the intensity perspective is because for humans it seems real, and that’s where we have our best understanding (due to the advantages of self-report, more pyschophysics studies, introspecting on our own experiences), and so I was thinking about translating our (still very limited) understanding of that model to the non-human space. But maybe you’re right that it would be better to build a simpler model from scratch around the non-human limitations.
I like the property of the pain scale you mentioned where it scales linearly with time and duration. That would mean the whole ambiguity of the moral-weights/log-pain intersection that this post was about would disappear. And yes, I share your intuition that it would be the same as your gambling property (although would also be interested in any special cases where they come apart).
Thanks for pushing me on this, it helped clarify my own vague thoughts about it!
Yeah… I wish we would just say that the 4 is actually lower than 4 and directly track what you mean by “unpleasantness” with these scores, since this is what we care about. But that’s not how people use the /10 scale, unfortunately. And that’s understandable. If they were, they would seldom say that they’re suffering above a 1⁄10.[1]
And yes. When researchers/people assign welfare ranges, they think they’re tracking “unpleasantness”, but I also suspect they are actually tracking what you mean by “intensity” to a large extent, which may lead to very misguided cross-species welfare tradeoffs. I am extremely skeptical of the following counter-view you describe:
Maybe that’s what they mean, but I doubt that their estimate is not deeply biased by the “unpleasantness”/”intensity” confusion.
To be clear, though, I don’t want people to take away that we should care less about insects and shrimp. There are so many other considerations. If anything, this should make us less confident in precise-ish moral weight estimates (and maybe look for projects robust to this uncertainty).
That’s a very important problem you raise! Thank you for this. :)
I guess that’s why the /10 scale measures what you mean by “intensity,” even though I agree with Toby it’s not clear what it’s even supposed to be.
I have the intuition that I would choose 10 hours of pain of intensity X over 1 hour of pain of intensity 10X. I’m not sure that I can justify this intuition: I suspect it might be irrational by definition, but I also suspect that many other people share this intuition. And if this in fact the case, it suggests that superlinear scaling might be a fundamental property of pain itself as opposed to an artifact of the way in which we verbalize pain on the 1 to 10 scale.
If I had more time I’d try to tease out my intuition further. At first blush I think it boils down to the idea that there is some threshold of pain to be crossed (perhaps the point at which the pain becomes “unbearable”), that is qualitatively different from any pain below that threshold in such a way that for almost any choice between y hours of the the below-the-threshold-pain and z hours of the above-the-threshold pain, I would choose the former. This probably violates basic axioms of rationality, not to mention my own generally utilitarian beliefs, but I feel the pull of the intuition nevertheless.
It also occurs to me that my intuition is closely related to the intuition underlying the pinprick argument against negative utilitarianism: https://www.utilitarianism.com/pinprick-argument.html.
Here are a few ideas that come to my mind:
Like most biological processes, pain systems act like sigmoid curves. They have sensitive tipping points (activation of nociceptors, activation of a matrix of cortical regions, etc) and plateaus. There probably is a maximum plateau that one system can achieve: excruciating pain may feel like “infinitely” painful (as we might want to trade anything to make it stop), but it’s mostly just a powerful enough activation of specific brain regions that completely overtake other regions. These plateau are probably comparable to WFI’s pain intensities
10-point scales are so, so biased, yes. Having cluster headaches or fibromyalgia can make you question your own scale and revise how much a broken leg really feels like. However, they still give indication on the current state of the person’s welfare. If someone says 10 for a bruise, it probably means that the person is in extreme distress and need imediate help—not a lecture on how there probably are worse pains. The question is rather how this person can achieve lower levels of pain (actual painkillers, reassurance, other types of care...). Imagine two persons with broken legs saying they are currently feeling a 9⁄10 pain, one of them having already experienced cluster headaches and the other one having never experienced severe types of injuries. Would you treat them differently ?
We kind of have the same issues for animals. We can observe behavioral changes to infere internal welfare states, but it is completely inaccessible to us. And we know animals prone to chronic pain or depression can show less behavioral cues, just like humans. I think the Moral Weights estimates only look for the potential maximum plateau of pain that one species can achieve (the bucket analogy). This depends on genetics, morphology, physiology; not on welfare at one point in time. Its like comparing the “real” 10⁄10 for every species. Then we can try to deduce where one point in time situates. Compared with humans, where a 10⁄10 would be torture, a broken leg could be a 6⁄10; but if the person says 9⁄10, is their experience less morally relevant ?
I don’t think we can have a definite answer without better understanding sentience and negative internal states. Comparing Moral Weights this way can give us a sense of what to do in worst case scenarios. Shrimps may weight less than humans because their welfare range may be smaller, but if their life conditions put them continuously at their maximum limit while most humans live with reasonable levels of pain, it actually suggests prioritizing ressources to reduce shrimp suffering; plus calculating how much an intervention can avert negative welfare can give us a sense of how much it should cost for it to be effective
Thanks for sharing your draft!