Is RP’s Moral Weights Project too animal friendly? Four critical junctures

I really appreciate the RP Moral Weights Project and before I say anything I’d like to thank the amazing RP crew for their extremely thoughtful and kind responses to this critique. Because of their great response I feel a little uncomfortable even publishing this, as I respect both the project and the integrity of the researchers.[1]

This project helped me appreciate what it might mean for animals to suffer, and gave a framework to compare animal and human suffering. RP’s impressive project is now a crux in the EA “space”. 80,000 hours recently used these numbers as a key factor in elevating factory farming to a “most pressing world problem” and many forum posts use their median welfare estimates as a central number in cost-effectiveness analysis. Of the first 15 posts this debate week, 7 referenced the project.

As I considered their methodology, I reflected that the outcome hinged on a series of difficult and important junctures, which led to the surprising (to many) conclusion that the moral weight of animals might not be so different from humans. However through my biased, anthropocentric lens[2], it seemed to me that these key junctures lean towards favoring animals .

I present four critical junctures, where I think the Moral Weights project favored animals. I don’t argue that any of their decisions are necessarily wrong, only that each decision shifts the project outcome in an animal friendly direction and sometimes by at least an order of magnitude.[3]


Juncture 1 – Animal Friendly Researchers (Unclear multiplier)

Our team was composed of three philosophers, two comparative psychologists (one with expertise in birds; another with expertise in cephalopods), two fish welfare researchers, two entomologists, an animal welfare scientist, and a veterinarian.”

It’s uncontroversial to assume that researchers’ findings tend towards their priors—what they already believe[4]. This trend is obvious in contentious political subjects. In immigration research classically conservative think tanks are more likely to emphasise problems with immigration compared with libertarian or liberal ones[5].In the Moral weights project, researchers either have histories of either being animal advocates or are at best neutral—I couldn’t find anyone who had previously expressed public skepticism at animals having high moral weight. Major contributors Bob Fischer and Adam Shriver have a body of work which supports animal welfare.

Bob Fischer “… I should acknowledge that I’m not above motivated reasoning either, having spent a lot of the last 12 years working on animal-related issues. In my own defence, I’ve often been an animal-friendly critic of pro-animal arguments, so I think I’m reasonably well-placed to do this work.”

Could this bias have been mitigated? I’m a fan of adversarial research collaborations – that might have been impossible here, but perhaps one or two researchers with somewhat animal-welfare-critical opinions could have been included on the team. Also this particular project had more important subjective junctures and decisions to be made than most projects, which could mean even more potential for researcher bias.

Juncture 2 - Assuming hedonism (1.5 x − 10x multiplier)

The researchers choose to assume hedonism, which is likely to favor animals more than other single moral frameworks or a blend of frameworks.

At this juncture, there was one statement I thought questionable.

“We suggest that, compared to hedonism, an objective list theory might 3x our estimate of the differences between humans’ and nonhumans’ welfare ranges. But just to be cautious, let’s suppose it’s 10x. While not insignificant, that multiplier makes it far from clear that the choice of a theory of welfare is going to be practically relevant.”

When we build a model like this, I don’t think we should consider whether any decision is large enough to be “practically relevant”, especially not at an early stage of our process when there are multiple junctures still to come. Instead we do the best we can to include all variables we think might be important no matter how small. Then perhaps at the end of the entire project we can reflect on what might have happened had we made different decisions, and how much those decisions might affect the final result.

Juncture 3: Dismissing Neuron count (1.5x – 5x multiplier)

Perhaps the most controversial juncture was to largely dismiss neuron count. To RP’s credit, they understand the importance and devote an in-depth article which explains why they don’t think neuron count is a very useful proxy for animal moral weight. I’m not going to discuss their arguments or counter-arguments, but neuron count is one of the more concrete, objective measures we have to compare animals to humans, it does correlate fairly well with intelligence and our intuitions. and it has been widely used in the past as a proxy for moral weight..

Pigs have 200x less neurons than humans, while the final RP median moral weight for pigs is 2x smaller than humans. This huge gulf means that even giving neuron count a more significant weighting would have made a meaningful difference.

Interestingly, in RP’s neuron count analysis they state “Given this, we suggest that the best role for neuron counts in an assessment of moral weight is as a weighted contributor, one among many, to an overall estimation of moral weight.” - yet they don’t end up including neuron counts in what I would consider a meaningful way[6]

Juncture 4 – Not Discounting Behavioral Proxies (2x − 10x)[7]

RP put in a groundbreaking effort to scour animal behavior research to determine how similar animal welfare related behavior is to human behavior. They call these “Hedonic proxies” and “Cognitive proxies”. If an animal exhibits a clear positive response for any behavioral proxy, RP effectively assumes for scoring purposes that each behavioral proxy translates to an equivalent intensity of experience in a human[8].

They applied no discount to these behavioral proxies. For example if pigs display anxiety-like behavior, fear-like behavior and flexible self-protective behavior, their score for these proxies is the same as a human. My counter-assumption here is that where humans display anxiety, fear or self-protective behavior, both the behaviors themselves and the corresponding experience are likely to be more intense or bigger sized than a pig, chicken or shrimp that exhibits these behaviors[9]. Phil Trammell explores this here, saying “Even if humans and flies had precisely the same qualitative kinds of sensory and cognitive experiences, and could experience everything just as intensely, we might differ in our capacities for welfare because our experiences are of different sizes”.

I understand this would be difficult to discount in a logical, systematic way[10], but not-discounting at all means animals are likely favored.

To RPs credit, they acknowledge this issue as a potential criticism at the end of their project… “You’re assessing the proxies as either present or absent, but many of them obviously come either in degrees or in qualitatively different forms.” [11]

After the project decided to assume hedonism and dismiss neuron count, the cumulative percent of these 90 behavioral proxies became the basis for their welfare range estimates. Although the team used a number of models in their final analysis, these models were mostly based on different weightings of these same behavioral proxies.[12]. Median final welfare ranges are therefore fairly well approximated by the simple formula.

(Behavioral proxy percent) x (Probability of Sentience) = Median Welfare range

The graph below shows how similar RP’s final moral weights are to this simple formula.

Final Reflections

At four critical junctures in their moral weights project, RP chose animal friendly options ahead of alternatives. If less animal-friendly options had been chosen at some or all these junctures, then the median welfare range could have been lower by between 5 − 500 times.

Bob Fischer pointed out three other junctures that I didn’t include here where RP didn’t favor animals (full response in appendix)[13]. These include not allowing animals capacities that humans might lack, assigning zero value to animals on behavioral proxies with no available evidence, and giving considerable weight to neurophysiological proxies. My inclination though is that these might be less important junctures, as I don’t think any of these would swing the final results more than 2x

I’ll stress again that I’m not arguing here that any of RPs decisions were necessarily wrong, only that they were animal friendly.

Again an enormous thanks to the RP team for their generous feedback and looking forward to any response.

  1. ^

    I also really like that RP prioritise engaging with forum responses probably more than any other big EA org, and seem to post all their major work on here.

  2. ^

    I run a human welfare organization and have devoted my adult life so far to this cause, so in the EA community I’m on the human biased side. Interestingly though in the general population I would be comfortably in the top 1% of animal friendliness.

  3. ^

    Where possible I’ve tried to quantify how much these decisions might have influenced their final median welfare range, based on what the RP researchers themselves expressed.

  4. ^

    Jeff Kaufman and the discussion thread get into this in a bit more detail here https://​​forum.effectivealtruism.org/​​posts/​​H6hxTrgFpb3mzMPfz

  5. ^
  6. ^

    Neuron counts were included , but I suspect neuron count weighting is under 5% which I don’t consider meaningful (I couldn’t figure it out exactly).

  7. ^

    I just guessed this multiplier, as RP didn’t have a range here

  8. ^

    Phil Trammell puts it a slightly different way “Regarding hedonic intensity, their approach is essentially to make a list of ways in which humans can feel good or bad, at least somewhat related to the list of capacities above, and then to look into how many items on the list a given nonhuman species seems to check off.”

  9. ^

    I understand there are differences of opinion here as to whether humans are likely to have more intense experience than soldier flies, but on balance assuming no difference still likely favours animals

  10. ^

    Perhaps the complexity of the behavior could have been used to compare the behavior with humans? This though wouldn’t be accessible purely through accessing published research

  11. ^
  12. ^

    Neurophysiological proxies were also included, but given far less weight than behavioural proxies as is made clear by the graph below

  13. ^

    1) “We could have allowed capacities that humans lack, such as chemoreception or magnetoreception, to count toward animals ’welfare ranges. Instead, we used humans as the index species,assuming that humans have every proxy with certainty. This precludes the possibility of any other animal having welfare ranges larger than humans. In other words, from the very outset of the project, we assumed humans were at least tied for having the highest welfare range.

    2) It’s plausible that animals possess some of the traits for which we weren’t able to find evidence— as well as some of the traits for which we found negative evidence. Nevertheless, we effectively treated all unknowns and negative judgments as evidence for larger welfare range differences. We could have made some attempt to guess at how many of the proxies animals possess, or simply assign very low credences to proxy possession across the board, but we opted not to do that.

    3) We gave considerable weight to certain neurophysiological proxies that, in our estimation, entail that there are implausibly large differences in the possible intensities of valenced states.We did this largely out of respect for the judgment of others who have used them as proxies for differences in welfare ranges.Without these proxies, the differences between our welfare range estimates would be smaller still.”

  14. ^

    Here I just guess a multiplier, as unlike the other 2 junctures I can’t base it off RP’s numbers. I included a null 1x multiplier here, as if neuron count was weighted higher in an earlier juncture, this could arguably already account for not discounting behavioral proxies.

  15. ^

    This isn’t 10,000 (10 x 100 x 10) because I feel like if neuron counts were already heavily weighted, this would probably make the lack of discounting of behavioural proxies largely obsolete, so I counted this as x 1 rather than x10.