Vasco Grilo🔸 comments on Rethink Priorities’ Welfare Range Estimates

Vasco Grilo🔸 3 Mar 2023 18:41 UTC
2 points
0 ∶ 0
Hi Bob,
Great work!
I think it would be nice to have all the estimates in the table here with 3 significant digits, in order not to propagate errors. I understand more digits may give a sense of false precision, but you provide the 5th and 95th percentiles in the same table, so I suppose the uncertainty is already being conveyed.
Why do you give estimates for the median moral weight, instead of the mean moral weight? Normally, we care about expectations...
- Bob Fischer 3 Mar 2023 21:38 UTC
  5 points
  0 ∶ 0
  Parent
  Thanks, Vasco!
  Short version: I want to discourage people from using these numbers in any context where that level of precision might be relevant. That is, if the sign of someone’s analysis turns on three significant digits, then I doubt that their analysis is action-relevant.
  As for medians rather than means, our main concern there was just that means tend to be skewed toward extremes. But we can generate the means if it’s important!
  Finally, I should stress that I’m seeing people use these “moral weights” roughly as follows: “100 humans = ~33 chickens (100*.332= ~33).” This is not the way they’re intended to be used. Minimally, they should be adjusted by lifespan and average welfare levels, as they are estimates of welfare ranges rather than all-things-considered estimates of the strength of our moral reasons to benefit members of one species rather than another.
  - Vasco Grilo🔸 5 Mar 2023 15:16 UTC
    2 points
    0 ∶ 0
    Parent
    Hi again,
    Sorry, I forgot to touch on this point:
    As for medians rather than means, our main concern there was just that means tend to be skewed toward extremes. But we can generate the means if it’s important!
    Do you think the extremes of your moral weight distributions are reasonable? If so, even if the mean is skewed towards them, it would become more accurate. Anyways, I would say sharing the mean would be important, such that people could see how much influence extremes have (i.e. how heavy-tailed is the moral weight distribution).
    - Bob Fischer 9 Mar 2023 14:04 UTC
      4 points
      0 ∶ 0
      Parent
      Sorry for the slow reply, Vasco. Here are the means you requested. My vote is that if people are looking for placeholder moral weights, they should use our 50th-pct numbers, but I don’t have very strong feelings on that. And I know you know this, but I do want to stress for any other readers that these numbers are not “moral weights” as that term is often used in EA. Many EAs want one number per species that captures the overall strength of their moral reason to help members of that species relative to all others, accounting for moral uncertainty and a million other things. We aren’t offering that. The right interpretation of these numbers is given in the main post as well as in our Intro to the MWP.
      - Vasco Grilo🔸 3 Feb 2025 15:36 UTC
        2 points
        0 ∶ 0
        Parent
        My vote is that if people are looking for placeholder moral weights, they should use our 50th-pct numbers, but I don’t have very strong feelings on that.
        Are there concrete reasons for neglecting large welfare ranges which have not been considered in the estimation of the welfare range distributions? If not, why should one use the medians instead of the means? The mean welfare range of shrimp is 4.67 (= 0.21/0.045) times the median welfare range of shrimp, whereas the mean welfare range of chickens is 1.00 (= 0.368/0.368) times the median welfare range of chickens. So using medians instead of means makes the cost-effectiveness of helping chickens as a fraction of that of helping shrimp 4.67 times as high.
        Laura Duffy 3 Feb 2025 17:13 UTC
        5 points
        1 ∶ 0
        Parent
        Hi Vasco,
        Thanks for the good question! I think it’s important to note that there are (at least) 3 types of model choices and uncertainty at work:
        a) we have a good deal of uncertainty about each theory of welfare represented in the model,
        b) we don’t have a ton of confidence that the function we included to represent each theory of welfare is accurate (especially the undiluted experiences function, which partially drives the high mean results),
        a) we could have uncertainty that our approach to estimating welfare ranges in general is correct, but we’ve not included this overall model uncertainty. For instance, our model has no “prior” welfare ranges for each species, so the distribution output by the calculation entirely determines our judgement of the welfare range of the species involved. We also might be uncertain that simply taking a weighted mixture of each theory of welfare is a good way to arrive at an overall judgement of welfare ranges. Etc.
        
        Our preliminary method used in this project incorporates model uncertainty in the form of (a) by mixing together the separate distributions generated by each theory of welfare, but we don’t incorporate model uncertainty in the ways specified by (b) or (c). I think these additional layers of uncertainty are epistemically important, and incorporating them would likely serve to “dampen” the effect that the mean result of the model affects our all-things-considered judgement about the welfare capacity of any species. Using the median is a quick (though not super rigorous or principled) of encoding that conservatism/additional uncertainty into how you apply the moral weight project’s results in real life. But there are other ways to aggregate the estimates, which could (and likely would) be better than using the median.
        
        What links here?
        Effective giving initiatives should not assume the best animal and human welfare interventions are equally cost-effective? by Vasco Grilo🔸 (18 Feb 2025 17:47 UTC; 5 points)
        Vasco Grilo🔸 3 Feb 2025 19:00 UTC
        4 points
        0 ∶ 0
        Parent
        Thanks for the good reply too, Laura.
        I think these additional layers of uncertainty are epistemically important, and incorporating them would likely serve to “dampen” the effect that the mean result of the model affects our all-things-considered judgement about the welfare capacity of any species.
        I tend to agree.
        But there are other ways to aggregate the estimates, which could (and likely would) be better than using the median.
        I wondered whether it would be better for you to aggregate the results from the different models with the geometric mean of odds. For example, if models 1 and 2 implied a probability of 50 % and 90 % of the welfare range being smaller than 0.2, corresponding to odds of 1 (= 0.5/(1 − 0.5)) and 9 (= 0.9/(1 − 0.9)), the aggregated model would imply odds of 3 (= (1*9)^0.5) of the welfare range being smaller than 0.2, corresponding to a probability of 75 % (= 1/(1 + ¹⁄₃)). There is some evidence for using the geometric mean of odds, so I believe an approach like this combined with using the means of the aggregated distributions would be better than your approach of using the medians of the final distributions at the end.
      - Vasco Grilo🔸 9 Mar 2023 15:07 UTC
        2 points
        0 ∶ 0
        Parent
        Thanks for clarifying and sharing the means, Bob! There are some significant differences to the medians for some species, so it looks like it would be important to see whether the extremes of the distributions are being well represented.
  - Vasco Grilo🔸 4 Mar 2023 17:42 UTC
    2 points
    0 ∶ 0
    Parent
    Thanks for clarifying!
    Short version: I want to discourage people from using these numbers in any context where that level of precision might be relevant.
    I thought this would be the reason. That being said, I still think it makes sense to present the results with 2 or 3 significantdigits whenever the uncertainty is already being conveyed. For example, if I say the mean moral weight is 1.00, and the 5th and 95th percentiles are 0.00100 and 1.00 k, it should be clear that the result is pretty uncertain, even though all numbers have 3 significant digits.
    That is, if the sign of someone’s analysis turns on three significant digits, then I doubt that their analysis is action-relevant.
    I agree in general, but wonder whether for some cases it may matter in a non-crucial way. For example, the ratio between 1.50 and 2.49 is 0.602 without rounding, but 1 if we round both numbers to 2. An error of a factor of 0.602 may not be crucial, but it will not necessarily be totally negligible either.
    Finally, I should stress that I’m seeing people use these “moral weights” roughly as follows: “100 humans = ~33 chickens (100*.332= ~33).” This is not the way they’re intended to be used.
    Ahah, I agree! They are supposed to be used as follows: “100 chickens = 100*0.332 humans = 33.2 humans”. One should always be careful not to interpret the moral weight of chickens relative to humans as that of humans relative to chickens, and also present the final result with 3 significant digits instead of 2.
    Jokes apart, when I read “[based on RP’s median moral weights] 100 chickens = 33.2 humans”, I assume we are considering the duration and intensity of experience (relative to the moral weight) are the same for both humans and chickens, because that is what the moral weight alone tells us. However, if one says “saving x humans equals saving y chickens”, I agree the moral weights have to be combined with other variables, because now we are describing the consequences of actions instead of just a direct comparison of experiences.