I think it would be nice to have all the estimates in the table here with 3 significant digits, in order not to propagate errors. I understand more digits may give a sense of false precision, but you provide the 5th and 95th percentiles in the same table, so I suppose the uncertainty is already being conveyed.
Why do you give estimates for the median moral weight, instead of the mean moral weight? Normally, we care about expectations...
Short version: I want to discourage people from using these numbers in any context where that level of precision might be relevant. That is, if the sign of someoneās analysis turns on three significant digits, then I doubt that their analysis is action-relevant.
As for medians rather than means, our main concern there was just that means tend to be skewed toward extremes. But we can generate the means if itās important!
Finally, I should stress that Iām seeing people use these āmoral weightsā roughly as follows: ā100 humans = ~33 chickens (100*.332= ~33).ā This is not the way theyāre intended to be used. Minimally, they should be adjusted by lifespan and average welfare levels, as they are estimates of welfare ranges rather than all-things-considered estimates of the strength of our moral reasons to benefit members of one species rather than another.
As for medians rather than means, our main concern there was just that means tend to be skewed toward extremes. But we can generate the means if itās important!
Do you think the extremes of your moral weight distributions are reasonable? If so, even if the mean is skewed towards them, it would become more accurate. Anyways, I would say sharing the mean would be important, such that people could see how much influence extremes have (i.e. how heavy-tailed is the moral weight distribution).
Sorry for the slow reply, Vasco. Here are the means you requested. My vote is that if people are looking for placeholder moral weights, they should use our 50th-pct numbers, but I donāt have very strong feelings on that. And I know you know this, but I do want to stress for any other readers that these numbers are not āmoral weightsā as that term is often used in EA. Many EAs want one number per species that captures the overall strength of their moral reason to help members of that species relative to all others, accounting for moral uncertainty and a million other things. We arenāt offering that. The right interpretation of these numbers is given in the main post as well as in our Intro to the MWP.
My vote is that if people are looking for placeholder moral weights, they should use our 50th-pct numbers, but I donāt have very strong feelings on that.
Are there concrete reasons for neglecting large welfare ranges which have not been considered in the estimation of the welfare range distributions? If not, why should one use the medians instead of the means? The mean welfare range of shrimp is 4.67 (= 0.21/ā0.045) times the median welfare range of shrimp, whereas the mean welfare range of chickens is 1.00 (= 0.368/ā0.368) times the median welfare range of chickens. So using medians instead of means makes the cost-effectiveness of helping chickens as a fraction of that of helping shrimp 4.67 times as high.
Hi Vasco, Thanks for the good question! I think itās important to note that there are (at least) 3 types of model choices and uncertainty at work: a) we have a good deal of uncertainty about each theory of welfare represented in the model, b) we donāt have a ton of confidence that the function we included to represent each theory of welfare is accurate (especially the undiluted experiences function, which partially drives the high mean results), a) we could have uncertainty that our approach to estimating welfare ranges in general is correct, but weāve not included this overall model uncertainty. For instance, our model has no āpriorā welfare ranges for each species, so the distribution output by the calculation entirely determines our judgement of the welfare range of the species involved. We also might be uncertain that simply taking a weighted mixture of each theory of welfare is a good way to arrive at an overall judgement of welfare ranges. Etc.
Our preliminary method used in this project incorporates model uncertainty in the form of (a) by mixing together the separate distributions generated by each theory of welfare, but we donāt incorporate model uncertainty in the ways specified by (b) or (c). I think these additional layers of uncertainty are epistemically important, and incorporating them would likely serve to ādampenā the effect that the mean result of the model affects our all-things-considered judgement about the welfare capacity of any species. Using the median is a quick (though not super rigorous or principled) of encoding that conservatism/āadditional uncertainty into how you apply the moral weight projectās results in real life. But there are other ways to aggregate the estimates, which could (and likely would) be better than using the median.
I think these additional layers of uncertainty are epistemically important, and incorporating them would likely serve to ādampenā the effect that the mean result of the model affects our all-things-considered judgement about the welfare capacity of any species.
I tend to agree.
But there are other ways to aggregate the estimates, which could (and likely would) be better than using the median.
I wondered whether it would be better for you to aggregate the results from the different models with the geometric mean of odds. For example, if models 1 and 2 implied a probability of 50 % and 90 % of the welfare range being smaller than 0.2, corresponding to odds of 1 (= 0.5/ā(1 ā 0.5)) and 9 (= 0.9/ā(1 ā 0.9)), the aggregated model would imply odds of 3 (= (1*9)^0.5) of the welfare range being smaller than 0.2, corresponding to a probability of 75 % (= 1/ā(1 + 1ā3)). There is some evidence for using the geometric mean of odds, so I believe an approach like this combined with using the means of the aggregated distributions would be better than your approach of using the medians of the final distributions at the end.
Thanks for clarifying and sharing the means, Bob! There are some significant differences to the medians for some species, so it looks like it would be important to see whether the extremes of the distributions are being well represented.
Short version: I want to discourage people from using these numbers in any context where that level of precision might be relevant.
I thought this would be the reason. That being said, I still think it makes sense to present the results with 2 or 3 significantdigits whenever the uncertainty is already being conveyed. For example, if I say the mean moral weight is 1.00, and the 5th and 95th percentiles are 0.00100 and 1.00 k, it should be clear that the result is pretty uncertain, even though all numbers have 3 significant digits.
That is, if the sign of someoneās analysis turns on three significant digits, then I doubt that their analysis is action-relevant.
I agree in general, but wonder whether for some cases it may matter in a non-crucial way. For example, the ratio between 1.50 and 2.49 is 0.602 without rounding, but 1 if we round both numbers to 2. An error of a factor of 0.602 may not be crucial, but it will not necessarily be totally negligible either.
Finally, I should stress that Iām seeing people use these āmoral weightsā roughly as follows: ā100 humans = ~33 chickens (100*.332= ~33).ā This is not the way theyāre intended to be used.
Ahah, I agree! They are supposed to be used as follows: ā100 chickens = 100*0.332 humans = 33.2 humansā. One should always be careful not to interpret the moral weight of chickens relative to humans as that of humans relative to chickens, and also present the final result with 3 significant digits instead of 2.
Jokes apart, when I read ā[based on RPās median moral weights] 100 chickens = 33.2 humansā, I assume we are considering the duration and intensity of experience (relative to the moral weight) are the same for both humans and chickens, because that is what the moral weight alone tells us. However, if one says āsaving x humans equals saving y chickensā, I agree the moral weights have to be combined with other variables, because now we are describing the consequences of actions instead of just a direct comparison of experiences.
Hi Bob,
Great work!
I think it would be nice to have all the estimates in the table here with 3 significant digits, in order not to propagate errors. I understand more digits may give a sense of false precision, but you provide the 5th and 95th percentiles in the same table, so I suppose the uncertainty is already being conveyed.
Why do you give estimates for the median moral weight, instead of the mean moral weight? Normally, we care about expectations...
Thanks, Vasco!
Short version: I want to discourage people from using these numbers in any context where that level of precision might be relevant. That is, if the sign of someoneās analysis turns on three significant digits, then I doubt that their analysis is action-relevant.
As for medians rather than means, our main concern there was just that means tend to be skewed toward extremes. But we can generate the means if itās important!
Finally, I should stress that Iām seeing people use these āmoral weightsā roughly as follows: ā100 humans = ~33 chickens (100*.332= ~33).ā This is not the way theyāre intended to be used. Minimally, they should be adjusted by lifespan and average welfare levels, as they are estimates of welfare ranges rather than all-things-considered estimates of the strength of our moral reasons to benefit members of one species rather than another.
Hi again,
Sorry, I forgot to touch on this point:
Do you think the extremes of your moral weight distributions are reasonable? If so, even if the mean is skewed towards them, it would become more accurate. Anyways, I would say sharing the mean would be important, such that people could see how much influence extremes have (i.e. how heavy-tailed is the moral weight distribution).
Sorry for the slow reply, Vasco. Here are the means you requested. My vote is that if people are looking for placeholder moral weights, they should use our 50th-pct numbers, but I donāt have very strong feelings on that. And I know you know this, but I do want to stress for any other readers that these numbers are not āmoral weightsā as that term is often used in EA. Many EAs want one number per species that captures the overall strength of their moral reason to help members of that species relative to all others, accounting for moral uncertainty and a million other things. We arenāt offering that. The right interpretation of these numbers is given in the main post as well as in our Intro to the MWP.
Are there concrete reasons for neglecting large welfare ranges which have not been considered in the estimation of the welfare range distributions? If not, why should one use the medians instead of the means? The mean welfare range of shrimp is 4.67 (= 0.21/ā0.045) times the median welfare range of shrimp, whereas the mean welfare range of chickens is 1.00 (= 0.368/ā0.368) times the median welfare range of chickens. So using medians instead of means makes the cost-effectiveness of helping chickens as a fraction of that of helping shrimp 4.67 times as high.
Hi Vasco,
Thanks for the good question! I think itās important to note that there are (at least) 3 types of model choices and uncertainty at work:
a) we have a good deal of uncertainty about each theory of welfare represented in the model,
b) we donāt have a ton of confidence that the function we included to represent each theory of welfare is accurate (especially the undiluted experiences function, which partially drives the high mean results),
a) we could have uncertainty that our approach to estimating welfare ranges in general is correct, but weāve not included this overall model uncertainty. For instance, our model has no āpriorā welfare ranges for each species, so the distribution output by the calculation entirely determines our judgement of the welfare range of the species involved. We also might be uncertain that simply taking a weighted mixture of each theory of welfare is a good way to arrive at an overall judgement of welfare ranges. Etc.
Our preliminary method used in this project incorporates model uncertainty in the form of (a) by mixing together the separate distributions generated by each theory of welfare, but we donāt incorporate model uncertainty in the ways specified by (b) or (c). I think these additional layers of uncertainty are epistemically important, and incorporating them would likely serve to ādampenā the effect that the mean result of the model affects our all-things-considered judgement about the welfare capacity of any species. Using the median is a quick (though not super rigorous or principled) of encoding that conservatism/āadditional uncertainty into how you apply the moral weight projectās results in real life. But there are other ways to aggregate the estimates, which could (and likely would) be better than using the median.
Thanks for the good reply too, Laura.
I tend to agree.
I wondered whether it would be better for you to aggregate the results from the different models with the geometric mean of odds. For example, if models 1 and 2 implied a probability of 50 % and 90 % of the welfare range being smaller than 0.2, corresponding to odds of 1 (= 0.5/ā(1 ā 0.5)) and 9 (= 0.9/ā(1 ā 0.9)), the aggregated model would imply odds of 3 (= (1*9)^0.5) of the welfare range being smaller than 0.2, corresponding to a probability of 75 % (= 1/ā(1 + 1ā3)). There is some evidence for using the geometric mean of odds, so I believe an approach like this combined with using the means of the aggregated distributions would be better than your approach of using the medians of the final distributions at the end.
Thanks for clarifying and sharing the means, Bob! There are some significant differences to the medians for some species, so it looks like it would be important to see whether the extremes of the distributions are being well represented.
Thanks for clarifying!
I thought this would be the reason. That being said, I still think it makes sense to present the results with 2 or 3 significantdigits whenever the uncertainty is already being conveyed. For example, if I say the mean moral weight is 1.00, and the 5th and 95th percentiles are 0.00100 and 1.00 k, it should be clear that the result is pretty uncertain, even though all numbers have 3 significant digits.
I agree in general, but wonder whether for some cases it may matter in a non-crucial way. For example, the ratio between 1.50 and 2.49 is 0.602 without rounding, but 1 if we round both numbers to 2. An error of a factor of 0.602 may not be crucial, but it will not necessarily be totally negligible either.
Ahah, I agree! They are supposed to be used as follows: ā100 chickens = 100*0.332 humans = 33.2 humansā. One should always be careful not to interpret the moral weight of chickens relative to humans as that of humans relative to chickens, and also present the final result with 3 significant digits instead of 2.
Jokes apart, when I read ā[based on RPās median moral weights] 100 chickens = 33.2 humansā, I assume we are considering the duration and intensity of experience (relative to the moral weight) are the same for both humans and chickens, because that is what the moral weight alone tells us. However, if one says āsaving x humans equals saving y chickensā, I agree the moral weights have to be combined with other variables, because now we are describing the consequences of actions instead of just a direct comparison of experiences.