I’m a Senior Researcher for Rethink Priorities, a Professor of Philosophy at Texas State University, a Director of the Animal Welfare Economics Working Group, the Treasurer for the Insect Welfare Research Society, and the President of the Arthropoda Foundation. I work on a wide range of theoretical and applied issues related to animal welfare. You can reach me here.
I welcome feedback of all kinds. You can leave me an anonymous message here.
Thanks a bunch for your question, Matt. I can speak to the philosophical side of this; Laura has some practical comments below. I do think you’re right that—and in fact our team discussed the possibility that—we ought to be treating the welfare range estimates as correlated variables. However, we weren’t totally sure that that’s the best way forward, as it may treat the models with more deference than makes sense.
Here’s the rough thought. We need to distinguish between (a) philosophical theories about the relationship between the proxies and welfare ranges and (b) models that attempt to express the relationship between proxies and welfare range estimates. We assume that there’s some correct theory about the relationship between the proxies and welfare ranges, but while there might be a best model for expressing the relationship between proxies and welfare range estimates, we definitely don’t assume that we’ve found it. In part, this is because of ordinary points about uncertainty. Additionally, it’s because the philosophical theories underdetermine the models: lots of models are compatible with any given philosophical theory; so, we just had to choose representative possibilities. (The 1-point-per-proxy and aggregation-by-addition approaches, for instance, are basically justified by appeal to simplicity and ignorance. But, of course, the philosophical theory behind them is compatible with many other scoring and aggregation methods.) So, there’s a worry that if we set things up the way you’re describing, we’re treating the models as though they were the philosophical theories, whereas it might make more sense not to do that and then make other adjustments for practical purposes in specific decision contexts if we’re worried about this.
Laura’s practical notes on this:
A change like the one you’re suggesting would likely decrease the variance in the estimates of f(), since if you assume the welfare ranges are independent variables, you’d get samples where the undiluted experiences model is dominating the welfare range for, say, shrimp, and the neuron count model is dominating the welfare range for pigs. I suggest a quick practical way of dealing with this would be to cut off values of f() below the 2.5th percentile and 97.5th percentile.
Or, even better, I suggest sorting the welfare ranges from least to greatest, then using pairs of the ith-indexed welfare ranges for the ith estimate of f(). Since each welfare model is given the same weight, I predict this’ll most accurately match up welfare range values from the same welfare model. (e.g. the first 11% will be neuron count welfare ranges, etc.)
Ultimately, however, given all the uncertainty in whether our models are accurately tracking reality, it might not be advisable to reduce the variance as such.