Thank you for tackling a very important problem. But currently I feel I’d be lost when trying to apply this model because there is more explanation needed for many factors. For example, how does the cortisol level weight against the dopamine level? And what levels are good? How to measure and weight various listed factors to assess anxiety? Etc.
Some examples of this model being applied would be very helpful for understanding the model. Is that the next step in your research?
Yes indeed, that is the next step. We plan on applying this system to ~15 animal situations and doing a 1-5 hour report on each. This would be both for different animals (e.g. wild rat and factory farmed cows) and different welfare situations for the same animal (e.g. a report each for battery caged laying hens vs enriched cage laying hens)
On biological markers specifically, from the research we have done so far, it’s very hard to find any consistent biological markers, not to mention situations where we have a bunch that we can cross compare on the same animal. Generally a good score might look like “some cortisol tests have been done on rats in an ideal living situation vs wild rats and the cortisol levels are about the same” where if the same study was done but the cortisol levels were much higher in the wild rats, that would be an indication of lower wild rat welfare.
I wanted to echo all of Saulius’ points (including the thanks for doing this!).
To clarify your response here: all of the rankings are essentially subjective judgements, based on whatever evidence you have available in that category? So in the example above, if those cortisol tests were somehow your only evidence in the “index of biological markers” category, you would just decide a score that you felt represented the appropriate level of badness for the wild rat “index of biological markers” score?
I’m also wondering if you’re going to use the method to compare humans to non-human animals? Some of the biological measures we could use fall down when we think about how humans fit in, e.g. neuron count. Including humans in comparative measures seems valuable for reflecting on/testing intuitions we might otherwise have about cross-species comparisons.
Re:biological markers, the ideal situation would be multiple markers in both the animal in an ideal life vs their current life vs a perfectly unideal life, then scores would be given based on how their current life compares. In practice, sometimes we have found data on a happy life vs a standard life for an animal and can get some sense of how far away these are from each other, but often we have found no applicable data at all for this section. Our reports are very time capped (5 hours or less depending on the importance of the animal), so we do not dive deep into the mechanisms.
Humans from different situations will be ranked as well. I agree having them as a comparative measure for cross-species comparison allows for much easier intuition checks.
Thanks for providing the examples! A couple of questions:
1) Can I check I’ve understood: the “Estimated population size” and “Odds of feeling pain” columns are not factored into the “total welfare score” (which is made up of adding together scores from the various criteria which then end up somewhere between −100 and +100) at all; they are to be used separately.
So if you wanted to work out whether sparing 10 broiler chickens or 20 beef cows from existence was more impactful, you’d have to multiply your result by the odds of feeling pain etc. E.g. for chickens: 10 * −56 * 0.7 = −392 units of suffering prevented. For beef cows: −20 * 20 * 75% = −300 units of suffering prevented. So sparing chickens slightly better by this metric (also: note that people might not agree with that the rough estimates from the OPP on consciousness mean the same thing as “odds of feeling pain,” e.g. if you subscribe to consciousness eliminativism, although I haven’t read the OPP report in a while so might be misremembering the specifics)
2) I don’t understand where the “range” figure comes from?
1) As you correctly observed, we didn’t adjust welfare points for population size and odds of feeling pain in this spreadsheet. But we just publish another report summarizing our animal prioritization research where we aggregated information about baseline welfare points, population size, odds of feeling pain, neglectedness, and amount of suffering caused by a smaller number of specific reasons.
Generally, when we are calculating the cost-effectiveness of a given intervention we take into account the number of welfare points “gained” (baseline welfare points changed counterfactually by the intervention) multiplied by odds of feeling pain and number of animals affected.
We also need to adjust for length of life. For example, if the baseline welfare points per year for a cow is −20 and for broiler chicken is −56, but beef cow spends 402 days on a farm, their WP would be multiplied by the percentage of year they spend on the farm, so 402 days / 365 days in a year = 110%, and broiler chicken spend 42 days, then WPs would be multiplied by 12% resulting in: Cow: −22 welfare points per lifetime of an individual Broiler chicken: −6.72 welfare points per lifetime of an individual.
2) The range is the minimum and maximum values of welfare points as rated by our external reviewers. “Total welfare score” (second column) is an average of internal and external reviewer’s ratings.
Thank you for tackling a very important problem. But currently I feel I’d be lost when trying to apply this model because there is more explanation needed for many factors. For example, how does the cortisol level weight against the dopamine level? And what levels are good? How to measure and weight various listed factors to assess anxiety? Etc.
Some examples of this model being applied would be very helpful for understanding the model. Is that the next step in your research?
Yes indeed, that is the next step. We plan on applying this system to ~15 animal situations and doing a 1-5 hour report on each. This would be both for different animals (e.g. wild rat and factory farmed cows) and different welfare situations for the same animal (e.g. a report each for battery caged laying hens vs enriched cage laying hens)
On biological markers specifically, from the research we have done so far, it’s very hard to find any consistent biological markers, not to mention situations where we have a bunch that we can cross compare on the same animal. Generally a good score might look like “some cortisol tests have been done on rats in an ideal living situation vs wild rats and the cortisol levels are about the same” where if the same study was done but the cortisol levels were much higher in the wild rats, that would be an indication of lower wild rat welfare.
I wanted to echo all of Saulius’ points (including the thanks for doing this!).
To clarify your response here: all of the rankings are essentially subjective judgements, based on whatever evidence you have available in that category? So in the example above, if those cortisol tests were somehow your only evidence in the “index of biological markers” category, you would just decide a score that you felt represented the appropriate level of badness for the wild rat “index of biological markers” score?
I’m also wondering if you’re going to use the method to compare humans to non-human animals? Some of the biological measures we could use fall down when we think about how humans fit in, e.g. neuron count. Including humans in comparative measures seems valuable for reflecting on/testing intuitions we might otherwise have about cross-species comparisons.
Re:biological markers, the ideal situation would be multiple markers in both the animal in an ideal life vs their current life vs a perfectly unideal life, then scores would be given based on how their current life compares. In practice, sometimes we have found data on a happy life vs a standard life for an animal and can get some sense of how far away these are from each other, but often we have found no applicable data at all for this section. Our reports are very time capped (5 hours or less depending on the importance of the animal), so we do not dive deep into the mechanisms.
Humans from different situations will be ranked as well. I agree having them as a comparative measure for cross-species comparison allows for much easier intuition checks.
Also, I think the link “WAS research had a great summary” does not link to where you intended.
Thanks. Fixed.
We had applied this system to 15 different animals/breeds and recently posted the summary of our research here.
Thanks for providing the examples! A couple of questions:
1) Can I check I’ve understood: the “Estimated population size” and “Odds of feeling pain” columns are not factored into the “total welfare score” (which is made up of adding together scores from the various criteria which then end up somewhere between −100 and +100) at all; they are to be used separately.
So if you wanted to work out whether sparing 10 broiler chickens or 20 beef cows from existence was more impactful, you’d have to multiply your result by the odds of feeling pain etc. E.g. for chickens: 10 * −56 * 0.7 = −392 units of suffering prevented. For beef cows: −20 * 20 * 75% = −300 units of suffering prevented. So sparing chickens slightly better by this metric (also: note that people might not agree with that the rough estimates from the OPP on consciousness mean the same thing as “odds of feeling pain,” e.g. if you subscribe to consciousness eliminativism, although I haven’t read the OPP report in a while so might be misremembering the specifics)
2) I don’t understand where the “range” figure comes from?
1) As you correctly observed, we didn’t adjust welfare points for population size and odds of feeling pain in this spreadsheet. But we just publish another report summarizing our animal prioritization research where we aggregated information about baseline welfare points, population size, odds of feeling pain, neglectedness, and amount of suffering caused by a smaller number of specific reasons.
Generally, when we are calculating the cost-effectiveness of a given intervention we take into account the number of welfare points “gained” (baseline welfare points changed counterfactually by the intervention) multiplied by odds of feeling pain and number of animals affected.
We also need to adjust for length of life. For example, if the baseline welfare points per year for a cow is −20 and for broiler chicken is −56, but beef cow spends 402 days on a farm, their WP would be multiplied by the percentage of year they spend on the farm, so 402 days / 365 days in a year = 110%, and broiler chicken spend 42 days, then WPs would be multiplied by 12% resulting in:
Cow: −22 welfare points per lifetime of an individual
Broiler chicken: −6.72 welfare points per lifetime of an individual.
2) The range is the minimum and maximum values of welfare points as rated by our external reviewers. “Total welfare score” (second column) is an average of internal and external reviewer’s ratings.