MichaelStJules comments on GWWC’s evaluations of evaluators

MichaelStJules Nov 29, 2023, 2:21 PM
33 points
4 ∶ 0
It’s worth pointing out that ACE’s estimates/models (mostly weighted factor models, including ACE’s versions of Scale-Tractability-Neglectedness, or STN) are often already pretty close to being BOTECs, but aren’t quite BOTECs. I’d guess the smallest fixes to make them more scope-sensitive are to just turn them into BOTECs, or whatever parts of them you can into BOTECs^[1], whenever not too much extra work. BOTECs and other quantitative models force you to pick factors, and scale and combine them in ways that are more scope-sensitive.
For the cost-effective criterion, ACE makes judgements about the quality of charities’ achievements with Achievement Quality Scores. For corporate outreach and producer outreach, ACE already scores factors from which direct average impact BOTECs could pretty easily be done with some small changes, which I’d recommend:
1. Score “Scale (1-7)” = “How many locations and animals are estimated to be affected by the commitments/campaign, if successful?” in terms of the number of animals (or animal life-years) per year of counterfactual impact instead of 1-7.
2. Ideally, “Impact on animals (1-7)” should be scored quantitatively using Welfare Footprint Project’s approach (some rougher estimates here and here) instead of 1-7, but this is a lower priority than other changes. Welfare improvements per animal or per year of animal life can probably vary much more than 7 times, though, and can end up negative instead, so I’d probably at least adjust the range to be symmetric around 0 and let researchers select 0 or values very close to it.
3. The BOTEC is then just the product of “Impact on animals (1-7)” (the average^[2] welfare improvement with successful implementation), “Scale”, “Likelihood of implementation (%)”, expected welfare range and the number of years of counterfactual impact (until similar welfare improvements for the animals would have happened anyway and made these redundant). Similar BOTECs could be done for the direct impacts of other interventions.
For groups aiming to impact decision-making or funding in the near term with research like Faunalytics, ACE could also highlight some of the most important decisions that have been (or are not too unlikely to be) informed by their research so that we can independently judge how they compare to corporate outreach or other interventions. ACE could also use RP’s model or something similar to get impact BOTECs to make comparisons with more direct work.
For other charities, ACE could also think about how to turn the models into BOTECs or quantitative models of important outcomes. These can be intermediate outcomes or outputs that aren’t necessarily comparable across all interventions, if impact for animals is too speculative, but the potential upside is high enough and the potential downside small enough.^[1]
For the Impact Potential criterion, ACE uses STN a lot and cites the 80,000 Hours’ article where 80,000 Hours explains how to get a BOTEC by interpreting and scoring the factors in specific ways. ACE could just follow that procedure and then the STN estimates would be BOTECs.
That being said, STN is really easy to misapply generally (e.g. various critiques here), and I’d be careful about relying on it even if you were to follow 80,000 Hours’ procedure to get BOTECs. For example, only a tiny share of a huge but relatively intractable problem, like wild animal welfare/suffering, may be at all tractable, so it’s easy to overestimate the combination of Scale and Tractability in those cases. See also Joey’s Why we look at the limiting factor instead of the problem scale and Saulius’s Why I No Longer Prioritize Wild Animal Welfare. STN can be useful for guiding what to investigate further and filtering charities for review, but I’d probably go for BOTECs done other ways, like above to replace Achievement Quality Scores, and with more detailed theories of change.
1. ^
  For example, you could do a BOTEC of the number of additional engagement-weighted animal advocates, which could be part of a BOTEC for impact on animals, but going from engagement-weighted animal advocates to animals could be too speculative, so you stop at engagement-weighted animal advocates. This could be refined further, weighing by country scores.
2. ^
  Per animal or per animal life-year, to match Scale.
3. ^
  It seems ACE did so for the Scale factor, but no specific quantitative interpretation for the others.
What links here?
- MichaelStJules's comment on A Critique of Animal Charity Evaluators (ACE) by VettedCauses (Nov 8, 2024, 7:29 PM; 23 points)
- Animal Charity Evaluators Dec 4, 2023, 10:16 AM
  8 points
  0 ∶ 0
  Parent
  Hi Michael, thanks a lot for the helpful comments, and for taking the time to be so thorough in your feedback. We’ve been thinking a lot about how to produce proxies for impact that can be meaningfully compared with one another, with BOTECs being one possible way to help achieve that, so it’s really useful to get your views. We’ll talk these through as a team as we consider improvements to our process for the coming years.
  - Max