Probability distributions of Cost-Effectiveness can be misleading


The “mean” cost-effectiveness of interventions with uncertain impact can be misleading, sometimes significantly.
We want to consider , not . (Edit: we actually want )
Edit: there’s a lot of discussion going on in the comments. I still think the core point is correct, especially if you consider yourself part of a group that’s funding lots of interventions (e.g. EA), but I’m much less sure of it. Will add an appendix tonight.
I found this comment particularly enlightening, and unrelatedly agree with others that using value per dollar instead of dollars per value can help in most cases.

Edit 2: someone that knows a lot more than me about this offered to help write a better post, stay tuned!

Edit 3: that didn’t happen, but I now think that the top comment is right: we almost always want to use

Epistemic status (how much I’m sure of this): I’m pretty confident about the main claim, but still confused about the details, I end the post with some questions.

Minimal extreme example:

Let’s say that you have a magical intervention that has:

  • 33% of saving 1 life

  • 33% of saving 100 lives

  • 33% of saving 199 lives

All for the known cost of $10,000.
It would be an amazing intervention! If you run hundreds of similar interventions, you can save lives with cost-effectiveness of $100/​life: the expected value is 100 lives saved, and the cost is always $10,000.

But here is what happens if you model it in Guesstimate:

You get $3400 mean cost per life! Changing the useful value by a factor of 34![1]

This is obvious in hindsight: since Guesstimate shows the “mean” cost-effectiveness

instead of what we care about, which is:


Looking at the 5th and 95th percentile helps in many cases, but not in scenarios where there is a very small probability of very high effects and a significant probability of small effects. Minimal Guesstimate example with 4.8% of saving 1000 lives and 95.2% of saving 1 life.

Some practical examples of very small chances of huge value might be deworming or policy interventions. For those, mean(cost/​effect) and mean(cost)/​mean(effect) might differ by orders of magnitude.

Three recent examples:

If you want to check another Guesstimate model

For most models, you can just manually calculate , since the means are shown in the Guesstimate UI.
For more complex cases, if you are comfortable with python, you can port a guesstimate model to numpy using https://​​​​guesstimate_to_squiggle/​​ and add .mean() liberally (very MVP, let me know if it doesn’t work with a model you want to try).

Possible solutions /​ mitigations:

  • If costs are constant in your model, consider looking at the value per dollar (or per $1,000) instead of dollars per value, so the denominator is constant. The minimal example would become https://​​​​models/​​20682
    Edit: this is by far the most favored approach in comments, and should cover most cases.
    My view is that this is useful in part because huge uncertainties in costs are rare.

  • If you’re interested in a single number for some sense of “expected cost-effectiveness”, get the expected value and the expected cost and divide those numbers instead of the distributions (if the distributions can be considered independent).

  • Other ideas? I’m definitely not an expert in any of this and there’s probably a nice mathematical/​statistical solution that I can’t think of! Please comment if you think of anything!

Some questions I still have:

  • How can we express the uncertainty around cost/​effectiveness if the ratio distribution is hard to reason about and has misleading moments?

  • How could the UI in guesstimate or some potential alternative indicate to the user when to use and when to use for nonlinear functions, to prevent people from making this very common mistake?
    We might want to use the former for e.g. the value of cash transfers

Really curious to know if anyone has ideas!

Huge thanks to Sam Nolan, Justis Mills, and many others for fleshing out the main idea, editing, and correcting mistakes.

This work is licensed under a Creative Commons Attribution 4.0 International License.

  1. ^

    Edit: this used to say “Underestimating the actual effectiveness by a factor of 34”. But I don’t think that this value is more “actual” than the other, just much more useful.

  2. ^

    Assuming independence between cost and effect

    Edit: several commenters pointed out that I’m implicitly considering

    over many interventions, and it’s not obvious at all that that’s what we want in most cases.
    I still think that’s what we want in almost every case, but there’s some interesting discussion going on in comments