First thing: unless I’m making a terrible mistake, your derivation for focusing on mean(cost)/mean(effect) is just mathematically wrong. It treats cost and effect as fixed numbers—you cannot divide a random variable by N because N isn’t meaningful when talking about distributions. In the footnote you mention treating cost and effect as independent, which acknowledges that they are random variables, but then that invalidates the derivation.
Am I completely wrong? I can’t see how this works.
This is not expected value—that could be bad
Second thing: do we actually care about mean(cost)/mean(effect)? In another comment you justify it because it’s ∑cost/∑effect if we sum over different interventions. That does not mean it’s the expected value of each intervention! It’s just the total cost over the total effect. This does not have any direct link to expected value.
In fact, expected value is exactly why we don’t want to squash variability in the variables. Let’s say that cost is $1 or $1000 with 50% probability, and effect is 1 life or 1000 lives with 50% probability. Then mean(cost/effect) is 2 * 0.25 + 1000 * 0.25 + 0.001 * 0.25 ~250 $/life. Whereas mean(cost)/mean(effect) = $1/life.
Why are these so different? Because mean(cost)/mean(effect) neglects the “tail risk”, the 25% chance that we spend $1000 and only save 1 life. This terrible situation is exactly why we do expected value calculations, because it matters and should be factored into our cost-effectiveness calculations.
That said, ∑cost/∑effect could have some philosophical grounding as a quantity we care about. I would love to see more elaboration on that in the post and a full defense of it. That would be really interesting and definitely worth a post to itself!
There are better solutions to unstable estimates
The best fix is: compute mean(effect/cost), not mean(cost/effect). This is because the denominator will never become zero. I have never seen a cost distribution that includes zero. It doesn’t make sense for philanthropic applications. In fact if there was an intervention that had zero cost and improved lives we could all retire.
Yes, costs can still be low and this can make effect/cost very high. This is not a bug, it’s a feature. This is what generates fat-tailed distributions of cost-effectiveness. The most cost-effective interventions have modest effects and very low costs.
∑cost/∑effect could have some philosophical grounding as a quantity we care about
I agree it’s the main point of the post (we want to choose interventions in a way that maximizes the total effect). I thought it was a unanimous opinion but apparently, it’s not?
The best fix is: compute mean(effect/cost), not mean(cost/effect)
I agree it helps in many cases, where the cost distribution spans fewer orders of magnitude than the effect distribution.
In the model in the post, both numerator and denominator have very high uncertainties
your derivation for focusing on mean(cost)/mean(effect) is just mathematically wrong
It’s definitely not rigorous or formal.
I thought anyone with a math background would find the topic obvious, and most of the value would be in making the post accessible to casual estimators with many simple and informal examples.
My main argument for focusing on mean(cost)/mean(effect) is that I want to get the most value with a finite amount of resources, I don’t really care about the EV of cost/effect since that’s not valuable in itself. Maybe I could write total cost / total effect in that line, to keep it simple while making it less mathematically malformed?
I do not think anyone with a math background would find this obvious. Judging by the comments on this post and the feedback you said you received, I think you should update your beliefs on whether this claim is obvious at all.
In fact, I think the focus on examples detracts from the post. Examples can be misleading. Picking an example with a fixed numerator or a fixed denominator ignores the tail risk that I described in my comment, so the example serves to obscure and not explain.
I don’t really understand why you think it’s so common sense to focus on this quantity? Maybe given that you’re proposing an alternative to expected value calculations it seems reasonable that you have the burden of explaining why it’s a good alternative. I highly encourage you to make that as a separate post—I believe the title and content of this post are misleading given that you are proposing a new concept but rhetorically treating it like the one most EAs are used to.
Substantively speaking, one issue with total cost over total effect is that it is strictly a sampling quantity. For small N, we are never guaranteed that total cost = N * mean(cost). This is a consequence of the law of large numbers, not something you can take for granted. Unless we run hundreds of interventions there is a strong chance that total cost over total effect is not actually the same as mean(cost)/mean(effect), where mean() is taken as the true mean of the distribution.
It’s okay for cost estimates to span many orders of magnitude. As long as they are not zero, mean(effect/cost) will be well defined.
I do not think anyone with a math background would find this obvious. Judging by the comments on this post and the feedback you said you received, I think you should update your beliefs on whether this claim is obvious at all.
I was completely wrong, indeed!
Will think about the comments for a few hours and write an appendix tonight.
Do you agree that the main practical takeaway for non-experts reading this post should be “Be very careful using mean(cost/effect), especially if the effect can be small”?
I think the focus on examples detracts from the post. Examples can be misleading.
I disagree, the first example is exaggerated, but it’s a very common issue, I think like a third of guesstimate models have some version of it. (see the three recent examples in the post)
Will respond to the other parts of the comments in the appendix, since many other commenters raised similar points.
I think the main practical takeaway should be to use mean(effect/cost) unless you have a really good reason not to. I agree mean(cost/effect) is a bad metric because it would be unreasonable for our effect distribution to not include zero or negative values—which is the only way mean(cost/effect) is even defined.
I agree it’s the main point of the post (we want to choose interventions in a way that maximizes the total effect). I thought it was a unanimous opinion but apparently, it’s not?
I think most people would agree that we want to maximise mean(“effect”) for “cost” ⇐ “maximum cost”. The crucial question is how to handle this optimisation problem when “effect”, “cost” and “maximum cost” are distributions. The alternatives seem to be:
Maximising mean(“effect”) for mean(“cost”) ⇐ mean(“maximum cost”), which seems equivalent to maximising mean(“effect”)/mean(“cost”), as proposed in this post.
Maximising mean(“effect”/”cost”), as proposed in some comments of this post.
Thinking at the margin, these approaches seem equivalent.
The derivation is wrong
First thing: unless I’m making a terrible mistake, your derivation for focusing on mean(cost)/mean(effect) is just mathematically wrong. It treats cost and effect as fixed numbers—you cannot divide a random variable by N because N isn’t meaningful when talking about distributions. In the footnote you mention treating cost and effect as independent, which acknowledges that they are random variables, but then that invalidates the derivation.
Am I completely wrong? I can’t see how this works.
This is not expected value—that could be bad
Second thing: do we actually care about mean(cost)/mean(effect)? In another comment you justify it because it’s ∑cost/∑effect if we sum over different interventions. That does not mean it’s the expected value of each intervention! It’s just the total cost over the total effect. This does not have any direct link to expected value.
In fact, expected value is exactly why we don’t want to squash variability in the variables. Let’s say that cost is $1 or $1000 with 50% probability, and effect is 1 life or 1000 lives with 50% probability. Then mean(cost/effect) is 2 * 0.25 + 1000 * 0.25 + 0.001 * 0.25 ~250 $/life. Whereas mean(cost)/mean(effect) = $1/life.
Why are these so different? Because mean(cost)/mean(effect) neglects the “tail risk”, the 25% chance that we spend $1000 and only save 1 life. This terrible situation is exactly why we do expected value calculations, because it matters and should be factored into our cost-effectiveness calculations.
That said, ∑cost/∑effect could have some philosophical grounding as a quantity we care about. I would love to see more elaboration on that in the post and a full defense of it. That would be really interesting and definitely worth a post to itself!
There are better solutions to unstable estimates
The best fix is: compute mean(effect/cost), not mean(cost/effect). This is because the denominator will never become zero. I have never seen a cost distribution that includes zero. It doesn’t make sense for philanthropic applications. In fact if there was an intervention that had zero cost and improved lives we could all retire.
Yes, costs can still be low and this can make effect/cost very high. This is not a bug, it’s a feature. This is what generates fat-tailed distributions of cost-effectiveness. The most cost-effective interventions have modest effects and very low costs.
I agree it’s the main point of the post (we want to choose interventions in a way that maximizes the total effect). I thought it was a unanimous opinion but apparently, it’s not?
I agree it helps in many cases, where the cost distribution spans fewer orders of magnitude than the effect distribution.
Sadly it doesn’t solve all cases, E.g. for policy intervention estimates of costs can have uncertainty across orders of magnitude: see https://forum.effectivealtruism.org/posts/h2N9qEbvQ6RHABcae/a-critical-review-of-open-philanthropy-s-bet-on-criminal?commentId=NajaYiQD7KhAJyBcp
In the model in the post, both numerator and denominator have very high uncertainties
It’s definitely not rigorous or formal.
I thought anyone with a math background would find the topic obvious, and most of the value would be in making the post accessible to casual estimators with many simple and informal examples.
My main argument for focusing on mean(cost)/mean(effect) is that I want to get the most value with a finite amount of resources, I don’t really care about the EV of cost/effect since that’s not valuable in itself. Maybe I could write total cost / total effect in that line, to keep it simple while making it less mathematically malformed?
I do not think anyone with a math background would find this obvious. Judging by the comments on this post and the feedback you said you received, I think you should update your beliefs on whether this claim is obvious at all.
In fact, I think the focus on examples detracts from the post. Examples can be misleading. Picking an example with a fixed numerator or a fixed denominator ignores the tail risk that I described in my comment, so the example serves to obscure and not explain.
I don’t really understand why you think it’s so common sense to focus on this quantity? Maybe given that you’re proposing an alternative to expected value calculations it seems reasonable that you have the burden of explaining why it’s a good alternative. I highly encourage you to make that as a separate post—I believe the title and content of this post are misleading given that you are proposing a new concept but rhetorically treating it like the one most EAs are used to.
Substantively speaking, one issue with total cost over total effect is that it is strictly a sampling quantity. For small N, we are never guaranteed that total cost = N * mean(cost). This is a consequence of the law of large numbers, not something you can take for granted. Unless we run hundreds of interventions there is a strong chance that total cost over total effect is not actually the same as mean(cost)/mean(effect), where mean() is taken as the true mean of the distribution.
It’s okay for cost estimates to span many orders of magnitude. As long as they are not zero, mean(effect/cost) will be well defined.
I was completely wrong, indeed!
Will think about the comments for a few hours and write an appendix tonight.
Do you agree that the main practical takeaway for non-experts reading this post should be “Be very careful using mean(cost/effect), especially if the effect can be small”?
I disagree, the first example is exaggerated, but it’s a very common issue, I think like a third of guesstimate models have some version of it. (see the three recent examples in the post)
Will respond to the other parts of the comments in the appendix, since many other commenters raised similar points.
I found this comment https://forum.effectivealtruism.org/posts/SesLZfeYsqjRxM6gq/probability-distributions-of-cost-effectiveness-can-be?commentId=nA3mJoj2fToXtX7pY from Jérémy particularly clear
I think the main practical takeaway should be to use mean(effect/cost) unless you have a really good reason not to. I agree mean(cost/effect) is a bad metric because it would be unreasonable for our effect distribution to not include zero or negative values—which is the only way mean(cost/effect) is even defined.
I think most people would agree that we want to maximise mean(“effect”) for “cost” ⇐ “maximum cost”. The crucial question is how to handle this optimisation problem when “effect”, “cost” and “maximum cost” are distributions. The alternatives seem to be:
Maximising mean(“effect”) for mean(“cost”) ⇐ mean(“maximum cost”), which seems equivalent to maximising mean(“effect”)/mean(“cost”), as proposed in this post.
Maximising mean(“effect”/”cost”), as proposed in some comments of this post.
Thinking at the margin, these approaches seem equivalent.