Hi David, I really enjoyed this post. Your comment on the potentially infinite standard errors on ratio distributions has been something I have been mulling over for the last few months.
Because the numbers going into the ratio are themselves averages from samples of Indonesians, each comes with its own margin of error....as far as the math goes, there’s a nontrivial chance that Inpres led to zero additional years of schooling, then there’s a nontrivial chance that the ratio of wage increase to schooling increase is infinite.
Outside of the context of weakly identified instrumental variable regressions, I’m wondering how much Givewell takes this into account in its cost effectiveness analysis, and how much EA in general should be considering this. In one sense, what we care about is not E(benefit)/E(cost), but rather E(benefit/cost). i.e. what we care about is if we reproduced the program elsewhere, that it would have a similar cost-effectiveness. If what’s inside the expectation of the latter is the ratio of two independent normals, then we get a Cauchy, which has infinite fat tails and undefined standard errors. Am I right to say that cost-effectiveness analysis only has meaning if either
the distributions on the numerator or denominator are non-independent-normal, perhaps skewed (e.g. ratio of two independent Gammas gives us an F, which does have a well-defined standard error), or
the denominator, the distribution of “cost”, ends up converging to a constant (e.g. GAVI, where we have a well-defined/nonrandom cost that’s mandated by the government).
Does that imply that simply picking projects that have highest E(benefit)/E(lives saved) may not be the best solution? Have you read any good papers on this topic? (or perhaps this isn’t an issue at all?) The only thing I could find is abstract, which has seen 0 citations in the last 2 decades since it was published.
(Also, this is trivial, but I thought the addition of the photos added a really nice touch as compared to the typical academic journal article.)
Interesting question! Certainly it is the nonlinearities in the cost-effectiveness analysis that makes uncertainty matter to an expected value maximizer. If we thought that the cost-effectiveness of an intervention was best modeled as the sum of two uncertain variables (a simple example of a linear model), then the expected value of the intervention would be the sum of the expected values of the two variables. Their uncertainty would not matter.
The most serious effort I know of to incorporate uncertainty into the GiveWell cost-effectiveness analysis is this post by Sam Nolan, Hannah Rokebrand, and Tanae Rao. I was surprised at how little it changed the expected values—I think by a typical 10-15%, but I’m finding it a little hard to tell.
I think when the denominator is cost rather than impact of school construction on years of schooling, our uncertainty range is less likely to put much weight on the possibility that the true value is zero. Cost might even be modeled as log normal, so that it can never be zero. In this case, there would be little weight on ~infinite cost-effectiveness.
Thank you again for taking the time to share your thoughts. I hadn’t seen that link before, and you make a fair point that using distributions often doesn’t change the end conclusions. I think it would be interesting to explore how Jensen’s inequality comes into play with this, and the effects of differing sample sizes.
You’re right. His critique is mostly about the decision cutoff rule, and assumes that Givewell has accurately measured the point estimate, given the data. On the other hand, the url you provided shows that taking into account uncertainty can cause the point estimate to shift.
Hi David, I really enjoyed this post. Your comment on the potentially infinite standard errors on ratio distributions has been something I have been mulling over for the last few months.
Outside of the context of weakly identified instrumental variable regressions, I’m wondering how much Givewell takes this into account in its cost effectiveness analysis, and how much EA in general should be considering this. In one sense, what we care about is not E(benefit)/E(cost), but rather E(benefit/cost). i.e. what we care about is if we reproduced the program elsewhere, that it would have a similar cost-effectiveness. If what’s inside the expectation of the latter is the ratio of two independent normals, then we get a Cauchy, which has infinite fat tails and undefined standard errors. Am I right to say that cost-effectiveness analysis only has meaning if either
the distributions on the numerator or denominator are non-independent-normal, perhaps skewed (e.g. ratio of two independent Gammas gives us an F, which does have a well-defined standard error), or
the denominator, the distribution of “cost”, ends up converging to a constant (e.g. GAVI, where we have a well-defined/nonrandom cost that’s mandated by the government).
Does that imply that simply picking projects that have highest E(benefit)/E(lives saved) may not be the best solution? Have you read any good papers on this topic? (or perhaps this isn’t an issue at all?) The only thing I could find is abstract, which has seen 0 citations in the last 2 decades since it was published.
(Also, this is trivial, but I thought the addition of the photos added a really nice touch as compared to the typical academic journal article.)
Interesting question! Certainly it is the nonlinearities in the cost-effectiveness analysis that makes uncertainty matter to an expected value maximizer. If we thought that the cost-effectiveness of an intervention was best modeled as the sum of two uncertain variables (a simple example of a linear model), then the expected value of the intervention would be the sum of the expected values of the two variables. Their uncertainty would not matter.
The most serious effort I know of to incorporate uncertainty into the GiveWell cost-effectiveness analysis is this post by Sam Nolan, Hannah Rokebrand, and Tanae Rao. I was surprised at how little it changed the expected values—I think by a typical 10-15%, but I’m finding it a little hard to tell.
I think when the denominator is cost rather than impact of school construction on years of schooling, our uncertainty range is less likely to put much weight on the possibility that the true value is zero. Cost might even be modeled as log normal, so that it can never be zero. In this case, there would be little weight on ~infinite cost-effectiveness.
Thank you again for taking the time to share your thoughts. I hadn’t seen that link before, and you make a fair point that using distributions often doesn’t change the end conclusions. I think it would be interesting to explore how Jensen’s inequality comes into play with this, and the effects of differing sample sizes.
Ah, another article. It seems
uncertainty analysis is getting more traction: https://www.metacausal.com/givewells-uncertainty-problem/
But am I reading right that that one doesn’t push through to a concrete demonstration of impacts on expected values of interventions?
You’re right. His critique is mostly about the decision cutoff rule, and assumes that Givewell has accurately measured the point estimate, given the data. On the other hand, the url you provided shows that taking into account uncertainty can cause the point estimate to shift.