First of all, I don’t represent GiveWell or anyone else but myself, so all of this is more or less speculation.
My best guess as why GiveWell does not quantify uncertainty in their estimates is because the technology to do this is still somewhat primitive. The most mature candidate I see is Causal, but even then it’s difficult to identify how one might do something like have multiple parallel analyses of the same program but in different countries. GiveWell has a lot of requirements that their host plaftorm needs t ohave. Google Sheets has the benefit that it can be used, understood, and edited by anyone. I’m currently working on Squiggle with QURI to make sweeten the deal to quantifying uncertainty explicitly, but there’s a long way to go before it becomes somehing that could be readily understood and trusted to be stable like Google Sheets.
On a second note, I would also say that providing lower and upper estimates for cost-effectiveness for its top charities wouldn’t actually be that valuable, in the sense that it doesn’t influence any real world decisions. I know that I decided to spend hours making the GiveDirectly quantification but in truth, the information gained from it directly is extremely little. The main reason I did it is that it makes a great proof of concept for usage in non-GiveWell fields which need it much more.
There are two reasons why there is so little information gained from it:
The uncertainty of GiveDirectly and other GiveWell supported charities is not actually that high (about an order of magnitude for GiveDirectly, I expect over 2-3 orders of magnitude for the others). For instance, I never expected in my quantifaction of uncertainty in GiveDirectly that there would be practically any probability mass of it being more effective than AMF. At least before counting for things like moral uncertainty.
My uncertainty about my chosen uncertainties are really high. If you strip away how fancy my work looks and just look at what I’ve contributed in comparison to what GiveWell has done, I’ve practically copied GiveWell’s work and pulled some numbers out of thin air for uncertanity with the help of Nuno. Some Bayesian Analysis is done under questionable assumptions etc.
I see much more value in quantifying uncertainty when we might expect the uncertainty to be much larger, for instance, when dealing with moral uncertainty, or animal welfare/longtermist interventions.
Wildly guessing, but I don’t think it’s a technological issue. Givewell does publish upper and lower estimates for some of their analyses, at least they did for malnutrition interventions: https://docs.google.com/spreadsheets/d/1IdZLSBgEK46vc7cX9C7KnFgUcOk_M0UIYJ8go_DrvS0/edit#gid=1468241237 see at the bottom, 4x to 19x cash. Many of their CEAs (e.g. new incentives) are just one column. Even for the ones that have one column per country, they could have multiple sheets for upper and lower bounds.
I agree with your second point, I think GiveWell’s mission is not informing many small donors anymore, but informing OpenPhil (and maybe other big players), and OpenPhil cares mostly about GiveWell’s best guess about “what does the most good”.
The uncertainty of GiveDirectly and other GiveWell supported charities is not actually that high (about an order of magnitude for GiveDirectly, I expect over 2-3 orders of magnitude for the others).
That seems pretty high to me! When I’ve seen GiveDirectly used as a point of comparison for other global health/poverty charities, they’re usually described as 1-10x more effective (i.e. people care about distinctions within one order of magnitude).
One useful takeaway would be to know whether some interventions are much more uncertain about their range, and if that says something about the strength of evidence. If AMF is 6-10x and deworming is 1-20x (where 1x is point estimate on cash transfer cost effectiveness), then deworming might have a higher point estimate of cost effectiveness than AMF. But the large uncertainty suggests that maybe this is because we have much less evidence and not because the true cost effectiveness is much larger. So a risk averse donor could prioritize AMF on certainty.
In other words, we can favor more certain interventions, even within GiveWell top charities, because they are more robust to the risk that we have got it all wrong. They are less likely to be overturned by a new study. That seems pretty valuable.
Adding uncertainty to a single intervention may not be too informative. Still, I think it’s more informative than you imply for comparing interventions—especially if you’re considering other decision frameworks for allocating funds beyond giving your money to the one with the highest average cost-effectiveness.
E.g., If you have a framework where you allocate your money in proportion to the probability it has the highest cost-effectiveness, then uncertainty quantification would be essential. I’m not sure anyone supports a rule like this.
Another potentially more real-world example: imagine you’re a grantmaker choosing between 10 interventions that are all 10x more cost-effective than GiveDirectly, but vary in uncertainty. If you’re a Bayesian with more sceptical priors than analysts, you will favour the relatively less uncertain analyses.
For instance, I never expected in my quantifaction of uncertainty in GiveDirectly that there would be practically any probability mass of it being more effective than AMF.
Really? What do you mean by practically? If we crunched the numbers, I guess there’d be a single digit likelihood that GiveDirectly would be more impactful than AMF.
I would just like to emphasise your point that for other GiveWell top charities, we should expect a higher uncertainty than for GiveDirectly, especially the ones working on deworming. So modelling them could be more valuable.
Hello! Thanks for showing interest in my post.
First of all, I don’t represent GiveWell or anyone else but myself, so all of this is more or less speculation.
My best guess as why GiveWell does not quantify uncertainty in their estimates is because the technology to do this is still somewhat primitive. The most mature candidate I see is Causal, but even then it’s difficult to identify how one might do something like have multiple parallel analyses of the same program but in different countries. GiveWell has a lot of requirements that their host plaftorm needs t ohave. Google Sheets has the benefit that it can be used, understood, and edited by anyone. I’m currently working on Squiggle with QURI to make sweeten the deal to quantifying uncertainty explicitly, but there’s a long way to go before it becomes somehing that could be readily understood and trusted to be stable like Google Sheets.
On a second note, I would also say that providing lower and upper estimates for cost-effectiveness for its top charities wouldn’t actually be that valuable, in the sense that it doesn’t influence any real world decisions. I know that I decided to spend hours making the GiveDirectly quantification but in truth, the information gained from it directly is extremely little. The main reason I did it is that it makes a great proof of concept for usage in non-GiveWell fields which need it much more.
There are two reasons why there is so little information gained from it:
The uncertainty of GiveDirectly and other GiveWell supported charities is not actually that high (about an order of magnitude for GiveDirectly, I expect over 2-3 orders of magnitude for the others). For instance, I never expected in my quantifaction of uncertainty in GiveDirectly that there would be practically any probability mass of it being more effective than AMF. At least before counting for things like moral uncertainty.
My uncertainty about my chosen uncertainties are really high. If you strip away how fancy my work looks and just look at what I’ve contributed in comparison to what GiveWell has done, I’ve practically copied GiveWell’s work and pulled some numbers out of thin air for uncertanity with the help of Nuno. Some Bayesian Analysis is done under questionable assumptions etc.
I see much more value in quantifying uncertainty when we might expect the uncertainty to be much larger, for instance, when dealing with moral uncertainty, or animal welfare/longtermist interventions.
Wildly guessing, but I don’t think it’s a technological issue.
Givewell does publish upper and lower estimates for some of their analyses, at least they did for malnutrition interventions: https://docs.google.com/spreadsheets/d/1IdZLSBgEK46vc7cX9C7KnFgUcOk_M0UIYJ8go_DrvS0/edit#gid=1468241237 see at the bottom, 4x to 19x cash.
Many of their CEAs (e.g. new incentives) are just one column. Even for the ones that have one column per country, they could have multiple sheets for upper and lower bounds.
I agree with your second point, I think GiveWell’s mission is not informing many small donors anymore, but informing OpenPhil (and maybe other big players), and OpenPhil cares mostly about GiveWell’s best guess about “what does the most good”.
I disagree with the uncertainty being “not actually that high”, or that moral uncertainties should be considered separately. Considering moral uncertainty, the impact can vary by orders of magnitude. See https://blog.givewell.org/2008/08/22/dalys-and-disagreement/ (very old blog post, but I think the main point still stands), and https://forum.effectivealtruism.org/posts/3h3mscSSTwGs6qbei/givewell-s-charity-recommendations-require-taking-a. I think many donors would be interested in seeing those kinds of uncertainties somewhere.
That seems pretty high to me! When I’ve seen GiveDirectly used as a point of comparison for other global health/poverty charities, they’re usually described as 1-10x more effective (i.e. people care about distinctions within one order of magnitude).
One useful takeaway would be to know whether some interventions are much more uncertain about their range, and if that says something about the strength of evidence. If AMF is 6-10x and deworming is 1-20x (where 1x is point estimate on cash transfer cost effectiveness), then deworming might have a higher point estimate of cost effectiveness than AMF. But the large uncertainty suggests that maybe this is because we have much less evidence and not because the true cost effectiveness is much larger. So a risk averse donor could prioritize AMF on certainty.
In other words, we can favor more certain interventions, even within GiveWell top charities, because they are more robust to the risk that we have got it all wrong. They are less likely to be overturned by a new study. That seems pretty valuable.
Also an order of magnitude is really large.
Adding uncertainty to a single intervention may not be too informative. Still, I think it’s more informative than you imply for comparing interventions—especially if you’re considering other decision frameworks for allocating funds beyond giving your money to the one with the highest average cost-effectiveness.
E.g., If you have a framework where you allocate your money in proportion to the probability it has the highest cost-effectiveness, then uncertainty quantification would be essential. I’m not sure anyone supports a rule like this.
Another potentially more real-world example: imagine you’re a grantmaker choosing between 10 interventions that are all 10x more cost-effective than GiveDirectly, but vary in uncertainty. If you’re a Bayesian with more sceptical priors than analysts, you will favour the relatively less uncertain analyses.
Really? What do you mean by practically? If we crunched the numbers, I guess there’d be a single digit likelihood that GiveDirectly would be more impactful than AMF.
Great points, thanks!
I would just like to emphasise your point that for other GiveWell top charities, we should expect a higher uncertainty than for GiveDirectly, especially the ones working on deworming. So modelling them could be more valuable.