Yes, that seems like the main thing we disagree about. It also seems like we disagree about the likely impact of deworming.
I would also like to do that but I don’t have time to do that properly unfortunately—my time is now focused on longterm relevant stuff.
I think the point here is that the growth decelerations don’t seem like they would be big enough to make the back of the envelope calcs on the world bank, IMF and all economists in mine and Hauke’s post lower than RCT stuff.
Thanks for clarifying. First, (this is not a criticism of you) this is different to the view taken by Open Phil which does try to compare its rich world policy to AMF and is what prompted the main post. So, at least Open Phil can’t think it is an apples to orange comparison. Second, I don’t really get how it is an apples to oranges comparison. You seem to be saying that the judgement of the probability is too high. but that is different to the metric used to compare the options being incomparable. I don’t understand why you think the probability estimate is too high. The back of the envelope on the world bank and IMF seems like a pretty good guide. You really have to believe some implausible things to think that the probability of success is low enough. On the ‘all economists’ one for example, the estimate is extremely conservative because it only counts the economic gains in China and assumes that is the only thing that all economists achieved.
The reason I cited the Cochrane review was that GiveWell also cited it in their review of deworming. My main aim is to dispute GiveWell’s reasoning as that is what is driving a lot of EA money, which I think could do much more good elsewhere. On external validity, I’m going off the Vivalt paper.
It’s a natural way to interpret GiveWell’s views because the only charities they recommend, or (I think) have ever recommended, have been tested by RCTs. All of the charities on their standout list have all been tested by RCTs. This suggests at least that they put extremely high weight on RCT evidence. Even if they are not at the extreme of complete agnosticism sans RCT, they are very close to it. To use an example from Lant, Chris Blattman said that the best investment to fight world poverty would be to run an RCT comparing giving people cash or giving people chickens. I find this stance extremely hard to understand without the implicit premise of agnosticism in the absence of RCTs. I don’t think it would be difficult to find similar claims by duflo and banerjee with a bit of extra time. I do think that this goes beyond RCTs and extends to an over-reliance on empirical studies, as I argued here.
I’m not sure which direction RCTs will go in in the future. The publication incentives are to do that type of work.
Another point of agreement: the economics profession currently focuses too much on empirical work. Meanwhile my own personal view is that people like Esther and Chris B are slightly ‘too far’ in the pro-RCT camp, and that people like Lant (and you) are ‘too far’ in the anti-RCT camp. But I don’t see anyone in this discussion as being extreme (except possibly Lant...); healthy disagreement is to be expected and encouraged. Note that Esther and Abhijit’s most recent book tackles macro issues like migration, trade, climate change, and yes growth—using RCTs when possible / relevant but also plenty of other results (including lots of theory! Abhijit started life as a theorist, like I did). Meanwhile Chris has a forthcoming book on war and peace (macro level! no easy RCTs) for which he uses other approaches like machine learning. You can find all sorts of quotes, but the proof is in the pudding. Final point on this is that one can easily combine RCTs with admin data, ML, etc, and researchers (including me) are doing more and more of that, which imo is great—it’s not always one or the other.
As you say, the efficacy of deworming seems to be a point of disagreement between us. Again pulling back somewhat, you link to Eva’s paper as supporting your claim that RCTs have minimal external validity, but her paper is about all forms of impact evaluation (and she notes in the conclusion that the subset of RCTs aren’t special). So this would be extremely damning for economics if true, but her results don’t support your claim. For instance she notes that bednets and conditional cash transfers seem to do very well on this front. More relevantly, her point (as I read it) is to see how much of the nominal variation in effect sizes can be explained by other contextual variables, and she finds that typically a nontrivial amount of it can be. This is good news for external validity, since it means we can often explain / predict the differences even when they do arise.
I think I haven’t been very clear about ‘apples to oranges’ - I agree that these can & should absolutely be compared. I just felt like the way you were doing it glossed over an important difference. I can write a check to AMF and feel very confident that something will change in the world; we can then debate the expected magnitude of the impact of that change. But I can’t write a check to “growth reform in the developing world”, so even before we debate the relative benefits of changing immigration policy vs distributing bednets we have to calculate the probability that the desired policy will get implemented. I realize you’re fully aware of this, but that’s the part I keep coming back to because that’s the part where I’m pessimistic (partly having worked for the US government, although for a counterargument I liked this recent forum post) and suspect that our intuitions disagree, and mostly you keep talking about the benefits of more migration and of GDP growth (which are great!) and not so much about how we sit down and estimate the likelihoods of bringing those about. I’ll admit that the “pessimistic” estimate of 1% on ICRIER in the original post with Hauke really made me distrust everything afterward, since the pessimistic estimate in that case is a negative number and a plausible median estimate seems to be about 1 in a million.
On China I suppose my main point is still that I think it’s simply very very hard to quantitatively estimate most of this. Just because you (or I, or anyone) thinks that something is extremely conservative (when you admit you haven’t put in as much time on all this as you’d like, and indeed it’s not your job to do so) doesn’t make it so. In this specific case, if you forced me to take a stand, my best guess is to agree with you that economists have helped push policy in a better direction and that that made a big difference to global welfare. Even if I felt more confident about that, what is the counterfactual you are comparing to? Did some NGO or the WB cause that to happen on the margin, or would economists have tried to learn about the world and influence policy anyway? Are there similar opportunities going forward? The Taliban says they want economics expertise, so perhaps. But I don’t think we know the answers to these questions (yet), even within orders of magnitude, and whether or not this type of approach will beat RCT-type approaches depends entirely on those particular probabilities.
Yes, that seems like the main thing we disagree about. It also seems like we disagree about the likely impact of deworming.
I would also like to do that but I don’t have time to do that properly unfortunately—my time is now focused on longterm relevant stuff.
I think the point here is that the growth decelerations don’t seem like they would be big enough to make the back of the envelope calcs on the world bank, IMF and all economists in mine and Hauke’s post lower than RCT stuff.
Thanks for clarifying. First, (this is not a criticism of you) this is different to the view taken by Open Phil which does try to compare its rich world policy to AMF and is what prompted the main post. So, at least Open Phil can’t think it is an apples to orange comparison. Second, I don’t really get how it is an apples to oranges comparison. You seem to be saying that the judgement of the probability is too high. but that is different to the metric used to compare the options being incomparable. I don’t understand why you think the probability estimate is too high. The back of the envelope on the world bank and IMF seems like a pretty good guide. You really have to believe some implausible things to think that the probability of success is low enough. On the ‘all economists’ one for example, the estimate is extremely conservative because it only counts the economic gains in China and assumes that is the only thing that all economists achieved.
The reason I cited the Cochrane review was that GiveWell also cited it in their review of deworming. My main aim is to dispute GiveWell’s reasoning as that is what is driving a lot of EA money, which I think could do much more good elsewhere. On external validity, I’m going off the Vivalt paper.
It’s a natural way to interpret GiveWell’s views because the only charities they recommend, or (I think) have ever recommended, have been tested by RCTs. All of the charities on their standout list have all been tested by RCTs. This suggests at least that they put extremely high weight on RCT evidence. Even if they are not at the extreme of complete agnosticism sans RCT, they are very close to it. To use an example from Lant, Chris Blattman said that the best investment to fight world poverty would be to run an RCT comparing giving people cash or giving people chickens. I find this stance extremely hard to understand without the implicit premise of agnosticism in the absence of RCTs. I don’t think it would be difficult to find similar claims by duflo and banerjee with a bit of extra time. I do think that this goes beyond RCTs and extends to an over-reliance on empirical studies, as I argued here.
I’m not sure which direction RCTs will go in in the future. The publication incentives are to do that type of work.
Another point of agreement: the economics profession currently focuses too much on empirical work. Meanwhile my own personal view is that people like Esther and Chris B are slightly ‘too far’ in the pro-RCT camp, and that people like Lant (and you) are ‘too far’ in the anti-RCT camp. But I don’t see anyone in this discussion as being extreme (except possibly Lant...); healthy disagreement is to be expected and encouraged. Note that Esther and Abhijit’s most recent book tackles macro issues like migration, trade, climate change, and yes growth—using RCTs when possible / relevant but also plenty of other results (including lots of theory! Abhijit started life as a theorist, like I did). Meanwhile Chris has a forthcoming book on war and peace (macro level! no easy RCTs) for which he uses other approaches like machine learning. You can find all sorts of quotes, but the proof is in the pudding. Final point on this is that one can easily combine RCTs with admin data, ML, etc, and researchers (including me) are doing more and more of that, which imo is great—it’s not always one or the other.
As you say, the efficacy of deworming seems to be a point of disagreement between us. Again pulling back somewhat, you link to Eva’s paper as supporting your claim that RCTs have minimal external validity, but her paper is about all forms of impact evaluation (and she notes in the conclusion that the subset of RCTs aren’t special). So this would be extremely damning for economics if true, but her results don’t support your claim. For instance she notes that bednets and conditional cash transfers seem to do very well on this front. More relevantly, her point (as I read it) is to see how much of the nominal variation in effect sizes can be explained by other contextual variables, and she finds that typically a nontrivial amount of it can be. This is good news for external validity, since it means we can often explain / predict the differences even when they do arise.
I think I haven’t been very clear about ‘apples to oranges’ - I agree that these can & should absolutely be compared. I just felt like the way you were doing it glossed over an important difference. I can write a check to AMF and feel very confident that something will change in the world; we can then debate the expected magnitude of the impact of that change. But I can’t write a check to “growth reform in the developing world”, so even before we debate the relative benefits of changing immigration policy vs distributing bednets we have to calculate the probability that the desired policy will get implemented. I realize you’re fully aware of this, but that’s the part I keep coming back to because that’s the part where I’m pessimistic (partly having worked for the US government, although for a counterargument I liked this recent forum post) and suspect that our intuitions disagree, and mostly you keep talking about the benefits of more migration and of GDP growth (which are great!) and not so much about how we sit down and estimate the likelihoods of bringing those about. I’ll admit that the “pessimistic” estimate of 1% on ICRIER in the original post with Hauke really made me distrust everything afterward, since the pessimistic estimate in that case is a negative number and a plausible median estimate seems to be about 1 in a million.
On China I suppose my main point is still that I think it’s simply very very hard to quantitatively estimate most of this. Just because you (or I, or anyone) thinks that something is extremely conservative (when you admit you haven’t put in as much time on all this as you’d like, and indeed it’s not your job to do so) doesn’t make it so. In this specific case, if you forced me to take a stand, my best guess is to agree with you that economists have helped push policy in a better direction and that that made a big difference to global welfare. Even if I felt more confident about that, what is the counterfactual you are comparing to? Did some NGO or the WB cause that to happen on the margin, or would economists have tried to learn about the world and influence policy anyway? Are there similar opportunities going forward? The Taliban says they want economics expertise, so perhaps. But I don’t think we know the answers to these questions (yet), even within orders of magnitude, and whether or not this type of approach will beat RCT-type approaches depends entirely on those particular probabilities.