Lant Pritchett’s “smell test”: is your impact evaluation asking questions that matter?

Aaron Gertler 🔸18 Mar 2020 23:38 UTC

36 points

Cause prioritization Randomized controlled trials Global health & development Impact assessment Economic growth

Lant Pritchett is a development economist and master of backhanded compliments. I’m always looking for new frames to use when I think about causes, and his “smell test” fits the bill.

In short: Think about countries that have been successful, economically. What are some things these countries do? And does the development program you favor actually make developing countries more similar to developed countries?

This is, of course, not a fully general argument against RCT-driven interventions in the GiveWell mold. But I found it an interesting supplement to the Forum’s recent debate around economic growth research.

****

In 2006 I was in West Bengal with a World Bank team and was asking questions of a group of women about a “livelihoods” program that built and financed women’s self-help groups as a means of increasing women’s productivity and incomes. After asking them questions for an hour or so I asked them if they had any questions for me or the team—after all, they had been so gracious to answer our nosey questions we would be rude to not allow them to ask us anything they wanted to know. After an awkward silence, one woman said “You all are from countries that are much richer and doing much better than our country so your country’s women’s self-help groups must also be much better, tell us how women’s self-help groups work in your country.”

I’m American. Along on the team was a German woman, another man from New Zealand, and a woman from the UK. We all looked at each other blankly as none of us had any idea whether there even were at any time in our countries’ history such a thing as “women’s self-help groups” in our countries (much less government program for promoting them). We also had no idea how to explain that, yes, all of our countries are now developed but no, all of our countries did this without a major role from women’s self-help groups at any time (or if there were a role we development experts were collectively ignorant of it), but yes, women’s self-help groups promote development.

My four-fold “smell test” for what is important to development

I have a four-fold criteria for whether something is potentially an important determinant of development, or more narrowly, just economic growth, and I am happy if “thing X” that I am proposing is “good for development” can satisfy all four (and then can move on from these simple facts about potential importance to tease out complicated questions of proximal, distal, and reverse causality).

One, countries differ in their level of development by an order of magnitude. Countries that are developed should have more of thing X than countries that aren’t. If Denmark and Canada don’t have more of thing X than Mali or Nepal I am kind of suspicious.

Two, since now developed countries are almost an order of magnitude more developed than they were in 1870 I am happy if there is more of thing X in developed countries now than 140 years ago. If Germany and Japan don’t have more of thing X (or at least the same amount) than they did in 1870 I am kind of suspicious.

Three, since over the period since 1950 some countries have seen their development improve incredibly rapidly and others have seen almost no progress I am happy if thing X is more prevalent in rapid development successes than in development failures. If Korea and Taiwan don’t have more of thing X than Haiti and Nigeria then I am kind of suspicious.

Four, since countries change in their pace of development (and this is particularly true of economic growth, less so of human development indicators) dramatically over time, I am happy if there is more of thing X in a country in periods when development progress is rapid than in periods when development progress is slow. If China doesn’t have more of thing X after 1978 than before 1978 (as growth accelerated by 3.3 ppa) or if Cote d’Ivoire doesn’t have less of thing X after 1978 than before 1978 (as growth decelerated by 3.7 ppa) then I am kind of suspicious.

These four of course don’t resolve the debates or details about the respective roles of macroeconomic management, policy approaches to external markets (e.g. trade, capital, ideas), security of property rights, infrastructure, accumulation of human capital, technological change, capability in the product space, or “institutions” (or, more deeply, what is cause and what is consequence amongst these elements themselves). But nearly all contenders in debates about economic growth or development more broadly pass 2 or 3—and sometimes all four—of these “smell tests” of at least potentially being an important determinant.

Are interventions being evaluated important for development?

Eva Vivalt (2014) has written a paper that is so good it deserves several blog posts to discuss its interesting findings. She and her team have asked the important question about the generalizability of the findings from “rigorous impact evaluations” (including RCTs). In order to do her team surveyed 621 papers (not all of which could be used in her analysis). That is an impressive number. Suppose typical productivity of an academic or research economist is three original completed papers per year. Then 621 papers is 207 person/years of research. Alternatively think of inclusive cost (opportunity cost of researcher time plus money costs) per impact evaluation.

I would encourage you to fill in this table with the 20 programs on which Vivalt (2014) finds enough rigorous impact evaluations for comparison.

After the table is filled in (don’t cheat or I’ll send a nudge mobile phone reminder to alter your behavior) ask yourself: why has much of the best and brightest talent of a generation of development economists been devoted to producing rigorous impact evaluations about these 20 topics?

Aaron Gertler 🔸18 Mar 2020 23:38 UTC

36 points

7 comments3 min readEA link

Cause prioritization Randomized controlled trials Global health & development Impact assessment Economic growth

lucy.ea8 21 Mar 2020 6:56 UTC
3 points
0 ∶ 0
Interesting replacing “thing X” with “basic education” reads as follows
My four-fold “smell test” for what is important to development
I have a four-fold criteria for whether something is potentially an important determinant of development, or more narrowly, just economic growth, and I am happy if “basic education” that I am proposing is “good for development” can satisfy all four (and then can move on from these simple facts about potential importance to tease out complicated questions of proximal, distal, and reverse causality).
One, countries differ in their level of development by an order of magnitude. Countries that are developed should have more of basic education than countries that aren’t. If Denmark and Canada don’t have more of basic education than Mali or Nepal I am kind of suspicious.
Two, since now developed countries are almost an order of magnitude more developed than they were in 1870 I am happy if there is more of basic education in developed countries now than 140 years ago. If Germany and Japan don’t have more of basic education (or at least the same amount) than they did in 1870 I am kind of suspicious.
Three, since over the period since 1950 some countries have seen their development improve incredibly rapidly and others have seen almost no progress I am happy if basic education is more prevalent in rapid development successes than in development failures. If Korea and Taiwan don’t have more of basic education than Haiti and Nigeria then I am kind of suspicious.
Four, since countries change in their pace of development (and this is particularly true of economic growth, less so of human development indicators) dramatically over time, I am happy if there is more of basic education in a country in periods when development progress is rapid than in periods when development progress is slow. If China doesn’t have more of basic education after 1978 than before 1978 (as growth accelerated by 3.3 ppa) or if Cote d’Ivoire doesn’t have less of basic education after 1978 than before 1978 (as growth decelerated by 3.7 ppa) then I am kind of suspicious.
Basic education easily passes the first 3 tests. The final one also passes, with a time delay of 20 years (which is roughly the time it takes a kid to go through school and start working.)
human development indicators
good to see Lant Pritchett give a nod to human development indicators (and indirectly to the human development index)
accumulation of human capital, technological change, capability in the product space, or “institutions” (or, more deeply, what is cause and what is consequence amongst these elements themselves).
Good to see that room is left for human capital to be a cause and not merely a consequence, as most in EA seem to think
But nearly all contenders in debates about economic growth or development
Good to see subtle acknowledgement that “economic growth” and “development” can be different.
As a starting point EA should think from a human development standpoint, and not silently drop education from the definition of development.
cole_haus 19 Mar 2020 6:58 UTC
2 points
0 ∶ 0
I remain pretty confused by this line of argument. I basically parse it as: we should strive to make the actions of developing countries similar to the (best) actions of developed countries. But actions seem of merely instrumental interest and what we actually care about is states (conditions) that are conducive to development.

The recommendations from these two perspectives (actions vs states) converge only insofar as the best actions are invariant across states. But this is quite a big claim and contradicted by e.g. Rodrik who insists that “Institutional innovations do not travel well”.

It seems like the development interventions we commonly see can be readily justified by the state-based view. For example, no, we didn’t see widespread deployment of insecticidal nets in the US, but, yes, we did see deliberate effort to achieve and good returns from achieving a low burden of infectious disease in the US. No, we didn’t have women’s self-help groups, but, yes, we did achieve a state of increased gender equality and of increased integration of women into the formal economy.

TL;DR: Why would we expect the same actions to produce the same end state given different starting states?
- Aaron Gertler 🔸 25 Mar 2020 0:47 UTC
  4 points
  0 ∶ 0
  Parent
  I see his argument as:
  “To create conditions conducive to development, we should have a moderately strong prior in favor of doing things almost every developed country has done, and a moderately strong prior against doing things almost no developed countries have done.”
  I’m not familiar with Rodrik’s work, but my mental model of Pritchett would claim that we should try to find similarities between very different countries that successfully developed, and that such similarities do exist. (My model could be way off, and it doesn’t account for most of how I judge development projects.)
  I actually didn’t read Pritchett as having anything against LLINs, because “stopping malarial mosquitoes from biting people” seems like a thing developed countries generally do. (If he’s actually against LLINs and a big promoter of eradication strategies, I’m reading him wrong.)
  I also imagine him trying to think backwards from end states: “What would a developed, wealthy Kenya look like? What sorts of work do people do in this hypothetical country? What role do women’s self-help groups play? If they’ve faded away, what role would they have had in enabling development? Why do we think they’d have had that role if we don’t have evidence that women’s self-help groups have enabled development in other places?”
brb243 24 Mar 2020 19:54 UTC
1 point
0 ∶ 0
I guess that for the last column, this cannot be proven—too many variables can influence economic development so that one cannot be isolated controlling for all others.
For column 1, I guess scholarships, contract teachers, irrigation, performance pay, if any.
For column 2, perhaps similar to guess in column 1, plus why would we test these areas in developed, as opposed to developing, economies? Findings from developed economies may not be generalizable to developing countries.
For column 3, I guess the finance- and tech-related areas.
Plus, the working link is here.
Findings on page 27 are shocking, speak for bed nets (they help with malaria), conditional and unconditional cash transfers (help with schooling), micronutrients should help with diarrhea and anemia. Otherwise not much impact can be proven. I guess an argument for AMF and GD.
ishi 21 Mar 2020 10:48 UTC
1 point
0 ∶ 0
I can’t really tell what the article is about, but it appears to be saying that devoting alot of resources and talent to academic economists to do rigorous RCT evalutions of programs is ‘innefective’ or ‘inneficient’ (a waste). (I think the recent noble econ prizes were for this—so this might be critique of them.) I think the same point is often made of alot of rigorous economics—many view these as primarily aesthetic or mathematical excercizes which some economists value more than developement or economic policy.
I think there is a place for aesthetics, mathematics, RCTs and evaluations of them, as well as other forms of policy research and interventions.
But you sort of have to figure out what mix is the best.
I also don’t think the ‘smell tests’ are well worded. I think academic specialties often have their own dialects (as does EA) and they are often mutually almost incomprehnsible. Ecological economists and neoclassiclas often have different dialects, and the same theorem in math can be proven at times many ways, but people can only understand some of them,.
- Aaron Gertler 🔸 25 Mar 2020 0:46 UTC
  3 points
  0 ∶ 0
  Parent
  To clarify, Lant Pritchett is a development economist criticizing other development economists here. He’s the only person I’ve heard use “smell test” in this particular field, but it’s also a pretty common expression for applying “common sense” to check whether an idea seems good, across many different domains.
  - ishi 31 Mar 2020 23:10 UTC
    1 point
    0 ∶ 0
    Parent
    I was only commenting on the particular wording of the ‘smell test’ in devlopemntal economics—i use a smell test to decide if i need to throw food away which i try not to do, or wash clothes , or if somehow a dead mouse is in my apt—i leave mice alone but they are poisoned by what they eat or die of old age—i dont think they live very long—maybe 2 years
    developmental economics (definately not my area) i associate with jeffrey sachs, william easterly, amartya sen, and partha dasgupta. one can add jagdish baghwati and more than i can remember. there are more recent ones. i just know these from books and articles. (there are math modelers, anti-globalization/neoliberalism activists, etc—they all have books and articles).