Does Economic Growth Meaningfully Improve Well-being? An Optimistic Re-Analysis of Easterlin’s Research: Founders Pledge

Acknowledgments

I would like to thank Michael Plant, Matt Lerner and Rosie Bettle for their helpful comments and advice.

Summary

Understanding the relationship between wellbeing and economic growth is a topic that is of key importance to Effective Altruism (e.g. see Hillebrandt and Hallstead, Clare and Goth). In particular, a key disagreement regards the Easterlin Paradox; the finding that happiness[1] varies with income across countries and between individuals, but does not seem to vary significantly with a country’s income as it changes over time. Michael Plant recently wrote an excellent post summarizing this research. He ends up mostly agreeing with Richard Easterlin’s latest paper arguing that the Easterlin Paradox still holds; suggesting that we should look to approaches other than economic growth to boost happiness. I agree with Michael Plant that life satisfaction is a valid and reliable measure, that it should be a key goal of policy and philanthropy, and that boosting income does not increase it as much as we might naively expect. In fact, we at Founders Pledge highly value and regularly use Michael Plant’s and Happier Lives Institute’s (HLI) research; and we believe income is only a small part of what interventions should aim at. However, my interpretation of the practical implications of Easterlin’s research differ from Easterlin’s in three ways which I argue in this post:

  1. Easterlin finds small coefficients in his preferred regressions of changes in countries’ happiness on changes in GDP. He concludes that these coefficients have low “economic significance” and that increasing economic growth is not a good way to make people happier. However, even if we take these coefficients at face value, they still represent a very meaningful increase in wellbeing within the effective altruism framework, consistent with the impacts of unconditional cash transfers on individuals. The benefits become very large when aggregated across all the people in a country for many years.

  2. We also have reason to doubt Easterlin’s results, in that they are highly sensitive to small changes in methodology. We perform two variations on his regression that fully accept his methodology of only including “full cycle” countries, but update it slightly, reversing the result. If we replicate his results counting one more country as a “transition” economy, the Easterlin paradox largely disappears. If we repeat his analysis with new data from 2020 instead of 2019, the paradox also seems to largely disappear.

  3. It may be difficult to find things we can influence whose change over time will have a higher correlation to a country’s change in happiness than changes in GDP. Even if we accept that boosting GDP does not meaningfully increase happiness, other potential means of boosting national happiness may increase it even less. If we rerun Easterlin’s analysis using three interventions Easterlin and Plant suggest (health, pollution, and a comprehensive welfare state), their implied impacts on national happiness are much smaller than the impacts for GDP or negative. However, I have low confidence in this conclusion, and think it is a very valuable project to identify the interventions that are most likely to have an impact on happiness.

1. Taking Easterlin’s results at face value and estimating impact

Easterlin and O’Connor (2022) rely on two regressions for their conclusions, both comparing annual changes in a country’s happiness to annual changes in per capita GDP. The first measures happiness using a “life satisfaction” survey question on a smaller set of countries from 1981-2019 and the second uses a “best possible life” survey question on a larger set of countries from 2005-2019. After excluding some of the countries in the dataset, the authors find that a one percent increase in annual GDP growth rate increases happiness by .001 and .0024 life satisfaction points in the two regressions. They conclude that these coefficients imply that it would take 500-1000 years of one percentage point higher GDP growth to increase happiness by one point, and have low “economic significance.” At first glance, these numbers do seem negligible.

However, once we compare these numbers to what we would expect from the literature on the happiness impacts of cash transfers, we find that they are no smaller than we should expect. Despite being small, these numbers are not exactly 0, and to get a sense of their practical implications we need to convert them to units more familiar in effective altruism. If we want to compare the impact of economic growth to the impact of interventions like cash transfers or deworming, it is helpful to convert the happiness impact of one percentage point higher growth to units capturing the happiness impact of doubling income. In order to do this, we have to consider that it would take 71 years to create an additional doubling of income by boosting growth by one percentage point. Therefore, a doubling of income would produce a 0.07 point increase in happiness using Easterlin’s first regression and a 0.17 point increase using the second. In comparison, HLI’s meta-analysis suggests that providing a cash transfer that doubles income for an individual leads to a 0.1 standard deviation increase in subjective well being. This equates to roughly 0.2 life satisfaction points for the recipient of the transfer. When we use HLI’s methodology to adjust for the fact that other household members likely experience smaller benefits, we get an expected increase of 0.14 life satisfaction points for an average person. So one of Easterlin’s estimates is lower than the impact of a cash transfer, and one is higher. The overall picture appears to be consistent with changes in GDP providing as much happiness as changes in individual income resulting from cash transfers. GiveDirectly, which provides unconditional cash transfers, has historically been one of GiveWell’s top charities, and generally seems like a very good use of money even if it is not the very best.

The happiness impacts of boosting GDP become very large when we take individual impacts that are comparable to GiveDirectly and aggregate them for a whole country for many years. Let us consider the impact of boosting incomes for a whole country with the same population as Ethiopia. We assume that we can find an intervention that boosts GDP growth by one percentage point for 40 years, and that the happiness impacts of this are as small as estimated by Easterlin. Only considering effects over forty years is a fairly arbitrary choice, picked to match GiveWell’s methodology of valuing income increases for 40 years, discounted at 4% annually. I think this is a fairly conservative choice, as some economic research suggests very long-term persistence of changes to GDP. We sum the discounted happiness boost across the entire population. The impact of boosting annual GDP growth from 2% to 3% would produce[2] the equivalent of approximately 400 million person-years of doubled income. HLI and GiveWell each independently estimate that the most cost effective interventions they have identified are approximately 10 times as cost effective as GiveDirectly at improving well-being. Using this multiple, the current costs of GiveDirectly suggest that EA as a community should be happy to spend $10 billion to boost GDP growth in Ethiopia by 1 percentage point for forty years. The amount would be even higher if we incorporated the likely impact of higher GDP on health and education. This is more than ten times as much as all of the money EA is likely to move this year, and likely more than the annual funding of all economics professors worldwide, the IMF, and development economics at the World Bank combined. This does not have any conclusive implications for whether boosting growth in a country like Ethiopia is tractable at these funding levels. However, it does suggest that the well-being benefits are very significant from an EA perspective, in contrast to Easterlin’s interpretation.

2. Easterlin’s estimates of impact become much larger with small changes in methodology.

The previous section looks at the impacts suggested by Easterlin’s methodology if we take it at face value. However, this methodology generates lower regression coefficients than most similarly reasonable alternative specifications. We compare Easterlin’s results with those we get if we rerun his analysis making a different choice about whether we consider India a transition economy, and then by rerunning his analysis with updated happiness survey data. Additionally, we compare Easterlin’s headline results with alternative versions he presents in his paper. These alternative versions of the analysis yield coefficients more consistent with the idea that GDP gains over time yield as much happiness increase as we would expect from cross sectional data than they are with the Easterlin Paradox. Therefore, I don’t think this latest paper should update us much away from the intuitive idea that higher incomes lead to more well being.

Cross sectional data suggests that we should expect a 0.5 point increase in happiness from a doubling in income. If we look at a regression of Cantril Ladder “best possible life” scores against GDP on a log scale, the coefficient implies slightly under 0.5 points per GDP doubling.

Cantril Ladder versus GDP (Sacks et al. 2010)

alt_text

Similarly, if we look at a graph of Cantril Ladder scores for individuals versus their incomes (figure 1 in Michael Plant’s post), we can estimate around 0.5 points per income doubling. In contrast, if we look at Easterlin’s .0024 regression coefficient, it only implies an increase of 0.17 points per income doubling.[3] This is close enough to 0 that it is reasonable for Easterlin to classify it as a paradox when compared to the 0.5 point estimates from alternative sources of data. However, when I rerun Easterlin’s analysis classifying one additional ambiguous case as a transition economy, or using newer data, the coefficients increase. The new results are closer to 0.5 than they are to 0, and don’t seem to imply the existence of a paradox.

Easterlin argues that we need to exclude countries that transitioned from socialist to capitalist economies from our analysis in order to remove the noise created by countries that only start to conduct happiness surveys just as their economies plummet with the start of the transition to capitalism. Most of the countries he excludes are Eastern European, but he also considers China a transition economy. I think it would be reasonable to put India in the same category as China, since both countries experienced a more gradual transition from socialism than did Eastern Europe with the collapse of the Soviet Union. If we repeat Easterlin’s analysis with his data, but exclude India along with China, we get an estimate of 0.3 life satisfaction points per income doubling. So the estimate moves from being closer to 0 to being closer to 0.5 after a minor methodological adjustment.

Next, I replicate Easterlin’s analysis with newly available 2020 “best possible life” scores, instead of the 2019 data in the original paper. In this regression I accept all of his methodological choices about which transition economies to exclude, and how to decide whether a country needs to be excluded for insufficient data. The new regression implies an impact of 0.3 life satisfaction points per income doubling.[4] Once again, this version of the analysis is closer to being consistent with the cross sectional data (0.5 points per income doubling), than it is to a paradox (0 points).

Similarly, if we look at alternative versions of the regressions included in Easterlin and O”Connor (2022), almost all of them have much higher coefficients than the main result. Easterlin makes two key methodological choices. The first is excluding transition economies. For both the “life satisfaction” and “best possible life” regressions, not excluding the transition economies would imply an impact of 0.4 life satisfaction points per income doubling. The second choice Easterlin makes is excluding all countries from the “best possible life” regression that have fewer than 12 years of data available. When he includes the 8 additional countries with 10 or 11 years of data, the impact also goes up to 0.4. I think Easterlin makes good arguments for these two choices.[5] However, I think we have to consider how sensitive his conclusion is to judgment calls when deciding how much to believe that there is a surprising paradox[6] in the happiness data.

3. The happiness impact of alternative interventions is smaller than the impact of GDP.

Easterlin concludes his latest paper by suggesting that even though he does not believe that GDP growth has a meaningful impact on happiness, that there are a number of better interventions. Michael Plant adds some suggestions to the list in his post, coming up with a set of potential interventions that includes:

“...job security, a comprehensive welfare state, getting citizens to be healthy, and encouraging long-term relationships…[taking] mental health and palliative care more seriously…improved air quality, reduced noise, more green and blue space (blue spaces being water), and getting people to commute smaller distances (Diener et al. 2019). Social interactions could be enhanced via urban design, reducing corruption, increasing transparency, supporting healthy family relationships, and maybe even things like progressive taxation.”

All of these sound like promising ideas, and are a good research agenda for future investigation. However, it may be difficult to find one of these measures that has a higher impact on country-level happiness than GDP using Easterlin’s methodology. To perform an exploratory analysis, I start with Easterlin’s data from his “best possible life” regression (taking his relatively low estimated impacts at face value as I do in section 1.) I then choose three interventions from Michael Plant’s list that seem to have a fair amount of annual data available on OurWorldInData.org: health, pollution and a comprehensive welfare state.[7] I replace annual GDP growth in Easterlin’s regression with annual growth on these three metrics, and perform a separate analysis for each one.[8] Each regression looks at annualized changes in a country’s Cantril ladders scores versus annualized changes in the specified metric for the past 12-14 years. The health regression estimates how much a decrease in the number of years people in a country lose to ill health corresponds to increases in happiness. This regression produces coefficients that are either an order of magnitude smaller than the GDP regression, or negative, depending on whether we exclude countries that have less than 12 years of data. In both cases the r-squared of the regression is essentially 0.. There does not appear to be a way to interpret these results to suggest that changes in health have a higher impact on national happiness than changes in GDP. The pollution regression repeats the methodology for health, but looks at only the changes in the years of life lost to pollution. This analysis actually shows negative results of a magnitude similar to the positive results of the GDP regression. This would imply that increases in pollution are actually associated with countries getting happier. For example, the Republic of Congo and Benin both had large annual increases in happiness despite increasing levels of pollution.[9] The comprehensive welfare state regression examines the impact of changes in a score of whether a country has an adequate safety net. This analysis also shows negative results, however there are very few countries and years for which this data is available and the data appears to be of low quality, suggesting that we should not read too much into this result. In all three of these analyses we do not find any evidence consistent with any of these metrics having a higher impact on national happiness than changes in GDP.

I do not have a high level of confidence in these initial results. There are likely better sources of data, and better methodologies to employ. However, I do think they suggest that it may be difficult to find any interventions of their kind which will imply a larger impact on happiness than GDP using Easterlin’s methodology.

4. Conclusion

Easterlin’s estimates of the impact of GDP growth on happiness are not as small as they initially appear. They are consistent with experimental data from individual cash transfers, and imply large welfare gains when aggregated for an entire country. When I consider slight variations in methodological choices that Easterlin makes, or update his data for 2020, the estimated impacts get much bigger. This leads me to decrease my belief in the existence of an Easterlin Paradox that we need to explain. But even if we accept Easterlin’s estimates, it may be difficult to find other things we can influence that will have a larger measured impact on happiness than GDP growth. I find three of the more promising potential ways to boost national happiness to have a smaller impact than boosting GDP. Of course, other interventions may prove to be far more tractable than boosting GDP, even if they have a lower impact on happiness. Also, we can likely find better sources of evidence than regressions with fewer than a hundred datapoints. So my conclusion is not that different from Easterlin and Michael Plant in that I do think the interventions they propose are very promising routes to explore towards increasing happiness. I just don’t think the data warrants dismissing GDP growth as a potentially even more promising route.

Notes


  1. ↩︎

    I use happiness in this post interchangeably with both life satisfaction and Cantril Ladder “best possible life” scores. Easterlin does not discuss measures of affect or other more immediate metrics, so happiness is not meant to refer to those here.

  2. ↩︎

    This spreadsheet calculation starts with one individual benefiting from increased GDP growth, and assumes that benefits of the GDP growth intervention start accruing after 8 years (just as the benefits of deworming start to accrue 8 years after). On the 9th year, the benefit is the difference between growth of 3% versus 2%, discounted by 4% for 9 years. The benefit is quantified as ln of percent income (to be consistent with GiveWell income boosting CEAs). The 10th year has further growth of the same amount, and is discounted by 4% for 10 years. We continue this estimation for 40 years and sum across all years. We then multiply by 115 million, the population of Ethiopia. The total ln impact is then converted to income doublings by dividing by ln(2). We consider the cost of an income doubling to be 110 the $294 GiveWell uses for GiveDirectly. We discount the final result slightly because the average estimated impact in Easterlin’s two regressions is slightly below the impact of cash transfers on happiness.

  3. ↩︎

    I focus on replicating the Cantril Ladder “best possible life” regression from the Gallup survey within Easterlin and O’Connor because the data is more readily available online than the World Values Survey “life satisfaction” regression. I am also not able to benefit from the longer time series in the World Values Survey because my potential interventions in section 3 have limited historical data. After arriving at regression coefficients, I convert them to “life satisfaction impact of an income doubling” here.

  4. ↩︎

    The regression is not an exact match to Easterlin’s because I use a different source for GDP, the new dataset includes more countries, and I use a simplified methodology for estimating annual happiness change (I do this because I believe it is more consistent with how we calculate annual GDP change, and for simplicity). However, my 2019 coefficient roughly matches Easterlin’s.

  5. ↩︎

    The reason that Easterlin excludes economies that have less than 12 years of data and that recently transitioned from socialism to capitalism is that he does not consider them to be full-cycle countries. Easterlin accepts that short term fluctuations in the business cycle have impacts on happiness in the short term. However, he thinks that once we zoom out to longer patterns of GDP growth across an entire business cycle, these impacts disappear. Since economies that transitioned from socialism have generally been growing since happiness surveys for these countries have become available, they have not experienced a full business cycle including a “bust.” Similarly, Easterlin does not consider countries with less than 12 years of data to have had a full business cycle (although he does not specify why the cutoff should be 12 rather than 13 or 11).

  6. ↩︎

    Easterlin argues that this contradiction can be resolved by considering that people evaluate their lives in comparison to those around them. While this could explain the contradiction with cross-sectional data of individuals within a country, I do not think it does a good job of explaining the cross-sectional data at the country level, or the data from cash transfers. If people were evaluating their lives in comparison to those around them, then people in poorer countries would not be much less happy than people in rich countries at a given point in time (especially in older data sets before the spread of Western media). People would also not get any happier when they and everyone in their village received a cash transfer from GiveDirectly.

  7. ↩︎

    I proxy health with “DALYs-rate-from-all-causes per 100,000,” pollution with “DALYs-particulate-matter per 100,000,” and a comprehensive welfare state with “adequacy-of-social-safety-net-programs.”

  8. ↩︎

    I begin the analysis by pulling all happiness scores, and years for which they were pulled from Easterlin’s appendix. In order to estimate the changes in each new metric, I pull data for all available years from OurWorldInData. I then do a lookup for the country-year pairs corresponding to the starting and ending year for each country in Easterlin’s data. Finally, I annualize the changes from the first to last year using the same methodology as Easterlin uses for GDP. I also multiply the annual changes by a factor so that they have the same standard deviation as the changes of GDP, and so that larger numbers imply improvement. I do this in order to have a consistent interpretation of coefficients. However, the conclusions are not sensitive to this methodological choice.

  9. ↩︎

    Of course, this does not mean that decreasing pollution levels is bad for happiness. The more likely explanation is that higher pollution in low and middle income countries is associated with more industrial jobs and more cars, both of which probably make people happier. Although, surprisingly, controlling for GDP growth does not seem to do much to reduce the coefficient on pollution.