(Apologies in advance if you end up seeing two versions of this!)
Hi David, thank you for digging in so deep! Let me respond with the same amount of effort.
First, I was not aware of the EA happiness institute (great!), but I do know about the Oxford centre, because it’s run by one of the authors of the 2023 WHR. I haven’t approached them yet because I wanted to kick the tires as hard as possible on the research first, but I welcome critiques of that strategy.
Re: the report, and its claims: (and please interpret all of my frustration as being very much at the report itself, not you!) While they do not *say* their model is definitive, they unambiguously *act* like it is. For god’s sake, they’re publishing something they call “The World Happiness Report,” and they’ve been doing so for a decade, with exactly the same model. As a researcher I find this particularly galling, because it just feels disingenuous. The data I use has been available for every year of the report’s publication, and yet the 2023 report starts with a table of contents, a brief introduction — and a full-page, full-color, adorable heart-filled cartoon that says GDP is a central factor in satisfaction! Listed, measured, and implied, to be first. Whereas my measurably more accurate model finds that the real predictive power of GNI (not even GDP!) says it explains 2.5% of model variation, and is the *eleventh* most important factor in the model. Sure they have caveats but they’re very much in the fine print, and they don’t make any difference anyway if there’s no alternative provided.
Next: I agree there is an important overlap, both conceptual and literal, in the variables we use. In particular, for the variables on Feelings of Freedom, Healthy Life Expectancy, and Social Support, I use the WHR data itself (in part because FoF and Support are only *available* from them). And you are right that several of the variables could be considered as pieces of other variables. I try to highlight this in the table towards the end of the paper because I think it’s important — and also to show how many even high-level categories the WHR is missing.
But I am adamant that the real value of models like this is in their ability to *improve* satisfaction, not just describe it — and actionability requires more precision than “Health.” I deeply believe it’s only in the nuances that we can actually see the outlines of a solution. It can also dramatically change the interpretation of the very large but very vague variables. For example, “Social Support” on its own sounds like the moral is “just be nice to people.” However when you realize that the model also has huge contributions from Gay and Lesbian Social Acceptance, and Political Power for Women, the moral becomes “be nice to people … including minorities, not just your own in-group … and actually don’t just be nice to them, but meaningfully share real power.” This has a dramatically different narrative, different policies, and different interventions.
Re: causality specifically I address that much more detail in the paper, but since you bring up noisy data I want to emphasize that I put a *lot* of work into making sure that the chosen variables are not noise. I ran dozens of iterations (well, thousands in total!) that randomly omitted rows, and randomly omitted columns, to test the robustness of the results, and the variables I report are the one that are selected every. single. time. In fact, I have reason to believe that the breadth of the search and the strong filter for robustness makes me much *less* susceptible to spurious variables than the WHR. In these thousands of tests, there are two out of six WHR variables that I estimate around thirtieth most important to prediction, and so don’t belong anywhere near the top six like they claim. That’s a third of their reported model!
Last, I’m very impressed that you got into the finance data! You’ve inspired me to take a closer look. I would still defend its inclusion like this: it showed up in every iteration of the robustness test, it’s strongly significant in the model, statistics is about aggregates not about North Korea, and I use 1,964 observations over a period of 18 years. (Plus, North Korea is definitely not included in the Gallup World Poll, so it’s moot for these conclusions anyway.) The variable is also completely consistent with the substance of the other VDem variables: people need political power (women’s political power), they need that political power to be meaningful not decorative (no shadow government), and they need a way to *achieve* that political power — hence public financing for elections. For all of these reasons I think it belongs in the model. It’s true that researchers like Nicholas Carnes emphasize that a naive implementation of funding is unlikely to help on its own—but I would argue that none of the discovered variables will be successful if implemented naively.
Still like above, I think the real contribution comes from the fact that financing provides not just an outcome, but a path — if also a reminder that each specific action can only have a small impact. You may say you want Women’s Political Power, okay, great — but how? Significant public financing for campaigns is a concrete resource, with a concrete action, and a concrete objective. In other words, there’s actually an extremely clear causal mechanism for public financing, that’s much better than many of the other variables. And that’s something that “Healthy Life Expectancy” just doesn’t provide.
I just lost a very long response—I’m trying to comment just to see if this gets eaten too. Hopefully it’s just a moderation thing …
Response to David T (so I don’t forget, too)
(Apologies in advance if you end up seeing two versions of this!)
Hi David, thank you for digging in so deep! Let me respond with the same amount of effort.
First, I was not aware of the EA happiness institute (great!), but I do know about the Oxford centre, because it’s run by one of the authors of the 2023 WHR. I haven’t approached them yet because I wanted to kick the tires as hard as possible on the research first, but I welcome critiques of that strategy.
Re: the report, and its claims: (and please interpret all of my frustration as being very much at the report itself, not you!) While they do not *say* their model is definitive, they unambiguously *act* like it is. For god’s sake, they’re publishing something they call “The World Happiness Report,” and they’ve been doing so for a decade, with exactly the same model. As a researcher I find this particularly galling, because it just feels disingenuous. The data I use has been available for every year of the report’s publication, and yet the 2023 report starts with a table of contents, a brief introduction — and a full-page, full-color, adorable heart-filled cartoon that says GDP is a central factor in satisfaction! Listed, measured, and implied, to be first. Whereas my measurably more accurate model finds that the real predictive power of GNI (not even GDP!) says it explains 2.5% of model variation, and is the *eleventh* most important factor in the model. Sure they have caveats but they’re very much in the fine print, and they don’t make any difference anyway if there’s no alternative provided.
Next: I agree there is an important overlap, both conceptual and literal, in the variables we use. In particular, for the variables on Feelings of Freedom, Healthy Life Expectancy, and Social Support, I use the WHR data itself (in part because FoF and Support are only *available* from them). And you are right that several of the variables could be considered as pieces of other variables. I try to highlight this in the table towards the end of the paper because I think it’s important — and also to show how many even high-level categories the WHR is missing.
But I am adamant that the real value of models like this is in their ability to *improve* satisfaction, not just describe it — and actionability requires more precision than “Health.” I deeply believe it’s only in the nuances that we can actually see the outlines of a solution. It can also dramatically change the interpretation of the very large but very vague variables. For example, “Social Support” on its own sounds like the moral is “just be nice to people.” However when you realize that the model also has huge contributions from Gay and Lesbian Social Acceptance, and Political Power for Women, the moral becomes “be nice to people … including minorities, not just your own in-group … and actually don’t just be nice to them, but meaningfully share real power.” This has a dramatically different narrative, different policies, and different interventions.
Re: causality specifically I address that much more detail in the paper, but since you bring up noisy data I want to emphasize that I put a *lot* of work into making sure that the chosen variables are not noise. I ran dozens of iterations (well, thousands in total!) that randomly omitted rows, and randomly omitted columns, to test the robustness of the results, and the variables I report are the one that are selected every. single. time. In fact, I have reason to believe that the breadth of the search and the strong filter for robustness makes me much *less* susceptible to spurious variables than the WHR. In these thousands of tests, there are two out of six WHR variables that I estimate around thirtieth most important to prediction, and so don’t belong anywhere near the top six like they claim. That’s a third of their reported model!
Last, I’m very impressed that you got into the finance data! You’ve inspired me to take a closer look. I would still defend its inclusion like this: it showed up in every iteration of the robustness test, it’s strongly significant in the model, statistics is about aggregates not about North Korea, and I use 1,964 observations over a period of 18 years. (Plus, North Korea is definitely not included in the Gallup World Poll, so it’s moot for these conclusions anyway.) The variable is also completely consistent with the substance of the other VDem variables: people need political power (women’s political power), they need that political power to be meaningful not decorative (no shadow government), and they need a way to *achieve* that political power — hence public financing for elections. For all of these reasons I think it belongs in the model. It’s true that researchers like Nicholas Carnes emphasize that a naive implementation of funding is unlikely to help on its own—but I would argue that none of the discovered variables will be successful if implemented naively.
Still like above, I think the real contribution comes from the fact that financing provides not just an outcome, but a path — if also a reminder that each specific action can only have a small impact. You may say you want Women’s Political Power, okay, great — but how? Significant public financing for campaigns is a concrete resource, with a concrete action, and a concrete objective. In other words, there’s actually an extremely clear causal mechanism for public financing, that’s much better than many of the other variables. And that’s something that “Healthy Life Expectancy” just doesn’t provide.
And thank you again!