Hello Luke, thanks for this, which was illuminating. I’ll make an initial clarifying comment and then go on to the substantive issues of disagreement.
At least for me, I don’t think this is a case of an EA funder repeatedly ignoring work by e.g. Michael Plant — I think it’s a case of me following the debate over the years and disagreeing on the substance after having familiarized myself with the literature.
I’m not sure what you mean here. Are you saying GiveWell didn’t repeatedly ignore the work? That Open Phil didn’t? Something else? As I set out in another comment, my experience with GiveWell staff was of being ignored by people who weren’t at that familiar with the relevant literature—FWIW, I don’t recall the concerns you raise in your notes being raised with me. I’ve not had interactions with Open Phil staff prior to 2021 - for those reading, Luke and I have never spoken—so I’m not able to comment regarding that.
Onto the substantive issues. Would you be prepared to more precisely state what your concerns are, and what sort of evidence would chance your mind? Reading your comments and your notes, I’m not sure exactly what your objections are and, in so far as I do, they don’t seem like strong objections.
You mention “weakly validated measures” as an issue but in the text you say “for some scales, reliability and validity have been firmly established”, which implies to me you think (some) scales are validated. So which scales are you worried about, to what extent, and why? Are they so non-validated we should think they contain no information? If some scales are validated, why not just use those ones? By analogy, we wouldn’t give up on measuring temperature if we thought only some of our thermometers were broken. I’m not sure if we’re even on the same page about what it is to ‘validate’ a measure of something (I can elaborate, if helpful).
On “unconvincing intervention studies”, I take it you’re referring to your conversation notes with Sonja Lyubormirsky. The ‘happiness interventions’ you talk about are really just those from the field of ‘positive psychology’ where, basically, you take mentally healthy people and try to get them to change their thoughts and behaviours to be happier, such as by writing down what they’re grateful for. This implies a very narrow interpretation of ‘happiness interventions’. Reducing poverty or curing diseases are ‘happiness interventions’ in my book because they increase happiness, but they are certainly not positive psychology interventions. One can coherently think that subjective wellbeing measures, eg self-reported happiness, are valid and capture something morally important but deny gratitude journalling etc. are particularly promising ways, in practice, of increasing it. Also, there’s a big difference between the lab-style experiments psychologists run and the work economists tend to do looking at large panel and cohort data sets.
Regarding “one entire literature using the wrong statistical test for decades”, again, I’m not sure exactly what you mean. Is the point about ‘item response theory’? I confess that’s not something that gets discussed in the academic world of subjective wellbeing measurement—I don’t think I’ve ever heard it mentioned. After a quick look, it seems to be a method to relate scores of psychometric tests to real-world performance. That seems to be a separate methodological ballgame from concerns about the relationship between how people feel and how they report those feelings on a numerical scale, e.g. when we ask “how happy are you, 0-10?”. Subjective wellbeing researchers do talk about the issue of ‘scale cardinality’, ie, roughly, does your “7/10” feel the same to you as my “7/10″ does to me? This issue has been starting to get quite a bit of attention in just the last couple of years but has, I concede, been rather neglected by the field. I’ve got a working paper on this under review which is (I think) the first comprehensive review of the problem.
To me, it looks like in your initial investigation you had the bad luck to run into a couple of dead ends and, quite understandably given those, didn’t go further. But I hope you’ll let me try to explain further to you why I think happiness research (like happiness itself) is worth taking seriously!
I don’t have much time to engage on this, but here are some quick replies:
I don’t know anything about your interactions with GiveWell. My comment about ignoring vs. not-ignoring arguments about happiness interventions was about me / Open Phil, since I looked into the literature in 2015 and have read various things by you since then. I wouldn’t say I ignored those posts and arguments, I just had different views than you about likely cost-effectiveness etc.
On “weakly validated measures,” I’m talking in part about lack of IRT validation studies for SWB measures used in adults (NIH funded such studies for SWB measures in kids but not adults, IIRC), but also about other things. The published conversation notes only discuss a small fraction of my findings/thoughts on the topic.
On “unconvincing intervention studies” I mean interventions from the SWB literature, e.g. gratitude journals and the like. Personally, I’m more optimistic about health and anti-poverty interventions for the purpose of improving happiness.
On “wrong statistical test,” I’m referring to the section called “Older studies used inappropriate statistical methods” in the linked conversation notes with Joel Hektner.
TBC, I think happiness research is worth engaging and has things to teach us, and I think there may be some cost-effectiveness happiness interventions out there. As I said in my original comment, I moved on to other topics not because I think the field is hopeless, but because it was in a bad enough state that it didn’t make sense for me to prioritize it at the time.
Thanks for this too. I appreciate you’ve since moved on to other things, so this isn’t really your topic to engage on anymore. However, I’ll make two comments.
First, you said you read various things in the area, including by me, since 2015. It would have been really helpful (to me) if, given you had different views, you had engaged at the time and set out where you disagreed and what sort of evidence would have changed your mind.
Second, and similarly, I would really appreciate it if the current team at Open Philanthropy could more precisely set out their perspective on all this. I did have a few interactions with various Open Phil staff in 2021, but I wouldn’t say I’ve got anything like canonical answers on what their reservations are about 1. measuring outcomes in terms of SWB - Alex Berger’s recent technical update didn’t comment on this—and 2. doing more research or grantmaking into the things that, from the SWB perspective, seem overlooked.
This is an interesting conversation. It’s veering off into a separate topic. I wish there was a way to “rebase” these spin-off discussions into a different place. For better organisation.
Hello Luke, thanks for this, which was illuminating. I’ll make an initial clarifying comment and then go on to the substantive issues of disagreement.
I’m not sure what you mean here. Are you saying GiveWell didn’t repeatedly ignore the work? That Open Phil didn’t? Something else? As I set out in another comment, my experience with GiveWell staff was of being ignored by people who weren’t at that familiar with the relevant literature—FWIW, I don’t recall the concerns you raise in your notes being raised with me. I’ve not had interactions with Open Phil staff prior to 2021 - for those reading, Luke and I have never spoken—so I’m not able to comment regarding that.
Onto the substantive issues. Would you be prepared to more precisely state what your concerns are, and what sort of evidence would chance your mind? Reading your comments and your notes, I’m not sure exactly what your objections are and, in so far as I do, they don’t seem like strong objections.
You mention “weakly validated measures” as an issue but in the text you say “for some scales, reliability and validity have been firmly established”, which implies to me you think (some) scales are validated. So which scales are you worried about, to what extent, and why? Are they so non-validated we should think they contain no information? If some scales are validated, why not just use those ones? By analogy, we wouldn’t give up on measuring temperature if we thought only some of our thermometers were broken. I’m not sure if we’re even on the same page about what it is to ‘validate’ a measure of something (I can elaborate, if helpful).
On “unconvincing intervention studies”, I take it you’re referring to your conversation notes with Sonja Lyubormirsky. The ‘happiness interventions’ you talk about are really just those from the field of ‘positive psychology’ where, basically, you take mentally healthy people and try to get them to change their thoughts and behaviours to be happier, such as by writing down what they’re grateful for. This implies a very narrow interpretation of ‘happiness interventions’. Reducing poverty or curing diseases are ‘happiness interventions’ in my book because they increase happiness, but they are certainly not positive psychology interventions. One can coherently think that subjective wellbeing measures, eg self-reported happiness, are valid and capture something morally important but deny gratitude journalling etc. are particularly promising ways, in practice, of increasing it. Also, there’s a big difference between the lab-style experiments psychologists run and the work economists tend to do looking at large panel and cohort data sets.
Regarding “one entire literature using the wrong statistical test for decades”, again, I’m not sure exactly what you mean. Is the point about ‘item response theory’? I confess that’s not something that gets discussed in the academic world of subjective wellbeing measurement—I don’t think I’ve ever heard it mentioned. After a quick look, it seems to be a method to relate scores of psychometric tests to real-world performance. That seems to be a separate methodological ballgame from concerns about the relationship between how people feel and how they report those feelings on a numerical scale, e.g. when we ask “how happy are you, 0-10?”. Subjective wellbeing researchers do talk about the issue of ‘scale cardinality’, ie, roughly, does your “7/10” feel the same to you as my “7/10″ does to me? This issue has been starting to get quite a bit of attention in just the last couple of years but has, I concede, been rather neglected by the field. I’ve got a working paper on this under review which is (I think) the first comprehensive review of the problem.
To me, it looks like in your initial investigation you had the bad luck to run into a couple of dead ends and, quite understandably given those, didn’t go further. But I hope you’ll let me try to explain further to you why I think happiness research (like happiness itself) is worth taking seriously!
Hi Michael,
I don’t have much time to engage on this, but here are some quick replies:
I don’t know anything about your interactions with GiveWell. My comment about ignoring vs. not-ignoring arguments about happiness interventions was about me / Open Phil, since I looked into the literature in 2015 and have read various things by you since then. I wouldn’t say I ignored those posts and arguments, I just had different views than you about likely cost-effectiveness etc.
On “weakly validated measures,” I’m talking in part about lack of IRT validation studies for SWB measures used in adults (NIH funded such studies for SWB measures in kids but not adults, IIRC), but also about other things. The published conversation notes only discuss a small fraction of my findings/thoughts on the topic.
On “unconvincing intervention studies” I mean interventions from the SWB literature, e.g. gratitude journals and the like. Personally, I’m more optimistic about health and anti-poverty interventions for the purpose of improving happiness.
On “wrong statistical test,” I’m referring to the section called “Older studies used inappropriate statistical methods” in the linked conversation notes with Joel Hektner.
TBC, I think happiness research is worth engaging and has things to teach us, and I think there may be some cost-effectiveness happiness interventions out there. As I said in my original comment, I moved on to other topics not because I think the field is hopeless, but because it was in a bad enough state that it didn’t make sense for me to prioritize it at the time.
Hello Luke,
Thanks for this too. I appreciate you’ve since moved on to other things, so this isn’t really your topic to engage on anymore. However, I’ll make two comments.
First, you said you read various things in the area, including by me, since 2015. It would have been really helpful (to me) if, given you had different views, you had engaged at the time and set out where you disagreed and what sort of evidence would have changed your mind.
Second, and similarly, I would really appreciate it if the current team at Open Philanthropy could more precisely set out their perspective on all this. I did have a few interactions with various Open Phil staff in 2021, but I wouldn’t say I’ve got anything like canonical answers on what their reservations are about 1. measuring outcomes in terms of SWB - Alex Berger’s recent technical update didn’t comment on this—and 2. doing more research or grantmaking into the things that, from the SWB perspective, seem overlooked.
This is an interesting conversation. It’s veering off into a separate topic. I wish there was a way to “rebase” these spin-off discussions into a different place. For better organisation.