To me (as someone who has funded the Happier Lives institute) I just think it should not have taken founding an institute and 6 years and of repeating this message (and feeling largely ignored and dismissed by existing EA orgs) to reach the point we are at now.
I think expecting orgs and donors to change direction is certainly a very high bar. But I don’t think we should pride ourselves on being a community that pivots and changes direction when new data (e.g. on subjective wellbeing) is made available to us.
FWIW, one of my first projects at Open Phil, starting in 2015, was to investigate subjective well-being interventions as a potential focus area. We never published a page on it, but we did publish some conversation notes. We didn’t pursue it further because my initial findings were that there were major problems with the empirical literature, including weakly validated measures, unconvincing intervention studies, one entire literature using the wrong statistical test for decades, etc. I concluded that there might be cost-effective interventions in this space, perhaps especially after better measure validation studies and intervention studies are conducted, but my initial investigation suggested it would take a lot of work for us to get there, so I moved on to other topics.
At least for me, I don’t think this is a case of an EA funder repeatedly ignoring work by e.g. Michael Plant — I think it’s a case of me following the debate over the years and disagreeing on the substance after having familiarized myself with the literature.
That said, I still think some happiness interventions might be cost-effective upon further investigation, and I think our Global Health & Well-Being team has been looking into the topic again as that team has gained more research capacity in the past year or two.
Hello Luke, thanks for this, which was illuminating. I’ll make an initial clarifying comment and then go on to the substantive issues of disagreement.
At least for me, I don’t think this is a case of an EA funder repeatedly ignoring work by e.g. Michael Plant — I think it’s a case of me following the debate over the years and disagreeing on the substance after having familiarized myself with the literature.
I’m not sure what you mean here. Are you saying GiveWell didn’t repeatedly ignore the work? That Open Phil didn’t? Something else? As I set out in another comment, my experience with GiveWell staff was of being ignored by people who weren’t at that familiar with the relevant literature—FWIW, I don’t recall the concerns you raise in your notes being raised with me. I’ve not had interactions with Open Phil staff prior to 2021 - for those reading, Luke and I have never spoken—so I’m not able to comment regarding that.
Onto the substantive issues. Would you be prepared to more precisely state what your concerns are, and what sort of evidence would chance your mind? Reading your comments and your notes, I’m not sure exactly what your objections are and, in so far as I do, they don’t seem like strong objections.
You mention “weakly validated measures” as an issue but in the text you say “for some scales, reliability and validity have been firmly established”, which implies to me you think (some) scales are validated. So which scales are you worried about, to what extent, and why? Are they so non-validated we should think they contain no information? If some scales are validated, why not just use those ones? By analogy, we wouldn’t give up on measuring temperature if we thought only some of our thermometers were broken. I’m not sure if we’re even on the same page about what it is to ‘validate’ a measure of something (I can elaborate, if helpful).
On “unconvincing intervention studies”, I take it you’re referring to your conversation notes with Sonja Lyubormirsky. The ‘happiness interventions’ you talk about are really just those from the field of ‘positive psychology’ where, basically, you take mentally healthy people and try to get them to change their thoughts and behaviours to be happier, such as by writing down what they’re grateful for. This implies a very narrow interpretation of ‘happiness interventions’. Reducing poverty or curing diseases are ‘happiness interventions’ in my book because they increase happiness, but they are certainly not positive psychology interventions. One can coherently think that subjective wellbeing measures, eg self-reported happiness, are valid and capture something morally important but deny gratitude journalling etc. are particularly promising ways, in practice, of increasing it. Also, there’s a big difference between the lab-style experiments psychologists run and the work economists tend to do looking at large panel and cohort data sets.
Regarding “one entire literature using the wrong statistical test for decades”, again, I’m not sure exactly what you mean. Is the point about ‘item response theory’? I confess that’s not something that gets discussed in the academic world of subjective wellbeing measurement—I don’t think I’ve ever heard it mentioned. After a quick look, it seems to be a method to relate scores of psychometric tests to real-world performance. That seems to be a separate methodological ballgame from concerns about the relationship between how people feel and how they report those feelings on a numerical scale, e.g. when we ask “how happy are you, 0-10?”. Subjective wellbeing researchers do talk about the issue of ‘scale cardinality’, ie, roughly, does your “7/10” feel the same to you as my “7/10″ does to me? This issue has been starting to get quite a bit of attention in just the last couple of years but has, I concede, been rather neglected by the field. I’ve got a working paper on this under review which is (I think) the first comprehensive review of the problem.
To me, it looks like in your initial investigation you had the bad luck to run into a couple of dead ends and, quite understandably given those, didn’t go further. But I hope you’ll let me try to explain further to you why I think happiness research (like happiness itself) is worth taking seriously!
I don’t have much time to engage on this, but here are some quick replies:
I don’t know anything about your interactions with GiveWell. My comment about ignoring vs. not-ignoring arguments about happiness interventions was about me / Open Phil, since I looked into the literature in 2015 and have read various things by you since then. I wouldn’t say I ignored those posts and arguments, I just had different views than you about likely cost-effectiveness etc.
On “weakly validated measures,” I’m talking in part about lack of IRT validation studies for SWB measures used in adults (NIH funded such studies for SWB measures in kids but not adults, IIRC), but also about other things. The published conversation notes only discuss a small fraction of my findings/thoughts on the topic.
On “unconvincing intervention studies” I mean interventions from the SWB literature, e.g. gratitude journals and the like. Personally, I’m more optimistic about health and anti-poverty interventions for the purpose of improving happiness.
On “wrong statistical test,” I’m referring to the section called “Older studies used inappropriate statistical methods” in the linked conversation notes with Joel Hektner.
TBC, I think happiness research is worth engaging and has things to teach us, and I think there may be some cost-effectiveness happiness interventions out there. As I said in my original comment, I moved on to other topics not because I think the field is hopeless, but because it was in a bad enough state that it didn’t make sense for me to prioritize it at the time.
Thanks for this too. I appreciate you’ve since moved on to other things, so this isn’t really your topic to engage on anymore. However, I’ll make two comments.
First, you said you read various things in the area, including by me, since 2015. It would have been really helpful (to me) if, given you had different views, you had engaged at the time and set out where you disagreed and what sort of evidence would have changed your mind.
Second, and similarly, I would really appreciate it if the current team at Open Philanthropy could more precisely set out their perspective on all this. I did have a few interactions with various Open Phil staff in 2021, but I wouldn’t say I’ve got anything like canonical answers on what their reservations are about 1. measuring outcomes in terms of SWB - Alex Berger’s recent technical update didn’t comment on this—and 2. doing more research or grantmaking into the things that, from the SWB perspective, seem overlooked.
This is an interesting conversation. It’s veering off into a separate topic. I wish there was a way to “rebase” these spin-off discussions into a different place. For better organisation.
Do you feel that existing data on subjective wellbeing is so compelling that it’s an indictment on EA for GiveWell/OpenPhil not to have funded more work in that area? (Founder’s Pledge released their report in early 2019 and was presumably working on it much earlier, so they wouldn’t seem to be blameworthy.)
I can’t say much more here without knowing the details of how Michael/others’ work was received when they presented it to funders. The situation I’ve outlined seems to be compatible both with “this work wasn’t taken seriously enough” and “this work was taken seriously, but seen as a weaker thing to fund than the things that were actually funded” (which is, in turn, compatible with “funders were correct in their assessment” and “funders were incorrect in their assessment”).
That Michael felt dismissed is moderate evidence for “not taken seriously enough”. That his work (and other work like it) got a bunch of engagement on the Forum is weak evidence for “taken seriously” (what the Forum cares about =/= what funders care about, but the correlation isn’t 0). I’m left feeling uncertain about this example, but it’s certainly reasonable to argue that mental health and/or SWB hasn’t gotten enough attention.
(Personally, I find the case for additional work on SWB more compelling than the case for additional work on mental health specifically, and I don’t know the extent to which HLI was trying to get funding for one vs. the other.)
Do you feel that existing data on subjective wellbeing is so compelling that it’s an indictment on EA for GiveWell/OpenPhil not to have funded more work in that area?
Tl;dr. Hard to judge. Maybe: Yes for GW. No for Open Phil. Mixed for EA community as a whole.
I think I will slightly dodge the question and answer the separate question – are these orgs doing enough exploratory type research. (I think this is a more pertinent question, and although I think subjective wellbeing is worth looking into as an example it is not clear it is at the very top of the list of things to look into more that might change how we think about doing good).
Firstly to give a massive caveat that I do not know for sure. It is hard to judge and knowing exactly how seriously various orgs have looked into topics is very hard to do from the outside. So take the below with a pinch of salt. That said:
OpenPhil – AOK.
OpenPhil (neartermists) generally seem good at exploring new areas and experimenting (and as Luke highlights, did look into this).
GiveWell – hmmm could do better.
GiveWell seem to have a pattern of saying they will do more exploratory research (e.g. into policy) and then not doing it (mentioned here, I think 2020 has seen some but minimal progress).
I am genuinely surprised GiveWell have not found things better than anti-malaria and deworming (sure, there are limits on how effective scalable charities can be but it seems odd our first guesses are still the top recommended).
There is limited catering to anyone who is not a classical utilitarian – for example if you care about wellbeing (e.g. years lived with disability) but not lives saved it is unclear where to give.
EA in general – so-so.
There has been interest from EAs (individuals, Charity Entrepreneurship, Founders Pledge, EAG) on the value of happiness and addressing mental health issues, etc.
It is not just Michael. I get the sense the folk working on Improving Institutional Decision Making (IIDM) have struggled to get traction and funding and support too. (Although maybe promoters of new causes areas within EA always feel their ideas are not taken seriously.)
The EA community (not just GiveWell) seems very bad at catering to folk who are not roughly classical (or negative leaning) utilitarians (a thing I struggled with when working as a community builder).
I do believe there is a lack of exploratory research happening given the potential benefits (see here and here). Maybe Rethink are changing this.
Not sure I really answered the question. And anyway none of those points are very strong evidence as much as me trying to explain my current intuitions. But maybe I said something of interest.
To me (as someone who has funded the Happier Lives institute) I just think it should not have taken founding an institute and 6 years and of repeating this message (and feeling largely ignored and dismissed by existing EA orgs) to reach the point we are at now.
I think expecting orgs and donors to change direction is certainly a very high bar. But I don’t think we should pride ourselves on being a community that pivots and changes direction when new data (e.g. on subjective wellbeing) is made available to us.
FWIW, one of my first projects at Open Phil, starting in 2015, was to investigate subjective well-being interventions as a potential focus area. We never published a page on it, but we did publish some conversation notes. We didn’t pursue it further because my initial findings were that there were major problems with the empirical literature, including weakly validated measures, unconvincing intervention studies, one entire literature using the wrong statistical test for decades, etc. I concluded that there might be cost-effective interventions in this space, perhaps especially after better measure validation studies and intervention studies are conducted, but my initial investigation suggested it would take a lot of work for us to get there, so I moved on to other topics.
At least for me, I don’t think this is a case of an EA funder repeatedly ignoring work by e.g. Michael Plant — I think it’s a case of me following the debate over the years and disagreeing on the substance after having familiarized myself with the literature.
That said, I still think some happiness interventions might be cost-effective upon further investigation, and I think our Global Health & Well-Being team has been looking into the topic again as that team has gained more research capacity in the past year or two.
Hello Luke, thanks for this, which was illuminating. I’ll make an initial clarifying comment and then go on to the substantive issues of disagreement.
I’m not sure what you mean here. Are you saying GiveWell didn’t repeatedly ignore the work? That Open Phil didn’t? Something else? As I set out in another comment, my experience with GiveWell staff was of being ignored by people who weren’t at that familiar with the relevant literature—FWIW, I don’t recall the concerns you raise in your notes being raised with me. I’ve not had interactions with Open Phil staff prior to 2021 - for those reading, Luke and I have never spoken—so I’m not able to comment regarding that.
Onto the substantive issues. Would you be prepared to more precisely state what your concerns are, and what sort of evidence would chance your mind? Reading your comments and your notes, I’m not sure exactly what your objections are and, in so far as I do, they don’t seem like strong objections.
You mention “weakly validated measures” as an issue but in the text you say “for some scales, reliability and validity have been firmly established”, which implies to me you think (some) scales are validated. So which scales are you worried about, to what extent, and why? Are they so non-validated we should think they contain no information? If some scales are validated, why not just use those ones? By analogy, we wouldn’t give up on measuring temperature if we thought only some of our thermometers were broken. I’m not sure if we’re even on the same page about what it is to ‘validate’ a measure of something (I can elaborate, if helpful).
On “unconvincing intervention studies”, I take it you’re referring to your conversation notes with Sonja Lyubormirsky. The ‘happiness interventions’ you talk about are really just those from the field of ‘positive psychology’ where, basically, you take mentally healthy people and try to get them to change their thoughts and behaviours to be happier, such as by writing down what they’re grateful for. This implies a very narrow interpretation of ‘happiness interventions’. Reducing poverty or curing diseases are ‘happiness interventions’ in my book because they increase happiness, but they are certainly not positive psychology interventions. One can coherently think that subjective wellbeing measures, eg self-reported happiness, are valid and capture something morally important but deny gratitude journalling etc. are particularly promising ways, in practice, of increasing it. Also, there’s a big difference between the lab-style experiments psychologists run and the work economists tend to do looking at large panel and cohort data sets.
Regarding “one entire literature using the wrong statistical test for decades”, again, I’m not sure exactly what you mean. Is the point about ‘item response theory’? I confess that’s not something that gets discussed in the academic world of subjective wellbeing measurement—I don’t think I’ve ever heard it mentioned. After a quick look, it seems to be a method to relate scores of psychometric tests to real-world performance. That seems to be a separate methodological ballgame from concerns about the relationship between how people feel and how they report those feelings on a numerical scale, e.g. when we ask “how happy are you, 0-10?”. Subjective wellbeing researchers do talk about the issue of ‘scale cardinality’, ie, roughly, does your “7/10” feel the same to you as my “7/10″ does to me? This issue has been starting to get quite a bit of attention in just the last couple of years but has, I concede, been rather neglected by the field. I’ve got a working paper on this under review which is (I think) the first comprehensive review of the problem.
To me, it looks like in your initial investigation you had the bad luck to run into a couple of dead ends and, quite understandably given those, didn’t go further. But I hope you’ll let me try to explain further to you why I think happiness research (like happiness itself) is worth taking seriously!
Hi Michael,
I don’t have much time to engage on this, but here are some quick replies:
I don’t know anything about your interactions with GiveWell. My comment about ignoring vs. not-ignoring arguments about happiness interventions was about me / Open Phil, since I looked into the literature in 2015 and have read various things by you since then. I wouldn’t say I ignored those posts and arguments, I just had different views than you about likely cost-effectiveness etc.
On “weakly validated measures,” I’m talking in part about lack of IRT validation studies for SWB measures used in adults (NIH funded such studies for SWB measures in kids but not adults, IIRC), but also about other things. The published conversation notes only discuss a small fraction of my findings/thoughts on the topic.
On “unconvincing intervention studies” I mean interventions from the SWB literature, e.g. gratitude journals and the like. Personally, I’m more optimistic about health and anti-poverty interventions for the purpose of improving happiness.
On “wrong statistical test,” I’m referring to the section called “Older studies used inappropriate statistical methods” in the linked conversation notes with Joel Hektner.
TBC, I think happiness research is worth engaging and has things to teach us, and I think there may be some cost-effectiveness happiness interventions out there. As I said in my original comment, I moved on to other topics not because I think the field is hopeless, but because it was in a bad enough state that it didn’t make sense for me to prioritize it at the time.
Hello Luke,
Thanks for this too. I appreciate you’ve since moved on to other things, so this isn’t really your topic to engage on anymore. However, I’ll make two comments.
First, you said you read various things in the area, including by me, since 2015. It would have been really helpful (to me) if, given you had different views, you had engaged at the time and set out where you disagreed and what sort of evidence would have changed your mind.
Second, and similarly, I would really appreciate it if the current team at Open Philanthropy could more precisely set out their perspective on all this. I did have a few interactions with various Open Phil staff in 2021, but I wouldn’t say I’ve got anything like canonical answers on what their reservations are about 1. measuring outcomes in terms of SWB - Alex Berger’s recent technical update didn’t comment on this—and 2. doing more research or grantmaking into the things that, from the SWB perspective, seem overlooked.
This is an interesting conversation. It’s veering off into a separate topic. I wish there was a way to “rebase” these spin-off discussions into a different place. For better organisation.
Thank you Luke – super helpful to hear!!
Do you feel that existing data on subjective wellbeing is so compelling that it’s an indictment on EA for GiveWell/OpenPhil not to have funded more work in that area? (Founder’s Pledge released their report in early 2019 and was presumably working on it much earlier, so they wouldn’t seem to be blameworthy.)
I can’t say much more here without knowing the details of how Michael/others’ work was received when they presented it to funders. The situation I’ve outlined seems to be compatible both with “this work wasn’t taken seriously enough” and “this work was taken seriously, but seen as a weaker thing to fund than the things that were actually funded” (which is, in turn, compatible with “funders were correct in their assessment” and “funders were incorrect in their assessment”).
That Michael felt dismissed is moderate evidence for “not taken seriously enough”. That his work (and other work like it) got a bunch of engagement on the Forum is weak evidence for “taken seriously” (what the Forum cares about =/= what funders care about, but the correlation isn’t 0). I’m left feeling uncertain about this example, but it’s certainly reasonable to argue that mental health and/or SWB hasn’t gotten enough attention.
(Personally, I find the case for additional work on SWB more compelling than the case for additional work on mental health specifically, and I don’t know the extent to which HLI was trying to get funding for one vs. the other.)
Tl;dr. Hard to judge. Maybe: Yes for GW. No for Open Phil. Mixed for EA community as a whole.
I think I will slightly dodge the question and answer the separate question – are these orgs doing enough exploratory type research. (I think this is a more pertinent question, and although I think subjective wellbeing is worth looking into as an example it is not clear it is at the very top of the list of things to look into more that might change how we think about doing good).
Firstly to give a massive caveat that I do not know for sure. It is hard to judge and knowing exactly how seriously various orgs have looked into topics is very hard to do from the outside. So take the below with a pinch of salt. That said:
OpenPhil – AOK.
OpenPhil (neartermists) generally seem good at exploring new areas and experimenting (and as Luke highlights, did look into this).
GiveWell – hmmm could do better.
GiveWell seem to have a pattern of saying they will do more exploratory research (e.g. into policy) and then not doing it (mentioned here, I think 2020 has seen some but minimal progress).
I am genuinely surprised GiveWell have not found things better than anti-malaria and deworming (sure, there are limits on how effective scalable charities can be but it seems odd our first guesses are still the top recommended).
There is limited catering to anyone who is not a classical utilitarian – for example if you care about wellbeing (e.g. years lived with disability) but not lives saved it is unclear where to give.
EA in general – so-so.
There has been interest from EAs (individuals, Charity Entrepreneurship, Founders Pledge, EAG) on the value of happiness and addressing mental health issues, etc.
It is not just Michael. I get the sense the folk working on Improving Institutional Decision Making (IIDM) have struggled to get traction and funding and support too. (Although maybe promoters of new causes areas within EA always feel their ideas are not taken seriously.)
The EA community (not just GiveWell) seems very bad at catering to folk who are not roughly classical (or negative leaning) utilitarians (a thing I struggled with when working as a community builder).
I do believe there is a lack of exploratory research happening given the potential benefits (see here and here). Maybe Rethink are changing this.
Not sure I really answered the question. And anyway none of those points are very strong evidence as much as me trying to explain my current intuitions. But maybe I said something of interest.