Three thoughts. First, it’s not really the case that EAs use QALYs/DALYs. GWWC and GiveWell used to use them , but GWWC no longer exists as an independent entity and GiveWell now use their own metric. 80k mostly focus on the far future and so QALYs/DALYs aren’t of primary interest. Have I missed someone? I think Founders Pledge do use them. Not sure what goes on ‘under the hood’ for The Life You Can Save’s recommendations.
Second, even if you wanted to use the experience sampling method (ESM) as your measure of wellbeing, you couldn’t because there isn’t enough data on it. There are only two academic projects which have tried to collect data en masse—trackyourhappiness and mappiness. The former is now defunct (Killingsworth works for Microsoft now I believe) and the latter isn’t actively being used (I spoke to the creator, George MacKerron a couple of months ago) I discuss this in a previous forum post. The best I think we can do, if we want to use subjective wellbeing (SWB) measure is life satisfaction.
Third, I think ESM is the theoretically ideal measure of happiness and thus EA—indeed, everyone—should use it as the outcome measure of impact (I assume wellbeing consists in happiness). What follows is that ESM is superior to all other measures of wellbeing, including QALYs/DALYs, wealth, etc. I’m hoping to do some research using ESM at some point in the future if I can.
I think ESM is the theoretically ideal measure of happiness and thus EA—indeed, everyone—should use it as the outcome measure of impact (I assume wellbeing consists in happiness).
As you laid out in this comment, it looks like experience sampling is not getting strong uptake in academia.
Here’s a short argument:
(a) Experience-sampling is theoretically the best way to measure happiness
I think your short argument misses the point. The obstacle isn’t the lack of such infrastructure—I imagine academics could use the existing tools if they asked politely or created their own—but the lack of demand for such infrastructure.
First, it’s not really the case that EAs use QALYs/DALYs. GWWC and GiveWell used to use them, but GWWC no longer exists as an independent entity and GiveWell now use their own metric.
This is a good point.
I think that GWWC & GiveWell’s earlier use of QALYs created a lot of path dependence, such that current EA prioritization remains influenced by the QALY framework even though no organization explicitly uses it at present.
Considering an alternate timeline can help draw out the path dependence:
Imagine a world where the DCP project started in 2019, rather than 1993. In 2019, lots of people have smartphones, so experience sampling is a viable method.
The DCP researchers decide to use experience sampling instead of retrospective surveys to determine QALY weights.
Because they’re using experience sampling, mental health disorders are more highly weighted in the DCP.
In this world, GiveWell gets started in 2026. GiveWell takes a look at the DCP research, and sees that mental health disorders are highly weighted. So, they decide to prioritize research into mental health interventions.
Even after GiveWell moves away from the DCP weightings (in 2036, or whenever), mental health interventions remain their main focus, because that’s where they have the most granular models & the strongest network.
I think that GWWC & GiveWell’s earlier use of QALYs created a lot of path dependence, such that current EA prioritization remains influenced by the QALY framework even though no organization explicitly uses it at present.
I find this to be the most plausible explanation of what has happened. Your counterfactual story is rather helpful!
A minor correction: GiveWell uses DALY to measure mortality and morbidity. (Well, for malaria they actually don’t look at the impact of prevention on morbidity, only mortality, since the former is relatively small—see row 22 here.) Maybe what you had in mind is their “moral weights” which they use to convert between life years and income.
Like cole_haus points out below, ESM’s results would enter disability weights (which are used to construct DALYs) to affect how health interventions are prioritized. Currently disability weights involve hypothetical surveys using methods described in cole_haus’ comment, with a major issue being most respondents haven’t experienced those conditions. ESM would correct that.
To use ESM results as inputs into disability weights though you’d want a representative sample. Looking at app users is a first step but you’d want to ideally do representative sampling or at least weighting. Otherwise you only capture people who would use the app. Having a large enough sample so you can break down by medical conditions is also a challenge. (For doing all these things properly, I suggest partnering with academics or at least professional researchers experienced in the relevant statistical analysis etc. Someone mentioned lack of demand from users being a potential issue—perhaps they can be incentivized.)
Another way to solve the hypothetical bias issue is to look at surveys that include happiness metrics and
have other characteristics of respondents
have nationally representative samples
such as the Gallup World Poll (whose results are used in the World Happiness Report) and the World Value Survey. (Both mentioned here.) The individual-level data can be used to examine the relationship between medical conditions and happiness (this paper uses similar data to look at income and happiness, and this paper on the impact of relatives dying on happiness). I believe you can access the individual-level data through some university libraries. Though again there’s the challenge of having a large enough sample size so you can break down by medical conditions, and they probably don’t have detailed information on medical conditions. (Perhaps one advantage of an app is you can track someone over time, e.g. before and after a medical condition occurs, which you won’t be able to do with these surveys if they don’t have a panel.)
Three thoughts. First, it’s not really the case that EAs use QALYs/DALYs. GWWC and GiveWell used to use them , but GWWC no longer exists as an independent entity and GiveWell now use their own metric. 80k mostly focus on the far future and so QALYs/DALYs aren’t of primary interest. Have I missed someone? I think Founders Pledge do use them. Not sure what goes on ‘under the hood’ for The Life You Can Save’s recommendations.
Second, even if you wanted to use the experience sampling method (ESM) as your measure of wellbeing, you couldn’t because there isn’t enough data on it. There are only two academic projects which have tried to collect data en masse—trackyourhappiness and mappiness. The former is now defunct (Killingsworth works for Microsoft now I believe) and the latter isn’t actively being used (I spoke to the creator, George MacKerron a couple of months ago) I discuss this in a previous forum post. The best I think we can do, if we want to use subjective wellbeing (SWB) measure is life satisfaction.
Third, I think ESM is the theoretically ideal measure of happiness and thus EA—indeed, everyone—should use it as the outcome measure of impact (I assume wellbeing consists in happiness). What follows is that ESM is superior to all other measures of wellbeing, including QALYs/DALYs, wealth, etc. I’m hoping to do some research using ESM at some point in the future if I can.
As you laid out in this comment, it looks like experience sampling is not getting strong uptake in academia.
Here’s a short argument:
(a) Experience-sampling is theoretically the best way to measure happiness
(b) It’s feasible to build experience-sampling infrastructure, e.g. Natália’s mobile app proposal
(c) Academics & other stakeholders aren’t planning to build such infrastructure
Ergo: building experience-sampling infrastructure is a neglected, tractable, and impactful intervention that EA could undertake
I think your short argument misses the point. The obstacle isn’t the lack of such infrastructure—I imagine academics could use the existing tools if they asked politely or created their own—but the lack of demand for such infrastructure.
I’m imagining that EA could provide the demand for such infrastructure (EA cause prioritizers would be its customers).
This is a good point.
I think that GWWC & GiveWell’s earlier use of QALYs created a lot of path dependence, such that current EA prioritization remains influenced by the QALY framework even though no organization explicitly uses it at present.
Considering an alternate timeline can help draw out the path dependence:
I find this to be the most plausible explanation of what has happened. Your counterfactual story is rather helpful!
A minor correction: GiveWell uses DALY to measure mortality and morbidity. (Well, for malaria they actually don’t look at the impact of prevention on morbidity, only mortality, since the former is relatively small—see row 22 here.) Maybe what you had in mind is their “moral weights” which they use to convert between life years and income.
Like cole_haus points out below, ESM’s results would enter disability weights (which are used to construct DALYs) to affect how health interventions are prioritized. Currently disability weights involve hypothetical surveys using methods described in cole_haus’ comment, with a major issue being most respondents haven’t experienced those conditions. ESM would correct that.
To use ESM results as inputs into disability weights though you’d want a representative sample. Looking at app users is a first step but you’d want to ideally do representative sampling or at least weighting. Otherwise you only capture people who would use the app. Having a large enough sample so you can break down by medical conditions is also a challenge. (For doing all these things properly, I suggest partnering with academics or at least professional researchers experienced in the relevant statistical analysis etc. Someone mentioned lack of demand from users being a potential issue—perhaps they can be incentivized.)
Another way to solve the hypothetical bias issue is to look at surveys that include happiness metrics and
have other characteristics of respondents
have nationally representative samples
such as the Gallup World Poll (whose results are used in the World Happiness Report) and the World Value Survey. (Both mentioned here.) The individual-level data can be used to examine the relationship between medical conditions and happiness (this paper uses similar data to look at income and happiness, and this paper on the impact of relatives dying on happiness). I believe you can access the individual-level data through some university libraries. Though again there’s the challenge of having a large enough sample size so you can break down by medical conditions, and they probably don’t have detailed information on medical conditions. (Perhaps one advantage of an app is you can track someone over time, e.g. before and after a medical condition occurs, which you won’t be able to do with these surveys if they don’t have a panel.)