I think the fact that SWB measures differs across cultures is actually a good sign that these measures capture what they are supposed to capture… In fact, I would be more concerned if different people with different views and circumstances did not, as you say, ‘differ substantially.’
My claim is not “SWB is empirically different between cultures therefore SWB is bad”. My claim is, I suspect that cultural factors cause people to choose different numbers for reasons orthogonal to what they actually want. For example, maybe Alice wants to be a career woman instead of her current role as a housewife (and would make choices to this effect if she had an opportunity), but she reports high life satisfaction because she feels that is expected of her (and it’s not like reporting a low number would help her). Or, maybe people in Fooland consistently report higher life satisfaction than people in Baristan (because they have lower expectations of how life should be), but nobody from Baristan wants to move to Fooland and everyone from Fooland want to move to Baristan if they can (because life is actually better in Baristan).
I think these differences, attributable to culture or individual variance, are not likely to be of concern for what I would imagine would be the more common ways WELLBYs could be used. Most cost effectiveness analyses rely on RCTs or comparable designs with pre and post measures.
I agree that directly comparing “pre” to “post” SWB might work okay for many interventions, because the intervention doesn’t affect the confounding factors, as long as you’re comparing different interventions applied to similar populations. I would still rely more on asking people directly how much this intervention helped them / how much their life improved over this period (as opposed to comparing numbers reported at different points of time)[1]. And, we should still be vigilant about situations in which the confounders cannot be ignored (e.g. interventions that cause cultural shifts). And, there might be a non-linear relationship between SWB and decision-utility which should be somehow divulged if we are averaging these numbers.
In my reading, there’s a long body of researcher suggesting these are stable, yet in practice your ‘revealed’ preference at $5 is likely to be different than at $10.
I’m guessing you are not talking about things like, how much free time you would exchange for an additional $1? Because that’s consistent with constant preferences? So, Alice has $5 and Bob has $10, they are asked to choose between X and Y, and they have predictably different preferences despite the fact that post-X-Alice has the same wealth (and other circumstances) and post-X-Bob and the same for Y? And this despite somehow controlling for confounders are correlated both with the causes for Alice’s and Bob’s wealth and with their preferences?
I imagine such things can happen, in which case I would try to add hindsight judgements and judgements of people who experienced different circumstances into the mix. I expect that as people become more informed and experienced they roughly converge to some stable set of preferences, and the tradeoffs that don’t converge are not really important. If I’m wrong and they are important, then we need to use the revealed preferences of people in those particular circumstances (which, yes, might include SWB, might also include other parameters).
Even under optimistic assumptions about SWB, this seems less noisy. Under pessimistic assumptions, I can imagine e.g. people implicitly interpreting the question as comparing their life to their neighbors (which were also affected by the intervention) or comparing their life now to their life in the past (which was still after the intervention), in which case SWB has no signal at all.
Thanks so much for replying, I learned a lot from your response and its clarity helped me update my thinking.
My claim is, I suspect that cultural factors cause people to choose different numbers for reasons orthogonal to what they actually want.
Thanks, the specificity here helped me understand your view better. I suppose with the examples you give—I would expect these to be exceptions rather than norms (because if e.g. wanting to have a career was the norm, over enough time, that would tend to become culturally normative and even in the process of it becoming a more normative view the difference with a SWB measure should diminish). And more broadly, interventions that have large samples and aim for generalizability should be reasonably representative and also diminish this as a concern.
I suppose I’m also thinking about the potential difference in specific SWB scales. Something like the SWLS scale or the single item measures would not be very domain specific but scales based around the e.g. Wheel of Life tradition tell you a lot more different facets of your life (e.g. you can see high overall scale but low for job satisfaction), so it seems to me that with the right scales and enough items you can address culture or other variance even further.
I’m guessing you are not talking about things like, how much free time you would exchange for an additional $1? Because that’s consistent with constant preferences? So, Alice has $5 and Bob has $10, they are asked to choose between X and Y, and they have predictably different preferences despite the fact that post-X-Alice has the same wealth (and other circumstances) and post-X-Bob and the same for Y? And this despite somehow controlling for confounders are correlated both with the causes for Alice’s and Bob’s wealth and with their preferences?
Thanks again for responding with such precision. What I was unable to articulate well is that your individual preferences are not stable (or I suppose: per person, rather than across people), i.e. Alice when she has $5 will exchange a different amount of free time for an extra $1 then when Alice has $10.
I agree with everything else you’ve said and especially with:
I would still rely more on asking people directly how much this intervention helped them / how much their life improved over this period (as opposed to comparing numbers reported at different points of time)
I think this is a hugely underappreciated point. I think some of the SWB measures target this issue somewhat but in a limited fashion. I’d love to see more qualitative interviews and participatory / or co-production interventions. I am always surprised by how many interventions say they cannot ascertain a causal mechanism quantitatively and so do not attempt to… well, ask people what worked and didn’t.
Thanks so much for replying, I learned a lot from your response and its clarity helped me update my thinking.
You’re very welcome, I’m glad it was useful!
I would expect these to be exceptions rather than norms (because if e.g. wanting to have a career was the norm, over enough time, that would tend to become culturally normative and even in the process of it becoming a more normative view the difference with a SWB measure should diminish).
I’m much more pessimistic. The processes that determine what is culturally normative are complicated, there are many examples of norms that discriminate against certain groups or curtail freedoms lasting over time, and if you’re optimizing for the near future then “over enough time” is not a satisfactory solution.
I suppose I’m also thinking about the potential difference in specific SWB scales. Something like the SWLS scale or the single item measures would not be very domain specific but scales based around the e.g. Wheel of Life tradition tell you a lot more different facets of your life (e.g. you can see high overall scale but low for job satisfaction), so it seems to me that with the right scales and enough items you can address culture or other variance even further.
I don’t know how those scales work, but (as I wrote in my reply to Joel), I would be much more optimistic about scales that are relative i.e. ask you to compare your well-being in situation A to situation B (whether these situations are familiar or hypothetical) rather than absolute (in which case it’s not clear what’s the reference frame).
What I was unable to articulate well is that your individual preferences are not stable (or I suppose: per person, rather than across people), i.e. Alice when she has $5 will exchange a different amount of free time for an extra $1 then when Alice has $10.
This is considered a consistent preference in standard (VNM) decision theory. It is entirely consistent that U(6$ and X free time) > U(5$ and Y free time) but U(11$ and X free time) < U(10$ and Y free time).
My claim is not “SWB is empirically different between cultures therefore SWB is bad”. My claim is, I suspect that cultural factors cause people to choose different numbers for reasons orthogonal to what they actually want. For example, maybe Alice wants to be a career woman instead of her current role as a housewife (and would make choices to this effect if she had an opportunity), but she reports high life satisfaction because she feels that is expected of her (and it’s not like reporting a low number would help her). Or, maybe people in Fooland consistently report higher life satisfaction than people in Baristan (because they have lower expectations of how life should be), but nobody from Baristan wants to move to Fooland and everyone from Fooland want to move to Baristan if they can (because life is actually better in Baristan).
I agree that directly comparing “pre” to “post” SWB might work okay for many interventions, because the intervention doesn’t affect the confounding factors, as long as you’re comparing different interventions applied to similar populations. I would still rely more on asking people directly how much this intervention helped them / how much their life improved over this period (as opposed to comparing numbers reported at different points of time)[1]. And, we should still be vigilant about situations in which the confounders cannot be ignored (e.g. interventions that cause cultural shifts). And, there might be a non-linear relationship between SWB and decision-utility which should be somehow divulged if we are averaging these numbers.
I’m guessing you are not talking about things like, how much free time you would exchange for an additional $1? Because that’s consistent with constant preferences? So, Alice has $5 and Bob has $10, they are asked to choose between X and Y, and they have predictably different preferences despite the fact that post-X-Alice has the same wealth (and other circumstances) and post-X-Bob and the same for Y? And this despite somehow controlling for confounders are correlated both with the causes for Alice’s and Bob’s wealth and with their preferences?
I imagine such things can happen, in which case I would try to add hindsight judgements and judgements of people who experienced different circumstances into the mix. I expect that as people become more informed and experienced they roughly converge to some stable set of preferences, and the tradeoffs that don’t converge are not really important. If I’m wrong and they are important, then we need to use the revealed preferences of people in those particular circumstances (which, yes, might include SWB, might also include other parameters).
Even under optimistic assumptions about SWB, this seems less noisy. Under pessimistic assumptions, I can imagine e.g. people implicitly interpreting the question as comparing their life to their neighbors (which were also affected by the intervention) or comparing their life now to their life in the past (which was still after the intervention), in which case SWB has no signal at all.
Thanks so much for replying, I learned a lot from your response and its clarity helped me update my thinking.
Thanks, the specificity here helped me understand your view better. I suppose with the examples you give—I would expect these to be exceptions rather than norms (because if e.g. wanting to have a career was the norm, over enough time, that would tend to become culturally normative and even in the process of it becoming a more normative view the difference with a SWB measure should diminish). And more broadly, interventions that have large samples and aim for generalizability should be reasonably representative and also diminish this as a concern.
I suppose I’m also thinking about the potential difference in specific SWB scales. Something like the SWLS scale or the single item measures would not be very domain specific but scales based around the e.g. Wheel of Life tradition tell you a lot more different facets of your life (e.g. you can see high overall scale but low for job satisfaction), so it seems to me that with the right scales and enough items you can address culture or other variance even further.
Thanks again for responding with such precision. What I was unable to articulate well is that your individual preferences are not stable (or I suppose: per person, rather than across people), i.e. Alice when she has $5 will exchange a different amount of free time for an extra $1 then when Alice has $10.
I agree with everything else you’ve said and especially with:
I think this is a hugely underappreciated point. I think some of the SWB measures target this issue somewhat but in a limited fashion. I’d love to see more qualitative interviews and participatory / or co-production interventions. I am always surprised by how many interventions say they cannot ascertain a causal mechanism quantitatively and so do not attempt to… well, ask people what worked and didn’t.
You’re very welcome, I’m glad it was useful!
I’m much more pessimistic. The processes that determine what is culturally normative are complicated, there are many examples of norms that discriminate against certain groups or curtail freedoms lasting over time, and if you’re optimizing for the near future then “over enough time” is not a satisfactory solution.
I don’t know how those scales work, but (as I wrote in my reply to Joel), I would be much more optimistic about scales that are relative i.e. ask you to compare your well-being in situation A to situation B (whether these situations are familiar or hypothetical) rather than absolute (in which case it’s not clear what’s the reference frame).
This is considered a consistent preference in standard (VNM) decision theory. It is entirely consistent that U(6$ and X free time) > U(5$ and Y free time) but U(11$ and X free time) < U(10$ and Y free time).