If you give a new mom [in the US] a few hundred dollars a month or a homeless man one thousand dollars a month, thatâs gotta show up in the data, right?
Alas.
A few years back we got really serious about studying cash transfers, and rigorous research began in cities all across America. Some programs targeted the homeless, some new mothers and some families living beneath the poverty line. The goal was to figure out whether sizable monthly payments help people lead better lives, get better educations and jobs, care more for their children and achieve better health outcomes.
Many of the studies are still ongoing, but, at this point, the results arenât âuncertain.â Theyâre pretty consistent and very weird. Multiple large, high-quality randomized studies are finding that guaranteed income transfers do not appear to produce sustained improvements in mental health, stress levels, physical health, child development outcomes or employment. Treated participants do work a little less, but shockingly, this doesnât correspond with either lower stress levels or higher overall reported life satisfaction.
Homeless people, new mothers and low-income Americans all over the country received thousands of dollars. And itâs practically invisible in the data. On so many important metrics, these people are statistically indistinguishable from those who did not receive this aid.
I cannot stress how shocking I find this and I want to be clear that this is not âwe got some weak counterevidence.â These are careful, well-conducted studies. They are large enough to rule out even small positive effects and they are all very similar. This is an amount of evidence that in almost any other context weâd consider definitive.
[...]
Overall, the larger and more credible studies in this space have tended to find worse effects.
While itâs sad that the lives of people in those studies didnât improve, I think this is some evidence that Giving isnât demanding, and that giving 10% wouldnât worsen the life of a median person in the US in a measurable way.
I expected low effects based on background assumptions like utility being sublogarithmic to income but I didnât expect the effect to be actually zero (at the level of power that the very big studies could detect).
Iâd be interested in replicating these trials in other developed countries. It could just be because the US is unusually wealthy.
Of course, like you say, this is further evidence we should increase foreign aid, since money could do far more good in very poor countries than very rich ones.
I havenât run the numbers, and this is not my field so the below is very low confidence, but now that you mention it I wouldnât be surprised if isoelastic utility would be enough to explain the lack of results.
LLMs claim that if the effect size is 10 times smaller, you need a sample size 100x larger to have the same statistical significance (someone correct me if this is wrong)
So if a GiveDirectly RCT in Kenya needs a sample size of 2,000 individuals to detect a statistically significant effect; an RCT in the US where you expect the effect to be 10x smaller would need 200,000 individuals, which is intractable[1].
Another intuition is that the effects of cash transfers in LMIC are significant but not huge, and iirc many experts claim that after ~5 years from the transfer there are negligible effects on subjective well-being, so it wouldnât take that much for the effect to become undetectable.
Edit: Gemini raises a good point that variance could also be higher in the US. If the standard deviation of wellbeing for beneficiaries in the US is 2x larger than in Kenya, and the effect is 10x smaller, I think youâd need a 400x larger sample size, not âjustâ 100x
Hmm, Iâd guess the s.d.s to be lower in the US than in Kenya, for what itâs worth. Under five child mortality rates are about 10x higher in Kenya than in the US, and I expect stuff like that to bleed through to other places.
But even if we assume a smaller s.d. (check: is there a smaller s.d., empirically?), this might be a major problem. The top paper on OpenResearch says they can rule out health improvements greater than 0.023-0.028 standard deviations from giving $1000/âmonth. Iâm not sure how it compares to households incomes, but letâs assume that household income is $2000-$3000/âmonth for the US recipients now, so the transfer is 33-50% of household income.
From Googling around, the GiveDirectly studies show mental health effects around a quarter of a standard deviation from a doubling of income.
In other words, the study can rule out effect sizes the size that theory would predict if theory predicts effect sizes >0.1x the effect size in GiveDirectly experiments.
Does theory predict effect sizes >0.1x that of the GiveDirectly experiments? Well, it depends on Ć, the risk-aversion constant[1]! If risk aversion is between 0 (linear, aka insane) and 1 (logarithmic) we should predict changes >.41 x to >.58x that of the GD experiments. So we can rule out linera, logarithmic, and super-logarithmic utility!
But if Ć=1.5, then theory will predict changes on the scale of 0.076x to 0.128x that of the GD experiments. Ie, exactly in the boundary of whether itâs possible to detect an effect at all or not!
If Ć =2, then theory will predict changes on the scale of 0.014x to 0.028x, or much smaller than the experiments are powered to detect.
For what itâs worth, before the studies I wouldâve guessed a risk-aversion constant across countries to be something in between 1.2 and 2, so this study updates me some but not a lot.
@Kelsey Piper and others, did you or the study authors pre-register your beliefs on what risk-aversion constant you expected?
But if Ć=1.5, then theory will predict changes on the scale of 0.076x to 0.128x that of the GD experiments. Ie, exactly in the boundary of whether itâs possible to detect an effect at all or not!
Might be misreading, on a quick skim Sam Nolanâs analysis seemed pertinent but noticed youâd already commented. Samâs reply still seems useful to me, in particular the data here
Iâm not sure about using these results to update your estimates of Ć (as there are too many other differences between the US and LMICs, e.g. access to hospitals, no tubercolosis). But it does seem that reasonable values of Ć would explain most of the lack of effects, especially for the study where mothers received âjustâ $333/âmonth and similar ones.
Maybe it was just the titling of the essayâbut I was surprised when I read it to see no mention of giving a la givedirectly (i.e. to people in extreme poverty). I take it that these results donât imply much of an update there? Am I wrong?
Kelsey Piper wrote a nice article on recent results of cash transfers in the US: Giving people money helped less than I thought it would
[...]
While itâs sad that the lives of people in those studies didnât improve, I think this is some evidence that Giving isnât demanding, and that giving 10% wouldnât worsen the life of a median person in the US in a measurable way.
I expected low effects based on background assumptions like utility being sublogarithmic to income but I didnât expect the effect to be actually zero (at the level of power that the very big studies could detect).
Iâd be interested in replicating these trials in other developed countries. It could just be because the US is unusually wealthy.
Of course, like you say, this is further evidence we should increase foreign aid, since money could do far more good in very poor countries than very rich ones.
I havenât run the numbers, and this is not my field so the below is very low confidence, but now that you mention it I wouldnât be surprised if isoelastic utility would be enough to explain the lack of results.
LLMs claim that if the effect size is 10 times smaller, you need a sample size 100x larger to have the same statistical significance (someone correct me if this is wrong)
So if a GiveDirectly RCT in Kenya needs a sample size of 2,000 individuals to detect a statistically significant effect; an RCT in the US where you expect the effect to be 10x smaller would need 200,000 individuals, which is intractable[1].
Another intuition is that the effects of cash transfers in LMIC are significant but not huge, and iirc many experts claim that after ~5 years from the transfer there are negligible effects on subjective well-being, so it wouldnât take that much for the effect to become undetectable.
But again, this is all an uninformed vague guess.
Edit: Gemini raises a good point that variance could also be higher in the US. If the standard deviation of wellbeing for beneficiaries in the US is 2x larger than in Kenya, and the effect is 10x smaller, I think youâd need a 400x larger sample size, not âjustâ 100x
Hmm, Iâd guess the s.d.s to be lower in the US than in Kenya, for what itâs worth. Under five child mortality rates are about 10x higher in Kenya than in the US, and I expect stuff like that to bleed through to other places.
But even if we assume a smaller s.d. (check: is there a smaller s.d., empirically?), this might be a major problem. The top paper on OpenResearch says they can rule out health improvements greater than 0.023-0.028 standard deviations from giving $1000/âmonth. Iâm not sure how it compares to households incomes, but letâs assume that household income is $2000-$3000/âmonth for the US recipients now, so the transfer is 33-50% of household income.
From Googling around, the GiveDirectly studies show mental health effects around a quarter of a standard deviation from a doubling of income.
In other words, the study can rule out effect sizes the size that theory would predict if theory predicts effect sizes >0.1x the effect size in GiveDirectly experiments.
Does theory predict effect sizes >0.1x that of the GiveDirectly experiments? Well, it depends on Ć, the risk-aversion constant[1]! If risk aversion is between 0 (linear, aka insane) and 1 (logarithmic) we should predict changes >.41 x to >.58x that of the GD experiments. So we can rule out linera, logarithmic, and super-logarithmic utility!
But if Ć=1.5, then theory will predict changes on the scale of 0.076x to 0.128x that of the GD experiments. Ie, exactly in the boundary of whether itâs possible to detect an effect at all or not!
If Ć =2, then theory will predict changes on the scale of 0.014x to 0.028x, or much smaller than the experiments are powered to detect.
For what itâs worth, before the studies I wouldâve guessed a risk-aversion constant across countries to be something in between 1.2 and 2, so this study updates me some but not a lot.
@Kelsey Piper and others, did you or the study authors pre-register your beliefs on what risk-aversion constant you expected?
rendering the greek constant economists use for risk aversion
as Ć since it otherwise doesnât render correctly on my laptop.
Might be misreading, on a quick skim Sam Nolanâs analysis seemed pertinent but noticed youâd already commented. Samâs reply still seems useful to me, in particular the data here
although none of those countries are low-income so your concern re: OOD generalisation still applies.
lmao when I commented 3 years ago I said
and then I just did an out-of-country and out-of-distribution generalization with no caveats! I could be really silly sometimes lol.
Thank you for running the numbers!
Iâm not sure about using these results to update your estimates of Ć (as there are too many other differences between the US and LMICs, e.g. access to hospitals, no tubercolosis). But it does seem that reasonable values of Ć would explain most of the lack of effects, especially for the study where mothers received âjustâ $333/âmonth and similar ones.
Maybe it was just the titling of the essayâbut I was surprised when I read it to see no mention of giving a la givedirectly (i.e. to people in extreme poverty). I take it that these results donât imply much of an update there? Am I wrong?
I think these studies are just more evidence on the difference between US poverty and global poverty.
Some comments from the post: