I have a couple of questions. First, why does it make sense to assume that ln(income) grows linearly with SDs? Second, if that’s the case, then would it make more sense to use a geometric mean rather than the average for estimating the effect? (That is, the average in log scale, which would make sense if the noise there is unbiased normal in the log scale). (Changing from arithmetic to geometric mean resulted in a small reduction of the %income/SD from 19.1 to 15.6).
This is an interesting question. My answer is that I think of this exercise as a rough kind of meta-analysis, where results are combined in a (weighted) arithmetic mean.
I think the reason geometric means don’t work well in these kinds of exercises is that there are all sorts or differences and errors in individual studies that make it very likely that some of them will show zero (or negative) effect. Once this happens your geometric mean goes to zero (or breaks). I don’t think it makes sense to say something like “if because of noise the effect size on one of my many studies happens to show 0% instead of 1%, my meta analysis effect should be 0% instead of 10%.”
Oh, it seems like we’ve both made the same mistake 😊
If one SD results in a 10% increase, then I think the relevant effect size should be 110% and 2 SD be (110%)^2 rather than 120%, so that generally the logarithm of the effect would be linear with the test results increase (in SDs). Then, I think it makes more sense to approximate it as a geometric mean of these (all numbers > 0).
I’ve done the wrong calculation earlier, taking the geometric mean of the added percentage which doesn’t make sense, as you say. Correcting this, I got 16.4% increase.
[Note that for small enough effect size this would be very similar, as
I’ve started to draft a formal proof that under reasonable assumptions we would indeed get a linear relationship between the additive test results increase and the log of the effect on income, but accidently submitting too soon got me thinking that I’m spending too much time on this 🤓 If anyone is interested, I will continue with this proof
Thanks for this work! I found it interesting :)
To make this a bit easier for others like me who are interested in looking at the underlying calculations you’ve done:
CEA for Imagine Worldwide (has links to other calculations that I’ve seen)
Aggregation of the evidence for Income Effects of Education Interventions
I have a couple of questions. First, why does it make sense to assume that ln(income) grows linearly with SDs? Second, if that’s the case, then would it make more sense to use a geometric mean rather than the average for estimating the effect? (That is, the average in log scale, which would make sense if the noise there is unbiased normal in the log scale). (Changing from arithmetic to geometric mean resulted in a small reduction of the %income/SD from 19.1 to 15.6).
This is an interesting question. My answer is that I think of this exercise as a rough kind of meta-analysis, where results are combined in a (weighted) arithmetic mean.
I think the reason geometric means don’t work well in these kinds of exercises is that there are all sorts or differences and errors in individual studies that make it very likely that some of them will show zero (or negative) effect. Once this happens your geometric mean goes to zero (or breaks). I don’t think it makes sense to say something like “if because of noise the effect size on one of my many studies happens to show 0% instead of 1%, my meta analysis effect should be 0% instead of 10%.”
Oh, it seems like we’ve both made the same mistake 😊
If one SD results in a 10% increase, then I think the relevant effect size should be 110% and 2 SD be (110%)^2 rather than 120%, so that generally the logarithm of the effect would be linear with the test results increase (in SDs). Then, I think it makes more sense to approximate it as a geometric mean of these (all numbers > 0).
I’ve done the wrong calculation earlier, taking the geometric mean of the added percentage which doesn’t make sense, as you say. Correcting this, I got 16.4% increase.
[Note that for small enough effect size this would be very similar, as
∏(1+εi)λi=1+∑iλiεi+O(ε2).]
woops, submitted too early..I’ve started to draft a formal proof that under reasonable assumptions we would indeed get a linear relationship between the additive test results increase and the log of the effect on income, but accidently submitting too soon got me thinking that I’m spending too much time on this 🤓 If anyone is interested, I will continue with this proof