Also, notice that the top countries are pretty small. That may be because random factors/shocks may be more likely to push the average up or down for small countries. Cf:
Kahneman begins the chapter with an example of data interpretation using cases of kidney cancer. The lowest rates of kidney cancer are in counties that are rural and vote Republican. All sorts of theories jump to mind based on that data. However, a few paragraphs later Kahneman notes that the data also shows that the counties with the highest rates of kidney cancer are rural and vote Republican. The problem is that rural counties have small sample sizes and therefore are prone to extremes.
True! Of course if we had all the data we could run a fancier statistical test. I suppose my observation is limited to the fact that the English-speaking vs European ranges seem similar rather than e.g. all the Anglosphere countries being distinctly higher than all the European countries.
You could do a funnel type plot where your y-axis is EAs/capita and your x-axis is 1/sqrt(population), which is sort of what you’d expect the standard deviation to look like.
Hmm.… That cutoff is really making it hard to assess what’s going on here IMO. Everything is kinda clustered close to the line making me suspect the selection effect is important.
I agree with that.
Also, notice that the top countries are pretty small. That may be because random factors/shocks may be more likely to push the average up or down for small countries. Cf:
The classic fact about variance in small populations, from the start of “thinking fast and slow”. Love it!
True!
Of course if we had all the data we could run a fancier statistical test. I suppose my observation is limited to the fact that the English-speaking vs European ranges seem similar rather than e.g. all the Anglosphere countries being distinctly higher than all the European countries.
You could do a funnel type plot where your y-axis is EAs/capita and your x-axis is 1/sqrt(population), which is sort of what you’d expect the standard deviation to look like.
OK this is what we get, using the 25 EAs cutoff for the red line.
Hmm.… That cutoff is really making it hard to assess what’s going on here IMO. Everything is kinda clustered close to the line making me suspect the selection effect is important.