eyeballing this data makes me think I was wrong: Northern/Western Europe seems to have quite comparable rates of EAs.
I think this method of presenting the data makes the countries not present much less salient (e.g. Portugal, Denmark, Ireland) so there mere fact that there are a lot of countries in Europe means they will tend to fill up the chart. To test this hypothesis I think you’d want to compute a single % for continental europe to compare to a single Anglo number.
Also, notice that the top countries are pretty small. That may be because random factors/shocks may be more likely to push the average up or down for small countries. Cf:
Kahneman begins the chapter with an example of data interpretation using cases of kidney cancer. The lowest rates of kidney cancer are in counties that are rural and vote Republican. All sorts of theories jump to mind based on that data. However, a few paragraphs later Kahneman notes that the data also shows that the counties with the highest rates of kidney cancer are rural and vote Republican. The problem is that rural counties have small sample sizes and therefore are prone to extremes.
True! Of course if we had all the data we could run a fancier statistical test. I suppose my observation is limited to the fact that the English-speaking vs European ranges seem similar rather than e.g. all the Anglosphere countries being distinctly higher than all the European countries.
You could do a funnel type plot where your y-axis is EAs/capita and your x-axis is 1/sqrt(population), which is sort of what you’d expect the standard deviation to look like.
Hmm.… That cutoff is really making it hard to assess what’s going on here IMO. Everything is kinda clustered close to the line making me suspect the selection effect is important.
Nice, yes very different in this framing! Possibly the more interesting comparison to me would be not Anglosphere vs rest-of-world but rather Anglosphere vs Western Europe, or OECD countries or something. Also depending on how we compute the averages (country-scaled or population-weighted) results could be quite different.
You could produce conservative and aggressive versions by assuming either =0 or =(cutoff-1) for the missing values and then plot the range, maybe it will be tight.
sure, for e.g. China or any large-ish country this works, but as soon as population <1 million or so the range would be very wide—e.g. Iceland presumably has some EA presence but it would be strange to put it at the top without knowing the actual data. If I wanted to spend more time on this I think just asking for the raw data would be best (I am unsure if they would give it to me, and I haven’t tried).
I think this method of presenting the data makes the countries not present much less salient (e.g. Portugal, Denmark, Ireland) so there mere fact that there are a lot of countries in Europe means they will tend to fill up the chart. To test this hypothesis I think you’d want to compute a single % for continental europe to compare to a single Anglo number.
Estonia clearly a massive standout though!
I agree with that.
Also, notice that the top countries are pretty small. That may be because random factors/shocks may be more likely to push the average up or down for small countries. Cf:
The classic fact about variance in small populations, from the start of “thinking fast and slow”. Love it!
True!
Of course if we had all the data we could run a fancier statistical test. I suppose my observation is limited to the fact that the English-speaking vs European ranges seem similar rather than e.g. all the Anglosphere countries being distinctly higher than all the European countries.
You could do a funnel type plot where your y-axis is EAs/capita and your x-axis is 1/sqrt(population), which is sort of what you’d expect the standard deviation to look like.
OK this is what we get, using the 25 EAs cutoff for the red line.
Hmm.… That cutoff is really making it hard to assess what’s going on here IMO. Everything is kinda clustered close to the line making me suspect the selection effect is important.
Yeah, I made a quick chart comparing anglosphere vs non-anglosphere
Nice, yes very different in this framing! Possibly the more interesting comparison to me would be not Anglosphere vs rest-of-world but rather Anglosphere vs Western Europe, or OECD countries or something. Also depending on how we compute the averages (country-scaled or population-weighted) results could be quite different.
I had the same thought! If I haven’t messed something up:
Anglosphere OECD: 4.03 EAs per million
Non-Anglosphere OECD: 2.96 per million
Matches my impression, thanks for making the chart!
True, I don’t have access to the raw data sadly, only to the data from the EA Survey forum post which has a minimum number of EAs cutoff.
You could produce conservative and aggressive versions by assuming either =0 or =(cutoff-1) for the missing values and then plot the range, maybe it will be tight.
sure, for e.g. China or any large-ish country this works, but as soon as population <1 million or so the range would be very wide—e.g. Iceland presumably has some EA presence but it would be strange to put it at the top without knowing the actual data. If I wanted to spend more time on this I think just asking for the raw data would be best (I am unsure if they would give it to me, and I haven’t tried).