I think the main takeaway here is that you find that section confusing, and that’s not something one can “argue away”, and does point to room for improvement in my writing. :)
With that being said, note that we in fact don’t say anywhere that anything ‘is thin-tailed’. We just say that some paper ‘reports’ a thin-tailed distribution, which seems uncontroversially true. (OTOH I can totally see that the “by contrast” is confusing on some readings. And I also agree that it basically doesn’t matter what we say literally—if people read what we say as claiming that something is thin-tailed, then that’s a problem.)
FWIW, from my perspective the key observations (which I apparently failed to convey in a clear way at least for you) here are:
The top 1% share of ex-post “performance” [though see elsewhere that maybe that’s not the ideal term] data reported in the literature varies a lot, at least between 3% and 80%. So usually you’ll want to know roughly where on the spectrum you are for the job/task/situation relevant to you rather than just whether or not some binary property holds.
The range of top 1% shares is almost as large for data for which the sources used a mathematically ‘heavy-tailed’ type of distribution as model. In particular, there are some cases where we some source reports a mathematically ‘heavy-tailed’ distribution but where the top 1% share is barely larger than for other data based on a mathematically ‘thin-tailed’ distribution.
(As discussed elsewhere, it’s of course mathematically possible to have a mathematically ‘thin-tailed’ distribution with a larger top 1% share than a mathematically ‘heavy-tailed’ distribution. But the above observation is about what we in fact find in the literature rather than about what’s mathematically possible. I think the key point here is not so much that we haven’t found a ‘thin-tailed’ distribution with larger top 1% share than some ‘heavy-tailed’ distribution. but that the mathematical ‘heavy-tailed’ property doesn’t cleanly distinguish data/distributions by their top 1% share even in practice.)
So don’t look at whether the type of distribution used is ‘thin-tailed’ or ‘heavy-tailed’ in the mathematical sense, ask how heavy-tailed in the everyday sense (as operationalized by top 1% share or whatever you care about) your data/distribution is.
So basically what I tried to do is mentioning that we find both mathematically thin-tailed and mathematically heavy-tailed distributions reported in the literature in order to point out that this arguably isn’t the key thing to pay attention to. (But yeah I can totally see that this is not coming across in the summary as currently worded.)
As I tried to explain in my previous comment, I think the question whether performance in some domain is actually ‘thin-tailed’ or ‘heavy-tailed’ in the mathematical sense is closer to ill-posed or meaningless than true or false. Hence why I set aside the issue of whether a normal distribution or similar-looking log-normal distribution is the better model.
I think the main takeaway here is that you find that section confusing, and that’s not something one can “argue away”, and does point to room for improvement in my writing. :)
With that being said, note that we in fact don’t say anywhere that anything ‘is thin-tailed’. We just say that some paper ‘reports’ a thin-tailed distribution, which seems uncontroversially true. (OTOH I can totally see that the “by contrast” is confusing on some readings. And I also agree that it basically doesn’t matter what we say literally—if people read what we say as claiming that something is thin-tailed, then that’s a problem.)
FWIW, from my perspective the key observations (which I apparently failed to convey in a clear way at least for you) here are:
The top 1% share of ex-post “performance” [though see elsewhere that maybe that’s not the ideal term] data reported in the literature varies a lot, at least between 3% and 80%. So usually you’ll want to know roughly where on the spectrum you are for the job/task/situation relevant to you rather than just whether or not some binary property holds.
The range of top 1% shares is almost as large for data for which the sources used a mathematically ‘heavy-tailed’ type of distribution as model. In particular, there are some cases where we some source reports a mathematically ‘heavy-tailed’ distribution but where the top 1% share is barely larger than for other data based on a mathematically ‘thin-tailed’ distribution.
(As discussed elsewhere, it’s of course mathematically possible to have a mathematically ‘thin-tailed’ distribution with a larger top 1% share than a mathematically ‘heavy-tailed’ distribution. But the above observation is about what we in fact find in the literature rather than about what’s mathematically possible. I think the key point here is not so much that we haven’t found a ‘thin-tailed’ distribution with larger top 1% share than some ‘heavy-tailed’ distribution. but that the mathematical ‘heavy-tailed’ property doesn’t cleanly distinguish data/distributions by their top 1% share even in practice.)
So don’t look at whether the type of distribution used is ‘thin-tailed’ or ‘heavy-tailed’ in the mathematical sense, ask how heavy-tailed in the everyday sense (as operationalized by top 1% share or whatever you care about) your data/distribution is.
So basically what I tried to do is mentioning that we find both mathematically thin-tailed and mathematically heavy-tailed distributions reported in the literature in order to point out that this arguably isn’t the key thing to pay attention to. (But yeah I can totally see that this is not coming across in the summary as currently worded.)
As I tried to explain in my previous comment, I think the question whether performance in some domain is actually ‘thin-tailed’ or ‘heavy-tailed’ in the mathematical sense is closer to ill-posed or meaningless than true or false. Hence why I set aside the issue of whether a normal distribution or similar-looking log-normal distribution is the better model.