Briefly on this, I think my issue becomes clearer if you look at the full section.
If we agree that log-normal is more likely than normal, and log-normal distributions are heavy-tailed, then saying âBy contrast, [performance in these jobs] is thin-tailedâ is just incorrect? Assuming you meant the mathematical senses of heavy-tailed and thin-tailed here, which I guess Iâm not sure if you did.
This uncertainty and resulting inability to assess whether this section is true or false obviously loops back to why I would prefer not to use the term âheavy-tailedâ at all, which I will address in more detail in my reply to your other comment.
Ex-post performance appears âheavy-tailedâ in many relevant domains, but with very large differences in how heavy-tailed: the top 1% account for between 4% to over 80% of the total. For instance, we find âheavy-tailedâ distributions (e.g. log-normal, power law) of scientific citations, startup valuations, income, and media sales. By contrast, a large meta-analysis reports âthin-tailedâ (Gaussian) distributions for ex-post performance in less complex jobs such as cook or mail carrier
I think the main takeaway here is that you find that section confusing, and thatâs not something one can âargue awayâ, and does point to room for improvement in my writing. :)
With that being said, note that we in fact donât say anywhere that anything âis thin-tailedâ. We just say that some paper âreportsâ a thin-tailed distribution, which seems uncontroversially true. (OTOH I can totally see that the âby contrastâ is confusing on some readings. And I also agree that it basically doesnât matter what we say literallyâif people read what we say as claiming that something is thin-tailed, then thatâs a problem.)
FWIW, from my perspective the key observations (which I apparently failed to convey in a clear way at least for you) here are:
The top 1% share of ex-post âperformanceâ [though see elsewhere that maybe thatâs not the ideal term] data reported in the literature varies a lot, at least between 3% and 80%. So usually youâll want to know roughly where on the spectrum you are for the job/âtask/âsituation relevant to you rather than just whether or not some binary property holds.
The range of top 1% shares is almost as large for data for which the sources used a mathematically âheavy-tailedâ type of distribution as model. In particular, there are some cases where we some source reports a mathematically âheavy-tailedâ distribution but where the top 1% share is barely larger than for other data based on a mathematically âthin-tailedâ distribution.
(As discussed elsewhere, itâs of course mathematically possible to have a mathematically âthin-tailedâ distribution with a larger top 1% share than a mathematically âheavy-tailedâ distribution. But the above observation is about what we in fact find in the literature rather than about whatâs mathematically possible. I think the key point here is not so much that we havenât found a âthin-tailedâ distribution with larger top 1% share than some âheavy-tailedâ distribution. but that the mathematical âheavy-tailedâ property doesnât cleanly distinguish data/âdistributions by their top 1% share even in practice.)
So donât look at whether the type of distribution used is âthin-tailedâ or âheavy-tailedâ in the mathematical sense, ask how heavy-tailed in the everyday sense (as operationalized by top 1% share or whatever you care about) your data/âdistribution is.
So basically what I tried to do is mentioning that we find both mathematically thin-tailed and mathematically heavy-tailed distributions reported in the literature in order to point out that this arguably isnât the key thing to pay attention to. (But yeah I can totally see that this is not coming across in the summary as currently worded.)
As I tried to explain in my previous comment, I think the question whether performance in some domain is actually âthin-tailedâ or âheavy-tailedâ in the mathematical sense is closer to ill-posed or meaningless than true or false. Hence why I set aside the issue of whether a normal distribution or similar-looking log-normal distribution is the better model.
Briefly on this, I think my issue becomes clearer if you look at the full section.
If we agree that log-normal is more likely than normal, and log-normal distributions are heavy-tailed, then saying âBy contrast, [performance in these jobs] is thin-tailedâ is just incorrect? Assuming you meant the mathematical senses of heavy-tailed and thin-tailed here, which I guess Iâm not sure if you did.
This uncertainty and resulting inability to assess whether this section is true or false obviously loops back to why I would prefer not to use the term âheavy-tailedâ at all, which I will address in more detail in my reply to your other comment.
I think the main takeaway here is that you find that section confusing, and thatâs not something one can âargue awayâ, and does point to room for improvement in my writing. :)
With that being said, note that we in fact donât say anywhere that anything âis thin-tailedâ. We just say that some paper âreportsâ a thin-tailed distribution, which seems uncontroversially true. (OTOH I can totally see that the âby contrastâ is confusing on some readings. And I also agree that it basically doesnât matter what we say literallyâif people read what we say as claiming that something is thin-tailed, then thatâs a problem.)
FWIW, from my perspective the key observations (which I apparently failed to convey in a clear way at least for you) here are:
The top 1% share of ex-post âperformanceâ [though see elsewhere that maybe thatâs not the ideal term] data reported in the literature varies a lot, at least between 3% and 80%. So usually youâll want to know roughly where on the spectrum you are for the job/âtask/âsituation relevant to you rather than just whether or not some binary property holds.
The range of top 1% shares is almost as large for data for which the sources used a mathematically âheavy-tailedâ type of distribution as model. In particular, there are some cases where we some source reports a mathematically âheavy-tailedâ distribution but where the top 1% share is barely larger than for other data based on a mathematically âthin-tailedâ distribution.
(As discussed elsewhere, itâs of course mathematically possible to have a mathematically âthin-tailedâ distribution with a larger top 1% share than a mathematically âheavy-tailedâ distribution. But the above observation is about what we in fact find in the literature rather than about whatâs mathematically possible. I think the key point here is not so much that we havenât found a âthin-tailedâ distribution with larger top 1% share than some âheavy-tailedâ distribution. but that the mathematical âheavy-tailedâ property doesnât cleanly distinguish data/âdistributions by their top 1% share even in practice.)
So donât look at whether the type of distribution used is âthin-tailedâ or âheavy-tailedâ in the mathematical sense, ask how heavy-tailed in the everyday sense (as operationalized by top 1% share or whatever you care about) your data/âdistribution is.
So basically what I tried to do is mentioning that we find both mathematically thin-tailed and mathematically heavy-tailed distributions reported in the literature in order to point out that this arguably isnât the key thing to pay attention to. (But yeah I can totally see that this is not coming across in the summary as currently worded.)
As I tried to explain in my previous comment, I think the question whether performance in some domain is actually âthin-tailedâ or âheavy-tailedâ in the mathematical sense is closer to ill-posed or meaningless than true or false. Hence why I set aside the issue of whether a normal distribution or similar-looking log-normal distribution is the better model.