To the second half of your comment, I agree that extreme suffering can be very extreme and I think this is an important contribution. Maybe we have a misunderstanding about what ‘the bulk’ of suffering refers to. To me it means something like 75-99% and to you it means something like 45% as stated above? I should also clarify that by frequency I mean the product of ‘how many people have it’, ‘how often’ and ‘for how long’.
“the people in the top 10% of sufferers will have 10X the amount, and people in the 99% [I assume you mean top 1%?] will have 100X the amount”
I’m confused, you seem to be suggesting that every level of pain accounts for the _same_ amount of total suffering here.
To elaborate, you seem to be saying that at any level of pain, 10x worse pain is also 10x less frequent. That’s a power law with exponent 1. I.e. the levels of pain have an extreme distribution, but the frequencies do too (mild pains are extremely common). I’m not saying you’re wrong—just that I’ve seen also seems consistent with extreme pain being less than 10% of the total. I’m excited to see more data :)
Hi! Thank you for elaborating on what your question is :)
“Bulk” is indeed a very ambiguous term. Would you say 80% is “the bulk”? And 20% is “a small percentage”? If so we would be in agreement. If not, it is more of a wording issue than a matter of substance, I think.
Good catch that the numbers I provided would suggest a power law that just keeps going (e.g. similar to St. Petersburg paradox?). If we use the Cluster Headache dataset, the numbers are:
50% percentile experiences 70 CH/year
80% percentile experiences 365 CH/year
90% percentile experiences 730 CH/year
98% percentile experiences 2190 CH/year
So at least in this case the 90th percentile does get 10X the amount of the 50th percentile. But the 98th and 99th percentile is not as high as 100X, and more like 20 to 50x. So not quite the numbers I used as an example, but also not too far off.
I should also clarify that by frequency I mean the product of ‘how many people have it’, ‘how often’ and ‘for how long’.
Here is the main idea: In Lognormal World, you would see a lognormal distribution for “amount of suffering per person”, “peak suffering per person”, “how long suffering above a certain threshold lasts for each person”, etc.
To illustrate this point, let’s say that each person’s hedonic tone per each second of their life is distributed along a lognormal with an exponent that is a Gaussian with mean x and sd of y. We would then also have that x, across different people, is distributed along a Gaussian with a mean of z and sd of t. Now, if you want to get the global distribution of suffering per second across people, you would need to convolve two Gaussians on the logarithmic pain scale (which represent the exponents of the lognormal distributions). Since convolving two Gaussians gives you another Gaussian, we would then have that the global distribution of suffering per second is also a lognormal distribution! So both at the individual, and the global scale the lognormal long-tails will be present. Now, for you to appreciate the “bulk” of the suffering, you would need to look at the individuals who have the largest means for the normal distribution in the exponent (x in this case). Hence why looking at one’s own individual % of time in extreme pain does not provide a good idea of how much of it there is in the wild across people (especially if one is close to the median; i.e. a pretty happy person).
Your 4 cluster headache groups contribute about equally to the total number of cluster headaches if you multiply group size by # of CH’s. (The top 2% actually contribute a bit less). That’s my entire point. I’m not sure if you disagree?
I would disagree for the following reason. For a group to contribute equally it needs to have both its average and its size be such that when you multiply them you get the same value. While it is true that people at the 50% percentile get 1⁄10 of the people at the 90% (and ~1/50 of the 99%), these do not define groups. What we need to look at instead is the cumulative distribution function:
The bottom 50% accounts for 3.17% of incidents
The bottom 90% accounts for 30% of incidents
The bottom 95% accounts for 43% of incidents
What I am getting at is that for a given percentile, the contribution from the group “this percentile and lower” will be a lot smaller than the value at that percentile multiplied by the fraction of the participants below that level. This is because the distribution is very skewed, so for any given percentile the values below it quickly decrease.
Another way of looking at this is by assuming that each percentile has a corresponding value (in the example “number of CHs per year”) proportional to the rarity of that percentile or above. For simplicity, let’s say we have a step function where each time we divide the group by half we get twice the value for those right above the cut-off:
0 to 50% have 1/year
50 to 75% have 2/year
75 to 87.5% have 4/year
and so on...
Here each group contributes equally (size * # of CH is the same for each group). Counter-intuitively, this does not imply that extremes account for a small amount. On the contrary, it implies that the average is infinite (cf. St. Petersburg paradox): even though you will have that for any given percentile, the average below it is always finite (e.g. between 0 and 40% it’s 1/year), the average (and total contribution) above that percentile is always infinite. In this idealized case, it will always be the case that “the bulk is concentrated on a tiny percentile” (and indeed you can make that percentile as small as you want and still get infinitely more above it than below it).
The empirical distribution is not so skewed that we need to worry about infinity. But we do need to worry about the 57% accounted for by the top 5%.
That fair, I made a mathematical error there. The cluster headache math convinces me that a large chunk of total suffering goes to few people there due to lopsided frequencies. Do you have other examples? I particularly felt that the relative frequency of extreme compared to less extreme pain wasn’t well supported.
To the second half of your comment, I agree that extreme suffering can be very extreme and I think this is an important contribution. Maybe we have a misunderstanding about what ‘the bulk’ of suffering refers to. To me it means something like 75-99% and to you it means something like 45% as stated above? I should also clarify that by frequency I mean the product of ‘how many people have it’, ‘how often’ and ‘for how long’.
“the people in the top 10% of sufferers will have 10X the amount, and people in the 99% [I assume you mean top 1%?] will have 100X the amount”
I’m confused, you seem to be suggesting that every level of pain accounts for the _same_ amount of total suffering here.
To elaborate, you seem to be saying that at any level of pain, 10x worse pain is also 10x less frequent. That’s a power law with exponent 1. I.e. the levels of pain have an extreme distribution, but the frequencies do too (mild pains are extremely common). I’m not saying you’re wrong—just that I’ve seen also seems consistent with extreme pain being less than 10% of the total. I’m excited to see more data :)
Hi! Thank you for elaborating on what your question is :)
“Bulk” is indeed a very ambiguous term. Would you say 80% is “the bulk”? And 20% is “a small percentage”? If so we would be in agreement. If not, it is more of a wording issue than a matter of substance, I think.
Good catch that the numbers I provided would suggest a power law that just keeps going (e.g. similar to St. Petersburg paradox?). If we use the Cluster Headache dataset, the numbers are:
50% percentile experiences 70 CH/year
80% percentile experiences 365 CH/year
90% percentile experiences 730 CH/year
98% percentile experiences 2190 CH/year
So at least in this case the 90th percentile does get 10X the amount of the 50th percentile. But the 98th and 99th percentile is not as high as 100X, and more like 20 to 50x. So not quite the numbers I used as an example, but also not too far off.
Here is the main idea: In Lognormal World, you would see a lognormal distribution for “amount of suffering per person”, “peak suffering per person”, “how long suffering above a certain threshold lasts for each person”, etc.
To illustrate this point, let’s say that each person’s hedonic tone per each second of their life is distributed along a lognormal with an exponent that is a Gaussian with mean x and sd of y. We would then also have that x, across different people, is distributed along a Gaussian with a mean of z and sd of t. Now, if you want to get the global distribution of suffering per second across people, you would need to convolve two Gaussians on the logarithmic pain scale (which represent the exponents of the lognormal distributions). Since convolving two Gaussians gives you another Gaussian, we would then have that the global distribution of suffering per second is also a lognormal distribution! So both at the individual, and the global scale the lognormal long-tails will be present. Now, for you to appreciate the “bulk” of the suffering, you would need to look at the individuals who have the largest means for the normal distribution in the exponent (x in this case). Hence why looking at one’s own individual % of time in extreme pain does not provide a good idea of how much of it there is in the wild across people (especially if one is close to the median; i.e. a pretty happy person).
Your 4 cluster headache groups contribute about equally to the total number of cluster headaches if you multiply group size by # of CH’s. (The top 2% actually contribute a bit less). That’s my entire point. I’m not sure if you disagree?
Hey Soeren!
I would disagree for the following reason. For a group to contribute equally it needs to have both its average and its size be such that when you multiply them you get the same value. While it is true that people at the 50% percentile get 1⁄10 of the people at the 90% (and ~1/50 of the 99%), these do not define groups. What we need to look at instead is the cumulative distribution function:
The bottom 50% accounts for 3.17% of incidents
The bottom 90% accounts for 30% of incidents
The bottom 95% accounts for 43% of incidents
What I am getting at is that for a given percentile, the contribution from the group “this percentile and lower” will be a lot smaller than the value at that percentile multiplied by the fraction of the participants below that level. This is because the distribution is very skewed, so for any given percentile the values below it quickly decrease.
Another way of looking at this is by assuming that each percentile has a corresponding value (in the example “number of CHs per year”) proportional to the rarity of that percentile or above. For simplicity, let’s say we have a step function where each time we divide the group by half we get twice the value for those right above the cut-off:
0 to 50% have 1/year
50 to 75% have 2/year
75 to 87.5% have 4/year
and so on...
Here each group contributes equally (size * # of CH is the same for each group). Counter-intuitively, this does not imply that extremes account for a small amount. On the contrary, it implies that the average is infinite (cf. St. Petersburg paradox): even though you will have that for any given percentile, the average below it is always finite (e.g. between 0 and 40% it’s 1/year), the average (and total contribution) above that percentile is always infinite. In this idealized case, it will always be the case that “the bulk is concentrated on a tiny percentile” (and indeed you can make that percentile as small as you want and still get infinitely more above it than below it).
The empirical distribution is not so skewed that we need to worry about infinity. But we do need to worry about the 57% accounted for by the top 5%.
That fair, I made a mathematical error there. The cluster headache math convinces me that a large chunk of total suffering goes to few people there due to lopsided frequencies. Do you have other examples? I particularly felt that the relative frequency of extreme compared to less extreme pain wasn’t well supported.