Now for normal distributions, or normal-ish distributions, this may not matter all that much in practice. As you say āheight roughly follows a normal distribution,ā so as long as a distribution is ~roughly normal, some small divergences doesnāt get you too far away (maybe with a slightly differently shaped underlying distribution that fits the data itās possible to get a 242 cm human, maybe even 260 cm, but not 400cm, and certainly not 4000 cm).
Since height roughly follows a normal distribution, the probability of huge heights is negligible.
Right, by āthe probability of huge heights is negligibleā, I meant way more than 2.42 m, such that the details of the distribution would not matter. I would not get an astronomically low probability of at least such an height based on the methodology I used to get an astronomically low chance of a conflict causing human extinction. To arrive at this, I looked into the empirical tail distribution. I did not fit a distribution to the 25th to 75th range, which is probably what would have suggested a normal distribution for height, and then extrapolated from there. I said I got an annual probability of conflict causing human extinction lower than 10^-9 using 33 or less of the rightmost points of the tail distribution. The 33rd tallest person whose height was recorded was actually 2.42 m, which illustrates I would not have gotten an astronomically low probability for at least 2.42 m.
This is why I think itās important to be able to think about a problem from multiple angles.
I agree. What do you think is the annualised probability of a nuclear war or volcanic eruption causing human extinction in the next 10 years? Do you see any concrete scenarios where the probability of a nuclear war or volcanic eruption causing human extinction is close to Tobyās values?
I usually deploy this line [āany extremal distribution looks like a straight-line when drawn on a log-log plot with a fat markerā] when arguing against people who claim they discovered a power law when I suspect something like ~log-normal might be a better fit. But obviously it works in the other direction as well, the main issue is model uncertainty.
I think power laws overestimate extinction risk. They imply the probability of going from 80 M annual deaths to extinction would be the same as going from extinction to 800 billion annual deaths, which very much overestimates the risk of large death tolls. So it makes sense the tail distribution eventually starts to decay much faster than implied by a power law, especially if this is fitted to the left tail.
On the other hand, I agree it is unclear whether the above tail distribution suggests an annual probability of a conflict causing human extinction above/ābelow 10^-9. Still, even my inside view annual extinction risk from nuclear war of 5.53*10^-10 (which makes no use of the above tail distribution) is only 0.0111 % (= 5.53*10^-10/ā(5*10^-6)) of Tobyās value.
I did not fit a distribution to the 25th to 75th range, which is probably what would have suggested a normal distribution for height, and then extrapolated from there. I said I got an annual probability of conflict causing human extinction lower than 10^-9 using 33 or less of the rightmost points of the tail distribution. The 33rd tallest person whose height was recorded was actually 2.42 m, which illustrates I would not have gotten an astronomically low probability for at least 2.42 m.
To be clear, Iām not accusing you of removing outliers from your data. Iām saying that you canāt rule out medium-small probabilities of your model being badly off based on all the direct data you have access to, when you have so few data points to fit your model (not due to your fault, but because reality only gave you so many data points to look at).
My guess is that randomly selecting 1000 data points of human height and fitting a distribution will more likely than not generate a ~normal distribution, but this is just speculation, I havenāt done the data analysis myself.
What do you think is the annualised probability of a nuclear war or volcanic eruption causing human extinction in the next 10 years? Do you see any concrete scenarios where the probability of a nuclear war or volcanic eruption causing human extinction is close to Tobyās values?
I havenāt been able to come up with a good toy model or bounds that Iām happy with, after thinking about it for a bit (Iām sure less than you or Toby or others like Luisa). If you or other commenters have models that you like, please let me know!
(In particular Iād be interested in a good generative argument for the prior).
Thanks, Linch. Strongly upvoted.
Right, by āthe probability of huge heights is negligibleā, I meant way more than 2.42 m, such that the details of the distribution would not matter. I would not get an astronomically low probability of at least such an height based on the methodology I used to get an astronomically low chance of a conflict causing human extinction. To arrive at this, I looked into the empirical tail distribution. I did not fit a distribution to the 25th to 75th range, which is probably what would have suggested a normal distribution for height, and then extrapolated from there. I said I got an annual probability of conflict causing human extinction lower than 10^-9 using 33 or less of the rightmost points of the tail distribution. The 33rd tallest person whose height was recorded was actually 2.42 m, which illustrates I would not have gotten an astronomically low probability for at least 2.42 m.
I agree. What do you think is the annualised probability of a nuclear war or volcanic eruption causing human extinction in the next 10 years? Do you see any concrete scenarios where the probability of a nuclear war or volcanic eruption causing human extinction is close to Tobyās values?
I think power laws overestimate extinction risk. They imply the probability of going from 80 M annual deaths to extinction would be the same as going from extinction to 800 billion annual deaths, which very much overestimates the risk of large death tolls. So it makes sense the tail distribution eventually starts to decay much faster than implied by a power law, especially if this is fitted to the left tail.
On the other hand, I agree it is unclear whether the above tail distribution suggests an annual probability of a conflict causing human extinction above/ābelow 10^-9. Still, even my inside view annual extinction risk from nuclear war of 5.53*10^-10 (which makes no use of the above tail distribution) is only 0.0111 % (= 5.53*10^-10/ā(5*10^-6)) of Tobyās value.
To be clear, Iām not accusing you of removing outliers from your data. Iām saying that you canāt rule out medium-small probabilities of your model being badly off based on all the direct data you have access to, when you have so few data points to fit your model (not due to your fault, but because reality only gave you so many data points to look at).
My guess is that randomly selecting 1000 data points of human height and fitting a distribution will more likely than not generate a ~normal distribution, but this is just speculation, I havenāt done the data analysis myself.
I havenāt been able to come up with a good toy model or bounds that Iām happy with, after thinking about it for a bit (Iām sure less than you or Toby or others like Luisa). If you or other commenters have models that you like, please let me know!
(In particular Iād be interested in a good generative argument for the prior).