Thanks for all the feedback, and early work on the topic, Stephen! I will reply to your points in different comments as you suggested.
First, you’re very likely right that my earlier estimates were too high. Although I still put some credence in a power law model, I think I should have incorporated more model uncertainty, and noted that other models would imply (much) lower chances of extinction-level wars.
To be fair, you and Rani had a section on breaking the [power] law where you say other distributions would fit the data well (although you did not discuss the implications for tail risk):
First, and most importantly, only two papers in the review also check whether other distributions might fit the same data.Clauset, Shalizi, and Newman (2009) consider four other distributions,[3] while Rafael González-Val (2015) also considers a lognormal fit. Both papers find that alternative distributions also fit the Correlates of War data well. In fact, when Clauset, Shalizi, and Newman compare the fit of the different distributions, they find no reason to prefer the power law.[4]
With respect to the below, I encourage readers to check the respective thread for context.
I think @Ryan Greenblatt has made good points in other comments so won’t belabour this point other than to add that I think some method of using the mean, or geometric mean, rather than median seems reasonable to me when we face this degree of model uncertainty.
As I explained in the thread, I do not think a simple mean is appropriate. That being said, the mean could also lead to astronomically low extinction risk. With the methodology I followed, one has to look into at least 34 distributions for the mean not to be astronomically low. I have just obtained the following graph in this tab:
You suggested using the geometric mean, but it is always 0 given the null annual extinction risk for the top distribution, so it does not show up in the above graph. The median only is non-null for at least 84 distributions. I looked into all the 111 types of distributions available in SciPy, since I wanted to minimise cherry-picking as much as possible, but typical analyses only study 1 or a few. So it would have been easy to miss on noticing that the mean could lead to a much higher extinction risk.
Incidentally, the steep increase in the red line of the graph above illustrates one worry I have about using the mean I had alluded to in the thread. The simple mean is not resistant to outliers, in the sense that these are overweighted[1]. I have a strong intuition that, given 33 models outputting an annual extinction risk between 0 and 9.07*10^-14, with mean 1.15*10^-13 among them, one should not update upwards by 8 OOMs to an extinction risk of 6.45*10^-6 after integrating a 34th model outputting an annual extinction risk of 0.0219 % (similar to yours of 0.0124 %). Under these conditions, I think one should put way less weight on the 34th model (maybe roughly no weight?). As Holden Karnofsky discusses in the post Why we can’t take expected value estimates literally (even when they’re unbiased):
An EEV [explicit expected value] approach to this situation [analogous to using the simple mean in our case] might say, “Even if there’s a 99.99% chance that the estimate [of high extinction risk] is completely wrong and that the value of Action A is 0, there’s still an 0.01% probability that Action A has a value of X. Thus, overall Action A has an expected value of at least 0.0001X; the greater X is, the greater this value is, and if X is great enough [if there are a few models outputting a high enough extinction risk] then, then you should take Action A unless you’re willing to bet at enormous odds that the framework is wrong.”
However, the same formula discussed above indicates that Action X actually has an expected value – after the Bayesian adjustment – of X/(X^2+1), or just under 1/X. In this framework, the greater X is [the higher the extinction risk of a poorly calibrated model], the lower the expected value of Action A [the lower the product between the weight a poorly calibrated model should receive and its high extinction risk]. This syncs well with my intuitions: if someone threatened to harm one person unless you gave them $10, this ought to carry more weight (because it is more plausible in the face of the “prior” of life experience) than if they threatened to harm 100 people, which in turn ought to carry more weight than if they threatened to harm 3^^^3 people (I’m using 3^^^3 here as a representation of an unimaginably huge number).
Ideally, one would do more research to find how much weight each distribution should receive. In the absence of that, I think using the median is a simple way to adequately weight outliers.
One other minor point here: a reason I still like the power law fit is that there’s at least some theoretical support for this distribution (as Bear wrote about in Only the Dead).
The worry here is that the theoretical support for using a power law breaks at some point. According to a power law, the probability p1 of at least 8 billion deaths conditional on 800 M deaths is the same as the probability p2 of 80 billion deaths conditional on 8 billion deaths. However, p1 is low[2] whereas p2 is 0.
Whereas I haven’t seen arguments that connect other potential fits to the theory of the underlying data generating process.
There is this argument I mentioned in the post:
In addition, according to extreme value theory (EVT), the right tail should follow a generalised Pareto[10], and the respective best fit distribution resulted in an extinction risk of exactly 0[11] (R^2 of 99.8 %). Like I anticipated, the best fit Pareto (power law) resulted in a higher risk, 0.0122 % (R^2 of 99.7 %), i.e. 98.4 % (= 1.22*10^-4/(1.24*10^-4)) of Stephen’s 0.0124 %. Such remarkable agreement means the extinction risk for the best fit Pareto is essentially the same regardless of whether it is fitted to the top 10 % logarithm of the annual war deaths of combatants as a fraction of the global population (as I did), or to the war deaths of combatants per war (as implied by Stephen using Bear’s estimates). I guess this qualitatively generalises to other types of distributions. In any case, I would rather follow my approach.
I should note I have just updated in the post the part of the sentence above after “i.e.”. Previously, I was comparing the annual war extinction risk of my best fit power law with your extinction risk per war under the “constant risk hypothesis”. Now I am making the correct comparison with your annual war extinction risk.
This is pretty speculative and uncertain, but is another reason why I don’t want to throw away the power law entirely yet.
Just to clarify, I am still accounting for the results of the power law in my best guess. However, since I am using the median to aggregate the various estimates of the extinction risk, I get astronomically low extinction risk even accounting for distributions predicting super high values.
Thanks Vasco! I’ll come back to this to respond in a bit more depth next week (this is a busy week).
In the meantime, curious what you make of my point that setting a prior that gives only a 1 in 15 trillion chance of experiencing an extinction-level war in any given year seems wrong?
I’ll come back to this to respond in a bit more depth next week (this is a busy week).
No worries, and thanks for still managing to make an in-depth comment!
In the meantime, curious what you make of my point that setting a prior that gives only a 1 in 15 trillion chance of experiencing an extinction-level war in any given year seems wrong?
I only managed to reply to 3 of your points yesterday and during this evening, but I plan to address that 4th one still today.
Thanks for all the feedback, and early work on the topic, Stephen! I will reply to your points in different comments as you suggested.
To be fair, you and Rani had a section on breaking the [power] law where you say other distributions would fit the data well (although you did not discuss the implications for tail risk):
With respect to the below, I encourage readers to check the respective thread for context.
As I explained in the thread, I do not think a simple mean is appropriate. That being said, the mean could also lead to astronomically low extinction risk. With the methodology I followed, one has to look into at least 34 distributions for the mean not to be astronomically low. I have just obtained the following graph in this tab:
You suggested using the geometric mean, but it is always 0 given the null annual extinction risk for the top distribution, so it does not show up in the above graph. The median only is non-null for at least 84 distributions. I looked into all the 111 types of distributions available in SciPy, since I wanted to minimise cherry-picking as much as possible, but typical analyses only study 1 or a few. So it would have been easy to miss on noticing that the mean could lead to a much higher extinction risk.
Incidentally, the steep increase in the red line of the graph above illustrates one worry I have about using the mean I had alluded to in the thread. The simple mean is not resistant to outliers, in the sense that these are overweighted[1]. I have a strong intuition that, given 33 models outputting an annual extinction risk between 0 and 9.07*10^-14, with mean 1.15*10^-13 among them, one should not update upwards by 8 OOMs to an extinction risk of 6.45*10^-6 after integrating a 34th model outputting an annual extinction risk of 0.0219 % (similar to yours of 0.0124 %). Under these conditions, I think one should put way less weight on the 34th model (maybe roughly no weight?). As Holden Karnofsky discusses in the post Why we can’t take expected value estimates literally (even when they’re unbiased):
Ideally, one would do more research to find how much weight each distribution should receive. In the absence of that, I think using the median is a simple way to adequately weight outliers.
The worry here is that the theoretical support for using a power law breaks at some point. According to a power law, the probability p1 of at least 8 billion deaths conditional on 800 M deaths is the same as the probability p2 of 80 billion deaths conditional on 8 billion deaths. However, p1 is low[2] whereas p2 is 0.
There is this argument I mentioned in the post:
I should note I have just updated in the post the part of the sentence above after “i.e.”. Previously, I was comparing the annual war extinction risk of my best fit power law with your extinction risk per war under the “constant risk hypothesis”. Now I am making the correct comparison with your annual war extinction risk.
Just to clarify, I am still accounting for the results of the power law in my best guess. However, since I am using the median to aggregate the various estimates of the extinction risk, I get astronomically low extinction risk even accounting for distributions predicting super high values.
I have added this point to the post.
I would argue not just low, but super low.
Thanks Vasco! I’ll come back to this to respond in a bit more depth next week (this is a busy week).
In the meantime, curious what you make of my point that setting a prior that gives only a 1 in 15 trillion chance of experiencing an extinction-level war in any given year seems wrong?
You are welcome, Stephen!
No worries, and thanks for still managing to make an in-depth comment!
I only managed to reply to 3 of your points yesterday and during this evening, but I plan to address that 4th one still today.