Thanks again for this post, Vasco, and for sharing it with me for discussion beforehand. I really appreciate your work on this question. It’s super valuable to have more people thinking deeply about these issues and this post is a significant contribution.
The headline of my response is I think you’re pointing in the right direction and the estimates I gave in my original post are too high. But I think you’re overshooting and the probabilities you give here seem too low.
I have a couple of points to expand on; please do feel free to respond to each in individual comments to facilitate better discussion!
To summarize, my points are:
I think you’re right that my earlier estimates were too high; but I think this way overcorrects the other way.
There are some issues with using the historical war data
I’m still a bit confused and uneasy about your choice to use proportion killed per year rather than proportion or total killed per war.
I think your preferred estimate is so infinitesimally small that something must be going wrong.
First, you’re very likely right that my earlier estimates were too high. Although I still put some credence in a power law model, I think I should have incorporated more model uncertainty, and noted that other models would imply (much) lower chances of extinction-level wars.
I think @Ryan Greenblatt has made good points in other comments so won’t belabour this point other than to add that I think some method of using the mean, or geometric mean, rather than median seems reasonable to me when we face this degree of model uncertainty.
One other minor point here: a reason I still like the power law fit is that there’s at least some theoretical support for this distribution (as Bear wrote about in Only the Dead). Whereas I haven’t seen arguments that connect other potential fits to the theory of the underlying data generating process. This is pretty speculative and uncertain, but is another reason why I don’t want to throw away the power law entirely yet.
Second, I’m still skeptical that the historical war data is the “right” prior to use. It may be “a” prior but your title might be overstating things. This is related to Aaron’s point you quote in footnote 9, about assuming wars are IID over time. I think maybe we can assume they’re I (independent), but not that they’re ID (identically distributed) over time.
I think we can be pretty confident that WWII was so much larger than other wars not just randomly, but in fact because globalization[1] and new technologies like machine guns and bombs shifted the distribution of potential war outcomes. And I think similarly that distribution has shifted again since. Cf. my discussion of war-making capacity here. Obviously past war size isn’t completely irrelevant to the potential size of current wars, but I do think not adjusting for this shift at all likely biases your estimate down.
Third, I’m still uneasy about your choice to use annual proportion of population killed rather than number of deaths per war. This is just very rare in the IR world. I don’t know enough about how the COW data is created to assess it properly. Maybe one problem here is that it just clearly breaks the IID assumption. If we’re modelling each year as a draw, then since major wars last more than a year the probabilities of subsequent draws are clearly dependent on previous draws. Whereas if we just model each war as a whole as a draw (either in terms of gross deaths or in terms of deaths as a proportion of world population), then we’re at least closer to an IID world. Not sure about this, but it feels like it also biases your estimate down.
Finally, I’m a bit suspicious of infinitesimal probabilities due to the strength they give the prior. They imply we’d need enormously strong evidence to update much at all in a way that seems unreasonable to me.
Let’s take your preferred estimate of an annual probability of “6.36*10^-14”. That’s a 1 in 15,723,270,440,252 chance. That is, 1 in 15 trillion years.
I look around at the world and I see a nuclear-armed state fighting against a NATO-backed ally in Ukraine; I see conflict once again spreading throughout the Middle East; I see the US arming and perhaps preparing to defend Taiwan against China, which is governed by a leader who claims to consider reunification both inevitable and an existential issue for his nation.
And I see nuclear arsenals that still top 12,000 warheads and growing; I see ongoing bioweapons research powered by ever-more-capable biotechnologies; and I see obvious military interest in developing AI systems and autonomous weapons.
This does not seem like a situation that only leads to total existential destruction once every 15 trillion years.
I know you’re only talking about the prior, but your preferred estimate implies we’d need a galactically-enormous update to get to a posterior probability of war x-risk that seems reasonable. So I think something might be going wrong. Cf. some of Joe’s discussion of settling on infinitesimal priors here.
All that said, let me reiterate that I really appreciate this work!
What I mean here is that we should adjust somewhat for the fact that world wars are even possible nowadays. WWII was fought across three or four continents; that just couldn’t have happened before the 1900s. But about 1⁄3 of the COW dataset is for pre-1900 wars.
Finally, I’m a bit suspicious of infinitesimal probabilities due to the strength they give the prior. They imply we’d need enormously strong evidence to update much at all in a way that seems unreasonable to me.
[...]
Cf. some of Joe’s discussion of settling on infinitesimal priors here.
I think there is a potential misunderstanding here. Joe Carlsmith’s[1]discussion on the contraints on future updating apply to one’s best guess. In contrast, my astronomically low best guess prior is not supposed to be neither my current best guess nor a preliminary best guess from which one should formally update towards one’s best guess. That being said, historical war deaths seem to me like the most natural prior to assess future war deaths, so I see some merit in using my astronomically low best guess prior as a preliminary best guess.
I also agree with Joe that an astronomically low annual AI extinction risk (e.g. 6.36*10^-14) would not make sense (see this somewhat related thread). However, I would think about the possibility of AI killing all humans in the context of AI risk, not great power war.
Let’s take your preferred estimate of an annual probability of “6.36*10^-14”. That’s a 1 in 15,723,270,440,252 chance. That is, 1 in 15 trillion years.
I look around at the world and I see a nuclear-armed state fighting against a NATO-backed ally in Ukraine; I see conflict once again spreading throughout the Middle East; I see the US arming and perhaps preparing to defend Taiwan against China, which is governed by a leader who claims to consider reunification both inevitable and an existential issue for his nation.
And I see nuclear arsenals that still top 12,000 warheads and growing; I see ongoing bioweapons research powered by ever-more-capable biotechnologies; and I see obvious military interest in developing AI systems and autonomous weapons.
This does not seem like a situation that only leads to total existential destruction once every 15 trillion years.
I feel like the sentiment you are expressing describing current events and trends would also have applied in the past, and today to risks which you might consider overly low. On the one hand, I appreciate a probability like 6.36*10^-14 intuitively feels way too small. On the other, humans are not designed to intuitively/directly assess the probability of rare events in a reliable way. These involve many steps, and therefore give rise to scope neglect.
As a side note, I do not think there is an evolutionary incentive for an individual human to accurately distinguishing between an extinction risk of 10^-14 and 0.01 %, because both are negligible in comparison with the annual risk of death 1 % (for a life expectancy of 100 years). Relatedly, I mentioned in the post that:
In general, I suspect there is a tendency to give probabilities between 1 % and 99 % for events whose mechanics we do not understand well [e.g. extinction conditional on a war larger than World War 2], given this range encompasses the vast majority (98 %) of the available linear space (from 0 to 1), and events in everyday life one cares about are not that extreme. However, the available logarithmic space is infinitely vast, so there is margin for such guesses to be major overestimates. In the context of tail risk, subjective guesses can easily fail to adequately account for the faster decay of the tail distribution as severity approaches the maximum.
In addition, I guess my astronomically low annual war extinction risk feels like an extreme value to many because they have in the back of their minds Toby’s guesses for the existential risk between 2021 and 2120 given in The Precipice. The guess was 0.1 % for nuclear war, which respects an annual existential risk of around 10^-5, way larger than the estimates for annual war extinction risk I present in my post. Toby does not mechanistically explain how he got his guesses, but I do not think he used quantitative models to derive them. So I think they may well be prone to scope neglect. In terms of Toby’s guesses, I also mentioned in the post that:
In general, I agree with David Thorstad that Toby Ord’s guesses for the existential risk between 2021 and 2120 given in The Precipice are very high (e.g. 0.1 % for nuclear war). In the realm of the more anthropogenic AI, bio and nuclear risk, I personally think underweighting the outside view is a major reason leading to overly high risk. I encourage readers to check David’s series exaggerating the risks, which includes subseries on climate, AI and bio risk.
To give an example that is not discussed by David, Salotti 2022 estimated the extinction risk per century from asteroids and comets is 2.2*10^-12 (see Table 1), which is 6 (= log10(10^-6/(2.2*10^-12)) orders of magnitude lower than Toby Ord’s guess for the existential risk. The concept of existential risk is quite vague, but I do not think one can say existential risk from asteroids and comets is 6 orders of magnitude higher than extinction risk from these:
There have been 5 mass extinctions, and the impact winter involved in the last one, which played a role in the extinction of the dinosaurs, may well have contributed to the emergence of mammals, and ultimately humans.
It is possible a species better than humans at steering the future would have evolved given fewer mass extinctions, or in the absence of the last one in particular, but this is unclear. So I would say the above is some evidence that existential risk may even be lower than extinction risk.
I know you’re only talking about the prior, but your preferred estimate implies we’d need a galactically-enormous update to get to a posterior probability of war x-risk that seems reasonable. So I think something might be going wrong.
The methodology I followed in my analysis is quite similar to yours. The major differences are that:
I fitted distributions to the top 10 % logarithm of the annual war deaths of combatants as a fraction of the global population, whereas you relied on an extinction risk per war from Bear obtained by fitting a power law to war deaths of combatants per war. As I commented, it is unclear to me whether this is a major issue, but I prefer my approach.
I dealt with 111 types distributions, whereas you focussed on 1.
For the distribution you used (a Pareto), I got an annual probability of a war causing human extinction of 0.0122 %, which is very similar to the 0.0124 %/year respecting your estimate of 0.95 % over 77 years.
Aggregating the results of the top 100 distributions, I got 6.36*10^-14.
You might be thinking something along the lines of:
Given no fundamental flaw in my methodology, one should updated towards an astronomically low war extinction risk.
Given a fundamental flaw in my methodology, one should updated towards a war extinction risk e.g. 10 % as high as your 0.0124 %/year, i.e. 0.00124 %/year.
However, given the similarities between our methodologies, I think there is a high chance that any fundamental flaw in my methodology would affect yours too. So, given a fundamental flaw in mine, I would mostly believe that neither my best guess prior nor your best guess could be trusted. So, to the extent your best guess for the war extinction is informed by your methodology, I would not use it as a prior given a fundamental flaw in my methodology. In this case, one would have to come up with a better methodology rather than multiplying your annual war extinction risk by e.g. 10 %.
I also feel like updating on your prior via multiplication by something like 10 % would be quite arbitrary because my estimates for the annual war extinction risk are all over the map. Across all 111 distributions, and 3 values for the deaths of combatants as a fraction of the total deaths (10 %, 50 % and 90%) I studied, I got estimates for the annual probability of a war causing human extinction from 0 to 8.84 %. Considering just my best guess of war deaths of combatants equal to 50 % of the total deaths, the annual probability of a war causing human extinction still ranges from 0 to 2.95 %. Given such wide ranges, I would instead update towards a state of greater cluelessness or less resilience. In turn, these would imply a greater need for a better methodology, and more research on quantifying the risk of war in general.
Third, I’m still uneasy about your choice to use annual proportion of population killed rather than number of deaths per war. This is just very rare in the IR world.
Looking into annual war deaths as a fraction of the global population is relevant to estimate extinction risk, but the international relations world is not focussing on this. For reference, here is what I said about this matter in the post:
Stephen commented I had better follow the typical approach of modelling war deaths, instead of annual war deaths as a fraction of the global population, and then getting the probability of human extinction from the chance of war deaths being at least as large as the global population. I think my approach is more appropriate, especially to estimate tail risk. There is human extinction if and only if annual war deaths as a fraction of the global population are at least 1. In contrast, war deaths as a fraction of the global population in the year the war started being at least 1 does not imply human extinction. Consider a war lasting for the next 100 years totalling 8 billion deaths. The war deaths as a fraction of the global population in the year the war started would be 100 %, which means such a war would imply human extinction under the typical approach. Nevertheless, this would only be the case if no humans were born in the next 100 years, and new births are not negligible. In fact, the global population increased thanks to these during the years with the most annual war deaths of combatants in the data I used:
From 1914 to 1918 (years of World War 1), they were 9.28 M, 0.510 % (= 9.28/(1.82*10^3)) of the global population in 1914, but the global population increased 2.20 % (= 1.86/1.82 − 1) during this period.
From 1939 to 1945 (years of World War 2), they were 17.8 M, 0.784 % (= 17.8/(2.27*10^3)) of the global population in 1939, but the global population increased 4.85 % (= 2.38/2.27 − 1) during this period.
Do you have any thoughts on the above?
I don’t know enough about how the COW data is created to assess it properly. Maybe one problem here is that it just clearly breaks the IID assumption. If we’re modelling each year as a draw, then since major wars last more than a year the probabilities of subsequent draws are clearly dependent on previous draws. Whereas if we just model each war as a whole as a draw (either in terms of gross deaths or in terms of deaths as a proportion of world population), then we’re at least closer to an IID world. Not sure about this, but it feels like it also biases your estimate down.
It is unclear to me whether this is a major issue, because both methodolies lead to essentially the same annual war extinction risk for a power law:
Like I anticipated, the best fit Pareto (power law) resulted in a higher risk, 0.0122 % (R^2 of 99.7 %), i.e. 98.4 % (= 1.22*10^-4/(1.24*10^-4)) of Stephen’s 0.0124 %. Such remarkable agreement means the extinction risk for the best fit Pareto is essentially the same regardless of whether it is fitted to the top 10 % logarithm of the annual war deaths of combatants as a fraction of the global population (as I did), or to the war deaths of combatants per war (as implied by Stephen using Bear’s estimates). I guess this qualitatively generalises to other types of distributions. In any case, I would rather follow my approach.
Second, I’m still skeptical that the historical war data is the “right” prior to use. It may be “a” prior but your title might be overstating things. This is related to Aaron’s point you quote in footnote 9, about assuming wars are IID over time. I think maybe we can assume they’re I (independent), but not that they’re ID (identically distributed) over time.
Historical war deaths seem to me like the most natural prior to assess future war deaths. I guess you consider it a decent prior too, as you relied on historical war data to get your extinction risk, but maybe you have a better reference class in mind?
Aron’s point about annual war deaths not being IID over time does not have a clear impact on my estimate for the annual extinction risk. If one thinks war deaths have been decreasing/increasing, then one should update towards a lower/higher extinction risk. However:
There is not an obvious trend in the past 600 years (see last graph in the post).
My impression is that there is lots of debate in the literature, and that the honest conclusion is that we do not have enough data to establish a clear trend.
I think Aron’s paper (Clauset 2018) agrees with the above:
Since 1945, there have been relatively few large interstate wars, especially compared to the preceding 30 years, which included both World Wars. This pattern, sometimes called the long peace, is highly controversial. Does it represent an enduring trend caused by a genuine change in the underlying conflict-generating processes? Or is it consistent with a highly variable but otherwise stable system of conflict? Using the empirical distributions of interstate war sizes and onset times from 1823 to 2003, we parameterize stationary models of conflict generation that can distinguish trends from statistical fluctuations in the statistics of war. These models indicate that both the long peace and the period of great violence that preceded it are not statistically uncommon patterns in realistic but stationary conflict time series.
I think there is also another point Aron was referring to in footnote 9 (emphasis mine):
you have a deeper assumption that is quite questionable, which is whether events are plausibly iid [independent and identically distributed] over such a long time scale. This is where the deep theoretical understanding from the literature on war is useful, and in my 2018 paper [Clauset 2018], my Discussion section delves into the implications of that understanding for making such long term and large-size extrapolations.
Relevant context for what I highlighted above:
Clauset 2018 did estimate a 50 % probability of a war causing 1 billion battle deaths[15] in the next 1,339 years (see “The long view”), which is close to my pessimistic scenario [see post for explanation]
I think Aron had the above in mind, and therefore was worried about assuming wars are IID over a long time, because this affects how much time it would take in expectation for a war to cause extinction. However, in my post I am estimating this time, but rather the nearterm annual probability of a war causing extinction, which does not rely on assumptions about whether wars will be IID over a long time horizon. I alluded to this in footnote 9:
Assuming wars are IID over a long time scale would be problematic if one wanted to estimate the time until a war caused human extinction, but I do not think it is an issue to estimate the nearterm annual extinction risk.
It is possible you missed this part, because it was not in the early versions of the draft.
I think we can be pretty confident that WWII was so much larger than other wars not just randomly, but in fact because globalization[1] and new technologies like machine guns and bombs shifted the distribution of potential war outcomes.
Some thoughts on the above:
What directy matters to assess the annual probability of a war causing human extinction is not war deaths, but annual war deaths as a fraction of the global population. For instance, one can have increasing war deaths with constant annual probability of a war causing human extinction if wars become increasinly long and population increases. Hopefully not, but it is possible wars in the far future will routinely wipe out e.g. trillions of digital minds while not posing any meaningful risk of wiping out all digital minds due to the existence of a huge population.
It is unclear to me whether globalisation makes wars larger. For example, globalisation is associated with an expansion of international trade, and this can explain the “durable peace hypothesis” (see Jackson 2015).
In agreement with deterrence theory, I believe greater potential to cause damage may result in less expected damage, although I am personally not convinced of this.
Even if globalisation makes wars larger, it could make them less frequent too, such that the expected annual damage decreases, and so does the annual probability of one causing extinction.
And I think similarly that distribution has shifted again since. Cf. my discussion of war-making capacity here. Obviously past war size isn’t completely irrelevant to the potential size of current wars, but I do think not adjusting for this shift at all likely biases your estimate down.
I assume increasing capability to cause damage is the main reason for people arguing that future wars would belong to a different category. Yet:
I think war capabilities have been decreasing or not changing much in the last few decades:
“Nuclear risk has been decreasing. The estimated destroyable area by nuclear weapons deliverable in a first strike has decreased 89.2 % (= 1 − 65.2/601) since its peak in 1962” (see 1st graph below).
Military expenditure as a fraction of global GDP has decreased from 1960 to 2000, and been fairly constant since then (see 2nd graph below).
Taking a broader view, war capabilities do have been increasing, but there is not a clear trend in the deaths in conflicts as a fraction of the global population since 1400 (see last figure in the post).
Increases in the capability to cause damage are usually associated with increases in the capability to prevent damage, which I guess explains what I said just above, so one should not forecast future risk based on just one side alone.
Thanks for all the feedback, and early work on the topic, Stephen! I will reply to your points in different comments as you suggested.
First, you’re very likely right that my earlier estimates were too high. Although I still put some credence in a power law model, I think I should have incorporated more model uncertainty, and noted that other models would imply (much) lower chances of extinction-level wars.
To be fair, you and Rani had a section on breaking the [power] law where you say other distributions would fit the data well (although you did not discuss the implications for tail risk):
First, and most importantly, only two papers in the review also check whether other distributions might fit the same data.Clauset, Shalizi, and Newman (2009) consider four other distributions,[3] while Rafael González-Val (2015) also considers a lognormal fit. Both papers find that alternative distributions also fit the Correlates of War data well. In fact, when Clauset, Shalizi, and Newman compare the fit of the different distributions, they find no reason to prefer the power law.[4]
With respect to the below, I encourage readers to check the respective thread for context.
I think @Ryan Greenblatt has made good points in other comments so won’t belabour this point other than to add that I think some method of using the mean, or geometric mean, rather than median seems reasonable to me when we face this degree of model uncertainty.
As I explained in the thread, I do not think a simple mean is appropriate. That being said, the mean could also lead to astronomically low extinction risk. With the methodology I followed, one has to look into at least 34 distributions for the mean not to be astronomically low. I have just obtained the following graph in this tab:
You suggested using the geometric mean, but it is always 0 given the null annual extinction risk for the top distribution, so it does not show up in the above graph. The median only is non-null for at least 84 distributions. I looked into all the 111 types of distributions available in SciPy, since I wanted to minimise cherry-picking as much as possible, but typical analyses only study 1 or a few. So it would have been easy to miss on noticing that the mean could lead to a much higher extinction risk.
Incidentally, the steep increase in the red line of the graph above illustrates one worry I have about using the mean I had alluded to in the thread. The simple mean is not resistant to outliers, in the sense that these are overweighted[1]. I have a strong intuition that, given 33 models outputting an annual extinction risk between 0 and 9.07*10^-14, with mean 1.15*10^-13 among them, one should not update upwards by 8 OOMs to an extinction risk of 6.45*10^-6 after integrating a 34th model outputting an annual extinction risk of 0.0219 % (similar to yours of 0.0124 %). Under these conditions, I think one should put way less weight on the 34th model (maybe roughly no weight?). As Holden Karnofsky discusses in the post Why we can’t take expected value estimates literally (even when they’re unbiased):
An EEV [explicit expected value] approach to this situation [analogous to using the simple mean in our case] might say, “Even if there’s a 99.99% chance that the estimate [of high extinction risk] is completely wrong and that the value of Action A is 0, there’s still an 0.01% probability that Action A has a value of X. Thus, overall Action A has an expected value of at least 0.0001X; the greater X is, the greater this value is, and if X is great enough [if there are a few models outputting a high enough extinction risk] then, then you should take Action A unless you’re willing to bet at enormous odds that the framework is wrong.”
However, the same formula discussed above indicates that Action X actually has an expected value – after the Bayesian adjustment – of X/(X^2+1), or just under 1/X. In this framework, the greater X is [the higher the extinction risk of a poorly calibrated model], the lower the expected value of Action A [the lower the product between the weight a poorly calibrated model should receive and its high extinction risk]. This syncs well with my intuitions: if someone threatened to harm one person unless you gave them $10, this ought to carry more weight (because it is more plausible in the face of the “prior” of life experience) than if they threatened to harm 100 people, which in turn ought to carry more weight than if they threatened to harm 3^^^3 people (I’m using 3^^^3 here as a representation of an unimaginably huge number).
Ideally, one would do more research to find how much weight each distribution should receive. In the absence of that, I think using the median is a simple way to adequately weight outliers.
One other minor point here: a reason I still like the power law fit is that there’s at least some theoretical support for this distribution (as Bear wrote about in Only the Dead).
The worry here is that the theoretical support for using a power law breaks at some point. According to a power law, the probability p1 of at least 8 billion deaths conditional on 800 M deaths is the same as the probability p2 of 80 billion deaths conditional on 8 billion deaths. However, p1 is low[2] whereas p2 is 0.
Whereas I haven’t seen arguments that connect other potential fits to the theory of the underlying data generating process.
There is this argument I mentioned in the post:
In addition, according to extreme value theory (EVT), the right tail should follow a generalised Pareto[10], and the respective best fit distribution resulted in an extinction risk of exactly 0[11] (R^2 of 99.8 %). Like I anticipated, the best fit Pareto (power law) resulted in a higher risk, 0.0122 % (R^2 of 99.7 %), i.e. 98.4 % (= 1.22*10^-4/(1.24*10^-4)) of Stephen’s 0.0124 %. Such remarkable agreement means the extinction risk for the best fit Pareto is essentially the same regardless of whether it is fitted to the top 10 % logarithm of the annual war deaths of combatants as a fraction of the global population (as I did), or to the war deaths of combatants per war (as implied by Stephen using Bear’s estimates). I guess this qualitatively generalises to other types of distributions. In any case, I would rather follow my approach.
I should note I have just updated in the post the part of the sentence above after “i.e.”. Previously, I was comparing the annual war extinction risk of my best fit power law with your extinction risk per war under the “constant risk hypothesis”. Now I am making the correct comparison with your annual war extinction risk.
This is pretty speculative and uncertain, but is another reason why I don’t want to throw away the power law entirely yet.
Just to clarify, I am still accounting for the results of the power law in my best guess. However, since I am using the median to aggregate the various estimates of the extinction risk, I get astronomically low extinction risk even accounting for distributions predicting super high values.
Thanks Vasco! I’ll come back to this to respond in a bit more depth next week (this is a busy week).
In the meantime, curious what you make of my point that setting a prior that gives only a 1 in 15 trillion chance of experiencing an extinction-level war in any given year seems wrong?
I’ll come back to this to respond in a bit more depth next week (this is a busy week).
No worries, and thanks for still managing to make an in-depth comment!
In the meantime, curious what you make of my point that setting a prior that gives only a 1 in 15 trillion chance of experiencing an extinction-level war in any given year seems wrong?
I only managed to reply to 3 of your points yesterday and during this evening, but I plan to address that 4th one still today.
Thanks again for this post, Vasco, and for sharing it with me for discussion beforehand. I really appreciate your work on this question. It’s super valuable to have more people thinking deeply about these issues and this post is a significant contribution.
The headline of my response is I think you’re pointing in the right direction and the estimates I gave in my original post are too high. But I think you’re overshooting and the probabilities you give here seem too low.
I have a couple of points to expand on; please do feel free to respond to each in individual comments to facilitate better discussion!
To summarize, my points are:
I think you’re right that my earlier estimates were too high; but I think this way overcorrects the other way.
There are some issues with using the historical war data
I’m still a bit confused and uneasy about your choice to use proportion killed per year rather than proportion or total killed per war.
I think your preferred estimate is so infinitesimally small that something must be going wrong.
First, you’re very likely right that my earlier estimates were too high. Although I still put some credence in a power law model, I think I should have incorporated more model uncertainty, and noted that other models would imply (much) lower chances of extinction-level wars.
I think @Ryan Greenblatt has made good points in other comments so won’t belabour this point other than to add that I think some method of using the mean, or geometric mean, rather than median seems reasonable to me when we face this degree of model uncertainty.
One other minor point here: a reason I still like the power law fit is that there’s at least some theoretical support for this distribution (as Bear wrote about in Only the Dead). Whereas I haven’t seen arguments that connect other potential fits to the theory of the underlying data generating process. This is pretty speculative and uncertain, but is another reason why I don’t want to throw away the power law entirely yet.
Second, I’m still skeptical that the historical war data is the “right” prior to use. It may be “a” prior but your title might be overstating things. This is related to Aaron’s point you quote in footnote 9, about assuming wars are IID over time. I think maybe we can assume they’re I (independent), but not that they’re ID (identically distributed) over time.
I think we can be pretty confident that WWII was so much larger than other wars not just randomly, but in fact because globalization[1] and new technologies like machine guns and bombs shifted the distribution of potential war outcomes. And I think similarly that distribution has shifted again since. Cf. my discussion of war-making capacity here. Obviously past war size isn’t completely irrelevant to the potential size of current wars, but I do think not adjusting for this shift at all likely biases your estimate down.
Third, I’m still uneasy about your choice to use annual proportion of population killed rather than number of deaths per war. This is just very rare in the IR world. I don’t know enough about how the COW data is created to assess it properly. Maybe one problem here is that it just clearly breaks the IID assumption. If we’re modelling each year as a draw, then since major wars last more than a year the probabilities of subsequent draws are clearly dependent on previous draws. Whereas if we just model each war as a whole as a draw (either in terms of gross deaths or in terms of deaths as a proportion of world population), then we’re at least closer to an IID world. Not sure about this, but it feels like it also biases your estimate down.
Finally, I’m a bit suspicious of infinitesimal probabilities due to the strength they give the prior. They imply we’d need enormously strong evidence to update much at all in a way that seems unreasonable to me.
Let’s take your preferred estimate of an annual probability of “6.36*10^-14”. That’s a 1 in 15,723,270,440,252 chance. That is, 1 in 15 trillion years.
I look around at the world and I see a nuclear-armed state fighting against a NATO-backed ally in Ukraine; I see conflict once again spreading throughout the Middle East; I see the US arming and perhaps preparing to defend Taiwan against China, which is governed by a leader who claims to consider reunification both inevitable and an existential issue for his nation.
And I see nuclear arsenals that still top 12,000 warheads and growing; I see ongoing bioweapons research powered by ever-more-capable biotechnologies; and I see obvious military interest in developing AI systems and autonomous weapons.
This does not seem like a situation that only leads to total existential destruction once every 15 trillion years.
I know you’re only talking about the prior, but your preferred estimate implies we’d need a galactically-enormous update to get to a posterior probability of war x-risk that seems reasonable. So I think something might be going wrong. Cf. some of Joe’s discussion of settling on infinitesimal priors here.
All that said, let me reiterate that I really appreciate this work!
What I mean here is that we should adjust somewhat for the fact that world wars are even possible nowadays. WWII was fought across three or four continents; that just couldn’t have happened before the 1900s. But about 1⁄3 of the COW dataset is for pre-1900 wars.
I think there is a potential misunderstanding here. Joe Carlsmith’s[1] discussion on the contraints on future updating apply to one’s best guess. In contrast, my astronomically low best guess prior is not supposed to be neither my current best guess nor a preliminary best guess from which one should formally update towards one’s best guess. That being said, historical war deaths seem to me like the most natural prior to assess future war deaths, so I see some merit in using my astronomically low best guess prior as a preliminary best guess.
I also agree with Joe that an astronomically low annual AI extinction risk (e.g. 6.36*10^-14) would not make sense (see this somewhat related thread). However, I would think about the possibility of AI killing all humans in the context of AI risk, not great power war.
I feel like the sentiment you are expressing describing current events and trends would also have applied in the past, and today to risks which you might consider overly low. On the one hand, I appreciate a probability like 6.36*10^-14 intuitively feels way too small. On the other, humans are not designed to intuitively/directly assess the probability of rare events in a reliable way. These involve many steps, and therefore give rise to scope neglect.
As a side note, I do not think there is an evolutionary incentive for an individual human to accurately distinguishing between an extinction risk of 10^-14 and 0.01 %, because both are negligible in comparison with the annual risk of death 1 % (for a life expectancy of 100 years). Relatedly, I mentioned in the post that:
In addition, I guess my astronomically low annual war extinction risk feels like an extreme value to many because they have in the back of their minds Toby’s guesses for the existential risk between 2021 and 2120 given in The Precipice. The guess was 0.1 % for nuclear war, which respects an annual existential risk of around 10^-5, way larger than the estimates for annual war extinction risk I present in my post. Toby does not mechanistically explain how he got his guesses, but I do not think he used quantitative models to derive them. So I think they may well be prone to scope neglect. In terms of Toby’s guesses, I also mentioned in the post that:
To give an example that is not discussed by David, Salotti 2022 estimated the extinction risk per century from asteroids and comets is 2.2*10^-12 (see Table 1), which is 6 (= log10(10^-6/(2.2*10^-12)) orders of magnitude lower than Toby Ord’s guess for the existential risk. The concept of existential risk is quite vague, but I do not think one can say existential risk from asteroids and comets is 6 orders of magnitude higher than extinction risk from these:
There have been 5 mass extinctions, and the impact winter involved in the last one, which played a role in the extinction of the dinosaurs, may well have contributed to the emergence of mammals, and ultimately humans.
It is possible a species better than humans at steering the future would have evolved given fewer mass extinctions, or in the absence of the last one in particular, but this is unclear. So I would say the above is some evidence that existential risk may even be lower than extinction risk.
The methodology I followed in my analysis is quite similar to yours. The major differences are that:
I fitted distributions to the top 10 % logarithm of the annual war deaths of combatants as a fraction of the global population, whereas you relied on an extinction risk per war from Bear obtained by fitting a power law to war deaths of combatants per war. As I commented, it is unclear to me whether this is a major issue, but I prefer my approach.
I dealt with 111 types distributions, whereas you focussed on 1.
For the distribution you used (a Pareto), I got an annual probability of a war causing human extinction of 0.0122 %, which is very similar to the 0.0124 %/year respecting your estimate of 0.95 % over 77 years.
Aggregating the results of the top 100 distributions, I got 6.36*10^-14.
You might be thinking something along the lines of:
Given no fundamental flaw in my methodology, one should updated towards an astronomically low war extinction risk.
Given a fundamental flaw in my methodology, one should updated towards a war extinction risk e.g. 10 % as high as your 0.0124 %/year, i.e. 0.00124 %/year.
However, given the similarities between our methodologies, I think there is a high chance that any fundamental flaw in my methodology would affect yours too. So, given a fundamental flaw in mine, I would mostly believe that neither my best guess prior nor your best guess could be trusted. So, to the extent your best guess for the war extinction is informed by your methodology, I would not use it as a prior given a fundamental flaw in my methodology. In this case, one would have to come up with a better methodology rather than multiplying your annual war extinction risk by e.g. 10 %.
I also feel like updating on your prior via multiplication by something like 10 % would be quite arbitrary because my estimates for the annual war extinction risk are all over the map. Across all 111 distributions, and 3 values for the deaths of combatants as a fraction of the total deaths (10 %, 50 % and 90%) I studied, I got estimates for the annual probability of a war causing human extinction from 0 to 8.84 %. Considering just my best guess of war deaths of combatants equal to 50 % of the total deaths, the annual probability of a war causing human extinction still ranges from 0 to 2.95 %. Given such wide ranges, I would instead update towards a state of greater cluelessness or less resilience. In turn, these would imply a greater need for a better methodology, and more research on quantifying the risk of war in general.
I like use the full name on the 1st occasion a name is mentioned, and then just the 1st name afterwards.
Looking into annual war deaths as a fraction of the global population is relevant to estimate extinction risk, but the international relations world is not focussing on this. For reference, here is what I said about this matter in the post:
Do you have any thoughts on the above?
It is unclear to me whether this is a major issue, because both methodolies lead to essentially the same annual war extinction risk for a power law:
Historical war deaths seem to me like the most natural prior to assess future war deaths. I guess you consider it a decent prior too, as you relied on historical war data to get your extinction risk, but maybe you have a better reference class in mind?
Aron’s point about annual war deaths not being IID over time does not have a clear impact on my estimate for the annual extinction risk. If one thinks war deaths have been decreasing/increasing, then one should update towards a lower/higher extinction risk. However:
There is not an obvious trend in the past 600 years (see last graph in the post).
My impression is that there is lots of debate in the literature, and that the honest conclusion is that we do not have enough data to establish a clear trend.
I think Aron’s paper (Clauset 2018) agrees with the above:
I think there is also another point Aron was referring to in footnote 9 (emphasis mine):
Relevant context for what I highlighted above:
I think Aron had the above in mind, and therefore was worried about assuming wars are IID over a long time, because this affects how much time it would take in expectation for a war to cause extinction. However, in my post I am estimating this time, but rather the nearterm annual probability of a war causing extinction, which does not rely on assumptions about whether wars will be IID over a long time horizon. I alluded to this in footnote 9:
It is possible you missed this part, because it was not in the early versions of the draft.
Some thoughts on the above:
What directy matters to assess the annual probability of a war causing human extinction is not war deaths, but annual war deaths as a fraction of the global population. For instance, one can have increasing war deaths with constant annual probability of a war causing human extinction if wars become increasinly long and population increases. Hopefully not, but it is possible wars in the far future will routinely wipe out e.g. trillions of digital minds while not posing any meaningful risk of wiping out all digital minds due to the existence of a huge population.
It is unclear to me whether globalisation makes wars larger. For example, globalisation is associated with an expansion of international trade, and this can explain the “durable peace hypothesis” (see Jackson 2015).
In agreement with deterrence theory, I believe greater potential to cause damage may result in less expected damage, although I am personally not convinced of this.
Even if globalisation makes wars larger, it could make them less frequent too, such that the expected annual damage decreases, and so does the annual probability of one causing extinction.
Related to the above, I commented that:
Thanks for all the feedback, and early work on the topic, Stephen! I will reply to your points in different comments as you suggested.
To be fair, you and Rani had a section on breaking the [power] law where you say other distributions would fit the data well (although you did not discuss the implications for tail risk):
With respect to the below, I encourage readers to check the respective thread for context.
As I explained in the thread, I do not think a simple mean is appropriate. That being said, the mean could also lead to astronomically low extinction risk. With the methodology I followed, one has to look into at least 34 distributions for the mean not to be astronomically low. I have just obtained the following graph in this tab:
You suggested using the geometric mean, but it is always 0 given the null annual extinction risk for the top distribution, so it does not show up in the above graph. The median only is non-null for at least 84 distributions. I looked into all the 111 types of distributions available in SciPy, since I wanted to minimise cherry-picking as much as possible, but typical analyses only study 1 or a few. So it would have been easy to miss on noticing that the mean could lead to a much higher extinction risk.
Incidentally, the steep increase in the red line of the graph above illustrates one worry I have about using the mean I had alluded to in the thread. The simple mean is not resistant to outliers, in the sense that these are overweighted[1]. I have a strong intuition that, given 33 models outputting an annual extinction risk between 0 and 9.07*10^-14, with mean 1.15*10^-13 among them, one should not update upwards by 8 OOMs to an extinction risk of 6.45*10^-6 after integrating a 34th model outputting an annual extinction risk of 0.0219 % (similar to yours of 0.0124 %). Under these conditions, I think one should put way less weight on the 34th model (maybe roughly no weight?). As Holden Karnofsky discusses in the post Why we can’t take expected value estimates literally (even when they’re unbiased):
Ideally, one would do more research to find how much weight each distribution should receive. In the absence of that, I think using the median is a simple way to adequately weight outliers.
The worry here is that the theoretical support for using a power law breaks at some point. According to a power law, the probability p1 of at least 8 billion deaths conditional on 800 M deaths is the same as the probability p2 of 80 billion deaths conditional on 8 billion deaths. However, p1 is low[2] whereas p2 is 0.
There is this argument I mentioned in the post:
I should note I have just updated in the post the part of the sentence above after “i.e.”. Previously, I was comparing the annual war extinction risk of my best fit power law with your extinction risk per war under the “constant risk hypothesis”. Now I am making the correct comparison with your annual war extinction risk.
Just to clarify, I am still accounting for the results of the power law in my best guess. However, since I am using the median to aggregate the various estimates of the extinction risk, I get astronomically low extinction risk even accounting for distributions predicting super high values.
I have added this point to the post.
I would argue not just low, but super low.
Thanks Vasco! I’ll come back to this to respond in a bit more depth next week (this is a busy week).
In the meantime, curious what you make of my point that setting a prior that gives only a 1 in 15 trillion chance of experiencing an extinction-level war in any given year seems wrong?
You are welcome, Stephen!
No worries, and thanks for still managing to make an in-depth comment!
I only managed to reply to 3 of your points yesterday and during this evening, but I plan to address that 4th one still today.