A toy model for technological existential risk

Preface: I’m quite uncertain about this model. It may be making too many simplifying assumptions to be useful, and I may not be applying Laplace’s Law of Succession correctly. I haven’t seen this approach anywhere, but I’m not confident it hasn’t been done before. I’m really interested in any comments or feedback anyone may have!

Acknowledgements: Thanks to Alex HT and Will Payne for interesting discussion, and thanks to Aimee Watts for proof-reading. All mistakes my own.

Summary

  • Humans seem likely to make much more technological progress in the next century then has been made previously.

  • Under a model based on Bostrom’s urn of creativity, having not discovered a devastating technology so far does not give much reason to think we won’t in the future.

  • This model serves as a way of deriving a prior for technological existential risk, and there might be good reasons to lower the expected risk.

  • This model may be too naïve an application of the urn of creativity, or of Laplace’s model, and may be flawed.

The Urn of Creativity Model

In the Vulnerable World Hypothesis, Bostrom describes technological progress as drawing balls blindly out of an urn. Some are white balls representing technologies that are relatively harmless, and some are black, representing technologies that upon discovery lead to human extinction.

So far humanity hasn’t drawn a black ball.

If we think existential risk in the next century mostly comes from dangerous new technologies, then we can think of estimating existential risk in the next century as estimating the chance we draw a black ball.

Laplace’s Law of Succession

Suppose we have no information about the ratio of black to white balls in the urn except our record so far. A mathematical rule called Laplace’s Law of Succession then says that the best estimate of the chance of drawing a black ball on the next go is , where n is the number of balls we’ve drawn so far, and s is the number of black balls we’ve drawn so far. So for us, s=0.

How do we calculate the probability of one black ball in the next m draws? Well we can’t just multiply the probability by itself, because every time you don’t draw a black ball, you should revise down your probability of drawing black. So the probability of drawing a black ball in the next m draws having drawn s black balls in the last n is

So assuming no black draws so far we get .

Applying the model

How do we choose n? Well we haven’t been literally drawing balls from an urn, but we can try and approximate a ‘draw’ by the equivalent time/​resources needed for discovering a new technology. This is of course also hard to estimate and highly variable. But I don’t think the final answer is too sensitive to how we break up the time/​resource into units.

Time

Firstly we can just approximate from time. Suppose human’s have been discovery new technology every year since the agricultural revolution. So we’ve drawn ~10,000 times. So n=10000, m=100.

We get .

So a 1% chance of extinction next century.

But we haven’t exactly been discovering technology at the same rate for the last 10,000 years. Suppose we think the vast majority of technology was discovered in the last ~250 years. Then n=250 instead. Then we get

So a 28.4% chance of extinction in next 100 years.

Person-years

However, even over the last 250 years the rate of progress has been increasing a lot, and so has the population! What if we used “person-years”? i.e. one draw from the urn is 1 year of life lived by a single person. Then we can use historical population estimates and future predictions. The total person years lived since 1750 is ~6.2 x 10^11, and the total number of person years we can expect in the next century is about ~10 x 10^12 [1].So n=6.2 x 10^11, m=10^12.

Then we get

.

So a 62% chance of extinction.

GDP

We could also consider using GDP instead. Suppose one draw from the urn is equivalent to $1 billion of GDP (though this doesn’t matter too much for the final answer). Then there has been GDP of $3.94 x 10^15 since 1750 and if we just assume 2% annual growth we can expect $3.76 x 10^16 in the next century. So n=3.94 x 10^6, m=3.76 x 10^7.

Then we get

So a 90% chance of extinction in the next century.

Uncertainties

There are quite a few things I’m uncertain about in this approach.

  • How does the model change if we update on inside view information?

    • We don’t just have the information that previous technology wasn’t disastrous, we also know what the technology was, and have a sense of how hard it is for a technology to cause devastation. This seems to be extra information not included in the model. Laplace’s law of succession works if we have a prior of a uniform distribution over the possible densities of black balls in the urn. I’m not sure how the final answers would change if this prior changed.

  • Am I correct in thinking the final answer is not too sensitive to the choice of a unit of a draw?

    • From just experimenting with different “units” e.g. $1 billion GDP equal to a draw, the final answer doesn’t seem to sensitive to the units. However I haven’t shown mathematically why this is the case.

  • Is the urn of creativity an over-simplification to the extent that this model is irrelevant?

    • We might, for example, expect that the chance of a technology being a black ball is going to increase over time as technology becomes more powerful. This might be analogous to the black balls being further down the urn. I’m unsure also how to incorporate this into the model and whether this would update us to think the risk higher or lower than calculated above. On the one hand it seems to clearly increase it if we think black balls will only become more likely to be drawn in the future. But on the other hand, perhaps if we would naturally not expect many black balls earlier in human history, then we can’t infer much from not finding any.

Conclusion

Overall, I think this model shouldn’t be taken too literally. But I think the following takeaway is interesting. Given how much technological advancement there is likely to be in the future compared to the past, we cannot be that confident that future technologies will not pose significant risks based only on the lack of harm done so far.

I’m really interested in any feedback or critiques of this approach, and whether it’s already been discussed somewhere else!

[1] Check this spreadsheet for my data calculations- data obtained from Our World in Data