What is the risk level below which you’d be OK with unpausing AI?
I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so if we could release that model and then pause, I’d be okay with that.
A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord’s x-risk estimates:
If there was a concrete plan under which AI could be used to mitigate pandemics and anthropogenic risks, then I would be ok with a higher probability of AI extinction, but it seems more likely that AI progress would increase these risks before it decreased them.
AI could be helpful for climate change and eventually nuclear war. So maybe I should be willing to go a little higher on the risk. But we might need a few more GPTs to fix these problems and if each new GPT is 1 in 10,000 then it starts to even out.
What do you think about the potential benefits from AI?
I’m very bullish about the benefits of an aligned AGI. Besides mitigating x-risk, I think curing aging should be a top priority and is worth taking some risks to obtain.
I’ve read the post quickly, but I don’t have a background in economics, so it would take me a while to fully absorb. My first impression is that it is interesting but not that useful for making decisions right now. The simplifications required by the model offset the gains in rigor. What do you think? Is it something I should take the time to understand?
My guess would be that the discount rate is pretty cruxy. Intuitively I would expect almost any gains over the next 1000 years to be offset by reductions in x-risk since we could have zillions of years to reap the benefits. (On a meta-level I believe moral questions are not “truthy” so this is just according to my vaguely total utilitarian preferences, not some deeper truth).
I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so I think if we could release that model and then pause, I’d be okay with that.
To me, this is wild. 1⁄10,000 * 8 billion people = 800,000 current lives lost in expectation, not even counting future lives. If you think GPT-5 is worth 800k+ human lives, you must have high expectations. :)
When you’re weighing existential risks (or other things which steer human civilization on a large scale) against each other, effects are always going to be denominated in a very large number of lives. And this is what OP said they were doing: “a major consideration here is the use of AI to mitigate other x-risks”. So I don’t think the headline numbers are very useful here (especially because we could make them far far higher by counting future lives).
So I don’t think the headline numbers are very useful here (especially because we could make them far far higher by counting future lives).
I used to prefer focussing on tail risk, but I now think expected deaths are a better metric.
Interventions in the effective altruism community are usually assessed under 2 different frameworks, existential risk mitigation, and nearterm welfare improvement. It looks like 2 distinct frameworks are needed given the difficulty of comparing nearterm and longterm effects. However, I do not think this is quite the right comparison under a longtermist perspective, where most of the expected value of one’s actions results from influencing the longterm future, and the indirect longterm effects of saving lives outside catastrophes cannot be neglected.
In this case, I believe it is better to use a single framework for assessing interventions saving human lives in catastrophes and normal times. One way of doing this, which I consider in this post, is supposing the benefits of saving one life are a function of the population size.
Assuming the benefits of saving a life are proportional to the ratio between the initial and final population, and that the cost to save a life does not depend on this ratio, it looks like saving lives in normal times is better to improve the longterm future than doing so in catastrophes.
1⁄10,000 * 8 billion people = 800,000 current lives lost in expectation
The expected death toll would be much greater than 800 k assuming a typical tail distribution. This is the expected death toll linked solely to the maximum severity, but lower levels of severity would add to it. Assuming deaths follow a Pareto distribution with a tail index of 1.60, which characterises war deaths, the minimum deaths would be 25.3 M (= 8*10^9*(10^-4)^(1/1.60)). Consequently, the expected death toll would be 67.6 M (= 1.60/(1.60 − 1)*25.3*10^6), i.e. 1.11 (= 67.6/61) times the number of deaths in 2023, or 111 (= 67.6/0.608) times the number of malaria deaths in 2022. I certainly agree undergoing this risk would be wild.
Side note. I think the tail distribution will eventually decay faster than that of a Pareto distribution, but this makes my point stronger. In this case, the product between the deaths and their probability density would be lower for higher levels of severity, which means the expected deaths linked to such levels would represent a smaller fraction of the overall expected death toll.
A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord’s x-risk estimates
I think Toby’s existential risk estimates are many orders of magnitude higher than warranted. I estimated an annual extinction risk of 5.93*10^-12 for nuclear wars, 2.20*10^-14 for asteroids and comets, 3.38*10^-14 for supervolcanoes, a prior of 6.36*10^-14 for wars, and a prior of 4.35*10^-15 for terrorist attacks. These values are already super low, but I believe existential risk would still be orders of magnitude lower. I think there would only be a 0.0513 % (= e^(-10^9/(132*10^6))) chance of a repetition of the last mass extinction 66 M years ago, the Cretaceous–Paleogene extinction event, being existential. I got my estimate assuming:
An exponential distribution with a mean of 132 M years (= 66*10^6*2) represents the time between i) human extinction in such catastrophe and ii) the evolution of an intelligent sentient species after such a catastrophe. I supposed this on the basis that:
Given the above, i) and ii) are equally likely. So the probability of an intelligent sentient species evolving after human extinction in such a catastrophe is 50 % (= 1⁄2).
Consequently, one should expect the time between i) and ii) to be 2 times (= 1⁄0.50) as long as that if there were no such catastrophes.
An intelligent sentient species has 1 billion years to evolve before the Earth becomes habitable.
I think approximately 1 in 10,000 chance of extinction for each new GPT would be acceptable given the benefits of AI. This is approximately my guess for GPT-5, so if we could release that model and then pause, I’d be okay with that.
A major consideration here is the use of AI to mitigate other x-risks. Some of Toby Ord’s x-risk estimates:
AI − 1 in 10
Engineering Pandemic − 1 in 30
Unforeseen anthropogenic risks (eg. dystopian regime, nanotech) − 1 in 30
Other anthropogenic risks − 1 in 50
Nuclear war − 1 in 1000
Climate change − 1 in 1000
Other environmental damage 1 in 1000
Supervolcano − 1 in 10,000
If there was a concrete plan under which AI could be used to mitigate pandemics and anthropogenic risks, then I would be ok with a higher probability of AI extinction, but it seems more likely that AI progress would increase these risks before it decreased them.
AI could be helpful for climate change and eventually nuclear war. So maybe I should be willing to go a little higher on the risk. But we might need a few more GPTs to fix these problems and if each new GPT is 1 in 10,000 then it starts to even out.
I’m very bullish about the benefits of an aligned AGI. Besides mitigating x-risk, I think curing aging should be a top priority and is worth taking some risks to obtain.
I’ve read the post quickly, but I don’t have a background in economics, so it would take me a while to fully absorb. My first impression is that it is interesting but not that useful for making decisions right now. The simplifications required by the model offset the gains in rigor. What do you think? Is it something I should take the time to understand?
My guess would be that the discount rate is pretty cruxy. Intuitively I would expect almost any gains over the next 1000 years to be offset by reductions in x-risk since we could have zillions of years to reap the benefits. (On a meta-level I believe moral questions are not “truthy” so this is just according to my vaguely total utilitarian preferences, not some deeper truth).
To me, this is wild. 1⁄10,000 * 8 billion people = 800,000 current lives lost in expectation, not even counting future lives. If you think GPT-5 is worth 800k+ human lives, you must have high expectations. :)
When you’re weighing existential risks (or other things which steer human civilization on a large scale) against each other, effects are always going to be denominated in a very large number of lives. And this is what OP said they were doing: “a major consideration here is the use of AI to mitigate other x-risks”. So I don’t think the headline numbers are very useful here (especially because we could make them far far higher by counting future lives).
Thanks for the comment, Richard.
I used to prefer focussing on tail risk, but I now think expected deaths are a better metric.
Thanks for pointing that out, Ted!
The expected death toll would be much greater than 800 k assuming a typical tail distribution. This is the expected death toll linked solely to the maximum severity, but lower levels of severity would add to it. Assuming deaths follow a Pareto distribution with a tail index of 1.60, which characterises war deaths, the minimum deaths would be 25.3 M (= 8*10^9*(10^-4)^(1/1.60)). Consequently, the expected death toll would be 67.6 M (= 1.60/(1.60 − 1)*25.3*10^6), i.e. 1.11 (= 67.6/61) times the number of deaths in 2023, or 111 (= 67.6/0.608) times the number of malaria deaths in 2022. I certainly agree undergoing this risk would be wild.
Side note. I think the tail distribution will eventually decay faster than that of a Pareto distribution, but this makes my point stronger. In this case, the product between the deaths and their probability density would be lower for higher levels of severity, which means the expected deaths linked to such levels would represent a smaller fraction of the overall expected death toll.
Thanks for elaborating, Joseph!
I think Toby’s existential risk estimates are many orders of magnitude higher than warranted. I estimated an annual extinction risk of 5.93*10^-12 for nuclear wars, 2.20*10^-14 for asteroids and comets, 3.38*10^-14 for supervolcanoes, a prior of 6.36*10^-14 for wars, and a prior of 4.35*10^-15 for terrorist attacks. These values are already super low, but I believe existential risk would still be orders of magnitude lower. I think there would only be a 0.0513 % (= e^(-10^9/(132*10^6))) chance of a repetition of the last mass extinction 66 M years ago, the Cretaceous–Paleogene extinction event, being existential. I got my estimate assuming:
An exponential distribution with a mean of 132 M years (= 66*10^6*2) represents the time between i) human extinction in such catastrophe and ii) the evolution of an intelligent sentient species after such a catastrophe. I supposed this on the basis that:
An exponential distribution with a mean of 66 M years describes the time between:
2 consecutive such catastrophes.
i) and ii) if there are no such catastrophes.
Given the above, i) and ii) are equally likely. So the probability of an intelligent sentient species evolving after human extinction in such a catastrophe is 50 % (= 1⁄2).
Consequently, one should expect the time between i) and ii) to be 2 times (= 1⁄0.50) as long as that if there were no such catastrophes.
An intelligent sentient species has 1 billion years to evolve before the Earth becomes habitable.