AGI x-risk timelines: 10% chance (by year X) estimates should be the headline, not 50%.
Artificial General Intelligence (AGI) poses an existential risk (x-risk) to all known sentient life. Given the stakes involved—the whole world/future light cone—we should, by default, be looking at 10% chance-of-AGI-by timelines as the deadline for adequate preparation (alignment), rather than 50% (median) chance-of-AGI-by timelines, which seem to be the current default.
We should regard timelines of ≥10% probability of AGI in ≤10 years as crunch time. Given that there is already an increasingly broad consensus around this[1], we should be treating AGI x-risk as an urgent immediate priority, not something to mull over leisurely as part of a longtermist agenda. Thinking that we have decades to prepare (median timelines) is gambling a huge amount of current human lives, let alone the cosmic endowment.
Of course it’s not just time to AGI that is important. It’s also the probability of doom given AGI happening at that time. A recent survey of people working on AI risk gives a median of 30% for “level of existential risk” from “AI systems not doing/optimizing what the people deploying them wanted/intended”.[2]
To borrow from Stuart Russell’s analogy: if there was a 10% chance of aliens landing in the next 10-15 years[3], humanity would be doing a lot more than we are currently doing[4]. AGI is akin to an alien species more intelligent than us that is unlikely to share our values.
- ^
Note that Holden Karnofsky’s all-things-considered (and IMO conservative) estimate for the advent of AGI is >10% chance in (now) 14 years. Anecdotally, the majority of people I’ve spoke to on the current AGISF course have estimates for 10% chance of 10 years or less. Yet most people in EA at large seem to put more emphasis on the 50% estimates that are in the 2050-2060 range.
- ^
I originally wrote ”..probability of doom given AGI. I think most people in AI Alignment would regard this as >50% given our current state of alignment knowledge and implementation*”,”*Correct me if you think this is wrong; would be interesting to see a recent survey on this”, and was linked to a recent survey!
Note that there is a mismatch with the framing in my post in that the survey implicitly incorporates time to AGI, for which the median estimate amongst those surveyed is presumably significantly later than 10 years. This suggests that P(doom|AGI in 10 years) would be estimated to be higher. It would be good to have a survey of the following questions:
1. Year with 10% chance of AGI.
2. P(doom|AGI in that year).
(We can operationalise “doom” as Ord’s definition of “the greater part of our potential is gone and very little remains”; although I pretty much think of it as being paperclipped or equivalent so that ~0 value remains). - ^
This is different to the original analogy, which was an email saying: “People of Earth: We will arrive on your planet in 50 years. Get ready.” Say astronomers spotted something that looked like a space-craft, heading in approximately our direction, and estimated there was 10% chance that it was indeed a spacecraft heading to Earth.
- ^
Although perhaps we wouldn’t. Maybe people would endlessly argue about whether the evidence is strong enough to declare a >10% probability. Or flatly deny it.
- Recruit the World’s best for AGI Alignment by 30 Mar 2023 16:41 UTC; 34 points) (
- 5 Sep 2023 14:53 UTC; 16 points) 's comment on Long-Term Future Fund Ask Us Anything (September 2023) by (
- 7 Apr 2022 9:43 UTC; 8 points) 's comment on “Long-Termism” vs. “Existential Risk” by (
- 3 Mar 2022 12:58 UTC; 7 points) 's comment on The Future Fund’s Project Ideas Competition by (
- 17 Nov 2022 18:20 UTC; 7 points) 's comment on Samo Burja: What the collapse of FTX means for effective altruism by (
- 21 Sep 2023 8:22 UTC; 6 points) 's comment on How to think about slowing AI by (
- 7 Sep 2022 9:40 UTC; 4 points) 's comment on Who would you have on your dream team for solving AGI Alignment? by (
- 25 Nov 2022 10:01 UTC; 3 points) 's comment on Pre-Announcing the 2023 Open Philanthropy AI Worldviews Contest by (
- 25 Aug 2022 21:50 UTC; 3 points) 's comment on Who would you have on your dream team for solving AGI Alignment? by (
- 27 Sep 2022 8:45 UTC; 3 points) 's comment on Effective altruism in the garden of ends by (
- 9 May 2022 9:57 UTC; 3 points) 's comment on A tale of 2.5 orthogonality theses by (
- 10 Mar 2022 9:45 UTC; 2 points) 's comment on “Existential risk from AI” survey results by (
- 1 Mar 2022 12:05 UTC; 2 points) 's comment on Greg_Colbourn’s Quick takes by (
- 9 Mar 2022 16:25 UTC; 1 point) 's comment on Late 2021 MIRI Conversations: AMA / Discussion by (LessWrong;
- 2 Oct 2023 9:55 UTC; 0 points) 's comment on Announcing the Winners of the 2023 Open Philanthropy AI Worldviews Contest by (
It’s wrong.
Thanks, have changed it to 30%, given the median answer to question 2 (level of existential risk from “AI systems not doing/optimizing what the people deploying them wanted/intended”).
I’ll note that I find this somewhat surprising. What are the main mechanisms whereby AGI ends up default aligned/safe? Or are most people surveyed thinking that alignment will be solved in time (/is already essentially solved)? Or are people putting significant weight on non-existential GCR-type scenarios?
Some relevant writing:
AN #80: Why AI risk might be solved without additional intervention from longtermists
Is power-seeking an existential risk (AN #170)
Late 2021 MIRI conversations
I know there can be a tendency for goalposts to shift with these kinds of technology forecasting questions (fusion being a famous example), but I note that some people at least have been consistently sticking to their AGI timelines over more than a decade. Shane Legg being a prime example. He gave an expected-value (50%) estimate of AGI by 2028 way back in 2009 (that he claims dates back a further decade), that he still maintains(!)
My sense is that it would be good if you provided more data supporting the claim that there’s a consensus that there is a ≥10% chance of AGI in ≤10 years (I took that to be your claim).
Note that whilst “≥10% chance of AGI in ≤10 years” is my claim, the aim of this post is also to encourage people to state their 10%-chance-by-year-X timeline estimates front and centre, as that is what should be action relevant—i.e. regarded as the deadline for solving Alignment (not 50%-chance-by-year-Y estimates).
Not quite 10 years, but Holden’s estimate (of more than 10% chance in 14 years) is based on a series of in depth reports, with input from a wide variety of experts, looking at the estimation problem from a number of angles (see table near top of this post). And given that he says “more than”, presumably 10% exactly would be (at least slightly) sooner, so I think it’s not unreasonable to say “≥10% chance of AGI in ≤10 years” has broad support (at least from the relevant experts).
I disagree, and think that you can’t take a claim that there’s more than 10% chance in 14 years as evidence of your claim that there’s a consensus of 10% or more in 10 years or less.
Also the fact that that Holden makes that estimate doesn’t show that there’s a broad consensus in support of that estimate. “Consensus” means “a generally accepted opinion; wide agreement”. And the fact that Holden has come to a particular conclusion based on having talked to many experts doesn’t show that there is wide agreement among those experts.
These kinds of estimates are imprecise by nature, so perhaps I should change “≥10% probability of AGI in ≤10 years” to “~10% within 10 years? (To me at least, more than 10% in 14 years translates to something like 6 or 7% in 10 years, which I don’t think would alter my actions much. I don’t think crossing the “less than a decade to sort this out” line is dependent on the estimate a decade out being firmly in the double-figure percents by all accounts (rather, I think “likely-at-least-close-to-10%” by an increasing number of knowledgeable value-aligned people, is more than enough for action)).
Holden’s estimate is based on the series of in depth reports (linked above), reviewed by multiple experts, each of which comes to similar conclusions. I’ll note that I said “increasingly broad consensus”, which is a somewhat ambiguous phrasing. Would it help if I changed it to “consensus forming”?
I think you should gather more data (e.g. via surveys) on what credences experts would assign to AGI within 10 years.
I’m not sure if I’m the best person to do this. Would be good to see AI Impacts do a follow-up to their survey of 6 years ago.
We will be posting a follow up of Grace et al from a 2019 survey soon. Can link here once it is up. I’ll also make sure the figures for shorter years from now or lower percent probability are noted in the article somewhere.
Edit: For what it’s worth it is around 8% by 2032 in this sample.
Cool, thanks!
Preprint is up and can be found here. Table S7 in the appendix may be particularly useful to answer some of the above. There will be two new surveys this year that gather new data on HLMI forecasts and the results will be out a lot faster this time round.
Thanks! Here is Table S7 (I’ve highlighted the relevant years):
I’m thinking that it would be good to have a survey with the following 2 questions:
1. Year with 10% chance of AGI.
2. P(doom|AGI in that year).
What do you think EA as a movement should do differently if we took seriously the views that (1) “≥10% probability of AGI in ≤10 years is crunch time”, (2) “crunch time means we should be doing a lot more than we are currently doing”, and (3) “‘≥10% probability of AGI in ≤10 years’ is true”?
This. Most of the other AGI x-risk related ideas on the FTX Future Fund project ideas comment thread. And just generally be allocating about 100-1000x more resources to the problem. Maybe that’s too much for EA as it currently stands. But in an ideal world, more resources would be devoted to AI Alignment than to AI capabilities.
I realised that there was a missing step in the reasoning of the first paragraph (relating to the title), so I’ ve edited it (and split it into two). Previously it read:
Interested to see answers to this question (given AGI arriving in the year that you think there is a 10% chance of it arriving, what is the probability of an existential catastrophe ensuing?):
[Note that ideally I wanted there to be 2 questions:
1. Year with 10% chance of AGI.
2. P(doom|AGI in that year)
But Elicit embeds only allow for questions with a % answer, not a numerical (year) answer.]
A more general version would be for people to add counterparts to their probability distributions here, that are P(doom|AGI in each year) (factoring in expected progress in alignment).
Note: first posted as a shortform here.