Report on Semi-informative Priors for AI timelines (Open Philanthropy)
This is a linkpost for https://www.openphilanthropy.org/blog/report-semi-informative-priors
I’ve cross-posted the introduction so people can see what it’s about. Happy to respond to questions and comments here!
One of Open Phil’s major focus areas is technical research and policy work aimed at reducing potential risks from advanced AI.
To inform this work, I have written a report developing one approach to forecasting when artificial general intelligence (AGI) will be developed. By AGI, I mean computer program(s) that can perform virtually any cognitive task as well as any human, for no more money than it would cost for a human to do it. The field of AI is largely understood to have begun in Dartmouth in 1956, and since its inception one of its central aims has been to develop AGI.1
How should we forecast when powerful AI systems will be developed? One approach is to construct a detailed estimate of the development requirements and when they will be met, drawing heavily on evidence from AI R&D. My colleague Ajeya Cotra has developed a framework along these lines.
We think it’s useful to approach the problem from multiple angles, and so my report takes a different perspective. It doesn’t take into account the achievements of AI R&D and instead makes a forecast based on analogous historical examples.
In brief:
My framework estimates pr(AGI by year X): the probability we should assign to AGI being developed by the end of year X.
I use the framework to make low-end and high-end estimates of pr(AGI by year X), as well as a central estimate.
pr(AGI by 2100) ranges from 5% to 35%, with my central estimate around 20%.
pr(AGI by 2036) ranges from 1% to 18%, with my central estimate around 8%.
The probabilities over the next few decades are heightened due to current fast growth of the number of AI researchers and of the computation used in AI R&D.
These probabilities should be treated with caution, for two reasons:
The framework ignores some of our evidence about when AGI will happen. It restricts itself to outside view considerations—those relating to how long analogous developments have taken in the past. It ignores evidence about how good current AI systems are compared to AGI, and how quickly the field of AI is progressing. It does not attempt to give all-things-considered probabilities.
The predictions of the framework depend on a number of highly subjective judgement calls. There aren’t clear historical analogies to the development of AGI, and interpreting the evidence we do have is difficult. Other authors would have made different judgements and arrived at somewhat different probabilities. Nonetheless, I believe thinking about these issues has made my probabilities more reasonable.
We have made an interactive tool where people can specify their own inputs to the framework and see the resultant pr(AGI by year X).
The structure of the rest of this post is as follows:
First I explain what kinds of evidence my framework does and does not take into account.
Then I explain where my results come from on a high level, without getting into the maths (more here).
I give some other high-level takeaways from the report (more here).
I describe my framework in greater detail, including the specific assumptions used to derive the results (more here).
Three academics reviewed the report. At the bottom I link to their reviews (more here).
- 2 Apr 2021 22:13 UTC; 2 points) 's comment on Learning Russian Roulette by (LessWrong;
Some notes on the Laplace prior:
On footnote 16, you “For example, the application of Laplace’s law described below implies that there was a 50% chance of AGI being developed in the first year of effort”. But historically, participants in the Dartmouth conference were gloriously optimistic
When you write “I also find that pr(AGI by 2036) from Laplace’s law is too high,” what outside-view consideration are you basing that on? Also, is it really too high?
If you rule out AGI until 2028 (as you do in your report), the Laplace prior gives you 1 - (1-[1/(2028-1956)+1])^(2036-2028) ≈ 10.4% ≈ 10%, which is well withing your range of 1% to 18%, and really near to your estimate of 8%.
The point that Laplace’s prior depends on the unit of time chosen is really interesting, but it ends up not mattering once a bit of time has passed. For example, if we choose to use days instead of years, with (days since June 18 1956=23660, days until Mar 29 2028=2557, days until Jan 1 2036=5391), then Laplace’s rule would give for the probability of AGI until 2036: 1 - (1-[1/(23660+2557+1)])^(5391-2557) = 10.2% ≈ 10%, pretty much the same as above.
It’s fun to see that (1-(1/x))^x converges to 1/e pretty quickly, and that changing from years to days is equivalent to changing from ~(1+(1/x))^(x*r) to ~(1+(1/(365*x)))^(365*x*r) , where x is the time passed in years and x*r is the time remaining in years. But both converge pretty quickly to (1/e)^r.
It is not clear to me that by adjusting the Laplace prior down when you categorize AGI as a “highly ambitious but feasible technology” you are not updating twice: Once on the actual passage of time and another time given that AGI seems “highly ambitious”. But one knows that AGI is “highly ambitious” because it has hasn’t been solved in the first 65 years.
Given that, I’d still be tempted to go with the Laplace prior for this question, though I haven’t really digested the report yet.
Thanks for these thoughts! You raise many interesting points.
I’m not sure whether the participants at Dartmouth would have assigned 50% to creating AGI within a year and >90% within a decade, as implied by the Laplace prior. But either way I do think these probabilities would have been too high. It’s very rare, perhaps unprecedented, for such transformative tech progress to be made with so little effort. Even listing some of the best examples of quick and dramatic tech progress, I found the average time for a milestone to be achieved was >50 years, and the list omits the many failed projects.
That said, I agree that the optimism before Dartmouth is some reason to use a high first-trial probability (though I don’t think as high as 50%).
Agreed! (Interestingly, it only doesn’t matter once enough time has passed that Laplace strongly expects AGI to have already happened.) Still, Laplace’s predictions about the initial years of effort do depend on the trial definition: defining a ‘trial’ as 1 day, 1 year, or 30 years gives very different results. I think this shows something is wrong with the rule more generally. The root of the problem is that that Laplace assigns 50% probability of the first trial succeeding no matter how we define a trial. I think my alternative rule, where you choose the trial definition and the first-trial probability in tandem, addresses this issue.
My estimate of 8% only rules out AGI by the end of 2020. If I rule out AGI by the end of 2028, it becomes ~4%. This is quite a lot smaller than the 10% from Laplace.
The top of my range would be 9%, which is close to Laplace. However, this high-end is driven by forecasting that the inputs to AI R&D will grow faster than their historical average, so more trials occur per year. I don’t think such high values would be reasonable without taking these forecasts into account.
I find it too low mostly because it follows from aggressive assumptions about the chance of success in the first few years of effort, but also because of the reference classes discussed in the report.
Another way to justify ruling out Laplace is that if you had a hyper-prior, putting some weight on Laplace and some on more conservative rules, you would put extremely little weight on Laplace by now. (Although I personally wouldn’t put much weight on Laplace even in an initial hyper-prior.)
There’s a counter-intuitive example that illustrates this hyper-prior behaviour nicely. Suppose you assigned 20% to “AGI impossible” and 80% to another prior. If the other prior is Laplace, then your weight on “AGI impossible” rises to 92% by 2020, and you only assign 8% to Laplace. Your pr(AGI by 2036) is 1.6%. By contrast, if you reduce the first-trial probability in Laplace down to 1⁄100 then your weight on “AGI impossible” only rises to 29% by 2020 and your pr(AGI by 2036) is 6.3%. So having a lower first-trial probability ends up increasingpr(AGI by 2036).
This is an interesting idea, thanks. I think the description “highly ambitious” would have been appropriate in 1956: AGI would allow automation of ~all labour. In addition, it did seem hard to me to find reference classes supporting first-trial probability values above 1⁄50, and some reference classes I looked into suggest lower values.
That said, it’s possible that my favoured range for the first-trial probability [1⁄100, 1/1000] was influenced by my knowledge that we failed to develop AGI. If so, this would have made the range too conservative.
This prior should also work for other technologies sharing these reference classes. Examples might include a tech suite amounting to ‘longevity escape velocity’, mind reading, fully-immersive VR, or highly accurate 10+ year forecasting.
Agreed—the framework can be applied to things other than AGI.
Random thought on anthropics:
If AGI had been developed early and been highly dangerous, one can’t update on not seeing it
Anthropic reasoning might also apply to calculating the base rate of AGI; in the worlds where it existed and was beneficial, one might not be trying to calculate its a priori outside view.
Actually, I expected Gott equation to be mentioned here, as his Doomsday argument is a contemporary version of Laplace equation.
Also, qualified observers are not distributed linearly inside this period of time: from the idea of AI to creation of AI. If we assume that qualified observers are those who are interested in AI timing, than it look like that such people are much more numerous closer to the end of the period. As result, a random qualified observer should find oneself closer to the end of the period. If the number of qualified observers is growing exponentially, the median is just one doubling before the end. This makes AI timing prior closer to current events.