Why AGI Timeline Research/​Discourse Might Be Overrated

TL;DR: Research and discourse on AGI timelines aren’t as helpful as they may at first appear, and a lot of the low-hanging fruit (i.e. motivating AGI-this-century as a serious possibility) has already been plucked.

Introduction

A very common subject of discussion among EAs is “AGI timelines.” Roughly, AGI timelines, as a research or discussion topic, refer to the time that it will take before very general AI systems meriting the moniker “AGI” are built, deployed, etc. (one could flesh this definition out and poke at it in various ways, but I don’t think the details matter much for my thesis here—see “What this post isn’t about” below). After giving some context and scoping, I argue below that while important in absolute terms, improving the quality of AGI timelines isn’t as useful as it may first appear.

Just in the past few months, a lot of digital ink has been spilled, and countless in-person conversations have occurred, about whether recent developments in AI (e.g. DALL-E 2.0, Imagen, PALM, Minerva) suggest a need for updating one’s AGI timelines to be shorter. Interest in timelines has informed a lot of investment in surveys, research on variables which may be correlated with timelines like compute, etc. At least dozens of smart-person-years have been spent on this question; possibly the number is more like hundreds or thousands.

AGI timelines are, at least a priori, very important to reduce uncertainty about, to the extent that’s possible. Whether one’s timelines are “long” or “short” could be relevant to how one makes career investments—e.g. “exploiting” by trying to maximize influence over AI outcomes in the near-term, or “exploring” by building up skills that can be leveraged later. Timelines could also be relevant to what kinds of alignment research directions are useful, and which policy levers to consider (e.g. whether a plan that may take decades to pan out is worth seriously thinking about, or whether the “ship will have sailed” before then).

I buy those arguments to an extent, and indeed I have spent some time myself working on this topic. I’ve written or co-authored various papers and blog posts related to AI progress and its conceptualization/​measurement, I’ve contributed to papers and reports that explicitly made forecasts about what capabilities were plausible on a given time horizon, and I have participated in numerous surveys/​scenario exercises/​workshops/​conferences etc. where timelines loomed large. And being confused/​intrigued by people’s widely varying timelines is part of how I first got involved in AI, so it has a special place in my heart. I’ll certainly keep doing some things related to timelines myself, and think some others with special knowledge and skills should also continue to do so.

But I think that, as with many research and discussion topics, there are diminishing returns on trying to understand AGI timelines better and talking widely about them. A lot of the low-hanging fruit from researching timelines has already been plucked, and even much higher levels of certainty on this question (if that were possible) wouldn’t have all the benefits that might naively be suspected.

I’m not sure exactly how much is currently being invested in timeline research, so I am deliberately vague here as to how big of a correction, if any, is actually needed compared to the current level of investment. As a result of feedback on this post, I may find out that there’s actually less work on this than I thought, that some of my arguments are weaker than I thought, etc. and update my views. But currently, while I think timelines should be valued very highly compared to a random research topic, I suspect that many reading this post may have overly optimistic views on how useful timelines work can be.

What this post isn’t about

Again, I’m not saying no one should work on timelines. Some valuable work has indeed been done and is happening right now. But you should have very good responses to the claims below if you think you should be betting your career on it, or spending big fractions of your time thinking and talking about it informally, given all the other things you could be working on.

I’m also not going to go into detail about what I or others mean by AGI, even though one could make a lot of “timelines are overrated”-type arguments by picking at this issue. For example, perhaps (some) timeline discourse reinforces a discontinuous model of AI progress that could be problematic, perhaps a lot of AGI timeline discourse just involves people talking past each other, and perhaps our definitions and metrics for progress aren’t as useful as they could be. Those all feel like plausible claims to me but I don’t need to take a position on them in order to argue for the “maybe overrated” thesis. Even for very precise definitions amenable to AGI skeptics, including ones that allow for the possibility of gradual development, I still think there may not be as much value there as many think. Conversely, I think more extreme versions of such criticisms (e.g. that AGI is a crazy/​incoherent thing to talk about) are also wrong, but won’t go into that here.

Lastly, while I work at OpenAI and my perspective has been influenced in part by my experience of doing a lot of practical AI policy work there, this blog post just represents my own views, not my org’s or anyone else’s.

Reason 1: A lot of the potential value of timeline research and discourse has already been realized

In retrospect and at a high level, there are several plausible reasons why the initial big investment in timeline research/​discourse made sense (I would have to double check exactly what specific people said about their motivations for working on it at the time). Two stand out to me:

  • To reduce uncertainty about the issue in order to inform decision-making

  • To build a credible evidence base with which to persuade people that AGI is a non-crazy thing to think/​talk about

I will say more later about why I think the first motivation is less compelling than it first sounds, but for now I will focus on the second bullet.

It probably made a lot of sense to do an initial round of surveys of AI researchers about their views on AGI when no such surveys had been done in decades and the old ones had big methodological issues. And likewise, encouraging people to express their individual views re: AGI’s non-craziness (e.g. in interviews, books, etc.) was useful when there wasn’t a long list of credible expert quotes to draw on.

But now we have credible surveys of AI/​ML researchers showing clearly that AGI this century is considered plausible by “experts”; there are numerous recent examples of ~all experts under-predicting AI progress to point to, which can easily motivate claims like “we are often surprised/​could be surprised again, so let’s get prepared”; there’s a whole book taking AGI seriously by someone with ~unimpeachable AI credentials (Stuart Russell, co-author of the leading AI textbook); there are tons of quotes/​talks/​interviews etc. from many leaders in ML in which they take AGI in the next few decades seriously; there are tons of compelling papers and reports carefully making the point that, even for extremely conservative assumptions around compute and other variables, AGI this century seems very plausible if not likely; and AGI has now been mentioned in a non-dismissive way in various official government reports.

Given all of that, and again ignoring the first bullet for now, I think there’s much less to be accomplished on the timeline front than there used to be. The remaining value is primarily in increasing confidence, refining definitions, reconciling divergent predictions across different question framings, etc. which could be important—but perhaps not as much as one might think.

Reason 2: Many people won’t update much on a stronger evidence base even if we had it (and that’s fine)

Despite the litany of existing reasons to take AGI “soonish” seriously that I mentioned above, some people still aren’t persuaded. Those people are unlikely, in my view, to be persuaded by (slightly) more numerous and better of the same stuff. However, that’s not a huge deal—complete (expert or global) consensus is neither necessary nor sufficient for policy making in general. There is substantial disagreement even about how to explain and talk about current AI capabilities, let alone future ones, and nevertheless everyday people do many things to reduce current and future risks.

Reason 3: Even when timeline information is persuasive to relevant stakeholders, it isn’t necessarily that actionable

David Collingridge famously posed a dilemma for technology governance—in short, many interventions happen too early (when you lack sufficient information) or too late (when it’s harder to change things). Collingridge’s solution was essentially to take an iterative approach to governance, with reversible policy interventions. But, people in favor of more work on timelines might ask, why don’t we just frontload information gathering as much as possible, and/​or take precautionary measures, so that we can have the best of both worlds?

Again, as noted above, I think there’s some merit to this perspective, but it can easily be overstated. In particular, in the context of AI development and deployment, there is only so much value to knowing in advance that capabilities are coming at a certain time in the future (at least, assuming that there are some reasonable upper bounds on how good our forecasts can be, on which more below).

Even when my colleagues and I, for example, believed with a high degree of confidence that language understanding/​generation and image generation capabilities would improve a lot between 2020 and 2022 as a result of efforts that we were aware of at our org and others, this didn’t help us prepare *that* much. There was still a need for various stakeholders to be “in the room” at various points along the way, to perform analysis of particular systems’ capabilities and risks (some of which were not, IMO, possible to anticipate), to coordinate across organizations, to raise awareness of these issues among people who didn’t pay attention to those earlier bullish forecasts/​projections (e.g. from scaling laws), etc. Only some of this could or would have gone more smoothly if there had been more and better forecasting of various NLP and image generation benchmarks over the past few years.

I don’t see any reason why AGI will be radically different in this respect. We should frontload some of the information gathering via foresight, for sure, but there will still be tons of contingent details that won’t be possible to anticipate, as well as many cases where knowing that things are coming won’t help that much because having an impact requires actually “being there” (both in space and time).

Reason 4: Most actions that need to be taken are insensitive to timelines

One reason why timelines could be very important is if there were huge differences between what we’d do in a world where AGI is coming soon and a world where AGI is coming in the very distant future. On the extremes (e.g. 1 year vs. 100 years), I think there are in fact such differences, but for a more reasonable range of possibilities, I think the correct actions are mostly insensitive to timeline variations.

Regardless of timelines, there are many things we need to be making progress on as quickly as possible. These include improving discourse and practice around publication norms in AI; improving the level of rigor for risk assessment and management for developed and deployed AI systems; improving dialogue and coordination among actors building powerful AI systems, to avoid reinvention of the wheel re: safety assessments and mitigations; getting competent, well-intentioned people into companies and governments to work on these things; getting serious AI regulation started in earnest; and doing basic safety and policy research. And many of the items on such a list of “reasonable things to do regardless of timelines” can be motivated on multiple levels—for example, doing a good job assessing and managing the risks of current AI systems can be important at an object level, and also important for building good norms in the AI community, or gaining experience in applying/​debugging certain methods, which will then influence how the next generation of systems is handled. It’s very easy to imagine cases where different timelines lead to widely varying conclusions, but, as I’ll elaborate on in the next section, I don’t find this very common in practice.

To take one example of a type of intervention where timelines might be considered to loom large, efforts to raise awareness of risks from AI (e.g. among grad students or policymakers) are not very sensitive to AGI timeline details compared to how things might have seemed, say, 5 years ago. There are plenty of obviously-impactful-and-scary AI capabilities right now that, if made visible to someone you’re trying to persuade, are more than sufficient to motivate taking the robust steps above. Sometimes it may be appropriate and useful to say, e.g., “imagine if this were X times better/​cheaper/​faster etc.”, but in a world where AI capabilities are as strong as they already are, it generally suffices to raise the alarm about “AI,” full stop, without any special need to get into the details of AGI. Most people, at least those who haven’t already made up their mind that AGI-oriented folks and people bullish on technology generally are all misguided, can plainly see that AI is a a huge deal that merits a lot of effort to steer in the right direction.

Reason 5 (most hand-wavy reason): It hasn’t helped me much in practice

This is perhaps the least compelling of the reasons and I can’t justify it super well since it’s an “absence of evidence” type claim. But for what it’s worth, after working in AI policy for around a decade or so, including ~4 years at OpenAI, I have not seen many cases where having a more confident sense of either AI or AGI timelines would have helped all that much, under realistic conditions,* above and beyond the “take it seriously” point discussed under Reason 1.

There are exceptions but generally speaking, I have moved more every year towards the “just do reasonable stuff” perspective conveyed in Reason 4 above.

*by realistic conditions, I mean assuming that the basis of the increased confidence was something like expert surveys or trend projections, rather than e.g. a “message from the future” that was capable of persuading people who aren’t currently persuaded by the current efforts, so that there was still reasonable doubt about how seriously to take the conclusions.

Reason 6 (weakest reason): There are reputational risks to overinvesting in timelines research and discourse

Back in the day (5 years ago), there was a lot of skepticism in EA world about talking publicly about (short) AGI timelines due to fear of accelerating progress and/​or competition over AGI. At some point the mood seems to have shifted, which is an interesting topic in its own right but let’s assume for now that that shift is totally justified, at least re: acceleration risks.

Even so, there are still reputational risks to the EA community if it is seen as investing disproportionately in “speculation” about obviously-pretty-uncertain/​maybe-unknowable things like AGI timelines, compared to object level work to increase the likelihood of good outcomes from existing or near-term systems or robust actions related to longer-term risks. And the further along we are in plucking the low hanging fruits of timeline work, the more dubious the value of marginal new work will look to observers.

As suggested in the section header, I think this is probably the weakest argument: the EA community should be willing to do and say weird things, and there would have to be pretty compelling reputational risks to offset a strong case for doing more timeline work, if such a case existed. I also think there is good, non-wild-speculation-y timeline work, some of which could also plausibly boost EA’s reputation (though for what it’s worth, I haven’t seen that happen much yet). However, since I think the usual motivations for timeline work aren’t as strong as they first appear anyway, and because marginal new work (of the sort that might be influenced by this post) may be on the more reputationally risky end of the spectrum, this consideration felt worth mentioning as potential tie-breaker in ambiguous cases.

Reputational considerations could be especially relevant for people who lack special knowledge/​skills relevant to forecasting and are thus more vulnerable to the “wild speculation” charge than others who have those things, particularly when work on timelines is being chosen over alternatives that might be more obviously beneficial.

Conclusion

While there is some merit to the case for working on and talking about AGI timelines, I don’t think the case is as strong as it might first appear, and I would not be surprised if there were a more-than-optimal degree of investment in the topic currently. On the extremes (e.g. very near-term and very long-term timelines), there in fact may be differences in actions we should take, but almost all of the time we should just be taking reasonable, robust actions and scaling up the number of people taking such actions.

Things that would update me here include: multiple historical cases of people updating their plans in reasonable ways as a response to timeline work, in a way that couldn’t have been justified based on the evidence discussed in Reason 1, particularly if the timeline work in question was done by people without special skills/​knowledge; compelling/​realistic examples of substantially different policy conclusions stemming from timeline differences within a reasonable range (e.g. “AGI, under strict/​strong definitions, will probably be built this century but probably not in the next few years—assuming no major disruptions”); or examples of timeline work being very strongly synergistic, with or a good stepping stone towards, other kinds of work I mentioned as being valuable above.