The 80,000 Hours video on AI 2027 is missing important caveats. These are the sort of caveats that deserve to emphasized and foregrounded in any discussion of AI 2027.
The three major caveats that should be made about AI 2027 are:
It depends crucially on the subjective intuitions or guesses of the authors. If you donât personally share the authorsâ intuitions, or donât personally trust that the authorsâ intuitions are likely correct, then there is no particular reason to take AI 2027â˛s conclusions seriously. (By the time they finish the 80,000 Hours video, are viewers aware this is the case?)
Credible critics claim that the headline results of the AI 2027 timelines model are largely baked in by the authorsâ modelling decisions, irrespective of what data the model uses. That means, to a large extent, AI 2027â˛s conclusions are not actually determined by the data they use. We already saw with (1) that the conclusions of AI 2027 are largely a restatement of the authorsâ personal and contestable beliefs. This is another way in which AI 2027â˛s conclusions are, effectively, a restatement of the pre-existing beliefs or assumptions that the authors chose to embed in their timelines model.
AI 2027 is largely based on extrapolating the METR time horizons graph, which has serious problems and limitations, some of which are sometimes (but not always) clearly disclosed by METR employees. Gary Marcus, a cognitive scientist and AI researcher, and Ernest Davis, a computer scientist and AAAI fellow, co-authored a blog post on the METR graph that looks at how the graph was made and concludes that âattempting to use the graph to make predictions about the capacities of future AI is misguidedâ. Nathan Witkin, a research writer at NYU Sternâs Tech and Society Lab, published a detailed breakdown of some of the problems with METRâs methodology. He concludes that itâs âimpossible to draw meaningful conclusions from METRâs Long Tasks benchmarkâ and that the METR graph âcontains far too many compounding errors to excuseâ. Witkin calls out a specific tweet from METR, which presents the METR graph in the broad, uncaveated way that the AI 2027 authors interpret it. He calls the tweet âan uncontroversial example of misleading science communicationâ. Since AI 2027 leans so heavily on this interpretation of the METR graph to make its forecast, it is hard to see how AI 2027 could be credible if its interpretation of the METR graph is not credible.
The 80,000 Hours video insinuates that most AI experts agree with the core assumptions or conclusions of AI 2027, but there is evidence to the contrary:
76% of AI experts think it is unlikely or very unlikely that existing approaches to AI, which includes LLMs, will scale to AGI. (See page 66 of the AAAI 2025 survey. See also the preceding two pages about open research challenges in AI â such as continual learning, long-term planning, generalization, and causal reasoning â none of which are about scaling more, or at least not uncontroversially so. If you want an example of a specific, prominent AI researcher who emphasizes the importance of fundamental AI research over scaling, Ilya Sutskever believes that further scaling will be inadequate to get to AGI.)
Expert surveys about AGI timelines are not necessarily reliable, but the AI Impacts survey in late 2023 found that AI researchersâ median year for AGI is 20 to 90 years later than the AI 2027 scenario.
I canât find survey data on this, but I get the impression there is a diversity of opinions among AI experts on how existentially dangerous AGI would be. For example, the Turing Award-winning AI researchers Yann LeCun and Richard Sutton both have a very different perspective on this than the AI 2027 authors. Both have been outspoken in their belief that that the AI safety/âAI alignment communityâs perspective on this topic is misguided.
Iâm not sure if 80,000 Hours is interested in making videos that try to explain complexities like these, but, personally, I would love to see that. I think there is immense importance and value in helping viewers understand how conclusions are reached, particularly when they are radical and contentious. Many viewers would surely disagree with the reasoning behind the conclusions that the 80,000 Hours video presents if it were clearly explained to them.
The 80,000 Hours video on AI 2027 is missing important caveats. These are the sort of caveats that deserve to emphasized and foregrounded in any discussion of AI 2027.
The three major caveats that should be made about AI 2027 are:
It depends crucially on the subjective intuitions or guesses of the authors. If you donât personally share the authorsâ intuitions, or donât personally trust that the authorsâ intuitions are likely correct, then there is no particular reason to take AI 2027â˛s conclusions seriously. (By the time they finish the 80,000 Hours video, are viewers aware this is the case?)
Credible critics claim that the headline results of the AI 2027 timelines model are largely baked in by the authorsâ modelling decisions, irrespective of what data the model uses. That means, to a large extent, AI 2027â˛s conclusions are not actually determined by the data they use. We already saw with (1) that the conclusions of AI 2027 are largely a restatement of the authorsâ personal and contestable beliefs. This is another way in which AI 2027â˛s conclusions are, effectively, a restatement of the pre-existing beliefs or assumptions that the authors chose to embed in their timelines model.
AI 2027 is largely based on extrapolating the METR time horizons graph, which has serious problems and limitations, some of which are sometimes (but not always) clearly disclosed by METR employees. Gary Marcus, a cognitive scientist and AI researcher, and Ernest Davis, a computer scientist and AAAI fellow, co-authored a blog post on the METR graph that looks at how the graph was made and concludes that âattempting to use the graph to make predictions about the capacities of future AI is misguidedâ. Nathan Witkin, a research writer at NYU Sternâs Tech and Society Lab, published a detailed breakdown of some of the problems with METRâs methodology. He concludes that itâs âimpossible to draw meaningful conclusions from METRâs Long Tasks benchmarkâ and that the METR graph âcontains far too many compounding errors to excuseâ. Witkin calls out a specific tweet from METR, which presents the METR graph in the broad, uncaveated way that the AI 2027 authors interpret it. He calls the tweet âan uncontroversial example of misleading science communicationâ. Since AI 2027 leans so heavily on this interpretation of the METR graph to make its forecast, it is hard to see how AI 2027 could be credible if its interpretation of the METR graph is not credible.
The 80,000 Hours video insinuates that most AI experts agree with the core assumptions or conclusions of AI 2027, but there is evidence to the contrary:
76% of AI experts think it is unlikely or very unlikely that existing approaches to AI, which includes LLMs, will scale to AGI. (See page 66 of the AAAI 2025 survey. See also the preceding two pages about open research challenges in AI â such as continual learning, long-term planning, generalization, and causal reasoning â none of which are about scaling more, or at least not uncontroversially so. If you want an example of a specific, prominent AI researcher who emphasizes the importance of fundamental AI research over scaling, Ilya Sutskever believes that further scaling will be inadequate to get to AGI.)
Expert surveys about AGI timelines are not necessarily reliable, but the AI Impacts survey in late 2023 found that AI researchersâ median year for AGI is 20 to 90 years later than the AI 2027 scenario.
I canât find survey data on this, but I get the impression there is a diversity of opinions among AI experts on how existentially dangerous AGI would be. For example, the Turing Award-winning AI researchers Yann LeCun and Richard Sutton both have a very different perspective on this than the AI 2027 authors. Both have been outspoken in their belief that that the AI safety/âAI alignment communityâs perspective on this topic is misguided.
Iâm not sure if 80,000 Hours is interested in making videos that try to explain complexities like these, but, personally, I would love to see that. I think there is immense importance and value in helping viewers understand how conclusions are reached, particularly when they are radical and contentious. Many viewers would surely disagree with the reasoning behind the conclusions that the 80,000 Hours video presents if it were clearly explained to them.