The Metaculus timeline is already highly unreasonable given the resolution criteria,[1] and even these people think Aschenbrenner is unmoored from reality.
No reason to assume an individual Metaculus commentator agrees with the Metaculus timeline, so I donāt think thatās very fair.
I actually think the two Metaculus questions are just bad questions. The detailed resolution criteria donāt necessarily match what we intuitively think=AGI or transformative AI, or obviously capture anything that important, and it is just unclear whether people are forecasting on the actual resolution criteria or on their own idea of what āAGIā is.
All the tasks in both AGI questions are quite short, so itās easy to imagine an AI beating all of them, and yet not being able to replace most human knowledge workers, because it canāt handle long-running tasks. Itās also just not clear how performance on benchmark questions and the Turing test translates to competence with even short-term tasks in the real world. So even if you think AGI in the sense of āAI that can automate all knowledge workā (let alone all work) is far away, it might make sense to think we are only a few years from a system that can resolve these questions yes.
On the other hand, resolving the questions āyesā could conceivably lag the invention of some very powerful and significant systems, perhaps including some that some reasonable definition would count as AGI.
As someone points out in the comments of one of the questions; right now, any mainstream LLM will fail the Turing test, however smart, because if you ask āhow do I make chemical weaponsā itāll read you a stiff lecture about why it canāt do that as it would violate its principles. In theory, that could remain true even if we reach AGI. (The questions only resolve āyesā if a system that can pass the Turing test is actually constructed, itās not enough for this to be easy to do if Open AI or whoever want to.) And the stronger of the two questions requires that a system can do a complex manual task. Fair enough, some reasonable definitions of āAGIā do require machines that can match humans at every manual dexterity-based cognitive task. But a system that could automate all knowledge work, but not handle piloting a robot body would still be quite transformative.
Which particular resolution criteria do you think itās unreasonable to believe will be met by 2027/ā2032 (depending on whether itās the weak AGI question or the strong one)?
Two of the four in particular stand out. First, the Turing Test one exactly for the reason you mentionāasking the model to violate the terms of service is surely an easy way to win. Thatās the resolution criteria, so unless the Metaculus users think thatāll be solved in 3 years[1] then the estimates should be higher. Second, the SAT-passing requires āhaving less than ten SAT exams as part of the training dataā, which is very unlikely in current Frontier models, and labs probably arenāt keen to share what exactly they have trained on.
it is just unclear whether people are forecasting on the actual resolution criteria or on their own idea of what āAGIā is.
No reason to assume an individual Metaculus commentator agrees with the Metaculus timeline, so I donāt think thatās very fair.
I donāt know if it is unfair. This is Metaculus! Premier forecasting website! These people should be reading the resolution criteria and judging their predictions according to them. Just going off personal vibes on how much they āfeel the AGIā feels like a sign of epistemic rot to me. I know not every Metaculus user agrees with this, but it is shaped by the aggregate ā 2027/ā2032 are very short timelines, and those are median community predictions. This is my main issue with the Metaculus timelines atm.
I actually think the two Metaculus questions are just bad questions.
I mean, I do agree with you in the sense that they donāt fully match AGI, but thatās partly because āAGIā covers a bunch of different ideas and concepts. It might well be possible for a system to satisfy these conditions but not replace knowledge workers, perhaps a new market focusing on automation and employment might be better but that also has its issues with operationalisation.
What I meant to say was unfair was basing āeven Metaculus users, think Aschenbrennerās stuff is bad, and they have short time lines, off the reaction to Aschenbrenner of only one or two people.
Which particular resolution criteria do you think itās unreasonable to believe will be met by 2027/ā2032 (depending on whether itās the weak AGI question or the strong one)?
The Metaculus timeline is already highly unreasonable given the resolution criteria,[1] and even these people think Aschenbrenner is unmoored from reality.
Remind me to write this up soon
No reason to assume an individual Metaculus commentator agrees with the Metaculus timeline, so I donāt think thatās very fair.
I actually think the two Metaculus questions are just bad questions. The detailed resolution criteria donāt necessarily match what we intuitively think=AGI or transformative AI, or obviously capture anything that important, and it is just unclear whether people are forecasting on the actual resolution criteria or on their own idea of what āAGIā is.
All the tasks in both AGI questions are quite short, so itās easy to imagine an AI beating all of them, and yet not being able to replace most human knowledge workers, because it canāt handle long-running tasks. Itās also just not clear how performance on benchmark questions and the Turing test translates to competence with even short-term tasks in the real world. So even if you think AGI in the sense of āAI that can automate all knowledge workā (let alone all work) is far away, it might make sense to think we are only a few years from a system that can resolve these questions yes.
On the other hand, resolving the questions āyesā could conceivably lag the invention of some very powerful and significant systems, perhaps including some that some reasonable definition would count as AGI.
As someone points out in the comments of one of the questions; right now, any mainstream LLM will fail the Turing test, however smart, because if you ask āhow do I make chemical weaponsā itāll read you a stiff lecture about why it canāt do that as it would violate its principles. In theory, that could remain true even if we reach AGI. (The questions only resolve āyesā if a system that can pass the Turing test is actually constructed, itās not enough for this to be easy to do if Open AI or whoever want to.) And the stronger of the two questions requires that a system can do a complex manual task. Fair enough, some reasonable definitions of āAGIā do require machines that can match humans at every manual dexterity-based cognitive task. But a system that could automate all knowledge work, but not handle piloting a robot body would still be quite transformative.
Which particular resolution criteria do you think itās unreasonable to believe will be met by 2027/ā2032 (depending on whether itās the weak AGI question or the strong one)?
Two of the four in particular stand out. First, the Turing Test one exactly for the reason you mentionāasking the model to violate the terms of service is surely an easy way to win. Thatās the resolution criteria, so unless the Metaculus users think thatāll be solved in 3 years[1] then the estimates should be higher. Second, the SAT-passing requires āhaving less than ten SAT exams as part of the training dataā, which is very unlikely in current Frontier models, and labs probably arenāt keen to share what exactly they have trained on.
it is just unclear whether people are forecasting on the actual resolution criteria or on their own idea of what āAGIā is.
No reason to assume an individual Metaculus commentator agrees with the Metaculus timeline, so I donāt think thatās very fair.
I donāt know if it is unfair. This is Metaculus! Premier forecasting website! These people should be reading the resolution criteria and judging their predictions according to them. Just going off personal vibes on how much they āfeel the AGIā feels like a sign of epistemic rot to me. I know not every Metaculus user agrees with this, but it is shaped by the aggregate ā 2027/ā2032 are very short timelines, and those are median community predictions. This is my main issue with the Metaculus timelines atm.
I actually think the two Metaculus questions are just bad questions.
I mean, I do agree with you in the sense that they donāt fully match AGI, but thatās partly because āAGIā covers a bunch of different ideas and concepts. It might well be possible for a system to satisfy these conditions but not replace knowledge workers, perhaps a new market focusing on automation and employment might be better but that also has its issues with operationalisation.
On top of everything else needed to successfully pass the imitation game
What I meant to say was unfair was basing āeven Metaculus users, think Aschenbrennerās stuff is bad, and they have short time lines, off the reaction to Aschenbrenner of only one or two people.
Which particular resolution criteria do you think itās unreasonable to believe will be met by 2027/ā2032 (depending on whether itās the weak AGI question or the strong one)?