I think Chollet has shifted the goal posts a bit from when he first developed ARC [ARC-AGI 1]. In his original paper from 2019, Chollet says:
“We argue that ARC [ARC-AGI 1] can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans.”
And the original announcement (from June 2024) says:
A solution to ARC-AGI [1], at a minimum, opens up a completely new programming paradigm where programs can perfectly and reliably generalize from an arbitrary set of priors. We also believe a solution is on the critical path towards AGI”
(And ARC-AGI 1 has now basically been solved). You say:
I understand the theory that AI will have a super fast takeoff, so that even though it isn’t very capable now, it will match and surpass human capabilities within 5 years. But this kind of theory is consistent with pretty much any level of AI performance in the present.
But we are seeing a continued rapid improvement in A(G)I capabilities, not least along the trajectory to automating AGI development, as per the METR report Ben West mentions.
In his interview with Dwarkesh Patel in June 2024 to talk about the launch of the ARC Prize, Chollet emphasized how easy the ARC-AGI tasks were for humans, saying that even children could do them. This is not something he’s saying only now in retrospect that the ARC-AGI tasks have been mostly solved.
That first quote, from the 2019 paper, is consistent with Chollet’s January 2025 Bluesky post. That second quote is not from Chollet, but from Mike Knoop. I don’t know what the first sentence is supposed to mean, but the second sentence is also consistent with the Bluesky post.
In response to the graph… Just showing a graph go up does not amount to a “trajectory to automating AGI development”. The kinds of tasks AI systems can do today are very limited in their applicability to AGI research and development. That has only changed modestly between ChatGPT’s release in November 2022 and today.
In 2018, you could have shown a graph of go performance increasing from 2015 to 2017 and that also would not have been evidence of a trajectory toward automating AGI development. Nor would AlphaZero’s tripling of the games a single AI system can master from go to go, chess, and shogi. Measuring improved performance on tasks only provides evidence for AGI progress if the tasks you are measuring test for general intelligence.
I think Chollet has shifted the goal posts a bit from when he first developed ARC [ARC-AGI 1]. In his original paper from 2019, Chollet says:
And the original announcement (from June 2024) says:
(And ARC-AGI 1 has now basically been solved). You say:
But we are seeing a continued rapid improvement in A(G)I capabilities, not least along the trajectory to automating AGI development, as per the METR report Ben West mentions.
In his interview with Dwarkesh Patel in June 2024 to talk about the launch of the ARC Prize, Chollet emphasized how easy the ARC-AGI tasks were for humans, saying that even children could do them. This is not something he’s saying only now in retrospect that the ARC-AGI tasks have been mostly solved.
That first quote, from the 2019 paper, is consistent with Chollet’s January 2025 Bluesky post. That second quote is not from Chollet, but from Mike Knoop. I don’t know what the first sentence is supposed to mean, but the second sentence is also consistent with the Bluesky post.
In response to the graph… Just showing a graph go up does not amount to a “trajectory to automating AGI development”. The kinds of tasks AI systems can do today are very limited in their applicability to AGI research and development. That has only changed modestly between ChatGPT’s release in November 2022 and today.
In 2018, you could have shown a graph of go performance increasing from 2015 to 2017 and that also would not have been evidence of a trajectory toward automating AGI development. Nor would AlphaZero’s tripling of the games a single AI system can master from go to go, chess, and shogi. Measuring improved performance on tasks only provides evidence for AGI progress if the tasks you are measuring test for general intelligence.