All deep learning systems since 2012 have had some extremely limited generalization ability. If you show AlexNet a picture of an object in a class it has trained on with some novel differences, e.g. maybe itās black-and-white or upside-down or the dog in the photo is wearing a party hat, it will still do much better than chance at classifying the image. In an extremely limited sense, that is generalization.
Iām not sure I can agree with Cholletās āzero to oneā characterization of o3. To be clear, heās saying itās zero to one for fluid intelligence, not generalization, which is a related and similar concept but Chollet defines it a bit differently than generalization. Still, Iām not sure I can agree itās zero to one either with regard to generalization or fluid intelligence. And Iām not sure I can agree itās zero to one even for LLMs. It depends how strict you are about the definitions, how exact you are trying to be, and what, substantively, youāre trying to say.
I think many results in AI are incredibly impressive considered from the perspective of science and technology ā everything from AlexNet to AlphaGo to ChatGPT. But this is a separate question from whether they are closing the gap between AI and general intelligence in any meaningful way. My assessment is that theyāre not. I think o3ās performance on ARC-AGI-1 and now ARC-AGI-2 is a really cool science project, but it doesnāt feel like fundamental research progress on AGI (except in a really limited, narrow sense in which a lot of things would count as that).
AI systems can improve data efficiency, generalization, and other performance characteristics in incremental ways over previous systems and this can still be true. The best image classifiers today get better top-1 performance on ImageNet than the best image classifiers ten years ago, and so they are more data efficient. But itās still true that the image classifiers of 2025 are no closer than the image classifiers of 2015 to emulating the proficiency with which humans see or other mammals see.
The old analogy ā I think Douglas Hofstadter said this ā is that if you climb a tree, you are closer to the Moon than your friend on the ground, but no closer to actually getting to the Moon than they are.
In some very technical sense, essentially any improvement to AI in any domain could be considered as an improvement to data efficiency, generalization, and reliability. If AI is able to do a new kind of task it wasnāt able to do before, its performance along all three characteristics has increased from zero to something. If it was already able but now itās better at it, then its performance has increased from something to something more. But this is such a technicality and misses the substantive point.
All deep learning systems since 2012 have had some extremely limited generalization ability. If you show AlexNet a picture of an object in a class it has trained on with some novel differences, e.g. maybe itās black-and-white or upside-down or the dog in the photo is wearing a party hat, it will still do much better than chance at classifying the image. In an extremely limited sense, that is generalization.
Iām not sure I can agree with Cholletās āzero to oneā characterization of o3. To be clear, heās saying itās zero to one for fluid intelligence, not generalization, which is a related and similar concept but Chollet defines it a bit differently than generalization. Still, Iām not sure I can agree itās zero to one either with regard to generalization or fluid intelligence. And Iām not sure I can agree itās zero to one even for LLMs. It depends how strict you are about the definitions, how exact you are trying to be, and what, substantively, youāre trying to say.
I think many results in AI are incredibly impressive considered from the perspective of science and technology ā everything from AlexNet to AlphaGo to ChatGPT. But this is a separate question from whether they are closing the gap between AI and general intelligence in any meaningful way. My assessment is that theyāre not. I think o3ās performance on ARC-AGI-1 and now ARC-AGI-2 is a really cool science project, but it doesnāt feel like fundamental research progress on AGI (except in a really limited, narrow sense in which a lot of things would count as that).
AI systems can improve data efficiency, generalization, and other performance characteristics in incremental ways over previous systems and this can still be true. The best image classifiers today get better top-1 performance on ImageNet than the best image classifiers ten years ago, and so they are more data efficient. But itās still true that the image classifiers of 2025 are no closer than the image classifiers of 2015 to emulating the proficiency with which humans see or other mammals see.
The old analogy ā I think Douglas Hofstadter said this ā is that if you climb a tree, you are closer to the Moon than your friend on the ground, but no closer to actually getting to the Moon than they are.
In some very technical sense, essentially any improvement to AI in any domain could be considered as an improvement to data efficiency, generalization, and reliability. If AI is able to do a new kind of task it wasnāt able to do before, its performance along all three characteristics has increased from zero to something. If it was already able but now itās better at it, then its performance has increased from something to something more. But this is such a technicality and misses the substantive point.