1) physical limits to scaling, 2) the inability to learn from video data, 3) the lack of abundant human examples for most human skills, 4) data inefficiency, and 5) poor generalization
All of those except 2) boil down to āfoundation models have to learn once and for all through training on collected datasets instead of continually learning for each instantiationā. See also AGIās Last Bottlenecks.
No, none of them boil down to that, and especially not (1).
Iāve already read the āA Definition of AGIā paper (which the blog post you linked to is based on) and it does not even mention the objections I made in this post, let alone offer a reply.
My main objection to the paper is that it makes a false inference that tests used to assess human cognitive capabilities can be used to test whether AI systems have those same capabilities. GPT-4 scored more than 100 on an IQ test in 2023, which would imply that it is an AGI if an AI that passes a test has the cognitive capabilities a human is believed to have if it passes that same test. The paper does not anticipate this objection or try to argue against it.
(Also, this is just a minor side point, but Andrej Karpathy did not actually say AGI is a decade away on Dwarkesh Patelās podcast. He said useful AI agents are a decade away. This is pretty clear in the interview or the transcript. Karpathy did not comment directly on the timeline for AGI, although it seems to be implied that AGI can come no sooner than AI agents.
Unfortunately, Dwarkesh or his editor or whoever titles his episodes, YouTube chapters, and clips has sometimes given inaccurate titles that badly misrepresent what the podcast guest actually said.)
How is āheterogeneous skillsā based on private information and āadapting to changing situation in real time with very little dataā not what continual learning mean?
Hereās a definition of continual learning from an IBM blog post:
Continual learning is an artificial intelligence (AI) learning approach that involves sequentially training a model for new tasks while preserving previously learned tasks. Models incrementally learn from a continuous stream of nonstationary data, and the total number of tasks to be learned is not known in advance.
To cope with real-world dynamics, an intelligent system needs to incrementally acquire, update, accumulate, and exploit knowledge throughout its lifetime. This ability, known as continual learning, provides a foundation for AI systems to develop themselves adaptively. In a general sense, continual learning is explicitly limited by catastrophic forgetting, where learning a new task usually results in a dramatic performance degradation of the old tasks.
The definition of continual learning is not related to generalization, data efficiency, the availability of training data, or the physical limits to LLM scaling.
You could have a continual learning system that is equally data inefficient as current AI systems and is equally poor at generalization. Continual learning does not solve the problem of training data being unavailable. Continual learning does not help you scale up training compute or training data if compute and data are scarce or expensive, nor does the ability to continually learn mean an AI system will automatically get all the performance improvements it would have gotten from continuing scaling trends.
Yes those quotes do refer to the need for a model to develop heterogeneous skills based on private information, and to adapt to changing situations in real life with very little data. I donāt see your problem.
In case itās helpful, I prompted Claude Sonnet 4.5 with extended thinking to explain three of the key concepts weāre discussing and I thought it gave a pretty good answer, which you can read here. (I archived that answer here, in case that link breaks.)
I gave GPT-5 Thinking almost the same prompt (I had to add some instructions because the first response it gave was way too technical) and it gave an okay answer, which you can read here. (Archive link here.)
I tried to Google for human-written explanations of the similarities and differences first, since thatās obviously preferable. But I couldnāt quickly find one, probably because thereās no particular reason to compare these concepts directly to each other.
No, those definitions quite clearly donāt say anything about data efficiency or generalization, or the other problems I raised.
I think you have misunderstood the concept of continual learning. It doesnāt mean what you seem to think it means. You seem to be confusing the concept of continual learning with some much more expansive concept, such as generality.
If Iām wrong, you should be able to quite easily provide citations that clearly show otherwise.
All of those except 2) boil down to āfoundation models have to learn once and for all through training on collected datasets instead of continually learning for each instantiationā. See also AGIās Last Bottlenecks.
No, none of them boil down to that, and especially not (1).
Iāve already read the āA Definition of AGIā paper (which the blog post you linked to is based on) and it does not even mention the objections I made in this post, let alone offer a reply.
My main objection to the paper is that it makes a false inference that tests used to assess human cognitive capabilities can be used to test whether AI systems have those same capabilities. GPT-4 scored more than 100 on an IQ test in 2023, which would imply that it is an AGI if an AI that passes a test has the cognitive capabilities a human is believed to have if it passes that same test. The paper does not anticipate this objection or try to argue against it.
(Also, this is just a minor side point, but Andrej Karpathy did not actually say AGI is a decade away on Dwarkesh Patelās podcast. He said useful AI agents are a decade away. This is pretty clear in the interview or the transcript. Karpathy did not comment directly on the timeline for AGI, although it seems to be implied that AGI can come no sooner than AI agents.
Unfortunately, Dwarkesh or his editor or whoever titles his episodes, YouTube chapters, and clips has sometimes given inaccurate titles that badly misrepresent what the podcast guest actually said.)
How is āheterogeneous skillsā based on private information and āadapting to changing situation in real time with very little dataā not what continual learning mean?
Hereās a definition of continual learning from an IBM blog post:
Hereās another definition, from an ArXiv pre-print:
The definition of continual learning is not related to generalization, data efficiency, the availability of training data, or the physical limits to LLM scaling.
You could have a continual learning system that is equally data inefficient as current AI systems and is equally poor at generalization. Continual learning does not solve the problem of training data being unavailable. Continual learning does not help you scale up training compute or training data if compute and data are scarce or expensive, nor does the ability to continually learn mean an AI system will automatically get all the performance improvements it would have gotten from continuing scaling trends.
Yes those quotes do refer to the need for a model to develop heterogeneous skills based on private information, and to adapt to changing situations in real life with very little data. I donāt see your problem.
In case itās helpful, I prompted Claude Sonnet 4.5 with extended thinking to explain three of the key concepts weāre discussing and I thought it gave a pretty good answer, which you can read here. (I archived that answer here, in case that link breaks.)
I gave GPT-5 Thinking almost the same prompt (I had to add some instructions because the first response it gave was way too technical) and it gave an okay answer, which you can read here. (Archive link here.)
I tried to Google for human-written explanations of the similarities and differences first, since thatās obviously preferable. But I couldnāt quickly find one, probably because thereās no particular reason to compare these concepts directly to each other.
No, those definitions quite clearly donāt say anything about data efficiency or generalization, or the other problems I raised.
I think you have misunderstood the concept of continual learning. It doesnāt mean what you seem to think it means. You seem to be confusing the concept of continual learning with some much more expansive concept, such as generality.
If Iām wrong, you should be able to quite easily provide citations that clearly show otherwise.