Yep that’s a fair argument, and I don’t have a knockdown case that predicting human generated data will result in great abilities.
One bit of evidence is that people used to be really pessimistic that scaling up imitation would do anything interesting, this paper was a popular knockdown arguing language models could never understand the physical world, but most of the substantive predictions of that line of thinking have been wrong and those people have largely retreated to semantics debates about the meaning of “understanding”. Scaling has gone further than many people expected, and could continue.
Another argument would be that pretraining on human data has a ceiling, but RL fine-tuning on downstream objectives will be much more efficient after pretraining and will allow AI to surpass the human level.
But again, there are plenty of people who think GPT will not scale to superintelligence—Eliezer, Gary Marcus, Yann Lecun—and it’s hard to predict these things in advance.
Yep that’s a fair argument, and I don’t have a knockdown case that predicting human generated data will result in great abilities.
One bit of evidence is that people used to be really pessimistic that scaling up imitation would do anything interesting, this paper was a popular knockdown arguing language models could never understand the physical world, but most of the substantive predictions of that line of thinking have been wrong and those people have largely retreated to semantics debates about the meaning of “understanding”. Scaling has gone further than many people expected, and could continue.
Another argument would be that pretraining on human data has a ceiling, but RL fine-tuning on downstream objectives will be much more efficient after pretraining and will allow AI to surpass the human level.
But again, there are plenty of people who think GPT will not scale to superintelligence—Eliezer, Gary Marcus, Yann Lecun—and it’s hard to predict these things in advance.