Update: I thought about it a bit more & asked this question & got some useful feedback, especially from tin482 and vladimir_nesov. I now am confused about what people mean when they say current AI systems are much less sample-efficient than humans. On some interpretations, GPT-3 is already about as sample-efficient as humans. My guess is it’s something like: “Sure, GPT-3 can see a name or fact once in its dataset and then remember it later & integrate it with the rest of its knowledge. But that’s because it’s part of the general skill/task of predicting text. For new skills/tasks, GPT-3 would need huge amounts of fine-tuning data to perform acceptably.”
Update: I thought about it a bit more & asked this question & got some useful feedback, especially from tin482 and vladimir_nesov. I now am confused about what people mean when they say current AI systems are much less sample-efficient than humans. On some interpretations, GPT-3 is already about as sample-efficient as humans. My guess is it’s something like: “Sure, GPT-3 can see a name or fact once in its dataset and then remember it later & integrate it with the rest of its knowledge. But that’s because it’s part of the general skill/task of predicting text. For new skills/tasks, GPT-3 would need huge amounts of fine-tuning data to perform acceptably.”
Surely a big part of the resolution is that GPT-3 is sample-inefficient in total, but sample-efficient on the margin?