I typically donât agree with much that Dwarkesh Patel, a popular podcaster, says about AI,[1] but his recent Substack post makes several incisive points, such as:
Somehow this automated researcher is going to figure out the algorithm for AGIâa problem humans have been banging their head against for the better part of a centuryâwhile not having the basic learning capabilities that children have? I find this super implausible.
Yes, exactly. The idea of a non-AGI AI researcher inventing AGI is a skyhook. Itâs pulling yourself up by your bootstraps, a borderline supernatural idea. Itâs retrocausal. It just doesnât make sense.
There are more great points in the post besides that, such as:
Currently the labs are trying to bake in a bunch of skills into these models through âmid-trainingââthereâs an entire supply chain of companies building RL environments which teach the model how to navigate a web browser or use Excel to write financial models.
Either these models will soon learn on the job in a self directed wayâmaking all this pre-baking pointlessâor they wonâtâwhich means AGI is not imminent. Humans donât have to go through a special training phase where they need to rehearse every single piece of software they might ever need to use.
⌠You donât need to pre-bake the consultantâs skills at crafting Powerpoint slides in order to automate Ilya [Sutskever, an AI researcher]. So clearly the labsâ actions hint at a world view where these models will continue to fare poorly at generalizing and on-the-job learning, thus making it necessary to build in the skills that they hope will be economically valuable.
And:
It is not possible to automate even a single job by just baking in some predefined set of skills, let alone all the jobs.
We are in an AI bubble, and AGI hype is totally misguided.
There are some important things I disagree with in Dwarkeshâs post, too. For example, he says that AI has solved âgeneral understanding, few shot learning, [and] reasoningâ, but AI has absolutely not solved any of those things.
Models lack general understanding, and the best way to see that is they canât do much useful in complex, real world contexts â which is one of the points Dwarkesh is making in the post. Few-shot learning only works well in situations where a model has already been trained on a giant amount of similar training examples. The âreasoningâ in âreasoning modelsâ is, in Melanie Mitchellâs terminology, a wishful mnemonic. In other words, just naming an AI system something doesnât mean it can actually do the thing itâs named after. If Meta renamed Llama 5 to Superintelligence 1, that wouldnât make Llama 5 a superintelligence.
I also think Dwarkesh is astronomically too optimistic about how economically impactful AI will be by 2030. And heâs overfocusing on continual learning as the only research problem that needs to be solved, to the neglect of others.
I typically donât agree with much that Dwarkesh Patel, a popular podcaster, says about AI,[1] but his recent Substack post makes several incisive points, such as:
Yes, exactly. The idea of a non-AGI AI researcher inventing AGI is a skyhook. Itâs pulling yourself up by your bootstraps, a borderline supernatural idea. Itâs retrocausal. It just doesnât make sense.
There are more great points in the post besides that, such as:
And:
We are in an AI bubble, and AGI hype is totally misguided.
There are some important things I disagree with in Dwarkeshâs post, too. For example, he says that AI has solved âgeneral understanding, few shot learning, [and] reasoningâ, but AI has absolutely not solved any of those things.
Models lack general understanding, and the best way to see that is they canât do much useful in complex, real world contexts â which is one of the points Dwarkesh is making in the post. Few-shot learning only works well in situations where a model has already been trained on a giant amount of similar training examples. The âreasoningâ in âreasoning modelsâ is, in Melanie Mitchellâs terminology, a wishful mnemonic. In other words, just naming an AI system something doesnât mean it can actually do the thing itâs named after. If Meta renamed Llama 5 to Superintelligence 1, that wouldnât make Llama 5 a superintelligence.
I also think Dwarkesh is astronomically too optimistic about how economically impactful AI will be by 2030. And heâs overfocusing on continual learning as the only research problem that needs to be solved, to the neglect of others.