the fact that insane amounts of capital are going into 5+ competing companies providing commonly-used AI products should be strong evidence that the economics are looking good
Can you clarify what you mean by âthe economics are looking goodâ? The economics of what are looking good for what?
I can think of a few different things this could mean, such as:
The amount of capital invested, the number of companies investing, and the number of users of AI products indicates there is no AI bubble
The amount of capital invested (and the competition) is making AGI more likely/âmaking it come sooner, primarily because of scaling
The amount of capital invested (and the competition) is making AGI more likely/âmaking it come sooner, primarily because it provides funding for research
Those arenât the only possible interpretations, but those are three I thought of.
if AGI is technically possible using something like current tech, then all the incentives and resources are in place to find the appropriate architectures.
Youâre talking about research rather than scaling here, right? Do you think there is more funding for fundamental AI research now than in 2020? What about for non-LLM fundamental AI research?
The impression I get is that the vast majority of the capital is going into infrastructure (i.e. data centres) and R&D for ideas that can quickly be productized. I recall that the AI researcher/âengineer Andrej Karpathy rejoined OpenAI (his previous employer) after leaving Tesla, but ended up leaving OpenAI after not too long because the company wanted him to work on product rather than on fundamental research.
Youâre talking about research rather than scaling here, right? Do you think there is more funding for fundamental AI research now than in 2020? What about for non-LLM fundamental AI research?
Based on our compute and cost estimates for OpenAIâs released models from Q2 2024 through Q1 2025, the majority of OpenAIâs R&D compute in 2024 was likely allocated to research, experimental training runs, or training runs for unreleased models, rather than the final, primary training runs of released models like GPT-4.5, GPT-4o, and o3.
Thatâs kind of interesting in its own right, but I wouldnât say that money allocated toward training compute for LLMs is the same idea as money allocated to fundamental AI research, if thatâs what you were intending to say.
Itâs uncontroversial that OpenAI spends a lot on research, but Iâm trying to draw a distinction between fundamental research, which, to me, connotes things that are more risky, uncertain, speculative, explorative, and may take a long time to pay off, and research that can be quickly productized.
I donât understand the details of what Epoch AI is trying to say, but I would be curious to learn.
Do unreleased models include as-yet unreleased models such as GPT-5? (The timeframe is 2024 and OpenAI didnât release GPT-5 until 2025.) Would it also include o4? (Is there still going to be an o4?) Or is it specifically models that are never intended to be released? Iâm guessing itâs just everything that hasnât been released yet, since I donât know how Epoch AI would have any insight into what OpenAI intends to release or not.
Iâm also curious how much trial and error goes into training for LLMs. Does OpenAI often abort training runs or find the results to be disappointing? How many partial or full training runs go into training one model? For example, what percentage of the overall cost is the $400 million estimated for the final training run of GPT-4.5? 100%? 90%? 50%? 10%?
Overall, this estimate from Epoch AI doesnât seem to tell us much about what amount of money or compute OpenAI is allocating to fundamental research vs. R&D that can quickly be productized.
When I say âthe economics are looking good,â I mean that the conditions for capital allocation towards AGI-relevant work are strong. Enormous investment inflows, a bunch of well-capitalised competitors, and mass adoption of AI products means that, if someone has a good idea to build AGI within or around these labs, the money is there. It seems this is a trivial pointâif there were significantly less capital, then labs couldnât afford extensive R&D, hardware or large-scale training runs.
WRT Scaling vs. fundamental research, obviously âfundamental researchâ is a bit fuzzy, but itâs pretty clear that labs are doing a bit of everything. DeepMind is the most transparent about this, theyâre doing Gemini-related model research, Fundamental science, AI theory and safety etc. and have published thousands of papers. But Iâm sure a significant proportion of OpenAI & Anthropicâs work can also be classed as fundamental research.
The overall concept weâre talking about here is to what extent the outlandish amount of capital thatâs being invested in AI has increased budgets for fundamental AI research. My sense of this is that itâs an open question without a clear answer.
DeepMind has always been doing fundamental research, but I actually donât know if that has significantly increased in the last few years. For all I know, it may have even decreased after Google merged Google Brain and DeepMind and seemed to shift focus away from fundamental research and toward productization.
I donât really know, and these companies are opaque and secretive about what theyâre doing, but my vague impression is that ~99% of the capital invested in AI over the last three years is going toward productizing LLMs, and Iâm not sure itâs significantly easier to get funding for fundamental AI research now than it was three years ago. For all I know, itâs harder.
My impression is from anecdotes from AI researchers. I already mentioned Andrej Karpathy saying that he wanted to do fundamental AI research at OpenAI when he re-joined in early 2023, but the company wanted him to focus on product. I got the impression he was disappointed and I think this is a reason he ultimately quit a year later. My understanding is that during his previous stint at OpenAI, he had more freedom to do exploratory research.
The Turing Award-winning researcher Richard Sutton said in an interview something along the lines of no one wants to fund basic research or itâs hard to get money to do basic research. Sutton personally can get funding because of his renown, but I donât know about lesser-known researchers.
A similar sentiment was expressed by the AI researcher François Chollet here:
Now LLMs have sucked the oxygen out of the room. Everyone is just doing LLMs. I see LLMs as more of an off-ramp on the path to AGI actually. All these new resources are actually going to LLMs instead of everything else they could be going to.
If you look further into the past to like 2015 or 2016, there were like a thousand times fewer people doing AI back then. Yet the rate of progress was higher because people were exploring more directions. The world felt more open-ended. You could just go and try. You could have a cool idea of a launch, try it, and get some interesting results. There was this energy. Now everyone is very much doing some variation of the same thing.
Undoubtedly, there is an outrageous amount of money going toward LLM research that can be quickly productized, toward scaling LLM training, and towards LLM deployment. Initially, I thought this meant the AI labs would spend a lot more money on basic research. I was surprised each time I heard someone such as Karpathy, Sutton, or Chollet giving evidence in the opposite direction.
Itâs hard to know whatâs the Godâs honest truth and whatâs bluster from Anthropic, but if they honestly believe that they will create AGI in 2026 or 2027, as Dario Amodei has seemed to say, and if they believe they will achieve this mainly by scaling LLMs, then why would they invest much money in basic research thatâs not related to LLMs or scaling them and that, even if it succeeds, probably wonât be productizable for at least 3 years? Investing in diverse basic research would be hedging their bets. Maybe they are, or maybe theyâre so confident that they feel they donât have to. I donât know.
Can you clarify what you mean by âthe economics are looking goodâ? The economics of what are looking good for what?
I can think of a few different things this could mean, such as:
The amount of capital invested, the number of companies investing, and the number of users of AI products indicates there is no AI bubble
The amount of capital invested (and the competition) is making AGI more likely/âmaking it come sooner, primarily because of scaling
The amount of capital invested (and the competition) is making AGI more likely/âmaking it come sooner, primarily because it provides funding for research
Those arenât the only possible interpretations, but those are three I thought of.
Youâre talking about research rather than scaling here, right? Do you think there is more funding for fundamental AI research now than in 2020? What about for non-LLM fundamental AI research?
The impression I get is that the vast majority of the capital is going into infrastructure (i.e. data centres) and R&D for ideas that can quickly be productized. I recall that the AI researcher/âengineer Andrej Karpathy rejoined OpenAI (his previous employer) after leaving Tesla, but ended up leaving OpenAI after not too long because the company wanted him to work on product rather than on fundamental research.
Most of OpenAIâs 2024 compute went to experiments
This is what Epoch AI says about its estimates:
Thatâs kind of interesting in its own right, but I wouldnât say that money allocated toward training compute for LLMs is the same idea as money allocated to fundamental AI research, if thatâs what you were intending to say.
Itâs uncontroversial that OpenAI spends a lot on research, but Iâm trying to draw a distinction between fundamental research, which, to me, connotes things that are more risky, uncertain, speculative, explorative, and may take a long time to pay off, and research that can be quickly productized.
I donât understand the details of what Epoch AI is trying to say, but I would be curious to learn.
Do unreleased models include as-yet unreleased models such as GPT-5? (The timeframe is 2024 and OpenAI didnât release GPT-5 until 2025.) Would it also include o4? (Is there still going to be an o4?) Or is it specifically models that are never intended to be released? Iâm guessing itâs just everything that hasnât been released yet, since I donât know how Epoch AI would have any insight into what OpenAI intends to release or not.
Iâm also curious how much trial and error goes into training for LLMs. Does OpenAI often abort training runs or find the results to be disappointing? How many partial or full training runs go into training one model? For example, what percentage of the overall cost is the $400 million estimated for the final training run of GPT-4.5? 100%? 90%? 50%? 10%?
Overall, this estimate from Epoch AI doesnât seem to tell us much about what amount of money or compute OpenAI is allocating to fundamental research vs. R&D that can quickly be productized.
When I say âthe economics are looking good,â I mean that the conditions for capital allocation towards AGI-relevant work are strong. Enormous investment inflows, a bunch of well-capitalised competitors, and mass adoption of AI products means that, if someone has a good idea to build AGI within or around these labs, the money is there. It seems this is a trivial pointâif there were significantly less capital, then labs couldnât afford extensive R&D, hardware or large-scale training runs.
WRT Scaling vs. fundamental research, obviously âfundamental researchâ is a bit fuzzy, but itâs pretty clear that labs are doing a bit of everything. DeepMind is the most transparent about this, theyâre doing Gemini-related model research, Fundamental science, AI theory and safety etc. and have published thousands of papers. But Iâm sure a significant proportion of OpenAI & Anthropicâs work can also be classed as fundamental research.
The overall concept weâre talking about here is to what extent the outlandish amount of capital thatâs being invested in AI has increased budgets for fundamental AI research. My sense of this is that itâs an open question without a clear answer.
DeepMind has always been doing fundamental research, but I actually donât know if that has significantly increased in the last few years. For all I know, it may have even decreased after Google merged Google Brain and DeepMind and seemed to shift focus away from fundamental research and toward productization.
I donât really know, and these companies are opaque and secretive about what theyâre doing, but my vague impression is that ~99% of the capital invested in AI over the last three years is going toward productizing LLMs, and Iâm not sure itâs significantly easier to get funding for fundamental AI research now than it was three years ago. For all I know, itâs harder.
My impression is from anecdotes from AI researchers. I already mentioned Andrej Karpathy saying that he wanted to do fundamental AI research at OpenAI when he re-joined in early 2023, but the company wanted him to focus on product. I got the impression he was disappointed and I think this is a reason he ultimately quit a year later. My understanding is that during his previous stint at OpenAI, he had more freedom to do exploratory research.
The Turing Award-winning researcher Richard Sutton said in an interview something along the lines of no one wants to fund basic research or itâs hard to get money to do basic research. Sutton personally can get funding because of his renown, but I donât know about lesser-known researchers.
A similar sentiment was expressed by the AI researcher François Chollet here:
Undoubtedly, there is an outrageous amount of money going toward LLM research that can be quickly productized, toward scaling LLM training, and towards LLM deployment. Initially, I thought this meant the AI labs would spend a lot more money on basic research. I was surprised each time I heard someone such as Karpathy, Sutton, or Chollet giving evidence in the opposite direction.
Itâs hard to know whatâs the Godâs honest truth and whatâs bluster from Anthropic, but if they honestly believe that they will create AGI in 2026 or 2027, as Dario Amodei has seemed to say, and if they believe they will achieve this mainly by scaling LLMs, then why would they invest much money in basic research thatâs not related to LLMs or scaling them and that, even if it succeeds, probably wonât be productizable for at least 3 years? Investing in diverse basic research would be hedging their bets. Maybe they are, or maybe theyâre so confident that they feel they donât have to. I donât know.