I want make my prediction about the short-term future of AI. Partially sparked by this entertaining video about the nonsensical AI claims made by the zoom CEO. I am not an expert on any of the following of course, mostly writing for fun and for future vindication.
The AI space seems to be drowning in unjustified hype, with very few LLM projects having a path to consistent profitabilitiy, and applications that are severely limited by the problem of hallucinations and the general fact that LLM’s are poor at general reasoning (compared to humans). It seems like LLM progress is slowing down as they run out of public data and resource demands become too high. I predict gpt-5, if it is released, will be impressive to people in the AI space, but it will still hallucinate, will still be limited in generalisation ability, will not be AGI and the average joe will not much notice the difference. Generative AI will be big business and play a role in society and peoples lives, but in the next decade will be much less transformative than, the introduction of the internet or social media.
I expect that sometime in the next decade it will be widely agreed that AI progress has stalled, that most of the current wave of AI bandwagon jumpers will be quietly ignored or shelved, and that the current wave of LLM hype might look like a financial bubble that burst (ala dotcom bubble but not as big).
Both AI doomers and accelerationists will come out looking silly, but will both argue that we are only an algorithmic improvement away from godlike AGI. Both movements will still be obscure silicon valley things that the average joe only vaguely knows about.
I’m hearing this claim everywhere. I’m curious to know why you think so, given that OpenAI hasn’t released GPT-5.
Sam said multiple times that GPT-5 is going to be much better than GPT-4. It could be just hype but this would hurt his reputation as soon as GPT-5 is released.
I think you should update approximately not at all from Sam Altman saying GPT-5 is going to be much better. Every CEO says every new version of their product is much better—building hype is central to their job.
That’s true for many CEOs (like Elon Musk) but Sam Altman did not over-hype any of the big OpenAI launches (ChatGPT, gpt3.5, gpt4, gpt4o, dall-e, etc.).
It’s possible that he’s doing it for the first time now, but I think it’s unlikely.
But let’s ignore Sam’s claims. Why do you think LLM progress is slowing down?
Would you be interested in making quantitative predictions on the revenue of OpenAI/Anthropic in upcoming years, and/or when various benchmarks like these will be saturated (and OSWorld, released since that series was created), and/or when various Preparedness/ASL levels will be triggered?
Both AI doomers and accelerationists will come out looking silly, but will both argue that we are only an algorithmic improvement away from godlike AGI.
A common view is a median around 2035-2050 with substantial (e.g. 25%) mass in the next 6 years or so.
This view is consistent with both thinking:
LLM progress is likely (>50%) to stall out.
LLMs are plausibly going to quickly scale into very powerful AI.
(This is pretty similar to my view.)
I don’t think many people think “we are only an algorithmic improvement away from godlike AGI”. In fact, I can’t think of anyone who thinks this. Some people think that 1 substantial algorithmic advance + continued scaling/general algorithmic improvement, but the continuation of other improvements is key.
Upvoted for making your prediction. Disagree vote because I think it’s wrong.
Even if we expect AI progress to be “super fast”, it won’t always be “super fast”. Sometimes it’ll be “extra, super fast” and sometimes it’ll merely be “very fast”.
I think that some people are over-updating on AI progress now only being “very fast” thinking that it this can only happen within the model where AI is about to cap out, whilst I don’t think this is the case at all.
Why I disagree that this video insightful/entertaining: The YouTuber quite clearly has very little knowledge of the subject they are discussing—it’s actually quite reasonable for the Zoom CEO to simply say that fixing hallucinations will “occur down the stack”, given that they are not the ones developing AI models, and would instead be building the infrastructure and environments that the AI systems operate within.
From what I watched of the video, she also completely misses the real reason that the CEOs claims are ridiculous; if you have an AI system with a level of capability that allows it to replicate a person’s actions in the workplace, then why would we go to the extra effort of having Zoom calls between these AI clones?
I.e. It would be much more efficient to build information systems that align with the strengths & comparative advantages of the AI systems - presumably this would not involve having “realistic clones of real human workers” talking to each other, but rather a network of AI systems that communicate using protocols and data formats that are designed to be as robust and efficient as possible.
FWIW if I were the CEO of Zoom, I’d be pushing hard on the “Human-in-the-loop” idea. E.g. building in features that allow you send out AI agents to fetch information and complete tasks in real time as you’re having meetings with your colleagues. That would actually be a useful product that helps keep Zoom interesting and relevant.
With regards to AI progress stalling, I think it depends on what you mean by “stalling”, but I think this is basically impossible if you mean “literally will not meaningfully improve in a way that is economically useful”
When I first learned how modern AI systems worked, I was astonished at how absurdly simple and inefficient they are. In the last ~2 years there has been a move towards things like MoE architectures & RNN hybrids, but this is really only scratching the surface of what is possible with more complex architectures. We should expect a steady stream of algorithmic improvements that will push down inference costs and make more real-world applications viable. There’s also Moore’s Law, but everyone already talks about that quite a lot.
Also, if you buy the idea that “AI systems will learn tasks that they’re explicitly trained for”, then incremental progress is almost guaranteed. I think it’s hilarious that everyone in industry and Government is very excited about general-purpose AI and its capacity for automation, but there is basically no large-scale effort to create high-quality training data to expedite this process.
The fact that pre-training + chatbot RLHF is adequate to build a system with any economic value is dumb luck. I would predict that if we actually dedicated a not-insignificant chunk of society’s efforts towards training DL systems to perform important tasks, we would make quite a lot of progress very quickly. Perhaps a central actor like the CCP will do this at some stage, but until then we should expect incremental progress as small-scale efforts gradually build up datasets and training environments.
I think you’re mostly right, especially about LLMs and current hype (though I do think a couple innovations beyond current technology could get us AGI). but I want to point out that AI progress has not been entirely fruitless. The most salient example in my mind is AlphaFold which is actually used for research, drug discovery etc.
I want to say just “trust the market”, but unfotunately, if OpenAI has a high but not astronomical valuation, then even if the market is right, that could mean “almost certainly will be quite useful and profitable, chance of near-term AGI almost zero’ or it could mean “probably won’t be very useful or profitable at all, but 1 in 1000 chance of near-term AGI supports high valuation nonetheless” or many things inbetween those two poles. So I guess we are sort of stuck with our own judgment?
For publically-traded US companies there are ways to figure out the variance of their future value, not just the mean, mostly by looking at option prices. Unfortunately, OpenAI isn’t publically-traded and (afaik) has no liquid options market, but maybe other players (Nvidia? Microsoft?) can be more helpful there.
If you know how to do this, maybe it’d be useful to do it. (Maybe not though, I’ve never actually seen anyone defend “the market assigns a non-negligible probability to an intelligence explosion.)
It’s not really my specific area, but I had a quick look. (Frankly, this is mostly me just thinking out loud to see if I can come up with anything useful, and I don’t promise that I succeed.)
Yahoo Finance has option prices with expirations in Dec 2026. We’re mostly interested in upside potential rather than downside, so we look at call options, for which we see data up to strike prices of 280.[fn 1]
In principle I think the next step is to do something like invert Black-Scholes (perhaps (?) adjusting for the difference between European- and American-style options, assuming that these options are the latter), but that sounds hard, so let’s see if I can figure out something simpler from first principles:
The 280 strike Dec 2026 call option is the right to buy Nvidia stock, on Dec 18th 2026, for a price of $280. Nvidia’s current price is ~$124, so these options only have value if the stock more than doubles by then. They’re currently trading at $14.50, while the 275 call trades at $15.
The value of a particular option is the integral of the option’s payoff profile multiplied by the stock price’s probability density. If we want something like “probability the stock is at least X on date Y”, the ideal option payoff profile would be an indicator function with a step at X, but we can’t exactly get that. Instead, by buying a call struck at A and selling a call struck at B, we get a zero function up to A, then a linear increase from A to B, then a constant function from B. Picking A and B close together seems like the best approximation. It means looking at prices for very low-volume options, but looking at the nearby prices including for higher-volume options, they look superficially in line, so I’ll go with it.
More intuitively, if the stock was definitely going to be above both A and B, then the strike-A option would be B—A more valuable than the strike B option (that is, the right to buy a stock worth $10 for a price $1 is worth exactly $3 more than the right to do so for $4). If the stock was definitely going to be below both A and B, then both options would be worthless.
So the value of the two options differ by (B—A)P(the price is above B), plus some awkward term for when the price is between A and B, which you can hopefully make ignorable by making that interval small.
From this I hesitantly conclude that the options markets suggest that P(NVDA >= 280-ish) = 10%-ish?
[fn 1]: It looks like there are more strike prices than that, but all the ones after 280 I think aren’t applicable: you can see a huge discontinuity in the prices from 280 to 290, and all the “last trade date” fields for the higher options are from before mid-June, so I think these options don’t exist anymore and come from before the 10-to-1 stock split.
Which examples do you think of when you say this? (Not necessarily disagreeing, I’m just interested in the different interpretations of ‘LLMs are poor at general reasoning’ ).
I also think that LLM reasoning can be significantly boosted with scaffolding—i.e: most hard reasoning problems can be split up into a a handful of easier reasoning problems; this can be done recursively until your LLM can solve a subproblem, then build back up the full solution. So whilst scale might not get us to a level of general reasoning that qualifies as AGI, perhaps GPT-5 (or 6) plus scaffolding can.
FWIW, even if AGI arrives ~ 2050 I still think it* would be the thing I’d want to work on right now. I would need to be really confident it wasn’t arriving before then for me not to want to work on it.
I want make my prediction about the short-term future of AI. Partially sparked by this entertaining video about the nonsensical AI claims made by the zoom CEO. I am not an expert on any of the following of course, mostly writing for fun and for future vindication.
The AI space seems to be drowning in unjustified hype, with very few LLM projects having a path to consistent profitabilitiy, and applications that are severely limited by the problem of hallucinations and the general fact that LLM’s are poor at general reasoning (compared to humans). It seems like LLM progress is slowing down as they run out of public data and resource demands become too high. I predict gpt-5, if it is released, will be impressive to people in the AI space, but it will still hallucinate, will still be limited in generalisation ability, will not be AGI and the average joe will not much notice the difference. Generative AI will be big business and play a role in society and peoples lives, but in the next decade will be much less transformative than, the introduction of the internet or social media.
I expect that sometime in the next decade it will be widely agreed that AI progress has stalled, that most of the current wave of AI bandwagon jumpers will be quietly ignored or shelved, and that the current wave of LLM hype might look like a financial bubble that burst (ala dotcom bubble but not as big).
Both AI doomers and accelerationists will come out looking silly, but will both argue that we are only an algorithmic improvement away from godlike AGI. Both movements will still be obscure silicon valley things that the average joe only vaguely knows about.
I’m hearing this claim everywhere. I’m curious to know why you think so, given that OpenAI hasn’t released GPT-5.
Sam said multiple times that GPT-5 is going to be much better than GPT-4. It could be just hype but this would hurt his reputation as soon as GPT-5 is released.
In any case, we’ll probably know soon.
I think you should update approximately not at all from Sam Altman saying GPT-5 is going to be much better. Every CEO says every new version of their product is much better—building hype is central to their job.
That’s true for many CEOs (like Elon Musk) but Sam Altman did not over-hype any of the big OpenAI launches (ChatGPT, gpt3.5, gpt4, gpt4o, dall-e, etc.).
It’s possible that he’s doing it for the first time now, but I think it’s unlikely.
But let’s ignore Sam’s claims. Why do you think LLM progress is slowing down?
Would you be interested in making quantitative predictions on the revenue of OpenAI/Anthropic in upcoming years, and/or when various benchmarks like these will be saturated (and OSWorld, released since that series was created), and/or when various Preparedness/ASL levels will be triggered?
A common view is a median around 2035-2050 with substantial (e.g. 25%) mass in the next 6 years or so.
This view is consistent with both thinking:
LLM progress is likely (>50%) to stall out.
LLMs are plausibly going to quickly scale into very powerful AI.
(This is pretty similar to my view.)
I don’t think many people think “we are only an algorithmic improvement away from godlike AGI”. In fact, I can’t think of anyone who thinks this. Some people think that 1 substantial algorithmic advance + continued scaling/general algorithmic improvement, but the continuation of other improvements is key.
I think you’re probably wrong, but I hope you’re right.
Upvoted for making your prediction. Disagree vote because I think it’s wrong.
Even if we expect AI progress to be “super fast”, it won’t always be “super fast”. Sometimes it’ll be “extra, super fast” and sometimes it’ll merely be “very fast”.
I think that some people are over-updating on AI progress now only being “very fast” thinking that it this can only happen within the model where AI is about to cap out, whilst I don’t think this is the case at all.
Why I disagree that this video insightful/entertaining: The YouTuber quite clearly has very little knowledge of the subject they are discussing—it’s actually quite reasonable for the Zoom CEO to simply say that fixing hallucinations will “occur down the stack”, given that they are not the ones developing AI models, and would instead be building the infrastructure and environments that the AI systems operate within.
From what I watched of the video, she also completely misses the real reason that the CEOs claims are ridiculous; if you have an AI system with a level of capability that allows it to replicate a person’s actions in the workplace, then why would we go to the extra effort of having Zoom calls between these AI clones?
I.e. It would be much more efficient to build information systems that align with the strengths & comparative advantages of the AI systems - presumably this would not involve having “realistic clones of real human workers” talking to each other, but rather a network of AI systems that communicate using protocols and data formats that are designed to be as robust and efficient as possible.
FWIW if I were the CEO of Zoom, I’d be pushing hard on the “Human-in-the-loop” idea. E.g. building in features that allow you send out AI agents to fetch information and complete tasks in real time as you’re having meetings with your colleagues. That would actually be a useful product that helps keep Zoom interesting and relevant.
With regards to AI progress stalling, I think it depends on what you mean by “stalling”, but I think this is basically impossible if you mean “literally will not meaningfully improve in a way that is economically useful”
When I first learned how modern AI systems worked, I was astonished at how absurdly simple and inefficient they are. In the last ~2 years there has been a move towards things like MoE architectures & RNN hybrids, but this is really only scratching the surface of what is possible with more complex architectures. We should expect a steady stream of algorithmic improvements that will push down inference costs and make more real-world applications viable. There’s also Moore’s Law, but everyone already talks about that quite a lot.
Also, if you buy the idea that “AI systems will learn tasks that they’re explicitly trained for”, then incremental progress is almost guaranteed. I think it’s hilarious that everyone in industry and Government is very excited about general-purpose AI and its capacity for automation, but there is basically no large-scale effort to create high-quality training data to expedite this process.
The fact that pre-training + chatbot RLHF is adequate to build a system with any economic value is dumb luck. I would predict that if we actually dedicated a not-insignificant chunk of society’s efforts towards training DL systems to perform important tasks, we would make quite a lot of progress very quickly. Perhaps a central actor like the CCP will do this at some stage, but until then we should expect incremental progress as small-scale efforts gradually build up datasets and training environments.
I think you’re mostly right, especially about LLMs and current hype (though I do think a couple innovations beyond current technology could get us AGI). but I want to point out that AI progress has not been entirely fruitless. The most salient example in my mind is AlphaFold which is actually used for research, drug discovery etc.
I want to say just “trust the market”, but unfotunately, if OpenAI has a high but not astronomical valuation, then even if the market is right, that could mean “almost certainly will be quite useful and profitable, chance of near-term AGI almost zero’ or it could mean “probably won’t be very useful or profitable at all, but 1 in 1000 chance of near-term AGI supports high valuation nonetheless” or many things inbetween those two poles. So I guess we are sort of stuck with our own judgment?
For publically-traded US companies there are ways to figure out the variance of their future value, not just the mean, mostly by looking at option prices. Unfortunately, OpenAI isn’t publically-traded and (afaik) has no liquid options market, but maybe other players (Nvidia? Microsoft?) can be more helpful there.
If you know how to do this, maybe it’d be useful to do it. (Maybe not though, I’ve never actually seen anyone defend “the market assigns a non-negligible probability to an intelligence explosion.)
It’s not really my specific area, but I had a quick look. (Frankly, this is mostly me just thinking out loud to see if I can come up with anything useful, and I don’t promise that I succeed.)
Yahoo Finance has option prices with expirations in Dec 2026. We’re mostly interested in upside potential rather than downside, so we look at call options, for which we see data up to strike prices of 280.[fn 1]
In principle I think the next step is to do something like invert Black-Scholes (perhaps (?) adjusting for the difference between European- and American-style options, assuming that these options are the latter), but that sounds hard, so let’s see if I can figure out something simpler from first principles:
The 280 strike Dec 2026 call option is the right to buy Nvidia stock, on Dec 18th 2026, for a price of $280. Nvidia’s current price is ~$124, so these options only have value if the stock more than doubles by then. They’re currently trading at $14.50, while the 275 call trades at $15.
The value of a particular option is the integral of the option’s payoff profile multiplied by the stock price’s probability density. If we want something like “probability the stock is at least X on date Y”, the ideal option payoff profile would be an indicator function with a step at X, but we can’t exactly get that. Instead, by buying a call struck at A and selling a call struck at B, we get a zero function up to A, then a linear increase from A to B, then a constant function from B. Picking A and B close together seems like the best approximation. It means looking at prices for very low-volume options, but looking at the nearby prices including for higher-volume options, they look superficially in line, so I’ll go with it.
More intuitively, if the stock was definitely going to be above both A and B, then the strike-A option would be B—A more valuable than the strike B option (that is, the right to buy a stock worth $10 for a price $1 is worth exactly $3 more than the right to do so for $4). If the stock was definitely going to be below both A and B, then both options would be worthless.
So the value of the two options differ by (B—A)P(the price is above B), plus some awkward term for when the price is between A and B, which you can hopefully make ignorable by making that interval small.
From this I hesitantly conclude that the options markets suggest that P(NVDA >= 280-ish) = 10%-ish?
[fn 1]: It looks like there are more strike prices than that, but all the ones after 280 I think aren’t applicable: you can see a huge discontinuity in the prices from 280 to 290, and all the “last trade date” fields for the higher options are from before mid-June, so I think these options don’t exist anymore and come from before the 10-to-1 stock split.
Appreciate the concreteness in the predictions!
Which examples do you think of when you say this? (Not necessarily disagreeing, I’m just interested in the different interpretations of ‘LLMs are poor at general reasoning’ ).
I also think that LLM reasoning can be significantly boosted with scaffolding—i.e: most hard reasoning problems can be split up into a a handful of easier reasoning problems; this can be done recursively until your LLM can solve a subproblem, then build back up the full solution. So whilst scale might not get us to a level of general reasoning that qualifies as AGI, perhaps GPT-5 (or 6) plus scaffolding can.
Thanks for writing this up!
FWIW, even if AGI arrives ~ 2050 I still think it* would be the thing I’d want to work on right now. I would need to be really confident it wasn’t arriving before then for me not to want to work on it.
*AI Safety.