Exponential AI takeoff is a myth
TL;DR
Everything that looks like exponential growth eventually runs into limits and slows down. AI will quite soon run into limits of compute, algorithms, data, scientific progress, and predictability of our world. This reduces the perceived risk posed by AI and gives us more time to adapt.
Disclaimer
Although I have a PhD in Computational Neuroscience, my experience with AI alignment is quite low. I havenât engaged in the field much except for reading Superintelligence and listening to the 80k Hours podcast. Therefore, I may duplicate or overlook arguments obvious to the field or use the wrong terminology.
Introduction
Many arguments I have heard around the risks from AI go a bit like this: We will build an AI that will be as smart as humans, then that AI will be able to improve itself. The slightly better AI will again improve itself in a dangerous feedback loop and exponential growth will ultimately create an AI superintelligence that has a high risk of killing us all.
While I do recognize the other possible dangers of AI, such as engineering pathogens, manipulating media, or replacing human relationships, I will focus on that dangerous feedback loop, or âexponential AI takeoffâ. There are, of course, also risks from human-level-or-slightly-smarter systems, but I believe that the much larger, much less controllable risk would come from âsuperintelligentâł systems. Iâm arguing here that the probability of creating such systems via an âexponential takeoffâ is very low.
Nothing grows exponentially indefinitely
This might be obvious, but letâs start here: Nothing grows exponentially indefinitely. The textbook example for exponential growth is the growth of bacteria cultures. They grow exponentially until they hit the side of their petri dish, and then itâs over. If theyâre not in a lab, they grow exponentially until they hit some other constraint but in the end, all exponential growth is constrained. If youâre lucky, actual growth will look logistic (âS-shapedâ), where the growth rate approaches 0 as resources are eaten up. If youâre unlucky, the population implodes.
For the last decades, we have seen things growing and growing without limit, but weâre slowly seeing a change. Human population is starting to follow an S-curve, the number of scientific papers has been growing fast but is starting to flatten out, and even Silicon Valley has learnt that Metcalfeâs Law of exponential network benefits doesnât work due to the limits imposed by network complexity.
I am assuming that everybody will agree with the general argument above, but the relevant question is: When will we see the âflatteningâ of the curve for AI? Yes, eventually growth is limited, but if that limit kicks in once AI has used up all the resources of our universe, thatâs a bit too late for us. I believe that the limits will kick in as soon as AI will reach our level of knowledge, give or take a magnitude, and here is why:
Weâre reaching the limits of Mooreâs law
First and foremost, the growth of processing power is what enabled the growth of AI. Iâm not going to guess when we reach parity with the processing power of the human brain but even if we do, we wonât grow fast beyond that, because Mooreâs law is slowing down.
Although Iâm not a theoretical physicist, I believe that there is significant evidence, anecdotal and otherwise, that Mooreâs Law is reaching its limits. In 2015, the former Intel CEO stated that âour cadence today is closer to two and a half years than two.â and Wikipedia states that âthe physical limits to transistor scaling have been reachedâ and that âMost forecasters, including Gordon Moore, expect Mooreâs law will end by around 2025.â If we look at the cost of computer memory and storage we see that while it shrunk exponentially for most of the last 50 years, weâre reaching the limits of that already.
I think there are two ways to counter this:
Yes, humans are reaching the limits, but AI will be smarter than us and AI will figure it out.
It doesnât matter because weâll just work with better algorithms and more data.
Iâll start with the second one:
We will probably reach the limits of algorithms
If compute is reaching limits, but we can be increasingly efficient with our compute, then weâll still scale exponentially. OpenAI published an article in 2020 showing that the compute needed is actually decreasing exponentially due to algorithmic improvements and so far weâre not seeing any leveling.
I think that these improvements are fair to expect given that early machine learning researchers were usually not professional software engineers focused on efficiency, so there should be a lot of potential when trying to scale these methods. However, it is theoretically quite clear that there will be limits to this as well: Youâre unlikely to train a billion parameters with one subtraction operation, so the question is again: when will this taper off? Weâve seen similar developments for example with sorting algorithms: Here, the largest efficiency gains were found initially, e.g. from BubbleSort (1956) to QuickSort (1959), while new, slightly better algorithms are still developed to this day (e.g. Timsort 2002).
Either way, both compute and algorithms, even if we make a magical breakthrough in quantum computing tomorrow, are in the end limited by data. DeepMind showed in 2022 (see also here) that more compute only makes sense if you have more data to feed it. So even if we get exponentially scaling compute and algorithms, that would only give us the current models faster, not better. So what are the limits of data?
Weâre reaching the limits of training data
Intuitively, I think it makes sense that data should be the limiting factor of AI growth. A human with an IQ of 150 growing up in the rainforest will be very good at identifying plants, but wonât all of a sudden discover quantum physics. Similarly, an AI trained on only images of trees, even with compute 100 times more than we have now, will not be able to make progress in quantum physics.
(This is where we start to get less quantitative and more hand-wavy but stay with me.) I think itâs fair to assume that a large part of human knowledge is stored in books and on the internet. We are already using most of this to train AIs. OpenAI didnât publish what data they are using to train their models but letâs say itâs 10% of all of the internet and books. Since AI models need exponentially growing training data to get linear performance improvements, that would mean that we can only expect relatively small improvements by feeding it the remaining 90% of the internet, which isnât exactly exponential takeoff.
So letâs say we already use all the internet and books as training data. What else could we do? One extreme option would be to strap a camera and a microphone (similar to Google Glass) to every human, record everything, and feed all of the data into a neural network. Even if we ignore the time it takes to record this data (more on this in the next paragraph), I would argue that the additional information in there is not of the same quality of books and the internet. Language is an information compression tool. We condensed everything we learnt as a human species over the last centuries in books. The additional knowledge gain from following us around would be marginalâmaybe the AI would get a bit better at gossiping, maybe it would get scientific discoveries a year earlier than they are published, or understand human emotions better, but in the end, there is not much to be seen there if the AI has already been exposed to all of our written knowledge.
But even if we reach the limits of training data, canât AI just generate more data?
There are natural limits to the growth of knowledge
âAI will improve itselfâ, âAI will spiral out of controlâ, âAI will enter a positive feedback loop of learningââthese claims all assume that just through reasoning, AI will be able to get better and better, going around all the limitations we looked at so far. We already understood that even if AI could come up with a better training algorithm that would help only marginally, what it would have to do would be to generate novel data /â knowledge on a large scale.
Iâd argue that if it would be that easy, science wouldnât be that hard. There is a reason why we have separate fields of experimental and theoretical physics. A lot of things work in theory, until they are tested in the real world. And that testing is getting more and more cumbersome: While the number of scientific papers has been growing exponentially, in many fields the number of breakthrough discoveries has actually been shrinking exponentially. In Pharma there even is the famous Eroomâs law (Moore spelled backwards) that drug discovery is getting exponentially more difficult. Since the Scientific Revolution, we have picked all the âlow-hanging fruitâ and itâs getting increasingly difficult to âgenerate more dataâ in the sense of generating knowledge.
Iâm sure AI will be able to generate a lot of very good hypotheses by taking in all the current human knowledge, combining it, and advancing science that way, but testing these hypotheses in the real world is a manual process that takes a lot of time. Itâs not something that can explode overnight, and judging by the recent struggles of science weâre reaching limits that AI will probably face sooner rather than later.
But AI doesnât have to act in the real world, and collect real data. Canât it just improve in a simulation, just like the Go AI and Chess AI and Starcraft AI played against themselves in simulations to improve?
We canât simulate knowledge acquisition
We canât simulate our world. If we could, we could generate infinite data but the data that we can simulate is only as good as the assumptions we put into the simulation so itâs again limited by current human knowledge.
Yes, we are using simulations right now to train self-driving cars, and theyâll probably eventually get better than humans, but they are limited by the assumptions we put in the simulation. They wonât be able to anticipate situations that we didnât think of.
The great thing about Go, Chess, and Starcraft is that all of these can be easily simulated and thereby allow AIs to generate knowledge across millions of iterations. The world they are tested in is the same simulated world they are trained in, so this works. For anybody who has ever tried to make a robot work in real life that has been trained in a simulation knows that unfortunately, that doesnât easily translate. Simulations are inherently limited by the assumptions we put into them. There is a reason why AIs that live in a purely theoretical space (such as language models or video game AIs) have had amazing breakthroughs, while robots still struggle with grabbing arbitrary objects. As an example of this, just compare the recent video of DeepMindâs robot soccer players falling over and over again with their impressive advances in StarCraft.
Another way to get around the time it takes to generate novel data would be to massively parallelize it: An AI could make infinite copies of itself and if every copy learns something and pools that knowledge, then that would result in exponential scaling. A chatbot with access to the internet could learn exponentially just by making exponential copies of itself. However, this will need a lot of resources and will still be bound by the time it takes the AIs to perform individual actions or measurments in the real world. While this can speed up AI development, it will still be slow compared to the feedback loops most people think of when thinking of AI progress. Google just closed down yet another of their robot experiments (Everyday Robots) that used this as part of their strategy for learning.
There are natural limits to the predictability of the world
But what if we actually donât need more data? What if all the knowledge we already have as humans, combined in one artificial brain, and with a misguided value system, is enough to outwit our species?
Let me make the most hand-wavy argument so far here: The world is a random, complex, system. We canât predict the weather more than three days in advance, let alone what Trump will tweet tomorrow. There is no reason to believe that an AI 1000x smarter than us would be able to do this because in complex systems, small changes in state can have a massive effect on its outcome. We donât know the full state and a 1000x smarter AI also wonât know the full state due to the difficulty of acquiring knowledge from the real world, as discussed above. âNo plan survives contact with the enemyâ; thatâs because it is impossible, no matter how much compute you have, to predict the enemy. An AI can probably make better guesses than we can, and come up with more alternative plans than we can, but it cannot combat the combinatorial explosion. It has to work with best-guess estimates. These will very quickly lose value in the same way as our best guess estimate of the weather loses value a few days into the future.
So even if, due to some flaw of the above arguments, AI would actually be able to scale exponentially in intelligence, I believe that the application of this intelligence would very quickly run into the limits imposed by the unpredictability of our world, leading again to a logistic growth of power of that AI, and not to an exponential growth.
AI will be very useful and maybe even smarter than us, but it wonât overpower us overnight
I have argued that AI will grow logistically, not exponentially, and that we will see the move to logistical growth quite soon as we approach the current limits of human knowledge. Tricks like simulation wonât get us much further and even if they did, the power of that AI would still only grow logistically due to the limits imposed by the unpredictability of our world.
I have looked at five different constraints on the growth of AI: compute, algorithms, data, scientific progress, and predictability of our world. There are probably other constraints that I didnât consider that could also have a limiting effect on the exponential growth of AI. Claiming that AI will grow exponentially is claiming that there will be NO constraints, which is a much stronger claim than saying that there will be A constraint, because one is enough to limit it from growing exponentially.
If we accept this line of reasoning, then this means that AI has probably an upper limit of a very very intelligent human being who somehow manages to keep all of human knowledge in their head. Thatâs quite impressive, but itâs not the same as an exponentially growing AI. Itâs something we should be very careful with, but not avoid at all costs. I think itâs reasonable to assume that weâll not approach this exponentially but logistically, with the last steps taking much more time than the first ones, which we are witnessing now. We will need to change our laws, adapt our intuitions, regulate the use of AI, and maybe even treat AIs as citizens, but itâs not something that can kill us within a day of reaching superhuman knowledge.
With this in mind, we can focus some of our attention on monitoring AI and working to integrate it into todayâs world, while also not losing sight of all of the other issues we are facing.
Hey, great post. I mostly agree with your points here, and agree that an intelligence explosion is incredibly unlikely, especially anytime soon.
Iâm not too sure about the limits of algorithm point: My impression is that current AI architecture is incredibly inefficient at using data when compared to humans. So it seems like even if we hit the limit with current architecture, thereâs room to invent new algorithms that are better.
Iâm interested in your expertise as a a computational neuroscientist, do you think there are any particular insights from that field that are applicable to these discussions?
Thanks for your thoughts! When writing this up I also felt that the algorithm one is the weakest one, so let me answer from two perspectives:
From the room to invent new algorithms: Convolutional neural networks have been around since the 80s, weâve been using GPUs to run them since about 10 years. If there really would be huge potential left, Iâd be a bit surprised that we didnât find it in the last 40 years alreadyâwe certainly had incentives because hardware was so slow and people had to optimize, but of course you never know. I tried to find a paper reviewing efficiency improvements of non-negative matrix factorization over time, I think that could be a fun guide, but couldnât find one.
From the brain perspective: Yes, itâs puzzling that the brain can do all this on 12 watts power while OpenAI is using server farms that consume much much more than that. So somewhere there must be huge efficiency gains. Note that thatâs mostly on the training sideââevaluatingâ a network is pretty efficient as far as I know. For training, there could be different reasons:
Transfer learning: Maybe the âcomputation of evolutionâ just âpre-programmedâ our brain similar to how we use transfer learning. Itâs already pretty close to where we want it and we just need to fine tune. Transfer learning on neural networks is already pretty cheap today. One argument supporting this would be that many animals are perfectly functional from day 1 of their life without much learning. Of course not same level of intelligence, but still.
Hardware: The brain doesnât run on silicone. We use a very very abstracted version of our brain and there is much more going on biologically. Some people argue that a lot of computation is already happening in the dendrites, maybe the morphology of neurons has effects on computation, maybe the specific nonlinearity applied by the neurons is more relevant than we think, ⊠. One way to try to adress this would be to build chips that are more similar (âneuromorphicâ) but I havenât seen much progress there
Architecture: The brain isnât a CNN. This might be a good approximation for our sensory cortices but even there itâs not the same. The brain is very recurrent, not feed-forward, and it canât send signals back through itâs synapses and therefore canât implement backpropagation. Maybe weâre just using the wrong architecture and if we find the right one itâs going to go much faster. I did my PhD on something related to this and I gave up haha, but of course, Iâm sure there are lots of things to be discovered here.
AI scaling laws refer to a specific algorithm and so are not relevant for arguing against algorithmic progress. For example, humans are much more sample efficient than LLMs right now, and so are an existence proof for more sample efficient algorithms. I also am pretty sure that humans are far from the limits of intelligenceâneuron firing speeds are on the order of 1-100 Hz, while computers can run much faster than this. Moreover, the human brain has all sorts of bottlenecks like needing to fit through a motherâs birth canal that an AI need not have, as well as all the biases that plague our reasoning.
Epoch estimates algorithmic improvements at .4 OOM /â year currently, and I feel that itâs hard to be confident either way about which direction this will go in the future. AI assisted AI research could dramatically increase this, but on the other hand, as you say, scaling could hit a wall.
I agree that I donât expect the exponential to hold forever, I expect the overall growth to look more like a sigmoid, as described here (though my best guess parameters to this model are different than the default ones). Where I disagree is that I expect the sigmoid to top out at far stronger than human level.
Thanks for this, Thomas! See my answer to titotal addressing the algorithm efficiency question in general. Note that if we would follow the hand-wavy âevolutional transfer learningâ argument that would weaken the existence proof for sample-efficiency of the human brain. The brain isnât a âgeneral-purpose Tabula Rasaâ. But I do agree with you that probably weâll find a better algorithm that doesnât scale this badly with data and can extract knowledge more efficiently.
However, Iâd argue that as before, even if we find a much much more efficient algorithm, we are in the end limited by the growth of knowledge and the predictability of our world. Epoch estimates that weâll run out of high-quality text data next year, which I would argue is the most knowledge-dense data we have. Even if we find more efficient algorithms, once AI has learnt all this text, itâll have to start generating new knowledge itself, which is much more cumbersome thant âjustâ absorbing existing knowledge.
Iâve been thinking about this specific idea:
It seems to me that youâre making the point that extreme out-of-distribution domains are unreachable by generalization (at least rapidly). Letâs consider that humans actually went from only identifying plants to making progress in quantum physics. How did this happen?
Humans didnât do it all of a sudden. It was only possible in stepwise fashion spanning generations, and required building on past knowledge (the way to climb ten steps up the ladder is simply to climb one step at a time ten times over).
Human population increases meant that more people were working on learning new knowledge.
Humans had to (as you point out) gather new information (not in our rainforest training set) in order to learn new insights.
Humans often had to test their insights to gain practical knowledge (which you also point out with respect to theoretical vs experimental physics)
If we assume that generating high-quality synthetic data would not allow for new knowledge outside of the learned domain, you would necessarily have to gather new information that humans have not gathered yet to not hit the data ceiling. As long as humans are required to gather new information, itâs reasonable to assume that sustainable exponential improvement is unlikely, since human information-gathering speed would not increase in tandem. Okay, letâs remove the human bottleneck. In this case, an exponentially improving AI would have to find a way to gather information from the outside world with exponentially increasing speeds (as well as test insights/âtheories at those speeds). Can you think of any way this would be possible? Otherwise, I find it hard not to reach the same conclusion as you.
Thanks for taking the time to formalizing this a bit more. I think youâre capturing my ideas quite well and indeed I canât think of ways how this would scale exponentially. Your point on âletâs remove the human bottleneckâ goes a bit in the direction of the last simulation paragraph where I suggest that you could parallelize knowledge acquisition. But as I argue there I think thatâs unrealistic to scale exponentially.
In general, I think I focused too much on the robotics examples when trying to illustrate that generating new knowledge takes time and is difficult but the same applies of course also to performing any kind of other experiment that an AI would have to do such as generating knowledge on human psychology by doing experiments with us, testing new training algorithms, performing experiments on quantum physics for chip research, etc.
I think there is plenty of room for debate about what the curve of AI progress/âcapabilities will look like, and I mostly skimmed the article in about ~5 minutes, but I donât think your postâs content justified the title (âexponential AI takeoff is a mythâ). âExponential AI takeoff is currently unsupportedâ or âthe common narrative(s) for exponential AI takeoff is based on flawed premisesâ are plausible conclusions from this post (even if I donât necessarily agree with them), but I think the original title would require far more compelling arguments to be justified.
(I wonât get too deep into this, but I think itâs plausible that there is significant âmethodological overhangâ: humans might just struggle to make progress in some fields of researchâespecially softer sciences and theory-heavy sciencesâbecause principal-agent problems in research plague the accumulation of reliable knowledge through non-experimental methods.)
Hi Harrison, thanks for stating what I guess a few people are thinkingâitâs a bit of a clickbait title. I do think though that the non-exponential growth is much more likely than exponential growth just becuase exponential takeoff would require no constraints on growth while itâs enough if one constraint kicks in (maybe even one I didnât consider here) to stop exponential growth.
Iâd be curious on the methodological overhang though. Are you aware of any posts /â articles discussing this further?
I havenât looked very hard but the short answer is no, Iâm not aware of any posts/âarticles that specifically address the idea of âmethodological overhangâ (a phrase I hastily made up and in hindsight realize may not be totally logical) as it relates to AI capabilities.
That being said, I have written about the possibility that our current methods of argumentation and communication could be really suboptimal, here: https://ââgeorgetownsecuritystudiesreview.org/ââ2022/ââ11/ââ30/ââcomplexity-demands-adaptation-two-proposals-for-facilitating-better-debate-in-international-relations-and-conflict-research/ââ
Hi, thanks for writing this up. I agree the macro trends of hardware, software, and algorithms are unlikely to hold true indefinitely. That said, I mostly disagree with this line of thinking. More precisely I find it unconvincing because there just isnât a lot of empirical evidence for or against these macro trends (e.g. natural limits to the growth of knowledge), so I donât really understand how you can use it to rule out certain endpoints as possibilities. And when I see an industry exec make a statement about Mooreâs Law I generally assume it is only to reassure investors that the company is on the right path this quarter rather than making a profound forward-looking statement about the future of computing. For example since that 2015 quote, Intel lost the mobile market, fell far behind on GPUs, and is presently losing the datacenter market.
There are a number of well-funded AI hardware startups right now, and a lot of money and potential improvements on hardware roadmaps including but not limited to: exotic materials, 3D stacking, high-bandwidth interconnects, new memory architectures, and dataflow architecture. On the AI side techniques like distillation and dropout seem to be effective at allowing much smaller models to perform nearly as well. Altogether I donât know if this will be enough to keep Mooreâs law (and whatever youâd call the superlinear trend of AI models) going for another few decades but I donât think Iâd bet against it, either.
Hey Steve, thanks for those thoughts! I think Iâm not more qualified than the wikipedia community to argue for or against Mooreâs law, thatâs why I just quoted them. So canât give more thoughts on that unfortunately.
But even if Mooreâs law would continue forever, I think that the data argument would kick in. If we have infinite compute but limited information to learn from, thatâs still a limited model. Applying infinite compute to the MNIST dataset will give you a model that wonât be much better than the latest Kaggle competitor on that dataset.
So then we end up again at the more hand-wavy arguments for limits to the growth of knowledge and predictability of our world in general. Would be curious where Iâm losing you there.