Very interesting article. Some forecasts of AI timelines (like BioAnchors) are premised on compute efficiency continuing to progress as it has for the last several decades. Perhaps these arguments are less forceful against 5-10 year timelines to AGI, but they’re still worth exploring.
I’m skeptical of some of the headwinds you’ve identified. Let me go through my understanding of the various drivers of performance, and I’d be curious to hear how you think of each of these.
Parallelization has driven much of the recent progress in effective compute budgets. Three factors enable parallelization:
Hardware
GPUs are more easily parallelized than CPUs, as they have more cores and higher memory bandwidth. Will hardware continue its current pace of improvement?
You cite an interesting paper on Nvidia GPU progress over time; it seems that the greatest speedups in consumer hardware came in the most recent generation, but improvements in industry-grade hardware peaked earlier, with the P100 in 2016.
This doesn’t strike me as strong evidence in any direction. Industrial progress has slowed, consumer progress has accelerated, and there are wide error bars on both of those statements because they’re drawn from only four data points.
Stronger evidence seems to come from Epoch’s Trends in GPU Price Performance, showing that FLOP/s per dollar has doubled every two or three years for nearly two decades. Do you expect this trend to continue, and if not, why?
Kernels
Software like CUDA allows developers to specify the ordering of computations and memory transfers, which reduces idling time and improves performance. You say that “CUDA optimization...generated significant improvements but has exhausted its low-hanging fruit,” but I’m not sure what the argument is for that.
You do argue that the importance of kernel optimization reduces experimentation with new algorithms. I agree, but I see a different upshot. One of the biggest reasons to be bullish on ML performance is the rise of AI programming assistants. If AI programming assistants learn kernel optimization, they’ll reduce the cost and runtime of experiments. New algorithms will be on a level playing field with incumbents, and we’ll be more likely to see algorithmic progress that was previously bottlenecked by writing CUDA kernels.
Algorithms
Some algorithms are easy to parallelize; others, not so much. For example, a key benefit of transformers is they’re more easily parallelized than RNNs, allowing them to scale.
Neil Thompson has some interesting work on algorithmic progress, showing that many fundamental algorithms are provably optimal or close to it. I’m not sure if this is a relevant reference class for ML algorithms though, as runtime guarantees are far less important than measured performance.
Overall, will future algorithms be easier to parallelize? It seems likely. We’ve done it before, and I don’t have any particular reason to expect that it won’t happen again.
Overall, I don’t see strong evidence that any of these factors are hitting strong barriers. Instead, the most relevant trend I see in the next 5 years is the rise of AI programming assistants, which could significantly accelerate progress in kernel optimization and algorithms.
I’d highlight two other factors affecting effective compute budgets:
Spending. Maybe nobody will spend more than $10B on a training run, and the current trend will slow. But if we’re in a very short timelines world, then AI could be massively profitable in the next few years, and OpenAI might get the $100B investment they’ve been talking about.
Better ML models. Some models learn more efficiently than others. Right now, algorithmic progress halve the compute necessary to reach a fixed level of performance every 16 months or every 9 months, depending on how you look at it. (This research focuses on efficiently reaching an existing level of performance—I’m not sure if how we should expect it to generalize to improvements in SOTA performance.) Again, AI coders could accelerate this.
Overall, I used to argue that AI progress will soon slow. But I’ve lost a lot of Bayes points to folks like Elon, Sam Altman, and Daniel Kokotajlo. A slowdown is entirely possible, perhaps even likely. But it’s a live possibility that the world could be transformed in a span only a few years by human-level AI. Safety efforts should address the full range of possible outcomes, but short timelines scenarios are the most dangerous and most neglected, so that’s where I’m focusing most of my attention right now.
Very interesting article. Some forecasts of AI timelines (like BioAnchors) are premised on compute efficiency continuing to progress as it has for the last several decades. Perhaps these arguments are less forceful against 5-10 year timelines to AGI, but they’re still worth exploring.
I’m skeptical of some of the headwinds you’ve identified. Let me go through my understanding of the various drivers of performance, and I’d be curious to hear how you think of each of these.
Parallelization has driven much of the recent progress in effective compute budgets. Three factors enable parallelization:
Hardware
GPUs are more easily parallelized than CPUs, as they have more cores and higher memory bandwidth. Will hardware continue its current pace of improvement?
You cite an interesting paper on Nvidia GPU progress over time; it seems that the greatest speedups in consumer hardware came in the most recent generation, but improvements in industry-grade hardware peaked earlier, with the P100 in 2016.
This doesn’t strike me as strong evidence in any direction. Industrial progress has slowed, consumer progress has accelerated, and there are wide error bars on both of those statements because they’re drawn from only four data points.
Stronger evidence seems to come from Epoch’s Trends in GPU Price Performance, showing that FLOP/s per dollar has doubled every two or three years for nearly two decades. Do you expect this trend to continue, and if not, why?
Kernels
Software like CUDA allows developers to specify the ordering of computations and memory transfers, which reduces idling time and improves performance. You say that “CUDA optimization...generated significant improvements but has exhausted its low-hanging fruit,” but I’m not sure what the argument is for that.
You do argue that the importance of kernel optimization reduces experimentation with new algorithms. I agree, but I see a different upshot. One of the biggest reasons to be bullish on ML performance is the rise of AI programming assistants. If AI programming assistants learn kernel optimization, they’ll reduce the cost and runtime of experiments. New algorithms will be on a level playing field with incumbents, and we’ll be more likely to see algorithmic progress that was previously bottlenecked by writing CUDA kernels.
Algorithms
Some algorithms are easy to parallelize; others, not so much. For example, a key benefit of transformers is they’re more easily parallelized than RNNs, allowing them to scale.
Neil Thompson has some interesting work on algorithmic progress, showing that many fundamental algorithms are provably optimal or close to it. I’m not sure if this is a relevant reference class for ML algorithms though, as runtime guarantees are far less important than measured performance.
Overall, will future algorithms be easier to parallelize? It seems likely. We’ve done it before, and I don’t have any particular reason to expect that it won’t happen again.
Overall, I don’t see strong evidence that any of these factors are hitting strong barriers. Instead, the most relevant trend I see in the next 5 years is the rise of AI programming assistants, which could significantly accelerate progress in kernel optimization and algorithms.
I’d highlight two other factors affecting effective compute budgets:
Spending. Maybe nobody will spend more than $10B on a training run, and the current trend will slow. But if we’re in a very short timelines world, then AI could be massively profitable in the next few years, and OpenAI might get the $100B investment they’ve been talking about.
Better ML models. Some models learn more efficiently than others. Right now, algorithmic progress halve the compute necessary to reach a fixed level of performance every 16 months or every 9 months, depending on how you look at it. (This research focuses on efficiently reaching an existing level of performance—I’m not sure if how we should expect it to generalize to improvements in SOTA performance.) Again, AI coders could accelerate this.
Overall, I used to argue that AI progress will soon slow. But I’ve lost a lot of Bayes points to folks like Elon, Sam Altman, and Daniel Kokotajlo. A slowdown is entirely possible, perhaps even likely. But it’s a live possibility that the world could be transformed in a span only a few years by human-level AI. Safety efforts should address the full range of possible outcomes, but short timelines scenarios are the most dangerous and most neglected, so that’s where I’m focusing most of my attention right now.