Those forecasts were misguided. If they ended up with good answers, that’s accidental; the trends they extrapolated from have hit limits… (Skeptics get Bayes points.)
But IMO it’s not a fluke that the “that curve is going up, who knows why” POV has done well.
A sketch of what I think happens:
There’s a general dynamic here that goes something like:
Some people point to a curve going up (and maybe note some underlying driver)
Others point out that the drivers have inherent constraints (this is an s-curve, not an exponential)
And then in *some* sense the bottlenecks crowd turn out to be right (the specific drivers/paradigm peters out, there’s literally no more space for more transistors, companies run low on easily accessible/high quality training data, etc.)…
…but then a “surprise new thing” pops up and fills the gap, such that the “true” thing we cared about (whether or not it’s what we were originally measuring) *does* actually continue as people originally predicted, apparently naively
(and it turned out that the curve consists of a stack of s-curves..)
We can go too far with this kind of reasoning; some “true things we care about” (e.g. spread of a disease) * are* in fact s-curves, bounded, etc., and only locally look like ~exponentials. (So: no, we shouldn’t expect the baby to weigh trillions of pounds by age 10...)
But I think the more granular, gears-oriented view — which considers how long specific drivers of the progress we’re seeing could continue, etc. — often underrates the extent to which *other forces* can (and often do) jump in when earlier drivers lose momentum.
While shallow assessments focus on visible obstacles — the difficulties of matching human capabilities, of overcoming regulatory barriers, and of restructuring organizations — AI-enabled developments will often find paths that bypass rather than overcome apparent barriers. Existing obstacles are concrete and obvious in a way that alternatives are not. Skewed judgment follows.
(This stuff isn’t new; many people have pointed out these kinds of dynamics. But I feel like I’m still seeing them a fair bit — and this came up recently — so I wanted to write this note.)
I’m skeptical of an “exponentials generally continue” prior which is supposed to apply super-generally. For example, here’s a graph of world population since 1000 AD; it’s an exponential, but actually there are good mechanistic reasons to think it won’t continue along this trajectory. Do you think it’s very likely to?
I don’t personally have well-developed thoughts on population growth, but note that “population growth won’t continue to be exponential” is a prediction with a notoriously bad track record.
It has? The empirical track record has been slowing global population growth, which peaked 60 years ago:
The population itself also looks pretty linear for the last 60 years, given 1B/~12 years rate:
#
I think the general story is something like this:
The (observable) universe is a surprisingly small place when plotted on a log scale. Thus anything growing exponentially will hit physical ceilings if projected forward long enough: a population of 1000 humans growing at 0.9% pa roughly equal the number of atoms in the universe after 20 000 years.
(Even uploading digital humans onto computronium only gets you a little further on the log scale: after 60 000 years our population has grown to 2E236, so ~E150 people per atom, and E50 per planck volume. If that number is not sufficiently ridiculous, just let the growth run for a million more years and EE notation starts making sense (10^10^x).)
There are usually also ‘practical’ ceilings which look implausible to reach long before you’ve exhausted the universe: “If this stock keeps doubling in value, this company would be 99% of the global market cap in X years”, “Even if the total addressable consumer market is the entire human population, people aren’t going to be buying multiple subscriptions to Netflix each.”, etc.
So ~everything is ultimately an S-curve. Yet although ‘this trend will start capping out somewhere’ is a very safe bet, ‘calling the inflection point’ before you’ve passed it is known to be extremely hard. Sigmoid curves in their early days are essentially indistinguishable from exponential ones, and the extra parameter which ~guarantees they can better (over?)fit the points on the graph than a simple exponential give very unstable estimates of the putative ceiling the trend will ‘cap out’ at. (cf. 1, 2.)
Many important things turn on (e.g.) ‘scaling is hitting the wall ~now’ vs. ‘scaling will hit the wall roughly at the point of the first dyson sphere data center’ As the universe is a small place on a log scale, this range is easily spanned by different analysis choices on how you project forward.
Without strong priors on ‘inflecting soon’ vs. ‘inflecting late’, forecasts tend to be volatile: is this small blip above or below trend really a blip, or a sign we’re entering a faster/slow regime?
(My guess is the right ur-prior favours ‘inflecting soon’ weakly and in general, although exceptions and big misses abound. In most cases, you have mechanistic steers you can appeal to which give much more evidence. I’m not sure AI is one of them, as it seems a complete epistemic mess to me.)
In particular, I’m not trying to make a strong claim about exponentials specifically, or that things will line up perfectly, etc.
(Fwiw, though, it does seem possible that if we zoom out, recent/near-term population growth slow-downs might be functionally a ~blip if humanity or something like it leaves the Earth. Although at some point you’d still hit physical limits.)
And why, exactly, would you expect every single new development to show up at exactly the right time as to make the overall curve remain exponential? What your view actually predicts is that progress will be a series of S-curves… but it says nothing about how long the flat bits in between will be.
Even within the history of AI, we have seen S-curves flatten out: there have been AI winters that lasted literal decades.
Do you know if the AI winters actually broke a trend in the way OP is talking about? E.g. you can’t easily see winters in a graph of chess performance:
(Wikipedia says that the AI winters were 1974-80 and 1987-2000. Also minor note that neither of these winters lasted “literal decades”.)
Oh, apologies: I’m not actually trying to claim that things will be <<exactly.. exponential>>. We should expect some amount of ~variation in progress/growth (these are rough models, we shouldn’t be too confident about how things will go, etc.), what’s actually going on is (probably a lot) more complicated than a simple/neat progression of new s-curves, etc.
The thing I’m trying to say is more like:
When we’ve observed some datapoints about a thing we care about, and they seem to fit some overall curve (e.g. exponential growth) reasonably well,
then pointing to specific drivers that we think are responsible for the changes — & focusing on how those drivers might progress or be fundamentally limited, etc. — often makes us (significantly) overestimate bottlenecks/obstacles standing in the way of progress on the thing that we actually care about.
And placing some weight on the prediction that the curve will simply continue[1] seems like a useful heuristic / counterbalance (and has performed well).
(Apologies if what I’d written earlier was unclear about what I believe — I’m not sure if we still notably disagree given the clarification?)
A different way to think about this might be something like:
The drivers that we can point to are generally only part of the picture, and they’re often downstream of some fuzzier higher-level/”meta” force (or a portfolio of forces) like “incentives+...”
It’s usually quite hard to draw a boundary around literally everything that’s causing some growth/progress
It’s also often hard to imagine, from a given point in time, very different ways of driving the thing forward
(e.g. because we’ve yet to discover other ways of making progress, because proxies we’re looking at locally implicitly bake in some unnecessary assumptions about how progress on the thing we care about will get made, etc.)
So our stories about what’s causing some development that we’re observing are often missing important stuff, and sometimes we should trust the extrapolation more than the stories / assume the stories are incomplete
Something like this seems to help explain why views like “the curve we’re observing will (basically) just continue” have seemed surprisingly successful, even when the people holding those “curve go up” views justified their conclusions via apparently incorrect reasoning about the specific drivers of progress. (And so IMO people should place non-trivial weight on stuff like “rough, somewhat naive-seeming extrapolation of the general trends we’re observing[2].”[3])
Caveat: I’d add ”...on a big range/ the scale we care about”; at some point, ~any progress would start hitting ~physical limits. But if that point is after the curve reshapes ~everything we care about, then I’m basically ignoring that consideration for now.
- the metrics we use for such observations can lead us astray in some situations (in particular they might not ~linearly relate to “the true thing we care about”)
- we often have limited data, we shouldn’t be confident that we’re predicting/measuring the right thing, things can in fact change over time and we should also not forget that, etc.
And placing some weight on the prediction that the curve will simply continue[1] seems like a useful heuristic / counterbalance (and has performed well).
“and has performed well” seems like a good crux to zoom in on; for which reference class of empirical trends is this true, and how true is it?
It’s hard to disagree with “place some weight”; imo it always makes sense to have some prior that past trends will continue. The question is how much weight to place on this heuristic vs. more gears-level reasoning.
For a random example, observers in 2009 might have mispredicted Spanish GDP over the next ten years if they placed a lot of weight on this prior.
So ~everything is ultimately an S-curve. Yet although ‘this trend will start capping out somewhere’ is a very safe bet, ‘calling the inflection point’ before you’ve passed it is known to be extremely hard. Sigmoid curves in their early days are essentially indistinguishable from exponential ones, and the extra parameter which ~guarantees they can better (over?)fit the points on the graph than a simple exponential give very unstable estimates of the putative ceiling the trend will ‘cap out’ at. (cf. 1, 2.)
Many important things turn on (e.g.) ‘scaling is hitting the wall ~now’ vs. ‘scaling will hit the wall roughly at the point of the first dyson sphere data center’ As the universe is a small place on a log scale, this range is easily spanned by different analysis choices on how you project forward.
Without strong priors on ‘inflecting soon’ vs. ‘inflecting late’, forecasts tend to be volatile: is this small blip above or below trend really a blip, or a sign we’re entering a faster/slow regime?
I sometimes see people say stuff like:
But IMO it’s not a fluke that the “that curve is going up, who knows why” POV has done well.
A sketch of what I think happens:
There’s a general dynamic here that goes something like:
Some people point to a curve going up (and maybe note some underlying driver)
Others point out that the drivers have inherent constraints (this is an s-curve, not an exponential)
And then in *some* sense the bottlenecks crowd turn out to be right (the specific drivers/paradigm peters out, there’s literally no more space for more transistors, companies run low on easily accessible/high quality training data, etc.)…
…but then a “surprise new thing” pops up and fills the gap, such that the “true” thing we cared about (whether or not it’s what we were originally measuring) *does* actually continue as people originally predicted, apparently naively
(and it turned out that the curve consists of a stack of s-curves..)
We can go too far with this kind of reasoning; some “true things we care about” (e.g. spread of a disease) * are* in fact s-curves, bounded, etc., and only locally look like ~exponentials. (So: no, we shouldn’t expect the baby to weigh trillions of pounds by age 10...)
But I think the more granular, gears-oriented view — which considers how long specific drivers of the progress we’re seeing could continue, etc. — often underrates the extent to which *other forces* can (and often do) jump in when earlier drivers lose momentum.
(~Crossposting from Twitter.)
“The Bypass Principle: How AI flows around obstacles” from Eric Drexler is a very related (and IMO good) post. Quote (bold mine):
(This stuff isn’t new; many people have pointed out these kinds of dynamics. But I feel like I’m still seeing them a fair bit — and this came up recently — so I wanted to write this note.)
I’m skeptical of an “exponentials generally continue” prior which is supposed to apply super-generally. For example, here’s a graph of world population since 1000 AD; it’s an exponential, but actually there are good mechanistic reasons to think it won’t continue along this trajectory. Do you think it’s very likely to?
I don’t personally have well-developed thoughts on population growth, but note that “population growth won’t continue to be exponential” is a prediction with a notoriously bad track record.
It has? The empirical track record has been slowing global population growth, which peaked 60 years ago:
The population itself also looks pretty linear for the last 60 years, given 1B/~12 years rate:
#
I think the general story is something like this:
The (observable) universe is a surprisingly small place when plotted on a log scale. Thus anything growing exponentially will hit physical ceilings if projected forward long enough: a population of 1000 humans growing at 0.9% pa roughly equal the number of atoms in the universe after 20 000 years.
(Even uploading digital humans onto computronium only gets you a little further on the log scale: after 60 000 years our population has grown to 2E236, so ~E150 people per atom, and E50 per planck volume. If that number is not sufficiently ridiculous, just let the growth run for a million more years and EE notation starts making sense (10^10^x).)
There are usually also ‘practical’ ceilings which look implausible to reach long before you’ve exhausted the universe: “If this stock keeps doubling in value, this company would be 99% of the global market cap in X years”, “Even if the total addressable consumer market is the entire human population, people aren’t going to be buying multiple subscriptions to Netflix each.”, etc.
So ~everything is ultimately an S-curve. Yet although ‘this trend will start capping out somewhere’ is a very safe bet, ‘calling the inflection point’ before you’ve passed it is known to be extremely hard. Sigmoid curves in their early days are essentially indistinguishable from exponential ones, and the extra parameter which ~guarantees they can better (over?)fit the points on the graph than a simple exponential give very unstable estimates of the putative ceiling the trend will ‘cap out’ at. (cf. 1, 2.)
Many important things turn on (e.g.) ‘scaling is hitting the wall ~now’ vs. ‘scaling will hit the wall roughly at the point of the first dyson sphere data center’ As the universe is a small place on a log scale, this range is easily spanned by different analysis choices on how you project forward.
Without strong priors on ‘inflecting soon’ vs. ‘inflecting late’, forecasts tend to be volatile: is this small blip above or below trend really a blip, or a sign we’re entering a faster/slow regime?
(My guess is the right ur-prior favours ‘inflecting soon’ weakly and in general, although exceptions and big misses abound. In most cases, you have mechanistic steers you can appeal to which give much more evidence. I’m not sure AI is one of them, as it seems a complete epistemic mess to me.)
I tried to clarify things a bit in this reply to titotal: https://forum.effectivealtruism.org/posts/iJSYZJJrLMigJsBeK/lizka-s-shortform?commentId=uewYatQz4dxJPXPiv
In particular, I’m not trying to make a strong claim about exponentials specifically, or that things will line up perfectly, etc.
(Fwiw, though, it does seem possible that if we zoom out, recent/near-term population growth slow-downs might be functionally a ~blip if humanity or something like it leaves the Earth. Although at some point you’d still hit physical limits.)
And why, exactly, would you expect every single new development to show up at exactly the right time as to make the overall curve remain exponential? What your view actually predicts is that progress will be a series of S-curves… but it says nothing about how long the flat bits in between will be.
Even within the history of AI, we have seen S-curves flatten out: there have been AI winters that lasted literal decades.
Do you know if the AI winters actually broke a trend in the way OP is talking about? E.g. you can’t easily see winters in a graph of chess performance:
(Wikipedia says that the AI winters were 1974-80 and 1987-2000. Also minor note that neither of these winters lasted “literal decades”.)
Oh, apologies: I’m not actually trying to claim that things will be <<exactly.. exponential>>. We should expect some amount of ~variation in progress/growth (these are rough models, we shouldn’t be too confident about how things will go, etc.), what’s actually going on is (probably a lot) more complicated than a simple/neat progression of new s-curves, etc.
The thing I’m trying to say is more like:
When we’ve observed some datapoints about a thing we care about, and they seem to fit some overall curve (e.g. exponential growth) reasonably well,
then pointing to specific drivers that we think are responsible for the changes — & focusing on how those drivers might progress or be fundamentally limited, etc. — often makes us (significantly) overestimate bottlenecks/obstacles standing in the way of progress on the thing that we actually care about.
And placing some weight on the prediction that the curve will simply continue[1] seems like a useful heuristic / counterbalance (and has performed well).
(Apologies if what I’d written earlier was unclear about what I believe — I’m not sure if we still notably disagree given the clarification?)
A different way to think about this might be something like:
The drivers that we can point to are generally only part of the picture, and they’re often downstream of some fuzzier higher-level/”meta” force (or a portfolio of forces) like “incentives+...”
It’s usually quite hard to draw a boundary around literally everything that’s causing some growth/progress
It’s also often hard to imagine, from a given point in time, very different ways of driving the thing forward
(e.g. because we’ve yet to discover other ways of making progress, because proxies we’re looking at locally implicitly bake in some unnecessary assumptions about how progress on the thing we care about will get made, etc.)
So our stories about what’s causing some development that we’re observing are often missing important stuff, and sometimes we should trust the extrapolation more than the stories / assume the stories are incomplete
Something like this seems to help explain why views like “the curve we’re observing will (basically) just continue” have seemed surprisingly successful, even when the people holding those “curve go up” views justified their conclusions via apparently incorrect reasoning about the specific drivers of progress. (And so IMO people should place non-trivial weight on stuff like “rough, somewhat naive-seeming extrapolation of the general trends we’re observing[2].”[3])
[See also a classic post on the general topic, and some related discussion here, IIRC: https://www.alignmentforum.org/posts/aNAFrGbzXddQBMDqh/moore-s-law-ai-and-the-pace-of-progress ]
Caveat: I’d add ”...on a big range/ the scale we care about”; at some point, ~any progress would start hitting ~physical limits. But if that point is after the curve reshapes ~everything we care about, then I’m basically ignoring that consideration for now.
Obviously there are caveats. E.g.:
- the metrics we use for such observations can lead us astray in some situations (in particular they might not ~linearly relate to “the true thing we care about”)
- we often have limited data, we shouldn’t be confident that we’re predicting/measuring the right thing, things can in fact change over time and we should also not forget that, etc.
(I think there were nice notes on this here, although I’ve only skimmed and didn’t re-read https://arxiv.org/pdf/2205.15011 )
Also, sometimes we do know what
“and has performed well” seems like a good crux to zoom in on; for which reference class of empirical trends is this true, and how true is it?
It’s hard to disagree with “place some weight”; imo it always makes sense to have some prior that past trends will continue. The question is how much weight to place on this heuristic vs. more gears-level reasoning.
For a random example, observers in 2009 might have mispredicted Spanish GDP over the next ten years if they placed a lot of weight on this prior.
Ah, @Gregory Lewis🔸 says some of the above better. Quoting his comment: