Thank you for this thoughtful response — and for all of your comments on the document! I agree with much of what you say here.
(No need to respond to the below thoughts, since they somehow ended up quite a bit longer than I intended.)
Kahneman and Tversky showed that incorporating perspectives that neglect inside information (in this case the historical specifics of growth accelerations) can reduce our ignorance about the future—at least, the immediate future. This practice can improve foreseight both formally—leading experts to take weighted averages of predictions based on inside and outside views—and informally—through the productive friction that occurs when people are challenged to reexamine assumptions. So while I think the feeling expressed in the quote is understandable, it’s also useful to challenge it.
This is well put. I do agree with this point, and don’t want to downplay the value of taking outside view perspectives.
As I see it, there are a couple of different reasons to fit hyperbolic growth models — or, rather, models of form (dY/dt)/Y = aY^b + c — to historical growth data.
First, we might be trying to test a particular theory about the causes of the Industrial Revolution (Kremer’s “Two Heads” theory, which implies that pre-industrial growth ought to have followed a hyperbolic trajectory).[1] Second, rather than directly probing questions about the causes of growth, we can use the fitted models to explore outside view predictions — by seeing what the fitted models imply when extrapolated forward.
I read Kremer’s paper as mostly being about testing his growth theory, whereas I read the empirical section of your paper as mostly being about outside-view extrapolation. I’m interested in both, but probably more directly interested in probing Kremer’s growth theory.
I think that different aims lead to different emphases. For example: For the purposes of testing Kremer’s theory, the pre-industrial (or perhaps even pre-1500) data is nearly all that matters. We know that the growth rate has increased in the past few hundred years, but that’s the thing various theories are trying to explain. What distinguishes Kremer’s theory from the other main theories — which typically suggest that the IR represented a kind of ‘phase transition’ — is that Kremer’s predicts an upward trend in the growth rate throughout the pre-modern era.[2] So I think that’s the place to look.
On the other hand, if the purpose of model fitting is trend extrapolation, then there’s no particular reason to fit the model only to the pre-modern datapoint; this would mean pointlessly throwing out valuable information.
A lot of the reason I’m skeptical of Kremer’s model is that it doesn’t seem to fit very well with the accounts of economic historians and their descriptions of growth dynamics. His model seems to leave out too much and to treat the growth process as too homogenous across time. “Growth was faster in 1950AD than in 10,000BC mainly because there were more total ideas for new technologies each year, mainly because there were more people alive” seems really insufficient as an explanation; it seems suspicious that the model leaves out all of the other salient differences that typically draw economic historians’ attention. Are changes in institutions, culture, modes of production, and energetic constraints really all secondary enough to be slipped into the error term?[3]
But one definitely doesn’t need to ‘believe’ the Kremer model — which offers one explanation for why long-run growth would follow a consistent hyperbolic trajectory — to find it useful to make growth extrapolations using simple hyperbolic models. The best case for giving significant weight to the outside view extrapolations, as I understand it, is something like (non-quote):
We know that growth rates permanently increased in the centuries around the Industrial Revolution. Constant exponential growth models therefore fit long-run growth data terribly. Models of form (dY/dt)/Y = aY^b can fit the data much better, since they allow the growth rate to increase. If we fit one of these models to the long-run growth data (with an error term to account for stochasticity) we find that b > 0, implying hyperbolically increasing growth rates. Extrapolated forward, this implies that infinite rates are nearly inevitable in the future. While we we of course know that growth won’t actually become infinite, we should still update in the direction of believing that much faster growth is coming, because this is the simplest model that offers an acceptably good fit, and because we shouldn’t be too confident in any particular inside view model of how economic growth works.
I do think this line of thinking makes sense, but in practice don’t update that much. While I don’t believe any very specific ‘inside view’ story about long-run growth, I do find it easy to imagine that was a phase change of one sort or another around the Industrial Revolution (as most economic historians seem to believe). The economy has also changed enough over the past ten thousand years to make it intuitively surprising to me that any simple unified model — without phase changes or piecewise components — could actually do a good job of capturing growth dynamics across the full period.
I think that a more general prior might also be doing some work for me here. If there’s some variable whose growth rate has recently increased substantially, then a hyperbolic model — (dY/dt)/Y = a*Y^b, with b > 0 — will often be the simplest model that offers an acceptable fit. But I’m suspicious that extrapolating out the hyperbolic model will typically give you good predictions. It will more often turn out to be the case that there was just a kind of phase change.
To be clear, the paper seems to shift between two definitions of hyperbolic growth: usually it’s B = 1 (“proportional”), but in places it’s B > 0. I think the paper could easily be misunderstood to be rejecting B > 0 (superexponential growth/singularity in general) in places where it’s actually rejecting B = 1 (superexponential growth/singularity with a particular speed). This is the sense in which I’d prefer less specificity in the statement of the hyperbolic growth hypothesis.
I think this is a completely valid criticism.
I agree that B > 0 is the more important hypothesis to focus on (and it’s of course what you focus on in your report). I started out investigating B = 1, then updated parts of the document to be about B > 0, but didn’t ultimately fully switch it over. Part of the issue is that B = 0 and B = 1 are distinct enough to support at least weak/speculative inferences from the radiocarbon graphs. This led me to mostly focus on B > 0 when talking about the McEvedy data, but focus on B = 1 when talking about the radiocarbon data. I think, though, that this mixing-and-matching has resulted in the document being somewhat confusing and potentially misleading in places.
To be more concrete, look back at the qualifiers in the HGH statement: “tended to be roughly proportional.” Is the HGH, so stated, falsifiable? Or, more realistically, can it be assigned a p value? I think the answer is no, because there is no explicitly hypothesized, stochastic data generating process.
I think that this is also a valid criticism: I never really say outright what would count as confirmation, in my mind.
Supposing we had perfectly accurate data, I would say that a necessary condition for considering the data “consistent” with the hypothesis is something like: “If we fit a model of form (dP/dt)/P = a*P^b to population data from 5000BC to 1700AD, and use a noise term that models stochasticity in a plausible way, then the estimated value of b should not be significantly less than .5”
I only ran this regression using normal noise terms, rather than using the more theoretically well-grounded approach you’ve developed, so it’s possible the result would come out different if I reran it. But my concerns about data quality have also had a big influence on my sloppiness tolerance here: if a statistical result concerning (specifically) the pre-modern subset of the data is sufficiently sensitive to model specification, and isn’t showing up in bright neon letters, then I’m not inclined to give it much weight.
(These regression results ultimately don’t have a substantial impact on my views, in either direction.)
I believe this sort of fallacy is present in the current draft of Ben’s paper, where it says, “Kremer’s primary regression results don’t actually tell us anything that we didn’t already know: all they say is that the population growth rate has increased.”
I think this was an unclear statement on my part. I’m referring to the linear and non-linear regressions that Kremer runs on his population dataset (Tables II and IV), showing that population is significantly predictive of population growth rates for subsets that contain the Industrial Revolution. I didn’t mean to include his tests for heteroskedasticity or stability in that comment.
In my first attack on modeling long-term growth, I chose to put a lot of work into the simpler hyperbolic model because I saw an opportunity to improve is statistical expression, in particular by modeling how random growth shocks at each infinitesimal moment feed into the growth process to shape the probability distribution for growth over finite periods such as 10 years. This seemed potentially useful for two reasons. For one, since it was hard to do, it seemed better to do it in a simpler model first.
For another, it allowed a rigorous test of whether second-order effects—the apparently episodic character of growth accelerations—could be parsimoniously viewed as mere noise within a simpler, pattern of long-term acceleration. Within the particular structure of my model, the answer was no. For example, after being fit to the GWP data for 10,000 BCE to 1700 CE, my model is surprised at how high GWP was in 1820, assigning that outcome a p value of ~0.1. Ben’s paper presents similar findings, graphically.
Just wanted to say that I believe this is useful too! Beyond the reasons you list here, I think that your modeling work also gives a really interesting insight into — and raises really interesting questions about — the potential for path-dependency in the human trajectory. I found it very surprising, for example, that re-rolling-out the fitted model from 10,000BC could give such a wide range of potential dates for the growth takeoff.
But, as noted, it’s not clear that stipulating an episodic character should in itself shift one’s priors on the possibility of singularity-like developments.
I think that it should make a difference, although you’re right to suggest that the difference may not be huge. If we were fully convinced that the episodic model was right, then one natural outside view perspective would be: “OK, the growth rate has jumped up twice over the course of human history. What the odds it will happen at least once more?”
This particular outside view should spit out a greater than 50% probability, depending on the prior used. It will be lower than the probability that hyperbolic trend extrapolation outside view spits out, but, by any conventional standard, it certainly won’t be low!
Whichever view of economic history we prefer, we should make sure to have our seatbelts buckled.
I’m saying Kremer’s “theory” rather than Kremer’s “model” to avoiding ambiguity: when I mention “models” in this comment I always mean statistical models, rather than growth models.
I don’t know, of course, if Kremer would actually frame the empirical part of the paper quite this way. But if all the paper showed is that growth increased around the Industrial Revolution, this wouldn’t really be a very new/informative result. The fact that he’s also saying something about pre-modern growth dynamics (potentially back to 1 million BC) seems like the special thing about the paper — and the thing the paper emphasizes throughout.
To stretch his growth theory in an unfair way: If there’s a slight low-hanging fruit effect, then the general theory suggests that — if you kept the world exactly as it was in 10000BC, but bumped its population up to 2020AD levels (potentially by increasing the size of the Earth) — then these hunter-gatherer societies would soon start to experience much higher rates of economic growth/innovation than what we’re experiencing today.
As I see it, there are a couple of different reasons to fit hyperbolic growth models — or, rather, models of form (dY/dt)/Y = aY^b + c — to historical growth data.
...
I think the distinction between testing a theory and testing a mathematical model makes sense, but the two are intertwined. A theory will tend naturally to to imply a mathematical model, but perhaps less so the other way around. So I would say Kremer is testing both a theory and and model—not confined to just one side of that dichotomy. Whereas as far as I can see the sum-of-exponentials model is, while intuitive, not so theoretically grounded. Taken literally, it says the seeds of every economic revolution that has occurred and will occur were present 12,000 years ago (or in Hanson (2000), 2 million years ago), and it’s just taking them a while to become measurable. I see no framework behind it that predicts how the system will evolve as a function of its current state rather than as a function of time. Ideally, the second would emerge from the first.
Note that what you call Kremer’s “Two Heads” model predates him. It’s in the endogenous growth theory of Romer (1986, 1990), which is an essential foundation for Kremer. And Romer is very much focused on the modern era, so it’s not clear to me that “For the purposes of testing Kremer’s theory, the pre-industrial (or perhaps even pre-1500) data is nearly all that matters.” Kuznets (1957) wrote about the contribution of “geniuses”—more people, more geniuses, faster progress. Julian Simon built on that idea in books and articles.
A lot of the reason I’m skeptical of Kremer’s model is that it doesn’t seem to fit very well with the accounts of economic historians and their descriptions of growth dynamics....it seems suspicious that the model leaves out all of the other salient differences that typically draw economic historians’ attention. Are changes in institutions, culture, modes of production, and energetic constraints really all secondary enough to be slipped into the error term?
Actually, I believe the standard understanding of “technology” in economics includes institutions, culture, etc.—whatever affects how much output a society wrings from a given amount of inputs. So all of those are by default in Kremer’s symbol for technology, A. And a lot of those things plausibly could improve faster, in the narrow sense of increasing productivity, if there are more people, if more people also means more societies (accidentally) experimenting with different arrangements and then setting examples for others; or if such institutional innovations are prodded along by innovations in technology in the narrower sense, such as the printing press.
Actually, I believe the standard understanding of “technology” in economics includes institutions, culture, etc.--whatever affects how much output a society wrings from a given input. So all of those are by default in Kremer’s symbol for technology, A. And a lot of those things plausibly could improve faster, in the narrow sense of increasing productivity, if there are more people, if more people also means more societies (accidentally) experimenting with different arrangements and then setting examples for others; or if such institutional innovations are prodded along by innovations in technology in the narrower sense, such as the printing press.
Just on this point:
For the general Kremer model, where the idea production function is dA/dt = a(P^b)(A^c), higher levels of technology do support faster technological progress if c > 0. So you’re right to note that, for Kremer’s chosen parameter values, the higher level of technology in the present day is part of the story for why growth is faster today.
Although it’s not an essential part of the story: If c = 0, then the growth is still hyperbolic, with the growth rate being proportional to P^(2/3) during the Malthusian period. I suppose I’m also skeptical that at least institutional and cultural change are well-modeled as resulting from the accumulation of new ideas: beneath the randomness, the forces shaping them typically strike me as much more structural.
Hi David,
Thank you for this thoughtful response — and for all of your comments on the document! I agree with much of what you say here.
(No need to respond to the below thoughts, since they somehow ended up quite a bit longer than I intended.)
This is well put. I do agree with this point, and don’t want to downplay the value of taking outside view perspectives.
As I see it, there are a couple of different reasons to fit hyperbolic growth models — or, rather, models of form (dY/dt)/Y = aY^b + c — to historical growth data.
First, we might be trying to test a particular theory about the causes of the Industrial Revolution (Kremer’s “Two Heads” theory, which implies that pre-industrial growth ought to have followed a hyperbolic trajectory).[1] Second, rather than directly probing questions about the causes of growth, we can use the fitted models to explore outside view predictions — by seeing what the fitted models imply when extrapolated forward.
I read Kremer’s paper as mostly being about testing his growth theory, whereas I read the empirical section of your paper as mostly being about outside-view extrapolation. I’m interested in both, but probably more directly interested in probing Kremer’s growth theory.
I think that different aims lead to different emphases. For example: For the purposes of testing Kremer’s theory, the pre-industrial (or perhaps even pre-1500) data is nearly all that matters. We know that the growth rate has increased in the past few hundred years, but that’s the thing various theories are trying to explain. What distinguishes Kremer’s theory from the other main theories — which typically suggest that the IR represented a kind of ‘phase transition’ — is that Kremer’s predicts an upward trend in the growth rate throughout the pre-modern era.[2] So I think that’s the place to look.
On the other hand, if the purpose of model fitting is trend extrapolation, then there’s no particular reason to fit the model only to the pre-modern datapoint; this would mean pointlessly throwing out valuable information.
A lot of the reason I’m skeptical of Kremer’s model is that it doesn’t seem to fit very well with the accounts of economic historians and their descriptions of growth dynamics. His model seems to leave out too much and to treat the growth process as too homogenous across time. “Growth was faster in 1950AD than in 10,000BC mainly because there were more total ideas for new technologies each year, mainly because there were more people alive” seems really insufficient as an explanation; it seems suspicious that the model leaves out all of the other salient differences that typically draw economic historians’ attention. Are changes in institutions, culture, modes of production, and energetic constraints really all secondary enough to be slipped into the error term?[3]
But one definitely doesn’t need to ‘believe’ the Kremer model — which offers one explanation for why long-run growth would follow a consistent hyperbolic trajectory — to find it useful to make growth extrapolations using simple hyperbolic models. The best case for giving significant weight to the outside view extrapolations, as I understand it, is something like (non-quote):
I do think this line of thinking makes sense, but in practice don’t update that much. While I don’t believe any very specific ‘inside view’ story about long-run growth, I do find it easy to imagine that was a phase change of one sort or another around the Industrial Revolution (as most economic historians seem to believe). The economy has also changed enough over the past ten thousand years to make it intuitively surprising to me that any simple unified model — without phase changes or piecewise components — could actually do a good job of capturing growth dynamics across the full period.
I think that a more general prior might also be doing some work for me here. If there’s some variable whose growth rate has recently increased substantially, then a hyperbolic model — (dY/dt)/Y = a*Y^b, with b > 0 — will often be the simplest model that offers an acceptable fit. But I’m suspicious that extrapolating out the hyperbolic model will typically give you good predictions. It will more often turn out to be the case that there was just a kind of phase change.
I think this is a completely valid criticism.
I agree that B > 0 is the more important hypothesis to focus on (and it’s of course what you focus on in your report). I started out investigating B = 1, then updated parts of the document to be about B > 0, but didn’t ultimately fully switch it over. Part of the issue is that B = 0 and B = 1 are distinct enough to support at least weak/speculative inferences from the radiocarbon graphs. This led me to mostly focus on B > 0 when talking about the McEvedy data, but focus on B = 1 when talking about the radiocarbon data. I think, though, that this mixing-and-matching has resulted in the document being somewhat confusing and potentially misleading in places.
I think that this is also a valid criticism: I never really say outright what would count as confirmation, in my mind.
Supposing we had perfectly accurate data, I would say that a necessary condition for considering the data “consistent” with the hypothesis is something like: “If we fit a model of form (dP/dt)/P = a*P^b to population data from 5000BC to 1700AD, and use a noise term that models stochasticity in a plausible way, then the estimated value of b should not be significantly less than .5”
I only ran this regression using normal noise terms, rather than using the more theoretically well-grounded approach you’ve developed, so it’s possible the result would come out different if I reran it. But my concerns about data quality have also had a big influence on my sloppiness tolerance here: if a statistical result concerning (specifically) the pre-modern subset of the data is sufficiently sensitive to model specification, and isn’t showing up in bright neon letters, then I’m not inclined to give it much weight.
(These regression results ultimately don’t have a substantial impact on my views, in either direction.)
I think this was an unclear statement on my part. I’m referring to the linear and non-linear regressions that Kremer runs on his population dataset (Tables II and IV), showing that population is significantly predictive of population growth rates for subsets that contain the Industrial Revolution. I didn’t mean to include his tests for heteroskedasticity or stability in that comment.
Just wanted to say that I believe this is useful too! Beyond the reasons you list here, I think that your modeling work also gives a really interesting insight into — and raises really interesting questions about — the potential for path-dependency in the human trajectory. I found it very surprising, for example, that re-rolling-out the fitted model from 10,000BC could give such a wide range of potential dates for the growth takeoff.
I think that it should make a difference, although you’re right to suggest that the difference may not be huge. If we were fully convinced that the episodic model was right, then one natural outside view perspective would be: “OK, the growth rate has jumped up twice over the course of human history. What the odds it will happen at least once more?”
This particular outside view should spit out a greater than 50% probability, depending on the prior used. It will be lower than the probability that hyperbolic trend extrapolation outside view spits out, but, by any conventional standard, it certainly won’t be low!
Whichever view of economic history we prefer, we should make sure to have our seatbelts buckled.
I’m saying Kremer’s “theory” rather than Kremer’s “model” to avoiding ambiguity: when I mention “models” in this comment I always mean statistical models, rather than growth models.
I don’t know, of course, if Kremer would actually frame the empirical part of the paper quite this way. But if all the paper showed is that growth increased around the Industrial Revolution, this wouldn’t really be a very new/informative result. The fact that he’s also saying something about pre-modern growth dynamics (potentially back to 1 million BC) seems like the special thing about the paper — and the thing the paper emphasizes throughout.
To stretch his growth theory in an unfair way: If there’s a slight low-hanging fruit effect, then the general theory suggests that — if you kept the world exactly as it was in 10000BC, but bumped its population up to 2020AD levels (potentially by increasing the size of the Earth) — then these hunter-gatherer societies would soon start to experience much higher rates of economic growth/innovation than what we’re experiencing today.
I agree with much of this. A few responses.
I think the distinction between testing a theory and testing a mathematical model makes sense, but the two are intertwined. A theory will tend naturally to to imply a mathematical model, but perhaps less so the other way around. So I would say Kremer is testing both a theory and and model—not confined to just one side of that dichotomy. Whereas as far as I can see the sum-of-exponentials model is, while intuitive, not so theoretically grounded. Taken literally, it says the seeds of every economic revolution that has occurred and will occur were present 12,000 years ago (or in Hanson (2000), 2 million years ago), and it’s just taking them a while to become measurable. I see no framework behind it that predicts how the system will evolve as a function of its current state rather than as a function of time. Ideally, the second would emerge from the first.
Note that what you call Kremer’s “Two Heads” model predates him. It’s in the endogenous growth theory of Romer (1986, 1990), which is an essential foundation for Kremer. And Romer is very much focused on the modern era, so it’s not clear to me that “For the purposes of testing Kremer’s theory, the pre-industrial (or perhaps even pre-1500) data is nearly all that matters.” Kuznets (1957) wrote about the contribution of “geniuses”—more people, more geniuses, faster progress. Julian Simon built on that idea in books and articles.
Actually, I believe the standard understanding of “technology” in economics includes institutions, culture, etc.—whatever affects how much output a society wrings from a given amount of inputs. So all of those are by default in Kremer’s symbol for technology, A. And a lot of those things plausibly could improve faster, in the narrow sense of increasing productivity, if there are more people, if more people also means more societies (accidentally) experimenting with different arrangements and then setting examples for others; or if such institutional innovations are prodded along by innovations in technology in the narrower sense, such as the printing press.
Just on this point:
For the general Kremer model, where the idea production function is dA/dt = a(P^b)(A^c), higher levels of technology do support faster technological progress if c > 0. So you’re right to note that, for Kremer’s chosen parameter values, the higher level of technology in the present day is part of the story for why growth is faster today.
Although it’s not an essential part of the story: If c = 0, then the growth is still hyperbolic, with the growth rate being proportional to P^(2/3) during the Malthusian period. I suppose I’m also skeptical that at least institutional and cultural change are well-modeled as resulting from the accumulation of new ideas: beneath the randomness, the forces shaping them typically strike me as much more structural.