I find this comment relatable, and agree with the sentiment.
I also think I’ve come across this sort of idea in this post somewhere in EA before. But I can’t recall where, so for one thing this post gives me something to link people to. Plus this particular presentation feels somewhat unfamiliar, so I think I’ve learned something new from it. And I definitely found it an interesting refresher in any case.
In general, I think reading up on what’s been written before is good (see also this), but that there’s no harm in multiple posts with similar messages, and often there are various benefits. And EA thoughts are so spread out that it’s often hard to thoroughly check what’s come before.
So I think it’s probably generally a good norm for people to be happy to post things without worrying too much about the possibility they’re rehashing existing ideas, at least when the posts are quick to write.
 Also, I sort of like the idea of often having posts that serve well as places to link people to for one specific concept or idea, with the concept or idea not being buried in a lot of other stuff. Sort of like posts that just pull out one piece of a larger set of ideas, so that that piece can be easily consumed and revisited by itself. I try to do this sort of thing sometimes. And I think this post could serve that function too—I’d guess that wherever I came across a similar idea, there was a lot of other stuff there too, so this post might serve better as a reference for this one idea than that source would’ve.
ETA: A partial counter consideration is that some ideas may be best understood in a broader context, and may be misinterpreted without that context (I’m thinking along the lines of the fidelity model). So I think the “cover just one piece” approach won’t always be ideal, and is rarely easy to do well.
This also seems reminiscent of Bostrom’s Vulnerable World Hypothesis (published a year after this thread, so fair enough that it didn’t make an appearance here :D). The abstract:
Scientific and technological progress might change people’s capabilities or incentives in ways that would destabilize civilization. For example, advances in DIY biohacking tools might make it easy for anybody with basic training in biology to kill millions; novel military technologies could trigger arms races in which whoever strikes first has a decisive advantage; or some economically advantageous process may be invented that produces disastrous negative global externalities that are hard to regulate. This paper introduces the concept of a vulnerable world: roughly, one in which there is some level of technological development at which civilization almost certainly gets devastated by default, i.e. unless it has exited the ‘semi-anarchic default condition’. Several counterfactual historical and speculative future vulnerabilities are analyzed and arranged into a typology. A general ability to stabilize a vulnerable world would require greatly amplified capacities for preventive policing and global governance. The vulnerable world hypothesis thus offers a new perspective from which to evaluate the risk-benefit balance of developments towards ubiquitous surveillance or a unipolar world order.
The most relevant part is Bostrom’s “easy nukes” thought experiment.
tl;dr I think it’s “another million years”, or slightly longer, but I’m not sure.
In The Precipice, Toby Ord writes:
How much of this future might we live to see? The fossil record provides some useful guidance. Mammalian species typically survive for around one million years before they go extinct; our close relative, Homo erectus, survived for almost two million. If we think of one million years in terms of a single, eighty-year life, then today humanity would be in its adolescence—sixteen years old, just coming into our power; just old enough to get ourselves into serious trouble.
(There are various extra details and caveats about these estimates in the footnotes.)
Ord also makes similar statements on the FLI Podcast, including the following:
If you think about the expected lifespan of humanity, a typical species lives for about a million years [I think Ord meant “mammalian species”]. Humanity is about 200,000 years old. We have something like 800,000 or a million or more years ahead of us if we play our cards right and we don’t lead to our own destruction. The analogy would be 20% of the way through our life[...]
I think this is a strong analogy from a poetic perspective. And I think that highlighting the typical species’ lifespan is a good starting point for thinking about how long we might have left. (Although of course we could also draw on many other facts for that analysis, as Ord discusses in the book.)
But I also think that there’s a way in which the lifespan analogy might be a bit misleading. If a human is 70, we expect they have less time less to live than if a human is 20. But I’m not sure whether, if a species if 700,000 years old, we should expect that species to go extinct sooner than a species that is 200,000 years old will.
My guess would be that a ~1 million year lifespan for a typical mammalian species would translate into a roughly 1 in a million chance of extinction each year, which doesn’t rise or fall very much in a predictable way over most of the species’ lifespan. Specific events, like changes in a climate or another species arriving/evolving, could easily change the annual extinction rate. But I’m not aware of an analogy here to how ageing increases the annual risk of humans dying from various causes.
I would imagine that, even if a species has been around for almost or more than a million years, we should still perhaps expect a roughly 1 in a million chance of extinction each year. Or perhaps we should even expect them to have a somewhat lower annual chance of extinction, and thus a higher expected lifespan going forwards, based on how long they’ve survived so far?
(But I’m also not an expert on the relevant fields—not even certain what they would be—and I didn’t do extra research to inform this shortform comment.)
I don’t think that Ord actually intends to imply that species’ “lifespans” work like humans’ lifespans do. But the analogy does seem to imply it. And in the FLI interview, he does seem to briefly imply that, though of course there he was speaking off the cuff.
I’m also not sure how important this point is, given that humans are very atypical anyway. But I thought it was worth noting in a shortform comment, especially as I expect that, in the wake of The Precipice being great, statements along these lines may be quoted regularly over the coming months.
Ah, ok. The edits clear everything up for me except that the “even” is meant to be highlighting that this is even shorter than the timelines given in the above paragraph. (Not sure that matters much, though.)
Oh, also, on the more general question of what to actually do, given a particular belief about AGI timelines (or other existential risk timelines), this technical report by Owen Cotton-Barratt is interesting. One quote:
There are two major factors which seem to push towards preferring more work which focuses on scenarios where AI comes soon. The first is nearsightedness: we simply have a better idea of what will be useful in these scenarios. The second is diminishing marginal returns: the expected effect of an extra year of work on a problem tends to decline when it is being added to a larger total. And because there is a much larger time horizon in which to solve it (and in a wealthier world), the problem of AI safety when AI comes later may receive many times as much work as the problem of AI safety for AI that comes soon. On the other hand one more factor preferring work on scenarios where AI comes later is the ability to pursue more leveraged strategies which eschew object-level work today in favour of generating (hopefully) more object-level work later.
Other answers have made what I think of as the key points. I’ll try to add by pointing in the direction of some resources I’ve found on this matter which weren’t mentioned already by others. Note that:
Some of these source suggest AGI is on the horizon, some suggest it isn’t, and some just discuss the matter
The question of AGI timelines (things like “time until AGI”) is related to, but distinct from, the question of “discontinuity”/”takeoff speed”/”foom” (I mention the last of those terms only for historical reasons; I think it’s unnecessarily unprofessional). Both questions are relevant when determining strategies for handling AI risk. It would probably be good if the distinction was more often made explicit. The sources I’ll mention may sometimes be more about discontinuity-type-questions than about AGI timelines.
With those caveats in mind, here are some sources:
My current framework for thinking about AGI timelines (and the subsequent posts in the series) - zhukeepa, 2020
Double Cruxing the AI Foom debate—agilecaveman, 2018
Quick Nate/Eliezer comments on discontinuity − 2018
Arguments about fast takeoff—Paul Christiano, 2018
Likelihood of discontinuous progress around the development of AGI—AI Impacts, 2018
There’s No Fire Alarm for Artificial General Intelligence—Eliezer Yudkowsky, 2017 (I haven’t yet read this one)
The Hanson-Yudkowsky AI-Foom Debate—various works from 2008-2013 (I haven’t yet read most of this)
I’ve also made a collection of so far around 30 “works that highlight disagreements, cruxes, debates, assumptions, etc. about the importance of AI safety/alignment, about which risks are most likely, about which strategies to prioritise, etc.” Most aren’t primarily focused on timelines, but many relate to that matter.
People working specifically on AGI (eg, people at OpenAI, DeepMind) seem especially bullish about transformative AI, even relative to experts not working on AGI. Note that this is not uncontroversial, see eg, criticisms from Jessica Taylor, among others. Note also that there’s a strong selection effect for the people who’re the most bullish on AGI to work on it.
I have several uncertainties about what you meant by this:
Do you include in “People working specifically on AGI” people working on AI safety, or just capabilities?
“bullish” in the sense of “thinking transformative AI (TAI) is coming soon”, or in the sense of “thinking TAI will be great”, or in the sense of “thinking TAI will happen in some discontinuous way”, or something else?
what do you mean by “experts not working on AGI”? Why say “even”—wouldn’t the selection effect you mention mean we’d expect experts not working on AGI to be less bullish? (Though other factors could push in the opposite direction, such as people who are working on a problem realising just how massive and complicated it is.)
Also, for posterity, there’s some interesting discussion of that interview with Rohin here.
And some other takes on “Why AI risk might be solved without additional intervention from longtermists” are summarised, and then discussed in the comments, here.
But very much in line with technicalities’ comment, it’s of course totally possible to believe that AI risk will probably be solved without additional intervention from longtermists, and yet still think that serious effort should go into raising that probability further.
Great quote from The Precipice on that general idea, in the context of nuclear weapons:
In 1939, Enrico Fermi told Szilard the chain reaction was but a ‘remote possibility’ [...]
Fermi was asked to clarify the ‘remote possibility’ and ventured ‘ten percent’. Isidor Rabi, who was also present, replied, ‘Ten percent is not a remote possibility if it means that we may die of it. If I have pneumonia and the doctor tells me that there is a remote possibility that I might die, and it’s ten percent, I get excited about it’
Btw, there’s a section on “Comparing and combining risks” in Chapter 6 of The Precipice (pages 171-173 in my version), which is very relevant to this discussion. Appendix D expands on that further. I’d recommend interested people check that out.
I’m planning to do it via SoGive in the next few days. I’ll report back once I see how much it raised and if there were any hints of people coming to understand more about effective giving/major risks (seems very doubtful, but worth a shot!).
I got it on Google Play: https://play.google.com/store/books/details/Toby_Ord_The_Precipice?id=W7rEDwAAQBAJ
The Precipice—Toby Ord (Chapter 5 has a full section on Dystopian Scenarios)
The Totalitarian Threat—Bryan Caplan (a link to a Word doc version can be found on this page) (some related discussion on the 80k podcast here; use the “find” function)
The Centre for the Governance of AI’s research agenda—Allan Dafoe (this contains discussion of “robust totalitarianism”, and related matters)
A shift in arguments for AI risk—Tom Sittler (this has a brief but valuable section on robust totalitarianism) (discussion of the overall piece here)
Existential Risk Prevention as Global Priority—Nick Bostrom (this discusses the concepts of “permanent stagnation” and “flawed realisation”, and very briefly touches on their relevance to e.g. lasting totalitarianism)
I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.
These seem to often be examples of hedge drift, and their potential consequences seem like examples of memetic downside risks.
Questions: Is a change in the offence-defence balance part of why interstate (and intrastate?) conflict appears to have become less common? Does this have implications for the likelihood and trajectories of conflict in future (and perhaps by extension x-risks)?
Epistemic status: This post is unpolished, un-researched, and quickly written. I haven’t looked into whether existing work has already explored questions like these; if you know of any such work, please comment to point me to it.
Background/elaboration: Pinker argues in The Better Angels of Our Nature that many types of violence have declined considerably over history. I’m pretty sure he notes that these trends are neither obviously ephemeral nor inevitable. But the book, and other research pointing in similar directions, seems to me (and I believe others?) to at least weakly support the ideas that:
if we avoid an existential catastrophe, things will generally continue to get better
apart from the potential destabilising effects of technology, conflict seems to be trending downwards, somewhat reducing the risks of e.g. great power war, and by extension e.g. malicious use of AI (though of course a partial reduction in risks wouldn’t necessarily mean we should ignore the risks)
But How Does the Offense-Defense Balance Scale? (by Garfinkel and Dafoe, of the Center for the Governance of AI; summary here) says:
It is well-understood that technological progress can impact offense-defense balances. In fact, perhaps the primary motivation for developing the concept has been to understand the distinctions between different eras of military technology.
For instance, European powers’ failure to predict the grueling attrition warfare that would characterize much of the First World War is often attributed to their failure to recognize that new technologies, such as machine guns and barbed wire, had shifted the European offense-defense balance for conquest significantly toward defense.
holding force sizes fixed, the conventional wisdom holds that a conflict with mid-nineteenth century technology could be expected to produce a better outcome for the attacker than a conflict with early twentieth century technology. See, for instance, Van Evera, ‘Offense, Defense, and the Causes of War’.
The paper tries to use these sorts of ideas to explore how emerging technologies will affect trajectories, likelihood, etc. of conflict. E.g., the very first sentence is: “The offense-defense balance is a central concept for understanding the international security implications of new technologies.”
But it occurs to me that one could also do historical analysis of just how much these effects have played a role in the sort of trends Pinker notes. From memory, I don’t think Pinker discusses this possible factor in those trends. If this factor played a major role, then perhaps those trends are substantially dependent on something “we” haven’t been thinking about as much—perhaps we’ve wondered about whether the factors Pinker discusses will continue, whereas they’re less necessary and less sufficient than we thought for the overall trend (decline in violence/interstate conflict) that we really care about.
And at a guess, that might mean that that trend is more fragile or “conditional” than we might’ve thought. It might mean that we really really can’t rely on that “background trend” continuing, or at least somewhat offsetting the potentially destabilising effects of new tech—perhaps a lot of the trend, or the last century or two of it, was largely about how tech changed things, so if the way tech changes things changes, the trend could very easily reverse entirely.
I’m not at all sure about any of that, but it seems it would be important and interesting to explore. Hopefully someone already has, in which case I’d appreciate someone pointing me to that exploration.
(Also note that what the implications of a given offence-defence balance even are is apparently somewhat complicated/debatable matter. Eg., Garfinkel and Dafoe write: “While some hold that shifts toward offense-dominance obviously favor conflict and arms racing, this position has been challenged on a number of grounds. It has even been suggested that shifts toward offense-dominance can increase stability in a number of cases.”)
I just finished reading the full post this links to. Interesting work, thanks for posting it.
I’m not sure if you’re still pursuing this sort of question or plan to return to it later, but if you are, or if other readers are, a book I imagine would be quite relevant and interesting is Tetlock and Belkin’s Counterfactual Thought Experiments in World Politics. The Amazon description reads:
Political scientists often ask themselves what might have been if history had unfolded differently: if Stalin had been ousted as General Party Secretary or if the United States had not dropped the bomb on Japan. Although scholars sometimes scoff at applying hypothetical reasoning to world politics, the contributors to this volume-including James Fearon, Richard Lebow, Margaret Levi, Bruce Russett, and Barry Weingast-find such counterfactual conjectures not only useful, but necessary for drawing causal inferences from historical data. Given the importance of counterfactuals, it is perhaps surprising that we lack standards for evaluating them. To fill this gap, Philip Tetlock and Aaron Belkin propose a set of criteria for distinguishing plausible from implausible counterfactual conjectures across a wide range of applications. The contributors to this volume make use of these and other criteria to evaluate counterfactuals that emerge in diverse methodological contexts including comparative case studies, game theory, and statistical analysis. Taken together, these essays go a long way toward establishing a more nuanced and rigorous framework for assessing counterfactual arguments about world politics in particular and about the social sciences more broadly.
Unfortunately I haven’t read this book, and doubt I’ll get to it anytime soon, partly because I don’t think there’s an audiobook version. But it sounds like it’d be quite useful for the topic of how tractable changing the course of history is, so I’d love it if someone were to read the book and summarise/apply its most relevant lessons.
Great! I’ve sent Sanjay an email—my thanks to both of you.
Here’s another example of a prior statement of something like the idea I’m proposing should be investigated. This is from Carrick Flynn talking about AI policy and strategy careers:
If you are in this group whose talents and expertise are outside of these narrow areas, and want to contribute to AI strategy, I recommend you build up your capacity and try to put yourself in an influential position. This will set you up well to guide high-value policy interventions as clearer policy directions emerge. [...]
Depending on how slow these “entangled” research questions are to unjam, and on the timelines of AI development, there might be a very narrow window of time in which it will be necessary to have a massive, sophisticated mobilization of altruistic talent. This makes being prepared to mobilize effectively and take impactful action on short notice extremely valuable in expectation. (emphasis in original)
And Richard Ngo discusses similar ideas, again in relation to AI policy.
Posts investigating/discussing any of the questions listed here. These are questions which would be “valuable for someone to research, or at least theorise about, that the current pandemic in some way ‘opens up’ or will provide new evidence about, and that could inform EAs’ future efforts and priorities”.
If anyone has thought of such questions, please add them as answers to to that post.
An example of such a question which I added: “What lessons can be drawn from [events related to COVID-19] for how much to trust governments, mainstream experts, news sources, EAs, rationalists, mathematical modelling by people without domain-specific expertise, etc.? What lessons can be drawn for debates about inside vs outside views, epistemic modesty, etc.?”
Some people have suggested that one way to have a major, long-term influence on the world is for an intellectual movement to develop a body of ideas and have adherents to those ideas in respected positions (e.g., university professorships, high-level civil service or political staffer roles), with these ideas likely lying dormant for a while, but then potentially being taken up when there are major societal disruptions of some sort. I’ve heard these described as making sure there are good ideas “lying around” when an unexpected crisis occurs.
As an example, Kerry Vaughan describes how stagflation “helped to set the stage for alternatives to Keynesian theories to take center stage.” He also quotes Milton Freedman as saying: “the role of thinkers, I believe, is primarily to keep options open, to have available alternatives, so when the brute force of events make a change inevitable, there is an alternative available to change it.”
What evidence did COVID-19, reactions to it, and reactions that seem likely to occur in future, provide for or against these ideas? For example:
Was there a major appetite in governments for lasting changes that EA-aligned (or just very sensible and forward-thinking) civil servants were able to seize upon?
Were orgs like FHI, CSER, and GCRI, or other aligned academics, called upon by governments, media, etc., in a way that (a) seemed to depend on them having spent years developing rigorous versions of ideas about GCRs, x-risks, etc., and (b) seems likely to shift narratives, decisions, etc. in a lasting way?
And to more precisely inform future decisions, it’d be good to get some sense of:
How likely is it that similar benefits could’ve been seized by people “switching into” those pathways, roles, etc. during the crisis, without having built up the credibility, connections, research, etc. in advance?
If anyone did manage to influence substantial changes that seem likely to last, what precise factors, approaches, etc. seemed to help them do so?
Were there apparent instances where someone was almost able to influence such a change? If so, what seemed to block them? How could we position ourselves in future to avoid such blockages?