Note: This shortform is now superseded by a top-level post I adapted it into. There is no longer any reason to read the shortform version.
Here I list all the EA-relevant books I’ve read or listened to as audiobooks since learning about EA, in roughly descending order of how useful I perceive/remember them being to me.
I share this in case others might find it useful, as a supplement to other book recommendation lists. (I found Rob Wiblin, Nick Beckstead, and Luke Muehlhauser’s lists very useful.) That said, this isn’t exactly a recommendation list, because:
some of factors making these books more/less useful to me won’t generalise to most other people
I’m including all relevant books I’ve read (not just the top picks)
Let me know if you want more info on why I found something useful or not so useful.
(See also this list of EA-related podcasts and this list of sources of EA-related videos.)
The Precipice, by Ord, 2020
See here for a list of things I’ve written that summarise, comment on, or take inspiration from parts of The Precipice.
I recommend reading the ebook or physical book rather than audiobook, because the footnotes contain a lot of good content and aren’t included in the audiobook
The book Superintelligence may have influenced me more, but that’s just due to the fact that I read it very soon after getting into EA, whereas I read The Precipice after already learning a lot. I’d now recommend The Precipice first.
Superforecasting, by Tetlock & Gardner, 2015
How to Measure Anything, by Hubbard, 2011
Rationality: From AI to Zombies, by Yudkowsky, 2006-2009
I.e., “the sequences”
Superintelligence, by Bostrom, 2014
Maybe this would’ve been a little further down the list if I’d already read The Precipice
Expert Political Judgement, by Tetlock, 2005
I read this after having already read Superforecasting, yet still found it very useful
Normative Uncertainty, by MacAskill, 2014
This is actually a thesis, rather than a book
I assume it’s now a better idea to read MacAskill, Bykvist, and Ord’s book on the same subject, which is available as a free PDF
Though I haven’t read the book version myself
Secret of Our Success, by Henrich, 2015
See also this interesting Slate Star Codex review
The WEIRDest People in the World: How the West Became Psychologically Peculiar and Particularly Prosperous, by Henrich, 2020
See also the Wikipedia page on the book, this review on LessWrong, and my notes on the book.
I rank Secret of Our Success as more useful to me, but that may be partly because I read it first; if I only read either this book or Secret of Our Success, I’m not sure which I’d find more useful.
The Strategy of Conflict, by Schelling, 1960
See here for my notes on this book, and here for some more thoughts on this and other nuclear-risk-related books.
This and other nuclear-war-related books are more useful for me than they would be for most people, since I’m currently doing research related to nuclear war
This is available as an audiobook, but a few Audible reviewers suggest using the physical book due to the book’s use of equations and graphs. So I downloaded this free PDF into my iPad’s Kindle app.
Human-Compatible, by Russell, 2019
The Book of Why, by Pearl, 2018
I found an online PDF rather than listening to the audiobook version, as the book makes substantial use of diagrams
Blueprint, by Plomin, 2018
This is useful primarily in relation to some specific research I was doing, rather than more generically.
Moral Tribes, by Greene, 2013
Algorithms to Live By, by Christian & Griffiths, 2016
The Better Angels of Our Nature, by Pinker, 2011
See here for some thoughts on this and other nuclear-risk-related books.
Command and Control, by Schlosser, 2013
The Doomsday Machine, by Ellsberg, 2017
The Bomb: Presidents, Generals, and the Secret History of Nuclear War, by Kaplan, 2020
The Alignment Problem, by Christian, 2020
This might be better than Superintelligence and Human-Compatible as an introduction to the topic of AI risk. It also seemed to me to be a surprisingly good introduction to the history of AI, how AI works, etc.
But I’m not sure this’ll be very useful for people who’ve already read/listened to a decent amount (e.g., the equivalent of 4 books) about those topics.
That’s why it’s ranked as low as it is for me.
But maybe I’m underestimating how useful it’d be to many other people in a similar position.
Evidence for that is that someone told me that an AI safety researcher friend of theirs found the book helpful.
The Sense of Style, by Pinker, 2019
One thing to note is that I think a lot of chapter 6 (which accounts for roughly a third of the book) can be summed up as “Don’t worry too much about a bunch of alleged ‘rules’ about grammar, word choice, etc. that prescriptivist purists sometimes criticise people for breaking.”
And I already wasn’t worried most of those alleged rules, and hadn’t even heard of some of them.
And I think one could get the basic point without seeing all the examples and discussion.
So a busy reader might want to skip or skim most of that chapter.
Though I think many people would benefit from the part on commas.
I read an ebook rather than listening to the audiobook, because I thought that might be a better way to absorb the lessons about writing style
The Dead Hand, by Hoffman, 2009
Thinking, Fast and Slow, by Kahneman, 2011
This might be the most useful of all these books for people who have little prior familiarity with the ideas, but I happened to already know a decent portion of what was covered.
Against the Grain, by Scott, 2017
I read this after Sapiens and thought the content would overlap a lot, but in the end I actually thought it provided a lot of independent value.
Sapiens, by Harari, 2015
Destined for War, by Allison, 2017
The Dictator’s Handbook, by de Mesquita & Smith, 2012
Age of Ambition, by Osnos, 2014
Moral Mazes, by Jackall, 1989
The Myth of the Rational Voter, by Caplan, 2007
The Hungry Brain, by Guyenet, 2017
If I recall correctly, I found this surprisingly useful for purposes unrelated to the topics of weight, hunger, etc.
E.g., it gave me a better understanding of the liking-wanting distinction
See also this Slate Star Codex review (which I can’t remember whether I read)
The Quest: Energy, Security, and the Remaking of the Modern World, by Yergin, 2011
Harry Potter and the Methods of Rationality, by Yudkowsky, 2010-2015
I found this both surprisingly useful and very surprisingly enjoyable
To be honest, I was somewhat amused and embarrassed to find what is ultimately Harry Potter fan fiction as enjoyable and thought-provoking as I found this
This overlaps in many ways with Rationality: AI to Zombies, so it would be more valuable to someone who hadn’t already read those sequences
But I’d recommend such a person read those sequences before reading this; I think they’re more useful (though less enjoyable)
Within the 2 hours before I go to sleep, I try not to stimulate my brain too much—e.g., I try to avoid listening to most nonfiction audiobooks during that time. But I found that I could listen to this during that time without it keeping my brain too active. This is a perk, as that period of my day is less crowded with other things to do.
Same goes for the books Steve Jobs, Power Broker, Animal Farm, and Consider the Lobster.
Steve Jobs, by Walter Isaacson, 2011
Surprisingly useful, considering the facts that I don’t plan to at all emulate Jobs’ life and that I don’t work in a relevant industry
Enlightenment Now, by Pinker, 2018
The Undercover Economist Strikes Back, by Harford, 2014
Against Empathy, by Bloom, 2016
Inadequate Equilibria, by Yudkowksy, 2017
Radical Markets, by Posner & Weyl, 2018
How to Be a Dictator: The Cult of Personality in the Twentieth Century, by Dikötter, 2019
On Tyranny: 20 Lessons for the 20th Century, by Snyder, 2017
It seemed to me that most of what Snyder said was either stuff I already knew, stuff that seemed kind-of obvious or platitude-like, or stuff I was skeptical of
This might be partly due to the book being under 2 hours, and thus giving just a quick overview of the “basics” of certain things
So I do think it might be fairly useful per minute for someone who knew quite little about things like Hitler and the Soviet Union
Climate Matters: Ethics in a Warming World, by John Broome, 2012
The Power Broker, by Caro, 1975
Very interesting and engaging, but also very long and probably not super useful.
Science in the Twentieth Century: A Social-Intellectual Survey, by Goldman, 2004
This is actually a series of audio recordings of lectures, rather than a book
Animal Farm, by Orwell, 1945
Brave New World, by Huxley, 1932
Consider the Lobster, by Wallace, 2005
To be honest, I’m not sure why Wiblin recommended this. But I benefitted from many of Wiblin’s other recommendations. And I did find this book somewhat interesting.
Honorable mention: 1984, by Orwell, 1949. I haven’t included that in the above list because I read it before I learned about EA. But I think the book, despite being a novel, is actually the most detailed exploration I’ve seen of how a stable, global totalitarian system could arise and sustain itself. (I think this is a sign that there needs to be more actual research on that topic—a novel published more than 70 years ago shouldn’t be one of the best sources on an important topic!)
(Hat tip to Aaron Gertler for sort-of prompting me to post this list.)
I recommend making this a top-level post. I think it should be one of the most-upvoted posts on the “EA Books” tag, but I can’t tag it as a Shortform post.
I had actually been thinking I should probably do that sometime, so your message inspired me to pull the trigger and do it now. Thanks!
(I also made a few small improvements/additions while I was at it.)
Civilization Re-Emerging After a Catastrophe—Karim Jebari, 2019 (see also my commentary on that talk)
Civilizational Collapse: Scenarios, Prevention, Responses—Denkenberger & Ladish, 2019
Update on civilizational collapse research—Ladish, 2020 (personally, I found Ladish’s talk more useful; see the above link)
Modelling the odds of recovery from civilizational collapse—Michael Aird (i.e., me), 2020
The long-term significance of reducing global catastrophic risks—Nick Beckstead, 2015 (Beckstead never actually writes “collapse”, but has very relevant discussion of probability of “recovery” and trajectory changes following non-extinction catastrophes)
How much could refuges help us recover from a global catastrophe? - Nick Beckstead, 2015 (he also wrote a related EA Forum post)
Various EA Forum posts by Dave Denkenberger (see also ALLFED’s site)
Aftermath of Global Catastrophe—GCRI, no date (this page has links to other relevant articles)
A (Very) Short History of the Collapse of Civilizations, and Why it Matters—David Manheim, 2020
A grant application from Ladish, and Oliver Habryka’s thoughts on it − 2019
Civilisational collapse has a bright past – but a dark future—Luke Kemp, 2019
Are we on the road to civilisation collapse? - Luke Kemp, 2019
Civilization: Institutions, Knowledge and the Future—Samo Burja, 2018
Secret of Our Success—Henrich, 2015 (not about collapse, but it has many relevant insights, in my opinion) (see also the Slate Star Codex review)
Is there a subfield of economics devoted to “fragility vs resilience”? (and the answers there) - steve6320 and various commenters, 2020
I also have some as-yet unpublished work on collapse & recovery that I’m happy to share upon request.
Things about existential risk or GCRs more broadly, but with relevant parts
Toby Ord on the precipice and humanity’s potential futures − 2020 (the first directly relevant part is in the section on nuclear war)
The Precipice—Ord, 2020
Long-Term Trajectories of Human Civilization—Baum et al., 2019 (the authors never actually write “collapse”, but their section 4 is very relevant to the topic)
Towards Comprehensive Existential Risk Assessment: A Bayesian Network Model And Proposal For Assessment—Rozendal, 2019, working paper
Defence in Depth Against Human Extinction: Prevention, Response, Resilience, and Why They All Matter—Cotton-Barratt, Daniel, Sandberg, 2020
Existential Risk Strategy Conversation with Holden Karnofsky, Eliezer Yudkowsky, and Luke Muehlhauser − 2014
Causal diagrams of the paths to existential catastrophe—Michael Aird, 2020
Stuart Armstrong interview − 2014 (the relevant section is 7:45-14:30)
Existential Risk Prevention as Global Priority—Bostrom, 2012
The Future of Humanity—Bostrom, 2007 (covers similar points to the above paper)
How Would Catastrophic Risks Affect Prospects for Compromise? - Tomasik, 2013/2017
Crucial questions for longtermists—Michael Aird, 2020
Things that sound relevant, but which I haven’t read/watched/listened to yet
Catastrophe, Social Collapse, and Human Extinction—Robin Hanson, 2007
The Fragile World Hypothesis: Complexity, Fragility, and Systemic Existential Risk—David Manheim,
Existential Risks: Exploring a Robust Risk Reduction Strategy—Karim Jebari, 2015
Islands as refuges for surviving global catastrophes—Turchin & Green, 2018
Videos and slides from a Princeton Workshop on Historical Systemic Collapse − 2019
Feeding Everyone No Matter What - Denkenberger & Pearce, 2014
Why and how civilisations collapse—Kemp [CSER]
https://en.wikipedia.org/wiki/The_Knowledge:_How_to_Rebuild_Our_World_from_Scratch—Dartnell [book] (there’s also this TEDx Talk by the author, but I didn’t find that very useful from a civilizational collapse perspective)
The Collapse of Complex Societies—Joseph Tainter, 1988
1177 B.C.: The Year Civilization Collapsed—Eric Cline, 2014
On Collapse Risk (C-Risk) - Pawntoe4, 2020
I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.
Guns, Germs, and Steel—I felt this provided a good perspective on the ultimate factors leading up to agriculture and industry.
Great, thanks for adding that to the collection!
Suggested by a member of the History and Effective Altruism Facebook group:
Disputers of the Tao, by A. C. Graham
See also the book recommendations here.
Book Review: Why We’re Polarized—Astral Codex Ten, 2021
EA considerations regarding increasing political polarization—Alfred Dreyfus, 2020
Adapting the ITN framework for political interventions & analysis of political polarisation—OlafvdVeen, 2020
Thoughts on electoral reform—Tobias Baumann, 2020
Risk factors for s-risks—Tobias Baumann, 2019
Other EA Forum posts tagged Political Polarization
(Perhaps some older Slate Star Codex posts? I can’t remember for sure.)
I intend to add to this list over time. If you know of other relevant work, please mention it in a comment.
Also, I’m aware that there has also been a vast amount of non-EA analysis of this topic. The reasons I’m collecting only analyses by EAs/EA-adjacent people here are that:
their precise focuses or methodologies may be more relevant to other EAs than would be the case with non-EA analyses
links to non-EA work can be found in most of the things I list here
I’d guess that many collections of non-EA analyses of these topics already exist (e.g., in reference lists)
I’ve written some posts on related themes.
Great, thanks for adding these to the collection!
To provide us with more empirical data on value drift, would it be worthwhile for someone to work out how many EA Forum users each year have stopped being users the next year? E.g., how many users in 2015 haven’t used it since?
Would there be an easy way to do that? Could CEA do it easily? Has anyone already done it?
One obvious issue is that it’s not necessary to read the EA Forum in order to be “part of the EA movement”. And this applies more strongly for reading the EA Forum while logged in, for commenting, and for posting, which are presumably the things there’d be data on.
But it still seems like this could provide useful evidence. And it seems like this evidence would have a different pattern of limitations to some other evidence we have (e.g., from the EA Survey), such that combining these lines of evidence could help us get a clearer picture of the things we really care about.
Movement collapse scenarios—Rebecca Baron
Why do social movements fail: Two concrete examples. - NunoSempere
What the EA community can learn from the rise of the neoliberals—Kerry Vaughan
How valuable is movement growth? - Owen Cotton-Barratt (and I think this is sort-of a summary of that article)
Long-Term Influence and Movement Growth: Two Historical Case Studies—Aron Vallinder, 2018
Some of the Sentience Institute’s research, such as its “social movement case studies”* and the post How tractable is changing the course of history?
A Framework for Assessing the Potential of EA Development in Emerging Locations* - jahying
Hard-to-reverse decisions destroy option value—Schubert & Garfinkel, 2017
These aren’t quite “EA analyses”, but Slate Star Codex has several relevant book reviews and other posts, such as:
It appears Animal Charity Evaluators did relevant research, but I haven’t read it, they described it as having been “of variable quality”, and they’ve discontinued it.
In this comment, Pablo Stafforini refers to some relevant work that sounds like it’s non-public.
See also my collection of work on value drift, and my list of some history topics it might be very valuable to investigate.
*Asterisks indicate I haven’t read that source myself, and thus that the source might not actually be a good fit for this list.
Also, I’m aware that there are a lot of non-EA analyses of these topics. The reasons I’m collecting only EA analyses here are that:
I have a list here that has some overlap but also some new things: https://docs.google.com/document/d/1KyVgBuq_X95Hn6LrgCVj2DTiNHQXrPUJse-tlo8-CEM/edit#
That looks very helpful—thanks for sharing it here!
This is probably too broad but here’s Open Philanthropy’s list of case studies on the History of Philanthropy which includes ones they have commissioned, though most are not done by EAs with the exception of Some Case Studies in Early Field Growth by Luke Muehlhauser.
Edit: fixed links
Yeah, I think those are relevant, thanks for mentioning them!
It looks like the links lead back to your comment for some reason (I think I’ve done similar in the past). So, for other readers, here are the links I think you mean: 1, 2.
(Also, FWIW, I think if an analysis is by a non-EA by commissioned by an EA, I’d say that essentially counts as an “EA analysis” for my purposes. This is because I expect that such work’s “precise focuses or methodologies may be more relevant to other EAs than would be the case with [most] non-EA analyses”.)
I recently requested people take a survey on the quality/impact of things I’ve written. So far, 22 people have generously taken the survey. (Please add yourself to that tally!)
Here I’ll display summaries of the first 21 responses (I may update this later), and reflect on what I learned from this.
I had also made predictions about what the survey results would be, to give myself some sort of ramshackle baseline to compare results against. I was going to share these predictions, then felt no one would be interested; but let me know if you’d like me to add them in a comment.
For my thoughts on how worthwhile this was and whether other researchers/organisations should run similar surveys, see Should surveys about the quality/impact of research outputs be more common?
(Note that many of the things I’ve written were related to my work with Convergence Analysis, but my comments here reflect only my own opinions.)
Q5: “If you think anything I’ve written has affected your beliefs, please say what that thing was (either titles or roughly what the topic was), and/or say how it affected your beliefs.”
(I didn’t ask for permission to share people’s comments, so, for this and the other comment questions, I’ll just highlight some recurring themes or seemingly noteworthy specifics.)
9⁄21 respondents answered this
The writings people mentioned specifically were my collections and summaries of existing ideas/work (e.g., A central directory for open research questions), Database of existential risk estimates, Improving the future by influencing actors’ benevolence, intelligence, and power, and my comments on the Google doc of another person who wanted feedback.
Most responses seemed to indicate the shift in beliefs caused by my work was fairly small.
Q7: “If you think anything I’ve written has affected your decisions or plans, please say what that thing was (either titles or roughly what the topic was), and/or say how it affected your decisions or plans.”
5⁄21 respondents answered this
One respondent mentioned a way in which something I wrote contributed meaningfully to an output of theirs which I think is quite valuable
One respondent indicated Some history topics it might be very valuable to investigate influenced them somewhat
Another indicated Improving the future by influencing actors’ benevolence, intelligence, and power might inform an important decision
There was one other small influence
Q8, text box: “If you answered “Yes” to either of the above, could you say a bit about why?”
15⁄21 respondents filled in this text box
Some respondents indicated things “on their end” (e.g., busyness, attention span), or that they’d have said yes to one or both of those questions for most authors rather than just for me in particular
Some respondents mentioned topics just not seeming relevant to their interests
Some respondents mentioned my posts being long, being rambly, or failing to have a summary
Some respondents mentioned they were already well-versed in the areas I was writing about and didn’t feel my posts were necessary for them
Q9: “Do you have any other feedback on specific things I’ve written, my general writing style, my topic choices, or anything else?”
10⁄21 respondents answered this
Several non-specific positive comments/encouragements
Several positive or neutral comments on me having a lot of output
Several comments suggesting I should be more concise, use summaries more consistently, and/or be clearer about what the point of what I’m writing is
Some comments indicating appreciation of my summaries, collections, and efforts to make ideas accessible
Some comments on my writing style and clarity being good
Some comments that my original research wasn’t very impressive
One comment that I seem to hung up on defining things precisely/prescriptively
(I don’t actually endorse linguistic prescriptivism, and remember occasionally trying to make that explicit. But I’ll take this as useful data that I’ve sometimes accidentally given that impression, and try to adjust accordingly.)
Q10: “If you would like to share your name, please do so below. But this is 100% voluntary—you’re not at all obliged to do so :)”
6⁄21 respondents gave their name/username
2 gave their email for if I wanted to follow-up
Some takeaways from all this
Responses were notably more positive than expected for some questions, and notably less positive for others
I don’t think this should notably change my bottom-line view of the overall quality and impact of my work to date
But it does make me a little less uncertain about that all-things-considered view, as I now have slightly more data that roughly supports it
In turn, this updates me towards being a little more confident that it makes sense for me to focus on pursuing an EA research career for now (rather than, e.g., switching to operations or civil service roles)
This is because I’m now slightly less worried that I’m being strongly influenced by overconfidence or motivated reasoning. (I already wanted to do research or writing before learning about EA.)
I should definitely more consistently include summaries, and/or in other ways signal early and clearly what the point of a post is
I was already aiming to move in this direction, and had predicted responses would often mention this, but this has still given me an extra push
I should look out for ways in which I might appear linguistically prescriptive or overly focused on definitions/precision
I should more seriously consider moving more towards concision, even at the cost of precision, clarity, or comprehensiveness
Though I’m still not totally sold on that
I’m also aware that this shortform comment is not a great first step!
I should consider moving more towards concision, even at the cost of quantity/speed of output
With extra time on a given post, I could perhaps find ways to be more concise without sacrificing other valuable things
I should feel less like I “have to” produce writings rapidly
This point is harder to explain briefly, so I’ll just scratch the surface here
I don’t actually expect this to substantially change my behaviours, as that feeling wasn’t the main reason for my large amount of output
But if my output slows for some other reason, I think I’ll now not feel (as) bad about that
People have found my summaries and collections very useful, and some people have found my original research not so useful/impressive
The “direction” of this effect is in line with my expectations, but the strength was surprising
I’ve updated towards more confidence that my summaries and (especially) my collections were valuable and worth making, and this may slightly increase the already-high chance that I’ll continue creating that sort of thing
But this is also slightly confusing, as my original research/ideas and/or aptitude for future original research seems to have put me in good stead for various job and grant selection processes
And I don’t have indications that my summaries or collections helped there, though they may have
Much of my work to date may be less useful for more experienced/engaged EAs than less experienced/engaged EAs
This is in line with my sense that I was often trying to make ideas more accessible, make getting up to speed easier, etc.
There seemed to be a weak correlation between how recently something was posted and how often it was positively mentioned
This broadly aligns with trends from other data sources (e.g., researchers reaching out to me, upvotes)
This could suggest that:
my work is getting better
people are paying more attention to things written by me, regardless of their quality
people just remember the recent stuff more
I’d guess all three of those factors play some role
(I also have additional thoughts that are fuzzier or even less likely to be of interest to anyone other than me.)
 There are of course myriad reasons to not read into this data too much, including that:
it’s from a sample of only 21 people
the sample was non-representative, and indeed self-selecting (so it may, for example, disproportionately represent people who like my work)
the responses may be biased towards not hurting my feelings
That said, I think I can still learn something from this data, especially given flaws in other data sources I have. (E.g., comments from people who choose to randomly and non-anonymously reach out to me may be even more positively biased.)
If you’ve made it this far, you may also be interested in the above-mentioned Should surveys about the quality/impact of research outputs be more common?
“People have found my summaries and collections very useful, and some people have found my original research not so useful/impressive”
I haven’t read enough of your original research to know whether it applies in your case but just flagging that most original research has a much narrower target audience than the summaries/collections, so I’d expect fewer people to find it useful (and for a relatively broad summary to be biased against them).
That said, as you know, I think your summaries/collections are useful and underprovided.
Though I guess I suspect that, if the reason a person finds my original research not so useful is just because they aren’t the target audience, they’d be more likely to either not explicitly comment on it or to say something about it not seeming relevant to them. (Rather than making a generic comment about it not seeming useful.)
But I guess this seems less likely in cases where:
the person doesn’t realise that the key reason it wasn’t useful is that they weren’t the target audience, or
the person feels that what they’re focused on is substantially more important than anything else (because then they’ll perceive “useful to them” as meaning a very similar thing to “useful”)
In any case, I’m definitely just taking this survey as providing weak (though useful) evidence, and combining it with various other sources of evidence.
tl;dr: Toby Ord seems to imply that economic stagnation is clearly an existential risk factor. But I that we should actually be more uncertain about that; I think it’s plausible that economic stagnation would actually decrease economic risk, at least given certain types of stagnation and certain starting conditions.
(This is basically a nitpick I wrote in May 2020, and then lightly edited recently.)
In The Precipice, Toby Ord discusses the concept of existential risk factors: factors which increase existential risk, whether or not they themselves could “directly” cause existential catastrophe. He writes:
An easy way to find existential risk factors is to consider stressors for humanity or for our ability to make good decisions. These include global economic stagnation… (emphasis added)
This seems to me to imply that global economic stagnation is clearly and almost certainly an existential risk factor.
He also discusses the inverse concept, existential security factors: factors which reduce existential risk. He writes:
Many of the things we commonly think of as social goods may turn out to also be existential security factors. Things such as education, peace or prosperity may help protect us. (emphasis added)
It does seem to me quite plausible—indeed, probably >50% likely—that global economic stagnation is an existential risk factor, and that prosperity is a security factor (or at least that they tend to be these things). And in the case of prosperity, Ord merely says that prosperity may help protect us, which seems an entirely fair statement. (In the case of global economic stagnation, he seems to be making a stronger claim.)
But it also seems like how economic growth affects existential risk is still a fairly open and important question. (This is related to the idea of differential progress.)
And it also seems plausible that increasing growth from unusually low levels could be protective, while increasing it further from already high levels could increase risk, or something like that.
In fact, Ord himself separately—not in the context of economic growth—provides an interesting discussion of “the question of variables that both increase and decrease existential risk over different parts of their domains (i.e. where existential risk is non-monotonic in that variable).” He says that, in certain cases, we will need to consider such variables not as simply risk or security factors, but “as a more complex kind of factor instead”.
Altogether, I think that, if I had been the person writing The Precipice:
The book would’ve been much less excellent
...But also, I would’ve tried to make it clearer that global economic stagnation is just plausibly or probably an existential risk factor, rather than definitely one.
I think I would’ve highlighted economic growth as a potential example of one of the “more complex kind[s] of factor[s]”, for which the relationship is non-monotonic.
(See also this paper, this summary of it, and posts tagged differential progress. Based on a skim, that paper seems to suggest that economic growth reduces total existential risk, but also that it might increase annual risk in the short-run. I think that that’d roughly support Ord’s statements. But given that that’s just one paper on a complex topic, I still think we shouldn’t be highly confident that economic growth is (always) an existential security factor.)
You can see a list of all the things I’ve written that summarise, comment on, or take inspiration from parts of The Precipice here.
Epistemic status: Unimportant hot take on a paper I’ve only skimmed.
Watson and Watson write:
Conditions capable of supporting multicellular life are predicted to continue for another billion years, but humans will inevitably become extinct within several million years. We explore the paradox of a habitable planet devoid of people, and consider how to prioritise our actions to maximise life after we are gone.
I react: Wait, inevitably? Wait, why don’t we just try to not go extinct? Wait, what about places other than Earth?
They go on to say:
Finally, we offer a personal challenge to everyone concerned about the Earth’s future: choose a lineage or a place that you care about and prioritise your actions to maximise the likelihood that it will outlive us. For us, the lineages we have dedicated our scientific and personal efforts towards are mistletoes (Santalales) and gulls and terns (Laridae), two widespread groups frequently regarded as pests that need to be controlled. The place we care most about is south-eastern Australia – a region where we raise a family, manage a property, restore habitats, and teach the next generations of conservation scientists. Playing favourites is just as much about maintaining wellbeing and connecting with the wider community via people with shared values as it is about maximising future biodiversity.
I react: Wait, seriously? Your recipe for wellbeing is declaring the only culture-creating life we know of (ourselves) irreversibly doomed, and focusing your efforts instead on ensuring that mistletoe survives the ravages of deep time?
Even if your focus is on maximising future biodiversity, I’d say it still makes sense to set your aim a little higher—try to keep us afloat to keep more biodiversity afloat. (And it seems very unclear to me why we’d value biodiversity intrinsically, rather than individual nonhuman animal wellbeing, even if we cared more about nature than humans, but that’s a separate story.)
This was a reminder to me of how wide the gulf can be between different people’s ways of looking at the world.
It also reminded me of this quote from Dave Denkenberger:
In 2011, I was reading this paper called Fungi and Sustainability, and the premise was that after the dinosaur killing asteroid, there would not have been sunlight and there were lots of dead trees and so mushrooms could grow really well. But its conclusion was that maybe when humans go extinct, the world will be ruled by mushrooms again. I thought, why don’t we just eat the mushrooms and not go extinct?
This is a lightly edited version of some quick thoughts I wrote in May 2020. These thoughts are just my reaction to some specific claims in The Precipice, intended in a spirit of updating incrementally. This is not a substantive post containing my full views on nuclear war or collapse & recovery.
In The Precipice, Ord writes:
[If a nuclear winter occurs,] Existential catastrophe via a global unrecoverable collapse of civilisation also seems unlikely, especially if we consider somewhere like New Zealand (or the south-east of Australia) which is unlikely to be directly targeted and will avoid the worst effects of nuclear winter by being coastal. It is hard to see why they wouldn’t make it through with most of their technology (and institutions) intact.
(See also the relevant section of Ord’s 80,000 Hours interview.)
I share the view that it’s unlikely that New Zealand would be directly targeted by nuclear war, or that nuclear winter would cause New Zealand to suffer extreme agricultural losses or lose its technology. (That said, I haven’t looked into that closely myself.) However, it seems to me relatively easy to see why New Zealand might suffer a collapse—whether immediately following the nuclear war or after months, years, or decades. For example, I think collapse in New Zealand could plausibly be caused by:
Some massive emotional, social, and political reactions within New Zealand to a global nuclear war and nuclear winter
Nuclear winter might kill billions and cause many countries to collapse, and it seems hard to predict how people elsewhere would react to that
Huge numbers of people (perhaps over a billion?) trying to get into New Zealand if agriculture and/or civilization in most other places collapses
Further military actions by panicking governments or starving populaces
Sudden collapse of global trade
But what particularly stood out to me the above passage was Ord’s suggestion that it’s “hard to see” why New Zealand’s institutions wouldn’t remain intact. For the above reasons, I would see it as likely that there’d be major shifts in New Zealand’s institutions in a scenario where nuclear winter caused collapse in most of the rest of the world. And I’d see it as plausible that these shifts would be for the worse, and would cause NZ’s institutions to no longer be “intact”. (I’m not sure whether this is really a strong disagreement with Ord, as I’m not sure precisely what he meant by “hard to see”.)
The more generalised version of the ideas I’m expressing is that I’m quite concerned about what “recovery” from collapse might look like—I think in a lot of scenarios, recovery along technological and economic dimensions seems fairly likely, but it seems far harder to say what our morals, norms, social institutions, political systems, etc. would be like. It’s quite unclear to me how inevitable the apparent global trends towards something like capitalism (rather than something like feudalism), democracy, moral circle expansion, liberty for slaves, etc. were, and whether any inevitability there was would remain in place following the “scarring” and upheaval of a collapse.
This view is related to the following statements from Beckstead (2015):
If a global catastrophe occurs, I believe there is some (highly uncertain) probability that civilization would not fully recover (though I would also guess that recovery is significantly more likely than not). This seems possible to me for the general and non-specific reason that the mechanisms of civilizational progress are not understood and there is essentially no historical precedent for events severe enough to kill a substantial fraction of the world’s population. I also think that there are more specific reasons to believe that an extreme catastrophe could degrade the culture and institutions necessary for scientific and social progress, and/or upset a relatively favorable geopolitical situation. This could result in increased and extended exposure to other global catastrophic risks, an advanced civilization with a flawed realization of human values, failure to realize other “global upside possibilities,” and/or other issues.[...]In this way, our situation seems analogous to the situation of someone who is caring for a sapling, has very limited experience with saplings, has no mechanistic understanding of how saplings work, and wants to ensure that nothing stops the sapling from becoming a great redwood. It would be hard for them to be confident that the sapling’s eventual long-term growth would be unaffected by unprecedented shocks—such as cutting off 40% of its branches or letting it go without water for 20% longer than it ever had before—even taken as given that such shocks wouldn’t directly/immediately result in its death. For similar reasons, it seems hard to be confident that humanity’s eventual long-term progress would be unaffected by a catastrophe that resulted in hundreds of millions of deaths.
If a global catastrophe occurs, I believe there is some (highly uncertain) probability that civilization would not fully recover (though I would also guess that recovery is significantly more likely than not). This seems possible to me for the general and non-specific reason that the mechanisms of civilizational progress are not understood and there is essentially no historical precedent for events severe enough to kill a substantial fraction of the world’s population. I also think that there are more specific reasons to believe that an extreme catastrophe could degrade the culture and institutions necessary for scientific and social progress, and/or upset a relatively favorable geopolitical situation. This could result in increased and extended exposure to other global catastrophic risks, an advanced civilization with a flawed realization of human values, failure to realize other “global upside possibilities,” and/or other issues.
In this way, our situation seems analogous to the situation of someone who is caring for a sapling, has very limited experience with saplings, has no mechanistic understanding of how saplings work, and wants to ensure that nothing stops the sapling from becoming a great redwood. It would be hard for them to be confident that the sapling’s eventual long-term growth would be unaffected by unprecedented shocks—such as cutting off 40% of its branches or letting it go without water for 20% longer than it ever had before—even taken as given that such shocks wouldn’t directly/immediately result in its death. For similar reasons, it seems hard to be confident that humanity’s eventual long-term progress would be unaffected by a catastrophe that resulted in hundreds of millions of deaths.
 I’m not sure precisely what any of those things would look like, how they could lead to collapse, how likely they are, or how likely recovery from such a collapse might be in any case. Perhaps Ord has looked into such possibilities in depth, and concluded they don’t pose a major concern. But to me it at least seems plausible that they could cause a major collapse even in places such as New Zealand. And if collapse does occur, I see recovery as not guaranteed (although probably >50% likely, at least for economic and technological recovery).
Differential progress / intellectual progress / technological development—Michael Aird (me), 2020
Differential technological development—summarised introduction—james_aung, 2020
Differential Intellectual Progress as a Positive-Sum Project—Tomasik, 2013/2015
Differential technological development: Some early thinking—Beckstead (for GiveWell), 2015/2016
Differential progress—EA Concepts
Differential technological development—Wikipedia
Existential Risk and Economic Growth—Aschenbrenner, 2019 (summary by Alex HT here)
On Progress and Prosperity—Christiano, 2014
How useful is “progress”? - Christiano, ~2013
Improving the future by influencing actors’ benevolence, intelligence, and power—Aird, 2020
Differential intellectual progress—LW Wiki
Existential Risks: Analyzing Human Extinction Scenarios—Bostrom, 2002 (section 9.4) (introduced the term differential technological development, I think)
Intelligence Explosion: Evidence and Import—Muehlhauser & Salamon (for MIRI) (section 4.2) (introduced the term differential intellectual development, I think)
The Precipice—Ord, 2020 (page 206)
Some sources that are quite relevant but that don’t explicitly use those terms
Strategic Implications of Openness in AI Development—Bostrom, 2017
The growth of our “power” (or “science and technology”) vs our “wisdom” (see, e.g., page 34 of The Precipice)
The “pacing problem” (see, e.g., footnote 57 in Chapter 1 of The Precipice)
tl;dr I think it’s “another million years”, or slightly longer, but I’m not sure.
In The Precipice, Toby Ord writes:
How much of this future might we live to see? The fossil record provides some useful guidance. Mammalian species typically survive for around one million years before they go extinct; our close relative, Homo erectus, survived for almost two million. If we think of one million years in terms of a single, eighty-year life, then today humanity would be in its adolescence—sixteen years old, just coming into our power; just old enough to get ourselves into serious trouble.
(There are various extra details and caveats about these estimates in the footnotes.)
Ord also makes similar statements on the FLI Podcast, including the following:
If you think about the expected lifespan of humanity, a typical species lives for about a million years [I think Ord meant “mammalian species”]. Humanity is about 200,000 years old. We have something like 800,000 or a million or more years ahead of us if we play our cards right and we don’t lead to our own destruction. The analogy would be 20% of the way through our life[...]
I think this is a strong analogy from a poetic perspective. And I think that highlighting the typical species’ lifespan is a good starting point for thinking about how long we might have left. (Although of course we could also draw on many other facts for that analysis, as Ord discusses in the book.)
But I also think that there’s a way in which the lifespan analogy might be a bit misleading. If a human is 70, we expect they have less time less to live than if a human is 20. But I’m not sure whether, if a species if 700,000 years old, we should expect that species to go extinct sooner than a species that is 200,000 years old will.
My guess would be that a ~1 million year lifespan for a typical mammalian species would translate into a roughly 1 in a million chance of extinction each year, which doesn’t rise or fall very much in a predictable way over most of the species’ lifespan. Specific events, like changes in a climate or another species arriving/evolving, could easily change the annual extinction rate. But I’m not aware of an analogy here to how ageing increases the annual risk of humans dying from various causes.
I would imagine that, even if a species has been around for almost or more than a million years, we should still perhaps expect a roughly 1 in a million chance of extinction each year. Or perhaps we should even expect them to have a somewhat lower annual chance of extinction, and thus a higher expected lifespan going forwards, based on how long they’ve survived so far?
(But I’m also not an expert on the relevant fields—not even certain what they would be—and I didn’t do extra research to inform this shortform comment.)
I don’t think that Ord actually intends to imply that species’ “lifespans” work like humans’ lifespans do. But the analogy does seem to imply it. And in the FLI interview, he does seem to briefly imply that, though of course there he was speaking off the cuff.
I’m also not sure how important this point is, given that humans are very atypical anyway. But I thought it was worth noting in a shortform comment, especially as I expect that, in the wake of The Precipice being great, statements along these lines may be quoted regularly over the coming months.
I thought The Precipice was a fantastic book; I’d highly recommend it. And I agree with a lot about Chivers’ review of it for The Spectator. I think Chivers captures a lot of the important points and nuances of the book, often with impressive brevity and accessibility for a general audience. (I’ve also heard good things about Chivers’ own book.)
But there are three parts of Chivers’ review that seem to me to like they’re somewhat un-nuanced, or overstate/oversimplify the case for certain things, or could come across as overly alarmist.
I think Ord is very careful to avoid such pitfalls in The Precipice, and I’d guess that falling into such pitfalls is an easy and common way for existential risk related outreach efforts to have less positive impacts than they otherwise could, or perhaps even backfire. I understand that a review gives on far less space to work with than a book, so I don’t expect anywhere near the level of nuance and detail. But I think that overconfident or overdramatic statements of uncertain matters (for example) can still be avoided.
I’ll now quote and comment on the specific parts of Chivers’ review that led to that view of mine.
Firstly, in my view, there are three flaws with the opening passage of the review:
Humanity has come startlingly close to destroying itself in the 75 or so years in which it has had the technological power to do so. Some of the stories are less well known than others. One, buried in Appendix D of Toby Ord’s splendid The Precipice, I had not heard, despite having written a book on a similar topic myself. During the Cuban Missile Crisis, a USAF captain in Okinawa received orders to launch nuclear missiles; he refused to do so, reasoning that the move to DEFCON 1, a war state, would have arrived first.
Not only that: he sent two men down the corridor to the next launch control centre with orders to shoot the lieutenant in charge there if he moved to launch without confirmation. If he had not, I probably would not be writing this — unless with a charred stick on a rock.
First issue: Toby Ord makes it clear that “the incident I shall describe has been disputed, so we cannot yet be sure whether it occurred.” Ord notes that “others who claimed to have been present in the Okinawa missile bases at the time” have since challenged this account, although there is also “some circumstantial evidence” supporting the account. Ultimately, Ord concludes “In my view this alleged incident should be taken seriously, but until there is further confirmation, no one should rely on it in their thinking about close calls.” I therefore think Chivers should’ve made it clear that this is a disputed story.
Second issue: My impression from the book is that, even in the account of the person claiming this story is true, the two men sent down the corridor did not turn out to be necessary to avert the launch. (That said, the book isn’t explicit on the point, so I’m unsure.) Ord writes that Bassett “telephoned the Missile Operations Centre, asking the person who radioed the order to either give the DEFCON 1 order or issue a stand-down order. A stand-down order was quickly given and the danger was over.” That is the end of Ord’s retelling of the account itself (rather than discussion of the evidence for or against it).
Third issue: I think it’s true that, if a nuclear launch had occurred in that scenario, a large-scale nuclear war probably would’ve occurred (though it’s not guaranteed, and it’s hard to say). And if that happened, it seems technically true that Chivers probably would’ve have written this review. But I think that’s primarily because history would’ve just unfolded very, very difficulty. Chivers seems to imply this is because civilization probably would’ve collapsed, and done so so severely than even technologies such as pencils would be lost and that they’d still be lost all these decades on (such that, if he was writing this review, he’d do so with “a charred stick on a rock”).
This may seem like me taking a bit of throwaway rhetoric or hyperbole too seriously, and that may be so. But I think among the key takeaways of the book were vast uncertainties around whether certain events would actually lead to major catastrophes (e.g., would a launch lead to a full-scale nuclear war?), whether catastrophes would lead to civilizational collapse (e.g., how severe and long-lasting would the nuclear winter be, and how well would we adapt?), how severe collapses would be (e.g., to pre-industrial or pre-agricultural levels?), and how long-lasting collapses would be (from memory, Ord seems to think recovery is in fact fairly likely).
So I worry that a sentence like that one makes the book sound somewhat alarmist, doomsaying, and naive/simplistic, whereas in reality it seems to me quite nuanced and open about the arguments for why existential risk from certain sources may be “quite low”—and yet still extremely worth attending to, given the stakes.
To be fair, or to make things slightly stranger, Chivers does later say:
Perhaps surprisingly, [Ord] doesn’t think that nuclear war would have been an existential catastrophe. It might have been — a nuclear winter could have led to sufficiently dreadful collapse in agriculture to kill everyone — but it seems unlikely, given our understanding of physics and biology.
(Also, as an incredibly minor point, I think the relevant appendix was Appendix C rather than D. But maybe that was different in different editions or in an early version Chivers saw.)
Secondly, Chivers writes:
[Ord] points out that although the difference between a disaster that kills 99 per cent of us and one that kills 100 per cent would be numerically small, the outcome of the latter scenario would be vastly worse, because it shuts down humanity’s future.
I don’t recall Ord ever saying something like that the death of 1 percent of the population would be “numerically small”. Ord very repeatedly emphasises and reminds the reader that something really can count as deeply or even unprecedently awful, and well worth expending resources to avoid, even if it’s not an existential catastrophe. This seems to me a valuable thing to do, otherwise the x-risk community could easily be seen as coldly dismissive of any sub-existential catastrophes. (Plus, such catastrophes really are very bad and well worth expending resources to avoid—this is something I would’ve said anyway, but seems especially pertinent in the current pandemic.)
I think saying “the difference between a disaster that kills 99 per cent of us and one that kills 100 per cent would be numerically small” cuts against that goal, and again could paint Ord as more simplistic or extremist than he really is.
Finally (for the purpose of my critiques), Chivers writes:
We could live for a billion years on this planet, or billions more on millions of other planets, if we manage to avoid blowing ourselves up in the next century or so.
To me, “avoid blowing ourselves up” again sounds quite informal or naive or something like that. It doesn’t leave me with the impression that the book will be a rigorous and nuanced treatment of the topic. Plus, Ord isn’t primarily concerned with us “blowing ourselves up”—the specific risks he sees as the largest are unaligned AI, engineered pandemics, and “unforeseen anthropogenic risk”.
And even in the case of nuclear war, Ord is quite clear that it’s the nuclear winter that’s the largest source of existential risk, rather than the explosions themselves (though of course the explosions are necessary for causing such a winter). In fact, Ord writes “While one often hears the claim that we have enough nuclear weapons to destroy the world may times over, this is loose talk.” (And he explains why this is loose talk.)
So again, this seems like a case where Ord actively separates his clear-headed analysis of the risks from various naive, simplistic, alarmist ideas that are somewhat common among some segments of the public, but where Chivers’ review makes it sound (at least to me) like the book will match those sorts of ideas.
All that said, I should again note that I thought the review did a lot right. In fact, I have no quibbles at all with anything from that last quote onwards.
This was an excellent meta-review! Thanks for sharing it.
I agree that these little slips of language are important; they can easily compound into very stubborn memes. (I don’t know whether the first person to propose a paperclip AI regrets it, but picking a different example seems like it could have had a meaningful impact on the field’s progress.)
These seem to often be examples of hedge drift, and their potential consequences seem like examples of memetic downside risks.
Information hazards: a very simple typology—Will Bradshaw, 2020
Information hazards and downside risks—Michael Aird (me), 2020
Information hazards—EA concepts
Information Hazards in Biotechnology - Lewis et al., 2019
Bioinfohazards—Crawford, Adamson, Ladish, 2019
Information Hazards—Bostrom, 2011 (I believe this is the paper that introduced the term)
Terrorism, Tylenol, and dangerous information—Davis_Kingsley, 2018
Lessons from the Cold War on Information Hazards: Why Internal Communication is Critical—Gentzel, 2018
Horsepox synthesis: A case of the unilateralist’s curse? - Lewis, 2018
Mitigating catastrophic biorisks—Esvelt, 2020
The Precipice (particularly pages 135-137) - Ord, 2020
Information hazard—LW Wiki
Thoughts on The Weapon of Openness—Will Bradshaw, 2020
Exploring the Streisand Effect—Will Bradshaw, 2020
Informational hazards and the cost-effectiveness of open discussion of catastrophic risks—Alexey Turchin, 2018
A point of clarification on infohazard terminology—eukaryote, 2020
Somewhat less directly relevant
The Offense-Defense Balance of Scientific Knowledge: Does Publishing AI Research Reduce Misuse? - Shevlane & Dafoe, 2020 (commentary here)
The Vulnerable World Hypothesis—Bostrom, 2019 (footnotes 39 and 41 in particular)
Managing risk in the EA policy space—weeatquince, 2019 (touches briefly on information hazards)
Strategic Implications of Openness in AI Development—Bostrom, 2017 (sort-of relevant, though not explicitly about information hazards)
[Review] On the Chatham House Rule (Ben Pace, Dec 2019) - Pace, 2019
Interesting example: Leo Szilard and cobalt bombs
In The Precipice, Toby Ord mentions the possibility of “a deliberate attempt to destroy humanity by maximising fallout (the hypothetical cobalt bomb)” (though he notes such a bomb may be beyond our current abilities). In a footnote, he writes that “Such a ‘doomsday device’ was first suggested by Leo Szilard in 1950″. Wikipedia similarly says:
The concept of a cobalt bomb was originally described in a radio program by physicist Leó Szilárd on February 26, 1950. His intent was not to propose that such a weapon be built, but to show that nuclear weapon technology would soon reach the point where it could end human life on Earth, a doomsday device. Such “salted” weapons were requested by the U.S. Air Force and seriously investigated, but not deployed. [...]
The Russian Federation has allegedly developed cobalt warheads for use with their Status-6 Oceanic Multipurpose System nuclear torpedoes. However many commentators doubt that this is a real project, and see it as more likely to be a staged leak to intimidate the United States.
That’s the extent of my knowledge of cobalt bombs, so I’m poorly placed to evaluate that action by Szilard. But this at least looks like it could be an unusually clear-cut case of one of Bostrom’s subtypes of information hazards:
Attention hazard: The mere drawing of attention to some particularly potent or relevant ideas or data increases risk, even when these ideas or data are already “known”.
Because there are countless avenues for doing harm, an adversary faces a vast search task in finding out which avenue is most likely to achieve his goals. Drawing the adversary’s attention to a subset of especially potent avenues can greatly facilitate the search. For example, if we focus our concern and our discourse on the challenge of defending against viral attacks, this may signal to an adversary that viral weapons—as distinct from, say, conventional explosives or chemical weapons—constitute an especially promising domain in which to search for destructive applications. The better we manage to focus our defensive deliberations on our greatest vulnerabilities, the more useful our conclusions may be to a potential adversary.
It seems that Szilard wanted to highlight how bad cobalt bombs would be, that no one had recognised—or at least not acted on—the possibility of such bombs until he tried to raise awareness of them, and that since he did so there may have been multiple government attempts to develop such bombs.
I was a little surprised that Ord didn’t discuss the potential information hazards angle of this example, especially as he discusses a similar example with regards to Japanese bioweapons in WWII elsewhere in the book.
I was also surprised by the fact that it was Szilard who took this action. This is because one of the main things I know Szilard for is being arguably one of the earliest (the earliest?) examples of a scientist bucking standard openness norms due to, basically, concerns of information hazards potentially severe enough to pose global catastrophic risks. E.g., a report by MIRI/Katja Grace states:
Leó Szilárd patented the nuclear chain reaction in 1934. He then asked the British War Office to hold the patent in secret, to prevent the Germans from creating nuclear weapons (Section 2.1). After the discovery of fission in 1938, Szilárd tried to convince other physicists to keep their discoveries secret, with limited success.
Cross-posted to LessWrong as a top-level post.
I recently finished reading Henrich’s 2020 book The WEIRDest People in the World. I would highly recommend it, along with Henrich’s 2015 book The Secret of Our Success; I’ve roughly ranked them the 8th and 9th most useful-to-me of the 47 EA-related books I’ve read since learning about EA.
In this shortform, I’ll:
Summarise my “four main updates” from this book
Share the Anki cards I made for myself when reading the book
I intend this as a lower-effort alternative to writing notes specifically for public consumption or writing a proper book review
If you want to download the cards themselves to import them into your own deck, follow this link.
My hope is that this will be a low-effort way for me to help some EAs to quickly:
Gain some key insights from the book
Work out whether reading/listening to the book is worth their time
(See also Should pretty much all content that’s EA-relevant and/or created by EAs be (link)posted to the Forum?)
You may find it also/more useful to read
This review of the book on LessWrong (which I haven’t read myself)
The Wikipedia page on the book
The Slate Star Codex review of Secret of Our Success
My four main updates
I wrote this quickly and only after finishing the book; take it all with a grain of salt.
Here are what I think are the four main ways in which WEIRDest People shifted my beliefs on relatively high-level points that seem potentially decision-relevant, as distinct from specific facts I learned:
The book made me a bit less concerned about unrecoverable collapse and unrecoverable dystopia (i.e., the two types of existential catastrophe other than extinction, in Toby Ord’s breakdown)
This is because a big part of my concern was based on the idea that the current state and trend for things like values, institutions, and political systems seems unusually good by historical standards, and we don’t fully understand how that state and trend came about, so we should worry that any “major disruption” could somehow throw us off course and that we wouldn’t be able to get back on course (see Beckstead, 2015).
E.g., perhaps a major war could knock us from a stable equilibrium with many liberal democracies to a stable equilibrium with many authoritarian regimes.
But WEIRDest People made me a bit more confident that our current values, institutions, and political systems would stick around or re-emerge even after a “major disruption”, because they or the things driving them are “fit” in a cultural evolutionary sense.
The book made me less confident that the Industrial Revolution involved a stark change in a number of key trends, and/or made me more open to the idea that the drivers of the changes in those trends began long before the Industrial Revolution
My previous belief was quite influenced by a post by Luke Muehlhauser
Henrich seems to provide strong evidence that some key trends started long before 1750 (some starting in the first millennium CE, most starting by 1200-1500)
But I’m not sure how much Henrich’s book and Muehlhauser’s post actually conflict with each other
E.g., perhaps Henrich would agree (a) that there were discontinuities in all the metrics Muehlhauser looked at, and (b) that those metrics are more directly important than the metrics Henrich looked at; perhaps Henrich would say that the earlier discontinuities in the metrics he looked at were just the things that laid the foundations, not what directly mattered
The book made me less confident that economic growth/prosperity is one of the main drivers of various ways in which the world seems to have gotten better over time (e.g., more democracy, more science, more concern for all of humanity rather than just one’s ingroup)
The book made me more open to the idea that other factors (WEIRD psychology and institutions) caused both economic growth/prosperity and those other positive trends
E.g., I felt that the book pushed somewhat against an attitude expressed in this GiveWell post on flow-through effects
This is related in some ways to my above-mentioned update about the industrial revolution
The book made me more inclined to think that it’s really hard to design institutions/systems based on explicit ideas about how they’ll succeed in achieving desired objectives, or at least that humans tend to be bad at that, and that success more often results from a process of random variation followed by competition.
In reality, this update was mainly caused by Henrich’s previous book, Secret of Our Success, but WEIRDest People drummed it in a bit more, and it seemed worth mentioning here.
Each of those update was more like a partial shift than a total reversal of my previous views
See also Update Yourself Incrementally
E.g., I still tentatively think longtermists should devote more resources/attention should to risks of unrecoverable dystopia than they currently do, but I’m now a bit less confident about that.
I made this list only after finishing the book, and hadn’t been taking notes with this in mind along the way
So I might be distorting these updates or forgetting other important updates
My Anki cards
See the bottom of this shortform for caveats about my Anki cards.
The indented parts are the questions, the answers are in “spoiler blocks” (hover over them to reveal the text), and the parts in square brackets are my notes-to-self.
Henrich’s team found that people from more market-integrated societies made ___ offers in the ultimatum game (compared to people from less market-integrated societies)
Higher, more equal
Credence goods are…
those that buyers can’t easily assess for quality (e.g. a steel sword, whose carbon content is hard to determine)
Henrich discusses strategies to allow trade to happen in absence of market norms. Three I found interesting were…
Silent trade; divine oaths; and a single, widely scattered clan or ethnic group handling all aspects of moving goods through a vast trade network
Four things Henrich said KII and prevalence of cousin marriage were positively correlated with were…
High claims (dishonesty) in the Impersonal Honesty Game
Unpaid parking tickets per diplomat
Seven things Henrich said KII, prevalence of cousin marriage, and/or contemporary KII were negatively correlated with were…
Importance of intentionality in judging a “theft”
Contributions in the Public Good Game [there were two proxies for this]
Voluntary blood donations per 1,000 people
[Some of these things were measured by proxies I’m somewhat skeptical of the relevance/significance of.]
In India and China, analytic thinking (as measured using the triad task) is negatively correlated with…
Percentage of land under rice paddy cultivation
What are three effects Henrich suggests that exposure to war tends to have?
Tightening of interdependent network bonds
Strengthening of commitments to important social norms
Deepening of people’s religious devotion
What 2 things does Henrich suggest has some similar effects to exposure to war?
Exposure to natural disasters
Nonviolent intergroup competition (e.g. between firms) [though he suggests this’ll likely have smaller or no effects on religious devotion]
Henrich argues that at least 2 things (a) arose in part due to the emerging WEIRD psychology in the second millennium CE [and maybe the first as well?], and (b) then further contributed to the emergence of that WEIRD psychology. What are those 2 things?
Democracy and/or participatory governance
[He may have also mentioned other things. E.g., I think maybe he sees scientific thinking, universities, and more rational legal systems as also fitting that bill.]
What were the two key findings of Gurven et al. (2013)? [This has to do with personality.]
In the first test of the five-factor model of personality variation in a largely illiterate, indigenous society, Gurven et al. failed to find support for the model
That society’s personality variation seemed to display 2 principal factors that may reflect socioecological characteristics common to small-scale societies
[I learned of this study via Henrich’s WEIRDest People.]
What does Henrich say increases suicide rates?
Rates of Protestants relative to Catholics in an area
[He says historical Protestantism rates increased suicide rates at that time. I can’t remember if he also says historical P rates increase present suicide rates, or that present P rates increase present suicide rates. But I’m guessing he believes those things.]
Does Henrich seem to think Protestants tend to basically have more extreme versions of WEIRD tendencies than Catholics do?
Muthukrishna and Henrich argue that rates of innovation are heavily influenced by what 3 factors?
sociality (seemingly meaning both size and interconnectedness of a population)
cultural variance (analogous to genetic variance)
Henrich says that 4 voluntary associations (particularly) contributed to broadening the flow of knowledge and technology around Europe. These were:
Charter cities, monasteries, apprenticeships, universities
Henrich says that, historically, kings and other elites have tended to crack down on people with new ideas, inventions, or techniques that might shake up the existing power structure. He says this problem was mitigated in Europe [maybe just in the second millennium CE?] by 2 factors:
Political disunity (there were many competing states)
Relative cultural unity (due to transnational networks like the church, guilds, and the republic of letters)
[So people and groups could escape oppression by moving to other places.]
Henrich says it seems like banking deregulation increased ___, which in turn increased ___.
Interfirm competition; impersonal trust
What was the main way Henrich updated me away from the impression I’d gotten from Muehlhauser’s industrial revolution post?
Henrich seems to provide strong evidence that key trends started long before 1750 (some starting in the first millennium CE, most starting by 1200-1500)
[See caveats in the “My four main updates” section.]
The emergence of sedentary agriculture drove a(n) ____ in/of kin-based institutions.
[This led to norms related to things like cousin marriage, corporate ownership, patrilocal residence, segmentary lineages, and ancestor worship.]
Diamond argues that continents that are spread out in an ___ direction, such as ___, had a developmental advantage because of ___.
the ease with which crops, animals, ideas and technologies could spread between areas of similar latitude
[Quoting a PBS webpage on Guns, Germs and Steel.]
What does Henrich say is the basic relationship between his arguments and Diamond’s arguments in Guns, Germs and Steel?
Henrich’s arguments essentially pick up where Diamond’s arguments leave off
[I.e. Diamond’s arguments explain global inequality up to ~1000CE well, but don’t explain things like why the Industrial Revolution happened in Britain, whereas Henrich’s arguments can explain those later events.]
Henrich says that one reason why democracy hasn’t been taken up as effectively/thoroughly in Islamic countries is that Islam...
Says daughters should inherit half of what sons inherit (rather than nothing/very little), which likely drove the spread of and/or sustained a custom in which daughters marry their father’s brother’s sons, or more broadly a custom of marrying within clans. [This is to keep wealth within a family/clan.]
This encourages intensive forms of kinship, which favours certain ways of thinking and institutions that don’t mesh well with democracy.
[I may be slightly misrepresenting the ideas.
Japan, South Korea, and China have been able to adapt relatively rapidly to the economic configurations and global opportunities created by WEIRD societies. Henrich says that one factor that was likely important in that was that these societies had experienced long histories of ___, which had ___.
agriculture and state-level governance;
fostered the evolution of cultural values, customs, and norms encouraging formal education, industriousness, and a willingness to defer gratification.
[These can be seen as pre-existing cultural institutions that happened to dovetail nicely with the new institutions acquired from WEIRD societies.]
Japan, South Korea, and China have been able to adapt relatively rapidly to the economic configurations and global opportunities created by WEIRD societies. Henrich says that one factor that was likely important in that was that these societies had powerful ___, which ___.
helped them rapidly adopt and implement key kin-based institutions acquired from WEIRD societies (e.g. abolishing polygamy, clans, arranged marriages).
Henrich says studies on the effects of evolution by natural selection (not cultural selection) on length of time people spend in school indicate that...
Evolution by natural selection reduced that time by about 8 months over the 20th century
[And by about 1.5 months per generation—maybe just more recently.
But this was very much offset by cultural evolution increasing the length of time in school by a larger amount.]
 See here for the article that inspired me to actually start using Anki properly. Hat tip to Michelle Hutchinson for linking to that article and thus prompting me to read it. Note that some of the Anki cards that I made and include in this post violate some of the advice in that article—in particular, the advice to try to ensure that questions and answers each express only one idea.
 Caveats about these Anki cards:
It’s possible that some of these cards include mistakes, or will be confusing or misleading out of context.
I haven’t fact-checked Henrich on any of these points.
I only started making the cards after I was more than halfway through the book
I of course only made cards for some of the interesting insights in the remaining chapters
Some of these cards include direct quotes without having quote marks.
Some other cards are just my own interpretations -rather than definitely 100% parroting what the book is saying—but don’t note that fact.
A lot of the value of the book is not for the specific facts it collects, but rather its overarching theories and ways of looking at things. I think Anki cards could directly focus on those things, but I was making the cards for myself, so I mostly made them about specific facts that I thought would keep my memory of the theories and frameworks fresh.
oh, please, do post this type of stuff, specially in shortform… but, unfortunately, you can’t expect a lot of karma—attention is a scarce resource, right?I’d totally like to see you blog or send a newsletter with this.
Meta: I recently made two similar posts as top-level posts rather than as shortforms. Both got relatively little karma, especially the second. So I feel unsure whether posts/shortforms like this are worth putting in the time to make, and are worth posting as top-level posts vs as shortforms. If any readers have thoughts on that, let me know.
(Though it’s worth noting that making these posts takes me far less time than making regular posts does—e.g., this shortform took me 45 minutes total. So even just being mildly useful to a few people might be sufficient to justify that time cost.)
[Edited to add: I added the “My four main updates” section to this shortform 4 days after I originally posted it and made this comment.]
I really like these types of posts. I have some vague sense that these both would get more engagement and excitement on LW than the EA Forum, so maybe worth also posting them to there.
Thanks for that info and that suggestion. Given that, I’ve tried cross-posting my Schelling notes, as an initial experiment.
The old debate over “giving now vs later” is now sometimes phrased as a debate about “patient philanthropy”. 80,000 Hours recently wrote a post using the term “patient longtermism”, which seems intended to:
focus only on how the debate over patient philanthropy applies to longtermists
generalise the debate to also include questions about work (e.g., should I do a directly useful job now, or build career capital and do directly useful work later?)
They contrast this against the term “urgent longtermism”, to describe the view that favours doing more donations and work sooner.
I think the terms “patient longtermism” and “urgent longtermism” are both useful. One reason I think “urgent longtermism” is useful is that it doesn’t sound pejorative, whereas “impatient longtermism” would.
I suggest we also use three additional terms:
Like “patient philanthropy” and unlike “patient longtermism”, this term is cause-neutral.
But like “patient longtermism” and unlike “patient philanthropy”, this term clearly relates to both work and donations, not merely to donations.
Discussions about “patient philanthropy” do often make some reference to optimal timing of work, but it’s not usually central. Also, the term “philanthropy” is typically used just for donations.
Again, this is partly to avoid negative connotations, as is my next suggestion.
I don’t think “patient” and “urgent” are opposites, in the way Phil Trammell originally defined patience. He used “patient” to mean a zero pure time preference, and “impatient” to mean a nonzero pure time preference. You can believe it is urgent that we spend resources now while still having a pure time preference. Trammell’s paper argued that patient actors should give later, irrespective of how much urgency you believe there is. (Although he carved out some exceptions to this.)
Yes, Trammell writes:
We will call someone “patient” if he has low (including zero) pure time preference with respect to the welfare he creates by providing a good.
And I agree that a person with a low or zero pure time preference may still want to use a large portion of their resources now, for example due to thinking now is a much “hingier”/”higher leverage” time than average, or thinking value drift will be high.
You highlighting this makes me doubt whether 80,000 Hours should’ve used “patient longtermism” as they did, whether they should’ve used “patient philanthropy” as they arguably did*, and whether I should’ve proposed the term “patient altruism” for the position that we should give/work later rather than now (roughly speaking).
On the other hand, if we ignore Trammell’s definition of the term, I think “patient X” does seem like a natural fit for the position that we should do X later, rather than now.
Do you have other ideas for terms to use in place of “patient”? Maybe “delayed”? (I’m definitely open to renaming the tag. Other people can as well.)
If the case for patient philanthropy is as strong as Phil believes, many of us should be trying to improve the world in a very different way than we are now.He points out that on top of being able to dispense vastly more, whenever your trustees decide to use your gift to improve the world, they’ll also be able to rely on the much broader knowledge available to future generations. [...]And there’s a third reason to wait as well. What are the odds that we today live at the most critical point in history, when resources happen to have the greatest ability to do good? It’s possible. But the future may be very long, so there has to be a good chance that some moment in the future will be both more pivotal and more malleable than our own.Of course, there are many objections to this proposal. If you start a foundation you hope will wait around for centuries, might it not be destroyed in a war, revolution, or financial collapse?Or might it not drift from its original goals, eventually just serving the interest of its distant future trustees, rather than the noble pursuits you originally intended?Or perhaps it could fail for the reverse reason, by staying true to your original vision — if that vision turns out to be as deeply morally mistaken as the Rhodes’ Scholarships initial charter, which limited it to ‘white Christian men’.Alternatively, maybe the world will change in the meantime, making your gift useless. At one end, humanity might destroy itself before your trust tries to do anything with the money. Or perhaps everyone in the future will be so fabulously wealthy, or the problems of the world already so overcome, that your philanthropy will no longer be able to do much good.Are these concerns, all of them legitimate, enough to overcome the case in favour of patient philanthropy? [...]Should we have a mixed strategy, where some altruists are patient and others impatient?
If the case for patient philanthropy is as strong as Phil believes, many of us should be trying to improve the world in a very different way than we are now.
He points out that on top of being able to dispense vastly more, whenever your trustees decide to use your gift to improve the world, they’ll also be able to rely on the much broader knowledge available to future generations. [...]
And there’s a third reason to wait as well. What are the odds that we today live at the most critical point in history, when resources happen to have the greatest ability to do good? It’s possible. But the future may be very long, so there has to be a good chance that some moment in the future will be both more pivotal and more malleable than our own.
Of course, there are many objections to this proposal. If you start a foundation you hope will wait around for centuries, might it not be destroyed in a war, revolution, or financial collapse?
Or might it not drift from its original goals, eventually just serving the interest of its distant future trustees, rather than the noble pursuits you originally intended?
Or perhaps it could fail for the reverse reason, by staying true to your original vision — if that vision turns out to be as deeply morally mistaken as the Rhodes’ Scholarships initial charter, which limited it to ‘white Christian men’.
Alternatively, maybe the world will change in the meantime, making your gift useless. At one end, humanity might destroy itself before your trust tries to do anything with the money. Or perhaps everyone in the future will be so fabulously wealthy, or the problems of the world already so overcome, that your philanthropy will no longer be able to do much good.
Are these concerns, all of them legitimate, enough to overcome the case in favour of patient philanthropy? [...]
Should we have a mixed strategy, where some altruists are patient and others impatient?
This suggests to me that 80k is, at least in that post, taking “patient philanthropy” to refer not just to a low or zero pure time preference, but instead to a low or zero rate of discounting overall, or to a favouring of giving/working later rather than now.
Appendix A of The Precipice—Ord, 2020 (see also the footnotes, and the sources referenced)
The Long-Term Future: An Attitude Survey—Vallinder, 2019
Older people may place less moral value on the far future—Sanjay, 2019
Making people happy or making happy people? Questionnaire-experimental studies of population ethics and policy—Spears, 2017
The Psychology of Existential Risk: Moral Judgments about Human Extinction—Schubert, Caviola & Faber, 2019
Psychology of Existential Risk and Long-Termism—Schubert, 2018 (space for discussion here)
Descriptive Ethics – Methodology and Literature Review—Althaus, ~2018 (this is something like an unpolished appendix to Descriptive Population Ethics and Its Relevance for Cause Prioritization, and it would make sense to read the latter post first)
A Small Mechanical Turk Survey on Ethics and Animal Welfare—Brian Tomasik, 2015
Work on “future self continuity” might be relevant (I haven’t looked into it)
Some evidence about the views of EA-aligned/EA-adjacent groups
Survey results: Suffering vs oblivion—Slate Star Codex, 2016
Survey about preferences for the future of AI—FLI, ~2017
Some evidence about the views of EAs
Facebook poll relevant to preferences for one’s own suffering vs bliss—Jay Quigley, 2016
See also my collection of sources relevant to moral circles, moral boundaries, or their expansion, and my collection of sources relevant to the idea of “moral weight”.
Works by the EA community or related communities
Moral circles: Degrees, dimensions, visuals—Michael Aird (i.e., me), 2020
Why I prioritize moral circle expansion over artificial intelligence alignment—Jacy Reese, 2018
The Moral Circle is not a Circle—Grue_Slinky, 2019
The Narrowing Circle—Gwern, 2019 (see here for Aaron Gertler’s summary and commentary)
Radical Empathy—Holden Karnofsky, 2017
Various works from the Sentience Institute, including:
a presentation by Jamie Harris
a presentation by Jacy Reese (the table shown at 10:15 is perhaps especially relevant)
another video by Reese
Extinction risk reduction and moral circle expansion: Speculating suspicious convergence—Aird, work in progress
-Less relevant, or with only a small section that’s directly relevant-
Why do effective altruists support the causes we do? - Michelle Hutchinson, 2015
Finding more effective causes—Michelle Hutchinson, 2015
Cosmopolitanism—Topher Hallquist, 2014
Three Heuristics for Finding Cause X—Kerry Vaughan, 2016
The Drowning Child and the Expanding Circle—Peter Singer, 1997
The expected value of extinction risk reduction is positive—Brauner and Grosse-Holz, 2018
Crucial questions for longtermists: Overview—Michael Aird (me), work in progress
Should animals, plants, and robots have the same rights as you? - Sigal Samuel (for Vox’s Future Perfect), 2019
(There appears to be a substantial and continuing amount of psychological work on this topic; the papers I list here are just a fairly random subset to get you started.)
Toward a Psychology of Moral Expansiveness—Crimston et al., 2018
Moral expansiveness: Examining variability in the extension of the moral world—Crimston et al., 2016 (my unpolished commentary on this is here) (brief summary here)
Centripetal and centrifugal forces in the moral circle: Competing constraints on moral learning - Graham et al., 2017
Expanding the moral circle: Inclusion and exclusion mindsets and the circle of moral regard—Laham, 2009
Ideological differences in the expanse of the moral circle—Waytz et al., 2019
The Expanding Circle—Peter Singer, 1981
The Better Angels of Our Nature—Steven Pinker, 2011
The moral standing of animals: Towards a psychology of speciesism—Caviola, Everett, & Faber, 2019
See also this comment, my collection of sources relevant to the idea of “moral weight” ,and my collection of evidence about views on longtermism, time discounting, population ethics, etc. among non-EAs.
The only other very directly related resource I can think of is my own presentation on moral circle expansion, and various other short content by Sentience Institute’s website, e.g. our FAQ, some of the talks or videos. But I think that the academic psychology literature you refer to is very relevant here. Good starting point articles are, the “moral expansiveness” article you link to above and “Toward a psychology of moral expansiveness.”
Of course, depending on definitions, a far wider literature could be relevant, e.g. almost anything related to animal advocacy, robot rights, consideration of future beings, consideration of people on the other side of the planet etc.
There’s some wider content on “moral advocacy” or “values spreading,” of which work on moral circle expansion is a part:
Arguments for and against moral advocacy—Tobias Baumann, 2017
Values Spreading is Often More Important than Extinction Risk—Brian Tomasik, 2013
Against moral advocacy—Paul Christiano, 2013
Also relevant: “Should Longtermists Mostly Think About Animals?”
Thanks for adding those links, Jamie!
I’ve now added the first few into my lists above.
I continue to appreciate all the collections you’ve been posting! I expect to find reasons to link to many of these in the years to come.
Good to hear!
Yeah, I hope they’ll be mildly useful to random people at random times over a long period :D
Although I also expect that most people they’d be mildly useful for would probably never be aware they exist, so there may be a better way to do this.
Also, if and when EA coordinates on one central wiki, these could hopefully be folded into or drawn on for that, in some way.
Have any EAs involved in GCR-, x-risk-, or longtermism-related work considered submitting writing for the Bulletin? Should more EAs consider that?
I imagine many such EAs would have valuable things to say on topics the Bulletin’s readers care about, and that they could say those things well and in a way that suits the Bulletin. It also seems plausible that this could be a good way of:
disseminating important ideas to key decision-makers and thereby improving their decisions
either through the Bulletin articles themselves or through them allowing one to then talk individually with such decision-makers
gaining good career capital for certain career paths
e.g., later working in security-related roles in think tanks, NGOs, or governments
That said, I haven’t thought about those claims much, and I’m definitely not sure that this is a better option than other options the relevant EAs have available.
I raise this in part because I might consider writing something to submit to the Bulletin myself at a later stage of my nuclear risk research.
Thanks for those links!
(I also realise now that I’d already seen and found useful Gregory Lewis’s piece for the Bulletin, and had just forgotten that that’s the publication it was in.)
Here’s the Bulletin’s page on writing for them. Some key excerpts:
Readers of the Bulletin of the Atomic Scientists are informed and intelligent; they include top policymakers, researchers, and opinion makers from more than 150 countries and a large contingent of smart non-experts who are interested in the Bulletin’s mission. The Bulletin publishes articles written by the world’s leading science and security experts, who explore the potential for terrible damage to societies from manmade technologies. We focus on ways to prevent catastrophe from the malign or accidental misuse of technology. Our primary coverage areas are nuclear risk, climate change, and other disruptive technologies that could pose an existential threat to humanity.[...] The Bulletin is committed to serving our readers with a diverse array of perspectives from writers of all sorts of backgrounds. We especially welcome submissions from writers of historically underrepresented groups, including those who are Black, Latinx, Indigenous, people of color, and women. We also encourage the work of younger authors through the Voices of Tomorrow program.[...] Magazine. The bimonthly magazine features long form articles that generally run from 2,000 to 4,000 words; it is not the word count but the voice and the angle of the pieces that make the magazine distinctive. Read it to understand what the distinction is—we want you to tackle tough topics, make strong arguments, and offer strong takeaways.[...] Website. We accept opinion (800-1,300 words) and analysis pieces (1,000-3,000 words). Please do use the navigation on our home page to read a few of each of these types of pieces. They will be your best guide to Bulletin style and tone. Have a multimedia idea? Contact the editors directly and pitch them.[...] Include your bio. The Bulletin is known for publishing the top experts in their respective fields. Please submit your professional biography so that we understand your expertise and what makes you the perfect author to write the piece you are pitching.Peer review. The Bulletin is not a peer-reviewed journal; however, we do send unsolicited articles to colleagues for outside review. Be prepared to answer questions and to document your points—by way of hyperlinks for web pieces or in the form of footnotes for journal pieces.[...] Do not submit a research paper. The Bulletin publishes high-concept, high-quality journalism, which is a different form than the research paper. One is not a better form than the other; a research paper is perfectly appropriate to a research journal. It just won’t work with the Bulletin’s format or audience. The Bulletin is its own publication, with long-established parameters, and the best way to gauge what will work for the Bulletin is to read the Bulletin. [Though I’ve been reading the Nuclear Notebook articles, and I’d say they’re closer to research papers or white papers than to journalism. Maybe Nuclear Notebook is unusual in that respect?]
Readers of the Bulletin of the Atomic Scientists are informed and intelligent; they include top policymakers, researchers, and opinion makers from more than 150 countries and a large contingent of smart non-experts who are interested in the Bulletin’s mission. The Bulletin publishes articles written by the world’s leading science and security experts, who explore the potential for terrible damage to societies from manmade technologies. We focus on ways to prevent catastrophe from the malign or accidental misuse of technology. Our primary coverage areas are nuclear risk, climate change, and other disruptive technologies that could pose an existential threat to humanity.
[...] The Bulletin is committed to serving our readers with a diverse array of perspectives from writers of all sorts of backgrounds. We especially welcome submissions from writers of historically underrepresented groups, including those who are Black, Latinx, Indigenous, people of color, and women. We also encourage the work of younger authors through the Voices of Tomorrow program.
[...] Magazine. The bimonthly magazine features long form articles that generally run from 2,000 to 4,000 words; it is not the word count but the voice and the angle of the pieces that make the magazine distinctive. Read it to understand what the distinction is—we want you to tackle tough topics, make strong arguments, and offer strong takeaways.
[...] Website. We accept opinion (800-1,300 words) and analysis pieces (1,000-3,000 words). Please do use the navigation on our home page to read a few of each of these types of pieces. They will be your best guide to Bulletin style and tone. Have a multimedia idea? Contact the editors directly and pitch them.
[...] Include your bio. The Bulletin is known for publishing the top experts in their respective fields. Please submit your professional biography so that we understand your expertise and what makes you the perfect author to write the piece you are pitching.
Peer review. The Bulletin is not a peer-reviewed journal; however, we do send unsolicited articles to colleagues for outside review. Be prepared to answer questions and to document your points—by way of hyperlinks for web pieces or in the form of footnotes for journal pieces.
[...] Do not submit a research paper. The Bulletin publishes high-concept, high-quality journalism, which is a different form than the research paper. One is not a better form than the other; a research paper is perfectly appropriate to a research journal. It just won’t work with the Bulletin’s format or audience. The Bulletin is its own publication, with long-established parameters, and the best way to gauge what will work for the Bulletin is to read the Bulletin. [Though I’ve been reading the Nuclear Notebook articles, and I’d say they’re closer to research papers or white papers than to journalism. Maybe Nuclear Notebook is unusual in that respect?]
And here’s the page on the Voices of Tomorrow feature:
In its Voices of Tomorrow feature, the Bulletin of the Atomic Scientists invites emerging scholars to submit essays, opinion pieces, and multimedia presentations addressing at least one of the Bulletin‘s core issues: nuclear risk, climate change, and threats from emerging technologies.Beginning in 2015, editors will select one Voices of Tomorrow feature as winner of the Leonard M. Rieser Award; the author of that article will receive a $1,000 check plus a one-year subscription to the Bulletin’s journal, in addition to the publication of their submissions.[...] Submission process. Current students as well as recent graduates are encouraged to submit work. Essays and opinion pieces should not be longer than 2,000 words; video presentations should not exceed 5 minutes in playing time. Each entry must contain: the author’s email address, phone number, short biography, and school affiliation. Submissions should not have been previously published.Submissions should be sent to Bulletin Contributing Editor Dawn Stover at firstname.lastname@example.org; only one contribution at a time will be accepted per author.
In its Voices of Tomorrow feature, the Bulletin of the Atomic Scientists invites emerging scholars to submit essays, opinion pieces, and multimedia presentations addressing at least one of the Bulletin‘s core issues: nuclear risk, climate change, and threats from emerging technologies.
Beginning in 2015, editors will select one Voices of Tomorrow feature as winner of the Leonard M. Rieser Award; the author of that article will receive a $1,000 check plus a one-year subscription to the Bulletin’s journal, in addition to the publication of their submissions.
[...] Submission process. Current students as well as recent graduates are encouraged to submit work. Essays and opinion pieces should not be longer than 2,000 words; video presentations should not exceed 5 minutes in playing time. Each entry must contain: the author’s email address, phone number, short biography, and school affiliation. Submissions should not have been previously published.
Submissions should be sent to Bulletin Contributing Editor Dawn Stover at email@example.com; only one contribution at a time will be accepted per author.
See also Venn diagrams of existential, global, and suffering catastrophes
Bostrom & Ćirković (pages 1 and 2):
The term ‘global catastrophic risk’ lacks a sharp definition. We use it to refer, loosely, to a risk that might have the potential to inflict serious damage to human well-being on a global scale.
[...] a catastrophe that caused 10,000 fatalities or 10 billion dollars worth of economic damage (e.g., a major earthquake) would not qualify as a global catastrophe. A catastrophe that caused 10 million fatalities or 10 trillion dollars worth of economic loss (e.g., an influenza pandemic) would count as a global catastrophe, even if some region of the world escaped unscathed. As for disasters falling between these points, the definition is vague. The stipulation of a precise cut-off does not appear needful at this stage. [emphasis added]
Open Philanthropy Project/GiveWell:
risks that could be bad enough to change the very long-term trajectory of humanity in a less favorable direction (e.g. ranging from a dramatic slowdown in the improvement of global standards of living to the end of industrial civilization or human extinction).
Global Challenges Foundation:
threats that can eliminate at least 10% of the global population.
Wikipedia (drawing on Bostrom’s works):
a hypothetical future event which could damage human well-being on a global scale, even endangering or destroying modern civilization. [...]
any risk that is at least “global” in scope, and is not subjectively “imperceptible” in intensity.
Yassif (appearing to be writing for the Open Philanthropy Project):
By our working definition, a GCR is something that could permanently alter the trajectory of human civilization in a way that would undermine its long-term potential or, in the most extreme case, threaten its survival. This prompts the question: How severe would a pandemic need to be to create such a catastrophic outcome? [This is followed by interesting discussion of that question.]
Beckstead (writing for Open Philanthropy Project/GiveWell):
the Open Philanthropy Project’s work on global catastrophic risks focuses on both potential outright extinction events and global catastrophes that, while not threatening direct extinction, could have deaths amounting to a significant fraction of the world’s population or cause global disruptions far outside the range of historical experience.
(Note that Beckstead might not be saying that global catastrophes are defined as those that “could have deaths amounting to a significant fraction of the world’s population or cause global disruptions far outside the range of historical experience”. He might instead mean that Open Phil is focused on the relatively extreme subset of global catastrophes which fit that description. It may be worth noting that he later quotes Open Phil’s other, earlier definition of GCRs, which I listed above.)
My impression is that, at least in EA-type circles, the term “global catastrophic risk” is typically used for events substantially larger than things which cause “10 million fatalities or 10 trillion dollars worth of economic loss (e.g., an influenza pandemic)”.
E.g., the Global Challenges Foundation’s definition implies that the catastrophe would have to be able to eliminate at least ~750 million people, which is 75 times higher than the number Bostrom & Ćirković give. And I’m aware of at least some existential-risk-focused EAs whose impression is that the rough cutoff would be 100 million fatalities.
With that in mind, I also find it interesting to note that Bostrom & Ćirković gave the “10 million fatalities” figure as indicating something clearly is a GCR, rather than as the lower threshold that a risk must clear in order to be a GCR. From their loose definition, it seems entirely plausible that, for example, a risk with 1 million fatalities might be a GCR.
That said, I do agree that “The stipulation of a precise cut-off does not appear needful at this stage.” Personally, I plan to continue to use the term in a quite loose way, but probably primarily for risks that could cause much more than 10 million fatalities.
There is now a Stanford Existential Risk Initiative, which (confusingly) describes itself as:
a collaboration between Stanford faculty and students dedicated to mitigating global catastrophic risks (GCRs). Our goal is to foster engagement from students and professors to produce meaningful work aiming to preserve the future of humanity by providing skill, knowledge development, networking, and professional pathways for Stanford community members interested in pursuing GCR reduction. [emphasis added]
And they write:
What is a Global Catastrophic Risk?
We think of global catastrophic risks (GCRs) as risks that could cause the collapse of human civilization or even the extinction of the human species.
That is much closer to a definition of an existential risk (as long as we assume that the collapse is not recovered from) than of an global catastrophic risk. Given that fact and the clash between the term the initiative uses in its name and the term it uses when describing what they’ll focus on, it appears this initiative is conflating these two terms/concepts.
This is unfortunate, and could lead to confusion, given that there are many events that would be global catastrophes without being existential catastrophes. An example would be a pandemic that kills hundreds of millions but that doesn’t cause civilizational collapse, or that causes a collapse humanity later fully recovers from. (Furthermore, there may be existential catastrophes that aren’t “global catastrophes” in the standard sense, such as “plateauing — progress flattens out at a level perhaps somewhat higher than the present level but far below technological maturity” (Bostrom).)
For further discussion, see Clarifying existential risks and existential catastrophes.
(I should note that I have positive impressions of the Center for International Security and Cooperation (which this initiative is a part of), that I’m very glad to see that this initiative has been set up, and that I expect they’ll do very valuable work. I’m merely critiquing their use of terms.)
Some more definitions, from or quoted in 80k’s profile on reducing global catastrophic biological risks
Gregory Lewis, in that profile itself:
Global catastrophic risks (GCRs) are roughly defined as risks that threaten great worldwide damage to human welfare, and place the long-term trajectory of humankind in jeopardy. Existential risks are the most extreme members of this class.
Open Philanthropy Project:
[W]e use the term “global catastrophic risks” to refer to risks that could be globally destabilising enough to permanently worsen humanity’s future or lead to human extinction.
Schoch-Spana et al. (2017), on GCBRs, rather than GCRs as a whole:
The Johns Hopkins Center for Health Security’s working definition of global catastrophic biological risks (GCBRs): those events in which biological agents—whether naturally emerging or reemerging, deliberately created and released, or laboratory engineered and escaped—could lead to sudden, extraordinary, widespread disaster beyond the collective capability of national and international governments and the private sector to control. If unchecked, GCBRs would lead to great suffering, loss of life, and sustained damage to national governments, international relationships, economies, societal stability, or global security.
Metaculus features a series of questions on global catastrophic risks. The author of these questions operationalises a global catastrophe as an event in which “the human population decrease[s] by at least 10% during any period of 5 years or less”.
Baum and Barrett (2018) gesture at some additional definitions/conceptualisations of global catastrophic risk that have apparently been used by other authors:
In general terms, a global catastrophe is generally understood to be a major harm to global human civilization. Some studies have focused on catastrophes resulting in human extinction, including early discussions of nuclear winter (Sagan 1983). Several studies posit minimum damage thresholds such as the death of 10% of the human population (Cotton-Barratt et al. 2016), the death of 25% of the human population (Atkinson 1999), or 104 to 107 deaths or $109 to $1012 in damages (Bostrom and Ćirković 2008). Other studies define global catastrophe as an event that exceeds the resilience of global human civilization, resulting in its collapse (Maher and Baum 2013; Baum and Handoh 2014).
From an FLI podcast interview with two researchers from CSER:
“Ariel Conn: [...] I was hoping you could quickly go over a reminder of what an existential threat is and how that differs from a catastrophic threat and if there’s any other terminology that you think is useful for people to understand before we start looking at the extreme threats of climate change.”
Simon Beard: So, we use these various terms as kind of terms of art within the field of existential risk studies, in a sense. We know what we mean by them, but all of them, in a way, are different ways of pointing to the same kind of outcome — which is something unexpectedly, unprecedentedly bad. And, actually, once you’ve got your head around that, different groups have slightly different understandings of what the differences between these three terms are. So, for some groups, it’s all about just the scale of badness. So, an extreme risk is one that does a sort of an extreme level of harm; A catastrophic risk does more harm, a catastrophic level of harm. And an existential risk is something where either everyone dies, human extinction occurs, or you have an outcome which is an equivalent amount of harm: Maybe some people survive, but their lives are terrible. Actually, at the Center for the Study of Existential Risk, we are concerned about this classification in terms of the cost involved, but we also have coupled that with a slightly different sort of terminology, which is really about systems and the operation of the global systems that surround us.Most of the systems — be this physiological systems, the world’s ecological system, the social, economic, technological, cultural systems that surround those institutions that we build on — they have a kind of normal space of operation where they do the things that you expect them to do. And this is what human life, human flourishing, and human survival are built on: that we can get food from the biosphere, that our bodies will continue to operate in a way that’s consistent with and supporting our health and our continued survival, and that the institutions that we’ve developed will still work, will still deliver food to our tables, will still suppress interpersonal and international violence, and that we’ll basically, we’ll be able to get on with our lives.If you look at it that way, then an extreme risk, or an extreme threat, is one that pushes at least one of these systems outside of its normal boundaries of operation and creates an abnormal behavior that we then have to work really hard to respond to. A catastrophic risk is one where that happens, but then that also cascades. Particularly in global catastrophe, you have a whole system that encompasses everyone all around the world, or maybe a set of systems that encompass everyone all around the world, that are all operating in this abnormal state that’s really hard for us to respond to.And then an existential catastrophe is one where the systems have been pushed into such an abnormal state that either you can’t get them back or it’s going to be really hard. And life as we know it cannot be resumed; We’re going to have to live in a very different and very inferior world, at least from our current way of thinking.” (emphasis added)
The term ‘global catastrophic risk’ (GCR) is increasingly used in the scholarly community to refer to a category of threats that are global in scope, catastrophic in intensity, and non-zero in probability (Bostrom and Cirkovic, 2008). [...] The GCR framework is concerned with low-probability, high-consequence scenarios that threaten humankind as a whole (Avin et al., 2018; Beck, 2009; Kuhlemann, 2018; Liu, 2018)
(Personally, I don’t think I like that second sentence. I’m not sure what “threaten humankind” is meant to mean, but I’m not sure I’d count something that e.g. causes huge casualties on just one continent, or 20% casualties spread globally, as threatening humankind. Or if I did, I’d be meaning something like “threatens some humans”, in which case I’d also count risks much smaller than GCRs. So this sentence sounds to me like it’s sort-of conflating GCRs with existential risks.)
In January, I spent ~1 hour trying to brainstorm relatively concrete ideas for projects that might help improve the long-term future. I later spent another ~1 hour editing what I came up with for this shortform. This shortform includes basically everything I came up with, not just a top selection, so not all of these ideas will be great. I’m also sure that my commentary misses some important points. But I thought it was worth sharing this list anyway.
The ideas vary in the extent to which the bottleneck(s) to executing them are the right person/people, buy-in from the right existing organisation, or funding.
I’m not expecting to execute these ideas in the near-term future myself, so if you think one of these ideas sounds promising and relevant to your skills, interests, etc., please feel very free to explore the idea further, to comment here, and/or to reach out to me to discuss it!
Something along the lines of compiling a large set of potentially promising cause areas and interventions; doing rough Fermi estimates, cost-effectiveness analyses, and/or forecasts; thereby narrowing the list down; and then maybe gradually doing more extensive Fermi estimates, cost-effectiveness analyses, and/or forecasts
This is somewhat similar to things that Ozzie Gooen, Nuño Sempere, and Charity Entrepreneurship have done or are doing
Ozzie also discusses some similar ideas here
So it’d probably be worth talking to them about this
Something like a team of part-time paid forecasters, both to forecast on various important questions and to be “on-call” when it looks like a catastrophe or window of opportunity might be looming
I think I got this idea from Linch Zhang, and it might be worth talking to him about it
80,000 Hours-style career reviews on things like diplomacy, arms control, international organisations, becoming a Russia/India/etc specialist
Some discussion here
Could see if 80k would be happy to supervise someone else to do this
Could seek out EAs or EA-aligned people who are working full-time in related areas
Organisations like HIPE, CSET, and EA Russia might have useful connections
I might be open to collaborating with someone on this
Research or writing assistance for researchers (especially senior ones) at orgs like FHI, Forethought, MIRI, CHAI
This might allow them to complete additional valuable projects
This also might help the research or writing assistants build career capital and test fit for valuable roles
Maybe BERI can already provide this?
It’s possible it’s not worth being proactive about this, and instead waiting for people to decide they want an assistant and create a job ad for one. But I’d guess that some proactiveness would be useful (i.e., that there are cases where someone would benefit from such an assistant but hasn’t thought of it, or doesn’t think the overhead of a long search for one is worthwhile)
See also this comment from someone who did this sort of role for Toby Ord
Research or writing assistance for certain independent researchers?
Ops assistance for orgs like FHI?
But I think orgs like BERI and the Future of Humanity Foundation are already in this space
Additional “Research Training Programs” like summer research fellowships, “Early Career Conference Programmes”, internships, or similar
Probably best if this is at existing orgs
Could perhaps find an org that isn’t doing this yet but has researchers who would be capable of providing valuable mentorship, suggest the idea to them, and be or find someone who can handle the organisational aspects
Something like the Open Phil AI fellowship, but for another topic
In particular, something that captures the good effects a “fellowship” can have, beyond the provision of funding (since there are already some sources of funding alone, such as the Long-Term Future Fund)
A hub for longtermism-relevant research (or a narrower area, e.g. AI) outside of US and UK
Perhaps ideally a non-Anglophone country? Perhaps ideally in Asia?
Could be a new organisation or a branch/affiliate of an existing one
There’s some relevant discussion here, here, here, and I think here (though I haven’t properly read that post)
Found an organization/community similar to HIPE and/or APPGFG, but in countries other than the UK
I’d guess it’d probably be easiest in countries where there is a substantial EA presence, and perhaps easier in smaller countries like Switzerland rather than in the US
Why this might/might not be good:
I don’t know a huge amount about HIPE or APPGFG, but from my limited info on those orgs they seem valuable
I’d guess that there’s no major reason something similar to HIPE couldn’t be successfully replicated in other countries, if we could find the right person/people
In contrast, I’d guess that there might be more barriers to successfully replicating something like APPGFG
E.g., most countries probably don’t have an institution very similar to APPGs
But I imagine something broadly similar could be replicated elsewhere
Potential next steps:
Talk to people involved in HIPE and APPGFG about whether they think these things could be replicated, how valuable they think that’d be, how they’d suggest it be done, what countries they’d suggest, and who they’d suggest talking to
Talk to other EAs, especially outside of the UK, who are involved in politics, policy, and improving institutional decision-making
Ask them for their thoughts, who they’d suggest reaching out to, and (in some cases) whether they might be interested in collaborating on this
I also had some ideas for specific research or writing projects, but I’m not including them in this list
That’s partly because I might publish something more polished on that later
It’s mostly because people can check out A central directory for open research questions for a broader set of research project ideas
See also Why you (yes, you) should post on the EA Forum
Possible gaps in the EA community—EA Forum
Get Involved—EA Forum
The views I expressed here are my own, and do not necessarily reflect the views of my employers.
“Research or writing assistance for researchers (especially senior ones) at orgs like FHI, Forethought, MIRI, CHAI”
As a senior research scholar at FHI, I would find this valuable if the assistant was competent and the arrangement was low cost to me (in terms of time, effort, and money). I haven’t tried to set up anything like this since I expect finding someone competent, working out the details, and managing them would not be low cost, but I could imagine that if someone else (such as BERI) took care of details, it very well may be low cost. I support efforts to try to set something like this up, and I’d like to throw my hat into the ring of “researchers who would plausibly be interested in assistants” if anyone does set this up.
tl;dr: In The Precipice, Toby Ord argues that some disagreements about population ethics don’t substantially affect the case for prioritising existential risk reduction. I essentially agree with his conclusion, but I think one part of his argument is shaky/overstated.
This is a lightly edited version of some notes I wrote in early 2020. It’s less polished, substantive, and important than most top-level posts I write. This does not capture my full views on population ethics or The Precipice. (I really liked the book overall.)
Some of the more extreme approaches to this relatively new field of ‘population ethics’ imply that there is no reason to avoid extinction stemming from consideration of future generations—it just doesn’t matter whether these future people come into being or not. [But] all but the most implausible of these views agree with the immense importance of saving future generations from other kinds of existential catastrophe, such as the irrevocable collapse of civilization. Since most things that threaten extinction threaten such a collapse too, there is not much practical difference.
Some of the more extreme approaches to this relatively new field of ‘population ethics’ imply that there is no reason to avoid extinction stemming from consideration of future generations—it just doesn’t matter whether these future people come into being or not.
[But] all but the most implausible of these views agree with the immense importance of saving future generations from other kinds of existential catastrophe, such as the irrevocable collapse of civilization. Since most things that threaten extinction threaten such a collapse too, there is not much practical difference.
I agree that even many views on population ethics which would say it doesn’t matter whether future people get to come into being would agree that it’s at least somewhat important to save future generations from at least some kinds of non-extinction existential catastrophe. (It’s also the case that my preferred views on population ethics very strongly support prioritising existential risk reduction.)
But I think Ord overstates things here, perhaps considerably. There are three reasons I say this.
Reason 1: The size of the stakes matters. And even in person-affecting views where avoiding irrevocable collapse matters, it matters far less than in some non-person-affecting views.
People like Ord and I believe that existential risk reduction is not just important, but rather extremely important, and thus worth prioritising despite reasonable concerns about predictability and tractability. These beliefs are substantially influenced by the future’s potential scale, duration, and quality, if we manage to avoid catastrophe (see, e.g., Ord’s note 37 in chapter 8).
Ord deliberately moves away from relatively extreme / contrarian / counterintuitive versions of that sort of argument. For example, he argues that the probability of existential catastrophe in the coming century is not miniscule, and that there are a variety of reasons to believe particular interventions could reduce the risks.
But it would seem hard to argue that it’s just as easy to predictably cause a significant reduction in existential risks as to predictably cause a substantial improvement in near-term global health and development or animal welfare. And I don’t believe Ord tries to make that argument. So the potentially extreme stakes involved in existential risks still seem like an important part of his claims.
Let’s say we accept some view on population ethics in which we don’t care about the loss of value from things like extinction or not colonising the stars, but do care about the reduced quality of life of people who would exist in an irrevocable collapse scenario. Thus, as Ord suggests, we still acknowledge that there are some future-people-related reasons to reduce existential risks (rather than just other types of reasons, such as preventing death and suffering in the present generation or fulfilling duties to the past).
But those reasons would be about something like “the difference between the total/average quality of life that those people would have given irrevocable collapse and the total/average quality of life that the same people—or the same number of people, or something like that—would’ve had if not for the irrevocable collapse”. That will entail far smaller stakes than “the difference in the total amount of value (e.g., aggregate wellbeing, or achievement, or whatever) given irrevocable collapse and the total amount of value given no existential catastrophe (so we colonise the stars, or fulfil our potential in some other way”.
So I think that adopting that sort of view on population ethics would make a major practical difference. It wouldn’t render existential risk reduction valueless, but would substantially reduce its value, perhaps making it a lower priority than seemingly more predictable and tractable priorities such as near-term animal welfare.
Reason 2: In views which include the asymmetry principle, avoiding irrevocable collapse may not matter, as people in collapse scenarios may have net-positive lives.
In Ord’s appendix on population ethics, he notes that some people have argued for:
an asymmetry principle: that adding new lives of positive wellbeing doesn’t make an outcome better, but adding new lives with negative wellbeing does make it worse.
Views which include such that sort of asymmetry principle would think it matters to prevent futures with large numbers of lives of negative wellbeing. Such views may thus indeed support existential risk reduction, but with a focus on dystopian futures and/or s-risks rather than extinction risk. (I think that that’s what I’d support if my views on population ethics included that sort of asymmetry principle.)
But recall that Ord focuses on collapse rather than dystopia:
all but the most implausible of these views agree with the immense importance of saving future generations from other kinds of existential catastrophe, such as the irrevocable collapse of civilization. Since most things that threaten extinction threaten such a collapse too, there is not much practical difference.
I’d guess most of the lives in an irrevocable collapse scenario would be somewhere around neutral or somewhat positive wellbeing. (It does seem plausible that they’d tend to be of negative wellbeing, but also plausible that they’d be of similar or greater wellbeing levels than we currently have.)
Maybe Ord considers views which include the asymmetry principle to be among “the most implausible” of views on population ethics. But if so, that seems fairly contestable. And if not, then these views might actually not see preventing a sizeable portion of the possible irrevocable collapse scenarios as mattering at all. That would further reduce the extent to which those views would, overall, be inclined to prioritise existential risk reduction.
One could respond by saying “But couldn’t many things that threaten extinction also threaten the sort of scenarios these views would care about preventing, such as s-risks?” I think that that’s plausible, but the matter is a lot more complicated than in the case of irrevocable collapse. Here are a couple somewhat relevant posts:
How Would Catastrophic Risks Affect Prospects for Compromise?
The long-term significance of reducing global catastrophic risks
Reason 3: I’m very unsure whether most things which threaten extinction pose a similar risk of irrevocable collapse.
Irrevocable collapse would involve a very long period of neither going extinct nor fully recovering. But it seems plausible to me that, given a collapse, it’s extremely likely that we’d relatively quickly—e.g., within thousands of years—either go extinct or fully recover. (My views on this are fuzzy and confused. See also Bostrom, 2013, section 2.2.)
If that is the case, that would substantially reduce the harm the collapse represented from the perspective of views on population ethics which don’t care about extinction but would care about some collapse scenarios.
Ord does surround the passage quoted above with caveats, and he dedicates an appendix to the topic.
But I don’t think the caveats or appendix really address this specific point I’m making.
I’m merely critiquing this specific argument for why population ethics may not cast doubt on whether to prioritise existential risk reduction. I personally prioritise existential risk reduction, and think there are other strong arguments for doing so despite population ethics concerns.
E.g., I see something like a “total view” as very plausible, and I see greater issues with person-affecting views than with a “total view”.
E.g., certain approaches to moral uncertainty will suggest the total view should be pretty dominant if it’s at least seen as plausible (although some see this as problematic fanaticism).
Things I’ve written
Some thoughts on Toby Ord’s existential risk estimates
Database of existential risk estimates
Clarifying existential risks and existential catastrophes
Existential risks are not just about humanity
Failures in technology forecasting? A reply to Ord and Yudkowsky
What is existential security?
Why I’m less optimistic than Toby Ord about New Zealand in nuclear winter, and maybe about collapse more generally
Thoughts on Toby Ord’s policy & research recommendations
“Toby Ord seems to imply that economic stagnation is clearly an existential risk factor. But I that we should actually be more uncertain about that”
Why I think The Precipice might understate the significance of population ethics
My Google Play review
My review of Tom Chivers’ review of Toby Ord’s The Precipice
If a typical mammalian species survives for ~1 million years, should a 200,000 year old species expect another 800,000 years, or another million years?
What would it mean for humanity to protect its potential, but use it poorly?
Arguments for and against Toby Ord’s “grand strategy for humanity”
Does protecting humanity’s potential guarantee its fulfilment?
A typology of strategies for influencing the future
Working titles of things I plan/vaguely hope to write
Note: If you might be interested in writing about similar ideas, feel very free to reach out to me. It’s very unlikely I’ll be able to write all of these posts by myself, so potentially we could collaborate, or I could just share my thoughts and notes with you and let you take it from there.
Update: It’s now very unlikely that I’ll get around to writing any of these things.
The Terrible Funnel: Estimating odds of each step on the x-risk causal path (working title)
The idea here would be to adapt something like the “Great Filter” or “Drake Equation” reasoning to estimating the probability of existential catastrophe, using how humanity has fared in prior events that passed or could’ve passed certain “steps” on certain causal chains to catastrophe.
E.g., even though we’ve never faced a pandemic involving a bioengineered pathogen, perhaps our experience with how many natural pathogens have moved from each “step” to the next one can inform what would likely happen if we did face a bioengineered pathogen, or if it did get to a pandemic level.
This idea seems sort of implicit in the Precipice, but isn’t really spelled out there. Also, as is probably obvious, I need to do more to organise my thoughts on it myself.
This may include discussion of how Ord distinguishes natural and anthropogenic risks, and why the standard arguments for an upper bound for natural extinction risks don’t apply to natural pandemics. Or that might be a separate post.
Developing—but not deploying—drastic backup plans (see my comment here)
“Macrostrategy”: Attempted definitions and related concepts
This would relate in part to Ord’s concept of “grand strategy for humanity”
Collection of notes
A post summarising the ideas of existential risk factors and existential security factors?
I suspect I won’t end up writing this, but I think someone should. For one thing, it’d be good to have something people can reference/link to that explains that idea (sort of like the role EA Concepts serves).
Some selected Precipice-related works by others
80,000 Hours’ interview with Toby Ord
Slate Star Codex’s review of the book
FLI Podcast interview with Toby Ord
If anyone reading this has read anything I’ve written on the EA Forum or LessWrong, I’d really appreciate you taking this brief, anonymous survey. Your feedback is useful whether your opinion of my work is positive, mixed, lukewarm, meh, or negative.
And remember what mama always said: If you’ve got nothing nice to say, self-selecting out of the sample for that reason will just totally bias Michael’s impact survey.
(If you’re interested in more info on why I’m running this survey and some thoughts on whether other people should do similar, I give that here.)
Note: This is a slightly edited excerpt from my 2019 application to the FHI Research Scholars Program. I’m unsure how useful this idea is. But twice this week I felt it’d be slightly useful to share this idea with a particular person, so I figured I may as well make a shortform of it.
Efforts to benefit the long-term future would likely gain from better understanding what we should steer towards, not merely what we should steer away from. This could allow more targeted actions with better chances of securing highly positive futures (not just avoiding existential catastrophes). It could also help us avoid negative futures that may not appear negative when superficially considered in advance. Finally, such positive visions of the future could facilitate cooperation and mitigate potential risks from competition (see Dafoe, 2018 on “AI Ideal Governance”). Researchers have begun outlining particular possible futures, arguing for or against them, and surveying people’s preferences for them. It’d be valuable to conduct similar projects (via online surveys) that address several limitations of prior efforts.
First, these projects should provide relatively detailed portrayals of the potential futures under consideration. This could be done using summaries of scenarios richly imagined in existing sources (e.g., Tegmark’s Life 3.0, Hanson’s Age of Em) or generated during the “world-building” efforts to be conducted at the Augmented Intelligence Summit. This could address people’s apparent tendency to be repelled by descriptions of futures that simplistically maximise things they claim to intrinsically value while stripping away things they don’t. It could also allow for quantitative and qualitative feedback on these scenarios and various elements of them. People may find it easier to critique and build upon presented scenarios than to imagine ideal scenarios from scratch.
Second, these projects should include large, representative, cross-national samples. Existing research has typically included only small samples which often differ greatly from the general population. This doesn’t fully capture the three above-mentioned benefits of efforts to understand what futures we actually want.
Third, experimental manipulations could be embedded within the surveys to explore the impact of different framings, different information, and different arguments, partly to reveal how fragile people’s preferences are.
It would be useful to also similarly survey medium-term-relevant preferences (e.g., regarding institutions for managing adaptations to increasing AI capabilities; Dafoe, 2018).
One concern with this idea is that the long-term future may be so radically unfamiliar and unpredictable that any information regarding people’s present preferences for it would be irrelevant to scenarios that are actually plausible. Another concern is that present preferences may not be worth following anyway, as they may reflect intuitions that make sense in our current environment but wouldn’t in radically different future environments. They may also not be worth following if issues like framing effects and scope neglect become particularly impactful when evaluating such unfamiliar and astronomical options.
 I wrote this application when I was very new to EA and I was somewhat grasping at straws to come up with longtermism-relevant research ideas that would make use of my psychology degree.
Certificates of impact—Paul Christiano, 2014
The impact purchase—Paul Christiano and Katja Grace, ~2015 (the whole site is relevant, not just the home page)
The Case for Impact Purchase | Part 1 - Linda Linsefors, 2020
Making Impact Purchases Viable—casebash, 2020
Plan for Impact Certificate MVP—lifelonglearner, 2020
Impact Prizes as an alternative to Certificates of Impact—Ozzie Gooen, 2019
Altruistic equity allocation—Paul Christiano, 2019
Social impact bond—Wikipedia (highlighted as relevant by Toby Ord)
Health Impact Fund—Wikipedia (highlighted as relevant by Toby Ord)
I intend to add to this list over time. If you know of other relevant work, please mention it in a comment. I also may create a tag for relevant posts.
The Health Impact Fund (cited above by MichaelA) is an implementation of a broader idea outlined by Dr. Aidan Hollis here: An Efficient Reward System for Pharmaceutical Innovation. Hollis’ paper, as I understand it, proposes reforming the patent system such that innovations would be rewarded by government payouts (based on impact metrics, e.g. QALYs) rather than monopoly profit/rent. The Health Impact Fund, an NGO, is meant to work alongside patents (for now) and is intended to prove that the broader concept outlined in the paper can work. A friend and I are working on further broadening this proposal outlined by Dr. Hollis. Essentially, I believe this type of innovation incentive could be applied to other areas with easily measurable impact (e.g. energy, clean protein and agricultural innovations via a “carbon emissions saved” metric). We’d love to collaborate with anyone else interested (feel free to message me).
Questions: Is a change in the offence-defence balance part of why interstate (and intrastate?) conflict appears to have become less common? Does this have implications for the likelihood and trajectories of conflict in future (and perhaps by extension x-risks)?
Epistemic status: This post is unpolished, un-researched, and quickly written. I haven’t looked into whether existing work has already explored questions like these; if you know of any such work, please comment to point me to it.
Background/elaboration: Pinker argues in The Better Angels of Our Nature that many types of violence have declined considerably over history. I’m pretty sure he notes that these trends are neither obviously ephemeral nor inevitable. But the book, and other research pointing in similar directions, seems to me (and I believe others?) to at least weakly support the ideas that:
if we avoid an existential catastrophe, things will generally continue to get better
apart from the potential destabilising effects of technology, conflict seems to be trending downwards, somewhat reducing the risks of e.g. great power war, and by extension e.g. malicious use of AI (though of course a partial reduction in risks wouldn’t necessarily mean we should ignore the risks)
But How Does the Offense-Defense Balance Scale? (by Garfinkel and Dafoe, of the Center for the Governance of AI; summary here) says:
It is well-understood that technological progress can impact offense-defense balances. In fact, perhaps the primary motivation for developing the concept has been to understand the distinctions between different eras of military technology.
For instance, European powers’ failure to predict the grueling attrition warfare that would characterize much of the First World War is often attributed to their failure to recognize that new technologies, such as machine guns and barbed wire, had shifted the European offense-defense balance for conquest significantly toward defense.
holding force sizes fixed, the conventional wisdom holds that a conflict with mid-nineteenth century technology could be expected to produce a better outcome for the attacker than a conflict with early twentieth century technology. See, for instance, Van Evera, ‘Offense, Defense, and the Causes of War’.
The paper tries to use these sorts of ideas to explore how emerging technologies will affect trajectories, likelihood, etc. of conflict. E.g., the very first sentence is: “The offense-defense balance is a central concept for understanding the international security implications of new technologies.”
But it occurs to me that one could also do historical analysis of just how much these effects have played a role in the sort of trends Pinker notes. From memory, I don’t think Pinker discusses this possible factor in those trends. If this factor played a major role, then perhaps those trends are substantially dependent on something “we” haven’t been thinking about as much—perhaps we’ve wondered about whether the factors Pinker discusses will continue, whereas they’re less necessary and less sufficient than we thought for the overall trend (decline in violence/interstate conflict) that we really care about.
And at a guess, that might mean that that trend is more fragile or “conditional” than we might’ve thought. It might mean that we really really can’t rely on that “background trend” continuing, or at least somewhat offsetting the potentially destabilising effects of new tech—perhaps a lot of the trend, or the last century or two of it, was largely about how tech changed things, so if the way tech changes things changes, the trend could very easily reverse entirely.
I’m not at all sure about any of that, but it seems it would be important and interesting to explore. Hopefully someone already has, in which case I’d appreciate someone pointing me to that exploration.
(Also note that what the implications of a given offence-defence balance even are is apparently somewhat complicated/debatable matter. Eg., Garfinkel and Dafoe write: “While some hold that shifts toward offense-dominance obviously favor conflict and arms racing, this position has been challenged on a number of grounds. It has even been suggested that shifts toward offense-dominance can increase stability in a number of cases.”)
Ways people trying to do good accidentally make things worse, and how to avoid them—Rob Wiblin and Howie Lempel (for 80,000 Hours), 2018
How to Avoid Accidentally Having a Negative Impact with your Project—Max Dalton and Jonas Vollmer, 2018
Sources that seem somewhat relevant
https://en.wikipedia.org/wiki/Unintended_consequences (in particular, “Unexpected drawbacks” and “Perverse results”, not “Unintended benefits”)
(See also my lists of sources related to information hazards, differential progress, and the unilateralist’s curse.)
Unilateralist’s curse [EA Concepts]
Horsepox synthesis: A case of the unilateralist’s curse? [Lewis] (usefully connects the curse to other factors)
The Unilateralist’s Curse and the Case for a Principle of Conformity [Bostrom et al.’s original paper]
Hard-to-reverse decisions destroy option value [CEA]
Framing issues with the unilateralist’s curse—Linch, 2020
Managing risk in the EA policy space [EA Forum] (touches briefly on the curse)
Ways people trying to do good accidentally make things worse, and how to avoid them [80k] (only one section on the curse)
This is adapted from this comment, and I may develop it into a proper post later. I welcome feedback on whether it’d be worth doing so, as well as feedback more generally.
Epistemic status: During my psychology undergrad, I did a decent amount of reading on topics related to the “continued influence effect” (CIE) of misinformation. My Honours thesis (adapted into this paper) also partially related to these topics. But I’m a bit rusty (my Honours was in 2017, and I haven’t reviewed the literature since then).
This is a quick attempt to summarise some insights from psychological findings on the continued influence effect of misinformation (and related areas) that (speculatively) might suggest downsides to some of EA’s epistemic norms (e.g., just honestly contributing your views/data points to the general pool and trusting people will update on them only to the appropriate degree, or clearly acknowledging counterarguments even when you believe your position is strong).
From memory, this paper reviews research on CIE, and I perceived it to be high-quality and a good intro to the topic.
From this paper’s abstract:
Information that initially is presumed to be correct, but that is later retracted or corrected, often continues to influence memory and reasoning. This occurs even if the retraction itself is well remembered. The present study investigated whether the continued influence of misinformation can be reduced by explicitly warning people at the outset that they may be misled. A specific warning—giving detailed information about the continued influence effect (CIE)--succeeded in reducing the continued reliance on outdated information but did not eliminate it. A more general warning—reminding people that facts are not always properly checked before information is disseminated—was even less effective. In an additional experiment, a specific warning was combined with the provision of a plausible alternative explanation for the retracted information. This combined manipulation further reduced the CIE but still failed to eliminate it altogether. (emphasis added)
This seems to me to suggest some value in including “epistemic status” messages up front, but that this don’t make it totally “safe” to make posts before having familiarised oneself with the literature and checked one’s claims.
Here’s a couple other seemingly relevant quotes from papers I read back then:
“retractions [of misinformation] are less effective if the misinformation is congruent with a person’s relevant attitudes, in which case the retractions can even backfire [i.e., increase belief in the misinformation].” (source) (see also this source)
“we randomly assigned 320 undergraduate participants to read a news article presenting either claims both for/against an autism-vaccine link [a “false balance”], link claims only, no-link claims only or non-health-related information. Participants who read the balanced article were less certain that vaccines are safe, more likely to believe experts were less certain that vaccines are safe and less likely to have their future children vaccinated. Results suggest that balancing conflicting views of the autism-vaccine controversy may lead readers to erroneously infer the state of expert knowledge regarding vaccine safety and negatively impact vaccine intentions.” (emphasis added) (source)
This seems relevant to norms around “steelmanning” and explaining reasons why one’s own view may be inaccurate. Those overall seem like very good norms to me, especially given EAs typically write about issues where there truly is far less consensus than there is around things like the autism-vaccine “controversy” or climate change. But it does seem those norms could perhaps lead to overweighting of the counterarguments when they’re actually very weak, perhaps especially when communicating to wider publics who might read and consider posts less carefully than self-identifying EAs/rationalists would. But that’s all my own speculative generalisations of the findings on “falsely balanced” coverage.
Review of ‘value drift’ estimates, and several new estimates—Ben Todd, 2020
EA Survey 2018 Series: How Long Do EAs Stay in EA? - Peter Hurford, 2019
Empirical data on value drift—Joey Savoie, 2018
Concrete Ways to Reduce Risks of Value Drift and Lifestyle Drift—Darius Meissner, 2018
A Qualitative Analysis of Value Drift in EA—Marisa Jurczyk, 2020
Value Drift & How to Not Be Evil Part I & Part II—Daniel Gambacorta, 2019
Keeping everyone motivated: a case for effective careers outside of the highest impact EA organizations—FJehn, 2019
EA Survey 2018 Series: Do EA Survey Takers Keep Their GWWC Pledge? - Peter Hurford, 2019
Value drift in effective altruism—Effective Thesis, no date
Will Future Civilization Eventually Achieve Goal Preservation? - Brian Tomasik, 2017/2020
Let Values Drift—G Gordon Worley III, 2019 (note: I haven’t read this)
On Value Drift—Robin Hanson, 2018 (note: I haven’t read this)
Somewhat relevant, but less so
Value uncertainty—Michael Aird (me), 2020
An idea for getting evidence on value drift in EA—Michael Aird, 2020
Estimating the Philanthropic Discount Rate—Michael Dickens, 2020
The case for investing to give later—Sjir Hoeijmakers, 2020
I intend to add to this list over time. If you know of other relevant work, please mention it in a comment. One place to check is the relevant EA Forum tag. (As of July 2020, this list contains everything with that tag, but that might change in future.)
See also my collection of EA analyses of how social social movements rise, fall, can be influential, etc.
(See also Books on authoritarianism, Russia, China, NK, democratic backsliding, etc.?)
The Precipice—Toby Ord (Chapter 5 has a section on Dystopian Scenarios)
The Totalitarian Threat—Bryan Caplan (if that link stops working, a link to a Word doc version can be found on this page) (some related discussion on the 80k podcast here; use the “find” function)
Reducing long-term risks from malevolent actors—David Althaus and Tobias Baumann, 2020
The Centre for the Governance of AI’s research agenda—Allan Dafoe (this contains discussion of “robust totalitarianism”, and related matters)
A shift in arguments for AI risk—Tom Sittler (this has a brief but valuable section on robust totalitarianism) (discussion of the overall piece here)
Existential Risk Prevention as Global Priority—Nick Bostrom (this discusses the concepts of “permanent stagnation” and “flawed realisation”, and very briefly touches on their relevance to e.g. lasting totalitarianism)
The Future of Human Evolution—Bostrom, 2009 (I think some scenarios covered there might count as dystopias, depending on definitions)
The Vulnerable World Hypothesis—Bostrom, 2019
80,000 Hours interview with Tyler Cowen − 2018
Various works of fiction, most notably Orwell’s 1984
Some sources on dictatorships/totalitarianism in general (without a focus on long-term future consequences)
Dikötter, F. (2019). How to Be a Dictator: The Cult of Personality in the Twentieth Century. Bloomsbury Publishing.
Glad, B. (2002). Why tyrants go too far: Malignant narcissism and absolute power. Political Psychology, 23(1), 1-2.*
Chang, J., & Halliday, J. (2007). Mao: The unknown story. Vintage.*
In Appendix F of The Precipice, Ord provides a list of policy and research recommendations related to existential risk (reproduced here). This post contains lightly edited versions of some quick, tentative thoughts I wrote regarding those recommendations in April 2020 (but which I didn’t post at the time).
Overall, I very much like Ord’s list, and I don’t think any of his recommendations seem bad to me. So most of my commentary is on things I feel are arguably missing.
Ord’s list includes no recommendations specifically related to any of what he calls “other anthropogenic risks”, meaning:
“back contamination” from microbes from planets we explore
“our most radical scientific experiments”
(Some of his “General” recommendations would be useful for those risks, but there are no recommendations specifically targeted at those risks.)
This is despite the fact that Ord estimates a ~1 in 50 chance that “other anthropogenic risks” will cause existential catastrophe in the next 100 years. That’s ~20 times as high as his estimate for each of nuclear war and climate change (~1 in 1000), and ~200 times as high as his estimate for all “natural risks” put together (~1 in 10,000). (Note that Ord’s “natural risks” includes supervolcanic eruption, asteroid or comet impact, and stellar explosion, but does not include “‘naturally’ arising pandemics”. See here for Ord’s estimates and some commentary on them.)
Meanwhile, Ord includes 10 recommendations specifically related to “natural risk”s, 7 related to nuclear war, and 8 related to climate change. Those recommendations do all look to me like good recommendations, and like things “someone” should do. But it seems odd to me that there are that many recommendations for those risks, yet none specifically related to a category Ord seems to think poses many times more existential risk.
Perhaps it’s just far less clear to Ord what, concretely, should be done about “other anthropogenic risks”. And perhaps he wanted his list to only include relatively concrete, currently actionable recommendations. But I expect that, if we tried, we could find or generate such recommendations related to dystopian scenarios and nanotechnology (the two risks from this category I’m most concerned about).
So one thing I’d recommend is someone indeed having a go at finding or generating such recommendations! (I might have a go at that myself for dystopias, but probably only at least 6 months from now.)
(See also posts tagged global dystopia, atomically precise manufacturing, or space.)
Similarly, Ord has no recommendations specifically related to what he called “‘naturally’ arising pandemics” (as opposed to “engineered pandemics”), which he estimates as posing as much existential risk over the next 100 years as all “natural risks” put together (~1 in 10,000). (Again, note that he doesn’t include “‘naturally’ arising pandemics” as a “natural risk”.)
This is despite the fact that, as noted above, he has 10 recommendations related to “natural risks”. This also seems somewhat strange to me.
That said, one of Ord’s recommendations for “Emerging Pandemics” would also help with “‘naturally’ arising pandemics”. (This is the recommendation to “Strengthen the WHO’s ability to respond to emerging pandemics through rapid disease surveillance, diagnosis and control. This involves increasing its funding and powers, as well as R&D on the requisite technologies.”) But the other five recommendations for “Emerging Pandemics” do seem fairly specific to emerging rather than “naturally” arising pandemics.
Ord recommends “Increas[ing] transparency around accidents in BSL-3 and BSL-4 laboratories.” “BSL” refers to “biosafety level”, and 4 is the highest it gets.
In Chapter 5, Ord provides some jawdropping/hilarious/horrifying tales of accidents even among labs following the BSL-4 standards (including two accidents in a row for one lab). So I’m very much on board with the recommendation to increase transparency around those accidents.
But I was a little surprised to see that Ord didn’t also call for things like:
introducing more stringent standards (to prevent rather than be transparent about accidents),
introducing more monitoring and enforcement of compliance with those standards, and/or
restricting some kinds of research as too dangerous for even labs following the highest standards
Some possible reasons why he may not have called for such things:
He may have worried there’d be too much pushback, e.g. from the bioengineering community
He may have thought those things just actually would be net-negative, even if not for pushback
He may have felt that his other recommendations would effectively accomplish similar results
But I’d guess (with low confidence) that at least something along the lines of the three “missing recommendations” mentioned above—and beyond what Ord already recommends—would probably help reduce biorisk, if done as collaboratively with the relevant communities as is practical.
One of Ord’s recommendations is to:
Develop better theoretical and practical tools for assessing risks with extremely high stakes that are either unprecedented or thought to have extremely low probability.
I think this is a great recommendation. (See also Database of existential risk estimates.) That recommendation also made me think that another strong recommendation might be something like:
Develop better approaches, incentives, and norms for communicating about risks with extremely high stakes that are either unprecedented or thought to have extremely low probability.
That sounds a bit vague, and I’m not sure exactly what form such approaches, incentives, or norms should take or how one would implement them. (Though I think that the same is true of the recommendation of Ord’s which inspired this one.)
That proposed recommendation of mine was in part inspired by the COVID-19 situation, and more specifically by the following part of an 80,000 Hours Podcast episode. (which also gestures in the direction of concrete implications of my proposed recommendations).
Rob Wiblin: The alarm [about COVID-19] could have been sounded a lot sooner and we could have had five extra weeks to prepare. Five extra weeks to stockpile food. Five extra weeks to manufacture more hand sanitizer. Five extra weeks to make more ventilators. Five extra weeks to train people to use the ventilators. Five extra weeks to figure out what the policy should be if things got to where they are now.Work was done in that time, but I think a lot less than could have been done if we had had just the forecasting ability to think a month or two ahead, and to think about probabilities and expected value. And this is another area where I think we could improve a great deal.I suppose we probably won’t fall for this exact mistake again. Probably the next time this happens, the world will completely freak out everywhere simultaneously. But we need better ability to sound the alarm, potentially greater willingness actually on the part of experts to say, ‘I’m very concerned about this and people should start taking action, not panic, but measured action now to prepare,’ because otherwise it’ll be a different disaster next time and we’ll have sat on our hands for weeks wasting time that could have saved lives. Do you have anything to add to that?Howie Lempel: I think one thing that we need as a society, although I don’t know how to get there, is an ability to see an expert say that they are really concerned about some risk. They think it likely won’t materialize, but it is absolutely worth putting a whole bunch of resources into preparing, and seeing that happen and then seeing the risk not materialize and not just cracking down on and shaming that expert, because that’s just going to be what happens most of the time if you want to prepare for things that don’t occur that often.
Rob Wiblin: The alarm [about COVID-19] could have been sounded a lot sooner and we could have had five extra weeks to prepare. Five extra weeks to stockpile food. Five extra weeks to manufacture more hand sanitizer. Five extra weeks to make more ventilators. Five extra weeks to train people to use the ventilators. Five extra weeks to figure out what the policy should be if things got to where they are now.
Work was done in that time, but I think a lot less than could have been done if we had had just the forecasting ability to think a month or two ahead, and to think about probabilities and expected value. And this is another area where I think we could improve a great deal.
I suppose we probably won’t fall for this exact mistake again. Probably the next time this happens, the world will completely freak out everywhere simultaneously. But we need better ability to sound the alarm, potentially greater willingness actually on the part of experts to say, ‘I’m very concerned about this and people should start taking action, not panic, but measured action now to prepare,’ because otherwise it’ll be a different disaster next time and we’ll have sat on our hands for weeks wasting time that could have saved lives. Do you have anything to add to that?
Howie Lempel: I think one thing that we need as a society, although I don’t know how to get there, is an ability to see an expert say that they are really concerned about some risk. They think it likely won’t materialize, but it is absolutely worth putting a whole bunch of resources into preparing, and seeing that happen and then seeing the risk not materialize and not just cracking down on and shaming that expert, because that’s just going to be what happens most of the time if you want to prepare for things that don’t occur that often.
Here are Ord’s four policy and research recommendations under the heading “Unaligned Artificial Intelligence”:
Foster international collaboration on safety and risk management.Explore options for the governance of advanced AI.Perform technical research on aligning advanced artificial intelligence with human values.Perform technical research on other aspects of AGI safety, such as secure containment or tripwires.
Foster international collaboration on safety and risk management.
Explore options for the governance of advanced AI.
Perform technical research on aligning advanced artificial intelligence with human values.
Perform technical research on other aspects of AGI safety, such as secure containment or tripwires.
These all seem to me like excellent suggestions, and I’m glad Ord has lent additional credibility and force to such recommendations by including them in such a compelling and not-wacky-seeming book. (I think Human Compatible and The Alignment Problem were also useful in a similar way.)
But I was also slightly surprised to not see explicit mention of, for example:
Work to actually understand what human values actually are, how they’re structured, which aspects of them we do/should care about, etc.
E.g., much of Stuart Armstrong’s research, or some work that’s more towards the philosophical rather than technical end
“Agent foundations”/“deconfusion”/MIRI-style research
Further formalisation and critique of the various arguments and models about AI risk
Along the lines of this, this, this, or the sources listed here
But this isn’t really a criticism, because:
Perhaps the first two of the “missing recommendations” I mentioned were actually meant to be implicit in Ord’s third and fourth recommendations
Perhaps Ord has good reasons to not see these recommendations as especially worth mentioning
Perhaps Ord thought he’d be unable to concisely state such recommendations (or just the MIRI-style research one) in a way that would sound concrete and clearly actionable to policymakers
Any shortlist of a person’s top recommendations will inevitably fail to 100% please all readers
Each of the following works show or can be read as showing a different model/classification scheme/taxonomy:
Defence in Depth Against Human Extinction:Prevention, Response, Resilience, and Why They All Matter—Cotton-Barratt, Daniel, and Sandberg, 2020
The same model is also discussed in Toby Ord’s The Precipice.
Cotton-Barratt also discusses this model, and rationales for building such models, on the 80,000 Hours podcast.
Classifying global catastrophic risks—Avin et al., 2018
Conflict of interest statement: I am the aforementioned human.
This might not quite “belong” in this list. But one could classify risks by which of the different “paths” they might follow (e.g., those that would vs wouldn’t “pass through” a distinct collapse stage).
Typology of human extinction risks—Alexey Turchin, ~2015
Related LessWrong post
Personally, I think the model/classification scheme in Defence in Depth is probably the most useful. But I think at least a quick skim of the above sources is useful; I think they each provide an additional useful angle or tool for thought.
Wait, exactly what are you actually collecting here?
The scope of this collection is probably best revealed by checking out the above sources.
But to further clarify, here are two things I don’t mean, which aren’t included in the scope:
Classifications into things like “AI risk vs biorisk”, or “natural vs anthropogenic”
Such categorisation schemes are clearly very important, but they’re also well-established and you probably don’t need a list of sources that show them.
Classifications into different “types of catastrophe”, such as Ord’s distinction between extinction, unrecoverable collapse, and unrecoverable dystopia
This is also very important, and maybe I should make such a collection at some point, but it’s a separate matter to this.
(Will likely be expanded as I find and remember more)
On a 2018 episode of the FLI podcast about the probability of nuclear war and the history of incidents that could’ve escalated to nuclear war, Seth Baum said:
a lot of the incidents were earlier within, say, the ’40s, ’50s, ’60s, and less within the recent decades. That gave me some hope that maybe things are moving in the right direction.
I think we could flesh out this idea as the following argument:
Premise 1. We know of fewer incidents that could’ve escalated to nuclear war from the 70s onwards than from the 40s-60s.
Premise 2. If we know of fewer such incidents from the 70s onwards than from the 40s-60s, this is evidence that there really were fewer incidents from the 70s onwards than from the 40s-60s.
Premise 3. If there were fewer such incidents from the 70s onwards than from the 40s-60s, the odds of nuclear war are lower than they were in the 40s-60s.
Conclusion. The odds of nuclear war are (probably) lower than they were in the 40s-60s.
I don’t really have much independent knowledge regarding the first premise, but I’ll take Baum’s word for it. And the third premise seems to make sense.
But I wonder about the second premise, which Baum’s statements seem to sort-of take for granted (which is fair enough, as this was just one quick, verbal statement from him). In particular, I wonder whether the observation “I know about fewer recent than older incidents” is actually what we’d expect to see even if the rate hadn’t changed, just because security-relevant secrets only gradually get released/filter into the public record? If so, should we avoid updating our beliefs about the rate based on that observation?
These are genuine rather than rhetorical questions. I don’t know much about how we come to know about these sorts of incidents; if someone knows more, I’d appreciate their views on what we can make of knowing about fewer recent incidents.
This also seems relevant to some points made earlier on that podcast. In particular, Robert de Neufville said:
We don’t have incidents from China’s nuclear program, but that doesn’t mean there weren’t any, it just means it’s hard to figure out, and that scenario would be really interesting to do more research on.
(Note: This was just one of many things Baum said, and was a quick, verbal comment. He may in reality already have thought in depth about the questions I raised. And in any case, he definitely seems to think the risk of nuclear war is significant enough to warrant a lot of attention.)
Comparisons of Capacity for Welfare and Moral Status Across Species—Jason Schukraft, 2020
Preliminary thoughts on moral weight—Luke Muehlhauser, 2018
Should Longtermists Mostly Think About Animals? - Abraham Rowe, 2020
2017 Report on Consciousness and Moral Patienthood—Luke Muehlhauser, 2017 (the idea of “moral weights” is addressed briefly in a few places)
As I’m sure you’ve noticed, this is a very small collection. I intend to add to it over time. If you know of other relevant work, please mention it in a comment.
(ETA: The following speculation appears false; see comments below.) It also appears possible this term was coined, for this particular usage, by Muehlhauser, and that in other communities other labels are used to discuss similar concepts. Please let me know if you have any information about either of those speculations of mine.
See also my collection of sources relevant to moral circles, moral boundaries, or their expansion and my collection of evidence about views on longtermism, time discounting, population ethics, etc. among non-EAs.
A few months ago I compiled a bibliography of academic publications about comparative moral status. It’s not exhaustive and I don’t plan to update it, but it might be a good place for folks to start if they’re interested in the topic.
Ah great, thanks!
Do you happen to recall if you encountered the term “moral weight” outside of EA/rationality circles? The term isn’t in the titles in the bibliography (though it may be in the full papers), and I see one that says “Moral status as a matter of degree?”, which would seem to refer to a similar idea. So this seems like it might be additional weak evidence that “moral weight” might be an idiosyncratic term in the EA/rationality community (whereas when I first saw Muehlhauser use it, I assumed he took it from the philosophical literature).
The term ‘moral weight’ is occasionally used in philosophy (David DeGrazia uses it from time to time, for instance) but not super often. There are a number of closely related but conceptually distinct issues that often get lumped together under the heading moral weight:
Capacity for welfare, which is how well or poorly a given animal’s life can go
Average realized welfare, which is how well or poorly the life of a typical member of a given species actually goes
Moral status, which is how much the welfare of a given animal matters morally
Differences in any of those three things might generate differences in how we prioritize interventions that target different species.
Rethink Priorities is going to release a report on this subject in a couple of weeks. Stay tuned for more details!
Thanks, that’s really helpful! I’d been thinking there’s an important distinction between that “capacity for welfare” idea and that “moral status” idea, so it’s handy to know the standard terms for that.
Looking forward to reading that!