Hey Scott—thanks for writing this, and sorry for being so slow to the party on this one!
I think you’ve raised an important question, and it’s certainly something that keeps me up at night. That said, I want to push back on the thrust of the post. Here are some responses and comments! :)
The main view I’m putting forward in this comment is “we should promote a diversity of memes that we believe, see which ones catch on, and mould the ones that are catching on so that they are vibrant and compelling (in ways we endorse).” These memes include both “existential risk” and “longtermism”.
What is longtermism?
The quote of mine you give above comes from Spring 2020. Since then, I’ve distinguished between longtermism and strong longtermism.
My current preferred slogan definitions of each:
Longtermism is the view that we should do much more to protect the interests of future generations. (Alt: that protecting the interests of future generations should be a key moral priority of our time.)
Strong longtermism is the view that protecting the interests of future generations should be the key moral priority of our time. (That’s similar to the quote of mine you give.)
In WWOTF, I promote the weaker claim. In recent podcasts, I’ve described it something like the the following (depending on how flowery I’m feeling at the time):
Longtermism about taking seriously just how much is at stake when we look to humanity’s future. It’s about trying to figure out what are the challenges we face in our lifetime that could be pivotal for our long-run trajectory. And it’s about ensuring that we act responsibly and carefully to navigate those challenges, steering that trajectory in a better direction, making the world better not just in the present, but also for our grandchildren, and for their grandchildren in turn.
I prefer to promote longtermism rather than strong longtermism. It’s a weaker claim and so I have a higher credence in it, and I feel much more robustly confident in it; at the same time, it gets almost all the value because in the actual world strong longtermism recommends the same actions most of the time, on the current margin.
Is existential risk a more compelling intro meme than longtermism?
My main take is: What meme is good for which people is highly dependent on the person and the context (e.g., the best framing to use in a back-and-forth conversation may be different from one in a viral tweet). This favours diversity; having a toolkit of memes that we can use depending on what’s best in context.
I think it’s very hard to reason about which memes to promote, and easy to get it wrong from the armchair, for a bunch of reasons:
It’s inherently unpredictable which memes do well.
It’s incredibly context-dependent. To figure this out, the main thing is just about gathering lots of (qualitative and quantitative) data from the demographic you’re interacting with. The memes that resonate most with Ezra Klein podcast listeners are very different from those that resonant most with Tyler Cowen podcast listeners, even though their listeners are very similar people compared to the wider world. And even with respect to one idea, subtly different framings can have radically different audience reactions. (cf. “We care about future generations” vs “We care about the unborn.”)
People vary a lot. Even within very similar demographics, some people can love one message while other people hate it.
“Curse of knowledge”—when you’re really deep down the rabbit hole in a set of ideas, it’s really hard to imagine what it’s like being first exposed to those ideas.
Then, at least when we’re comparing (weak) longtermism with existential risk, it’s not obvious which resonates better in general. (If anything, it seems to me that (weak) longtermism does better.) A few reasons:
First, message testing from Rethink suggests that longtermism and existential risk have similarly-good reactions from the educated general public, and AI risk doesn’t do great. The three best-performing messages they tested were:
“The current pandemic has shown that unforeseen events can have a devastating effect. It is imperative that we prepare both for pandemics and other risks which could threaten humanity’s long-term future.”
“In any year, the risk from any given threat might be small—but the odds that your children or grandchildren will face one of them is uncomfortably high.”
“It is important to ensure a good future not only for our children’s children, but also the children of their children.”
So people actually pretty like messages that are about unspecified, and not necessarily high-probability threats, to the (albeit nearer-term) future.
As terms to describe risk, “global catastrophic risk” and “long-term risk” did the best, coming out a fair amount better than “existential risk”.
They didn’t test a message about AI risk specifically. The related thing was how much the government should prepare for different risks (pandemics, nuclear, etc), and AI came out worst among about 10 (but I don’t think that tells us very much).
Second, most media reception of WWOTF has been pretty positive so far. This is based mainly on early reviews (esp trade reviews), podcast and journalistic interviews, and the recent profiles (although the New Yorker profile was mixed). Though there definitely has been some pushback (especially on Twitter), I think it’s overall been dwarfed by positive articles. And the pushback I have gotten is on the Elon endorsement, association between EA and billionaires, and on standard objections to utilitarianism — less so to the idea of longtemism itself.
Third, anecdotally at least, a lot of people just hate the idea of AI risk (cf Twitter), thinking of it as a tech bro issue, or doomsday cultism. This has been coming up in the twitter response to WWOTF, too, even though existential risk from AI takeover is only a small part of the book. And this is important, because I’d think that the median view among people working on x-risk (including me) is that the large majority of the risk comes from AI rather than bio or other sources. So “holy shit, x-risk” is mainly, “holy shit, AI risk”.
Do neartermists and longtermists agree on what’s best to do?
Here I want to say: maybe. (I personally don’t think so, but YMMV.) But even if you do believe that, I think that’s a very fragile state of affairs, which could easily change as more money and attention flows into x-risk work, or if our evidence changes, and I don’t want to place a lot of weight on it. (I do strongly believe that global catastrophic risk is enormously important even in the near term, and a sane world would be doing far, far better on it, even if everyone only cared about the next 20 years.)
More generally, I get nervous about any plan that isn’t about promoting what we fundamentally believe or care about (or a weaker version of what we fundamentally believe or care about, which is “on track” to the things we do fundamentally believe or care about).
What I mean by “promoting what we fundamentally believe or care about”:
Promoting goals rather than means. This means that (i) if the environment changes (e.g. some new transformative tech comes along, or the political environment changes dramatically, like war breaks out) or (ii) if our knowledge changes (e.g. about the time until transformative AIs, or about what actions to take), then we’ll take different means to pursue our goals. I think this is particularly important for something like AI, but also true more generally.
Promoting the ideas that you believe most robustly—i.e. that you think you are least likely to change in the coming 10 years. Ideally these things aren’t highly conjunctive or relying on speculative premises. This makes it less likely that you will realise that you’ve been wasting your time or done active harm by promoting wrong ideas in ten years’ time. (Of course, this will vary from person to person. I think that (weak) longtermism is really robustly true and neglected, and I feel bullish about promoting it. For others, the thing that might feel really robustly true is “TAI is a BFD and we’re not thinking about it enough”—I suspect that many people feel they more robustly believe this than longtermism.)
Examples of people promoting means rather than goals, and this going wrong:
“Eat less meat because it’s good for your health” → people (potentially) eat less beef and more chicken.
“Stop nuclear power” (in the 70s) → environmentalists hate nuclear power, even though it’s one of the best bits of clean tech we have.
Examples of how this could go wrong by promoting “holy shit x-risk”:
We miss out on non-x-risk ways of promoting a good long-run future:
E.g. the risk that we solve the alignment problem but AI is used to lock in highly suboptimal values. (Personally, I think a large % of future expected value is lost in this way.)
We highlight the importance of AI to people who are not longtermist. They realise how transformatively good it could be for them and for the present generation (a digital immortality of bliss!) if AI is aligned, and they think the risk of misalignment is small compared to the benefits. They become AI-accelerationists (a common view among Silicon Valley types).
AI progress slows considerably in the next 10 years, and actually near-term x-risk doesn’t seem so high. Rather than doing whatever the next-best longtermist thing is, the people who came in via “holy shit x-risk” people just do whatever instead, and the people who promoted the “holy shit x-risk” meme get a bad reputation.
So, overall my take is:
“Existential risk” and “longtermism” are both important ideas that deserve greater recognition in the world.
My inclination is to prefer promoting “longtermism” because that’s closer to what I fundamentally believe (in the sense I explain above), and it’s nonobvious to me which plays better PR-wise, and it’s probably highly context-dependent.
Let’s try promoting them both, and see how they each catch on.
Thanks for writing this! That overall seems pretty reasonable, and from a marketing perspective I am much more excited about promoting “weak” longtermism than strong longtermism.
A few points of pushback:
I think that to work on AI Risk, you need to buy into AI Risk arguments. I’m unconvinced that buying longtermism first really shifts the difficulty of figuring this point out. And I think that if you buy AI Risk, longtermism isn’t really that cruxy. So if our goal is to get people working on AI Risk, marketing longtermism first is strictly harder (even if it may be much easier)
I think that very few people say “I buy the standard AI X-Risk arguments and that this is a pressing thing, but I don’t care about future people so I’m going to rationally work on a more pressing problem”—if someone genuinely goes through that reasoning then more power to them!
I also expect that people have done much more message testing + refinement on longtermism than AI Risk, and that good framings could do much better—I basically buy the claim that it’s a harder sell though
Caveat: This reasoning applies more to “can we get people working on AI X-Risk with their careers” more so than things like broad societal value shifting
Caveat: Plausibly there’s enough social proof that people who care about longtermism start hanging out with EAs and are exposed to a lot of AI Safety memes and get there eventually? And it’s a good gateway thing?
I want AI Risk to be a broad tent where people who don’t buy longtermism feel welcome. I’m concerned about a mood affiliation problem where people who don’t buy longtermism but hear it phrased it as an abstract philosophical problem that requires you to care about the 10^30 future people won’t want to work on it, even though they buy the object level. This kind of thing shouldn’t hinge on your conclusions in contentious questions in moral philosophy!
More speculatively: It’s much less clear to me that pushing on things like general awareness of longtermism or longterm value change matter in a world with <20 year AI Timelines? I expect the world to get super weird after that, where more diffuse forms of longtermism don’t matter much. Are you arguing that this kind of value change over the next 20 years makes it more likely that the correct values are loaded into the AGI, and that’s how it affects the future?
message testing from Rethink suggests that longtermism and existential risk have similarly-good reactions from the educated general public
I can’t find info on Rethink’s site, is there anything you can link to?
Of the three best-performing messages you’ve linked, I think the first two emphasise risk much more heavily than longtermism. The third does sound more longtermist, but I still suspect the risk-ish phrase ‘ensure a good future’ is a large part of what resonates.
All that said, more info on the tests they ran would obviously update my position.
So people actually pretty like messages that are about unspecified, and not necessarily high-probability threats, to the (albeit nearer-term) future.
This seems correct to me, and I would be excited to see more of them. However, I wouldn’t interpret this as meaning ‘longtermism and existential risk have similarly-good reactions from the educated general public’, I would read this as risk messaging performing better.
Also, messages ‘about unspecified, and not necessarily high-probability threats’ is not how I would characterize most of the EA-related press I’ve seen recently (NYTimes, BBC, Time, Vox).
(More generally, I mostly see journalists trying to convince their readers that an issue is important using negative emphasis. Questioning existing practices is important: they might be ineffective; they might be unsuitable to EA aims (e.g. manipulative, insufficiently truth-seeking, geared to persuade as many people as possible which isn’t EA’s objective, etc.). But I think the amount of buy-in this strategy has in high-stakes, high-interest situations (e.g. US presidential elections) is enough that it would be valuable to be clear on when EA deviates from it and why).
tl;dr: I suspect risk-ish messaging works better. Journalists seem to have a strong preference for it. Most of the EA messaging I’ve seen recently departs from this. I think it would be great to be very clear on why. I’m aware I’m missing a lot of data. It would be great to see the data from rethink that you referenced. Thanks!
Thanks for explaining, really interesting and glad so much careful thinking is going into communication issues!
FWIW I find the “meme” framing you use here offputting. The framing feels kinda uncooperative, as if we’re trying to trick people into believing in something, instead of making arguments to convince people who want to understand the merits of an idea. I associate memes with ideas that are selected for being easy and fun to spread, that likely affirm our biases, and that mostly without the constraint whether the ideas are convincing upon reflection, true or helpful for the brain that gets “infected” by the meme.
Proponents theorize that memes are a viral phenomenon that may evolve by natural selection in a manner analogous to that of biological evolution.[8] Memes do this through the processes of variation, mutation, competition, and inheritance, each of which influences a meme’s reproductive success. Memes spread through the behavior that they generate in their hosts. Memes that propagate less prolifically may become extinct, while others may survive, spread, and (for better or for worse) mutate. Memes that replicate most effectively enjoy more success, and some may replicate effectively even when they prove to be detrimental to the welfare of their hosts.[9]
Hey Scott—thanks for writing this, and sorry for being so slow to the party on this one!
I think you’ve raised an important question, and it’s certainly something that keeps me up at night. That said, I want to push back on the thrust of the post. Here are some responses and comments! :)
The main view I’m putting forward in this comment is “we should promote a diversity of memes that we believe, see which ones catch on, and mould the ones that are catching on so that they are vibrant and compelling (in ways we endorse).” These memes include both “existential risk” and “longtermism”.
What is longtermism?
The quote of mine you give above comes from Spring 2020. Since then, I’ve distinguished between longtermism and strong longtermism.
My current preferred slogan definitions of each:
Longtermism is the view that we should do much more to protect the interests of future generations. (Alt: that protecting the interests of future generations should be a key moral priority of our time.)
Strong longtermism is the view that protecting the interests of future generations should be the key moral priority of our time. (That’s similar to the quote of mine you give.)
In WWOTF, I promote the weaker claim. In recent podcasts, I’ve described it something like the the following (depending on how flowery I’m feeling at the time):
I prefer to promote longtermism rather than strong longtermism. It’s a weaker claim and so I have a higher credence in it, and I feel much more robustly confident in it; at the same time, it gets almost all the value because in the actual world strong longtermism recommends the same actions most of the time, on the current margin.
Is existential risk a more compelling intro meme than longtermism?
My main take is: What meme is good for which people is highly dependent on the person and the context (e.g., the best framing to use in a back-and-forth conversation may be different from one in a viral tweet). This favours diversity; having a toolkit of memes that we can use depending on what’s best in context.
I think it’s very hard to reason about which memes to promote, and easy to get it wrong from the armchair, for a bunch of reasons:
It’s inherently unpredictable which memes do well.
It’s incredibly context-dependent. To figure this out, the main thing is just about gathering lots of (qualitative and quantitative) data from the demographic you’re interacting with. The memes that resonate most with Ezra Klein podcast listeners are very different from those that resonant most with Tyler Cowen podcast listeners, even though their listeners are very similar people compared to the wider world. And even with respect to one idea, subtly different framings can have radically different audience reactions. (cf. “We care about future generations” vs “We care about the unborn.”)
People vary a lot. Even within very similar demographics, some people can love one message while other people hate it.
“Curse of knowledge”—when you’re really deep down the rabbit hole in a set of ideas, it’s really hard to imagine what it’s like being first exposed to those ideas.
Then, at least when we’re comparing (weak) longtermism with existential risk, it’s not obvious which resonates better in general. (If anything, it seems to me that (weak) longtermism does better.) A few reasons:
First, message testing from Rethink suggests that longtermism and existential risk have similarly-good reactions from the educated general public, and AI risk doesn’t do great. The three best-performing messages they tested were:
“The current pandemic has shown that unforeseen events can have a devastating effect. It is imperative that we prepare both for pandemics and other risks which could threaten humanity’s long-term future.”
“In any year, the risk from any given threat might be small—but the odds that your children or grandchildren will face one of them is uncomfortably high.”
“It is important to ensure a good future not only for our children’s children, but also the children of their children.”
So people actually pretty like messages that are about unspecified, and not necessarily high-probability threats, to the (albeit nearer-term) future.
As terms to describe risk, “global catastrophic risk” and “long-term risk” did the best, coming out a fair amount better than “existential risk”.
They didn’t test a message about AI risk specifically. The related thing was how much the government should prepare for different risks (pandemics, nuclear, etc), and AI came out worst among about 10 (but I don’t think that tells us very much).
Second, most media reception of WWOTF has been pretty positive so far. This is based mainly on early reviews (esp trade reviews), podcast and journalistic interviews, and the recent profiles (although the New Yorker profile was mixed). Though there definitely has been some pushback (especially on Twitter), I think it’s overall been dwarfed by positive articles. And the pushback I have gotten is on the Elon endorsement, association between EA and billionaires, and on standard objections to utilitarianism — less so to the idea of longtemism itself.
Third, anecdotally at least, a lot of people just hate the idea of AI risk (cf Twitter), thinking of it as a tech bro issue, or doomsday cultism. This has been coming up in the twitter response to WWOTF, too, even though existential risk from AI takeover is only a small part of the book. And this is important, because I’d think that the median view among people working on x-risk (including me) is that the large majority of the risk comes from AI rather than bio or other sources. So “holy shit, x-risk” is mainly, “holy shit, AI risk”.
Do neartermists and longtermists agree on what’s best to do?
Here I want to say: maybe. (I personally don’t think so, but YMMV.) But even if you do believe that, I think that’s a very fragile state of affairs, which could easily change as more money and attention flows into x-risk work, or if our evidence changes, and I don’t want to place a lot of weight on it. (I do strongly believe that global catastrophic risk is enormously important even in the near term, and a sane world would be doing far, far better on it, even if everyone only cared about the next 20 years.)
More generally, I get nervous about any plan that isn’t about promoting what we fundamentally believe or care about (or a weaker version of what we fundamentally believe or care about, which is “on track” to the things we do fundamentally believe or care about).
What I mean by “promoting what we fundamentally believe or care about”:
Promoting goals rather than means. This means that (i) if the environment changes (e.g. some new transformative tech comes along, or the political environment changes dramatically, like war breaks out) or (ii) if our knowledge changes (e.g. about the time until transformative AIs, or about what actions to take), then we’ll take different means to pursue our goals. I think this is particularly important for something like AI, but also true more generally.
Promoting the ideas that you believe most robustly—i.e. that you think you are least likely to change in the coming 10 years. Ideally these things aren’t highly conjunctive or relying on speculative premises. This makes it less likely that you will realise that you’ve been wasting your time or done active harm by promoting wrong ideas in ten years’ time. (Of course, this will vary from person to person. I think that (weak) longtermism is really robustly true and neglected, and I feel bullish about promoting it. For others, the thing that might feel really robustly true is “TAI is a BFD and we’re not thinking about it enough”—I suspect that many people feel they more robustly believe this than longtermism.)
Examples of people promoting means rather than goals, and this going wrong:
“Eat less meat because it’s good for your health” → people (potentially) eat less beef and more chicken.
“Stop nuclear power” (in the 70s) → environmentalists hate nuclear power, even though it’s one of the best bits of clean tech we have.
Examples of how this could go wrong by promoting “holy shit x-risk”:
We miss out on non-x-risk ways of promoting a good long-run future:
E.g. the risk that we solve the alignment problem but AI is used to lock in highly suboptimal values. (Personally, I think a large % of future expected value is lost in this way.)
We highlight the importance of AI to people who are not longtermist. They realise how transformatively good it could be for them and for the present generation (a digital immortality of bliss!) if AI is aligned, and they think the risk of misalignment is small compared to the benefits. They become AI-accelerationists (a common view among Silicon Valley types).
AI progress slows considerably in the next 10 years, and actually near-term x-risk doesn’t seem so high. Rather than doing whatever the next-best longtermist thing is, the people who came in via “holy shit x-risk” people just do whatever instead, and the people who promoted the “holy shit x-risk” meme get a bad reputation.
So, overall my take is:
“Existential risk” and “longtermism” are both important ideas that deserve greater recognition in the world.
My inclination is to prefer promoting “longtermism” because that’s closer to what I fundamentally believe (in the sense I explain above), and it’s nonobvious to me which plays better PR-wise, and it’s probably highly context-dependent.
Let’s try promoting them both, and see how they each catch on.
Thanks for writing this! That overall seems pretty reasonable, and from a marketing perspective I am much more excited about promoting “weak” longtermism than strong longtermism.
A few points of pushback:
I think that to work on AI Risk, you need to buy into AI Risk arguments. I’m unconvinced that buying longtermism first really shifts the difficulty of figuring this point out. And I think that if you buy AI Risk, longtermism isn’t really that cruxy. So if our goal is to get people working on AI Risk, marketing longtermism first is strictly harder (even if it may be much easier)
I think that very few people say “I buy the standard AI X-Risk arguments and that this is a pressing thing, but I don’t care about future people so I’m going to rationally work on a more pressing problem”—if someone genuinely goes through that reasoning then more power to them!
I also expect that people have done much more message testing + refinement on longtermism than AI Risk, and that good framings could do much better—I basically buy the claim that it’s a harder sell though
Caveat: This reasoning applies more to “can we get people working on AI X-Risk with their careers” more so than things like broad societal value shifting
Caveat: Plausibly there’s enough social proof that people who care about longtermism start hanging out with EAs and are exposed to a lot of AI Safety memes and get there eventually? And it’s a good gateway thing?
I want AI Risk to be a broad tent where people who don’t buy longtermism feel welcome. I’m concerned about a mood affiliation problem where people who don’t buy longtermism but hear it phrased it as an abstract philosophical problem that requires you to care about the 10^30 future people won’t want to work on it, even though they buy the object level. This kind of thing shouldn’t hinge on your conclusions in contentious questions in moral philosophy!
More speculatively: It’s much less clear to me that pushing on things like general awareness of longtermism or longterm value change matter in a world with <20 year AI Timelines? I expect the world to get super weird after that, where more diffuse forms of longtermism don’t matter much. Are you arguing that this kind of value change over the next 20 years makes it more likely that the correct values are loaded into the AGI, and that’s how it affects the future?
On this particular point
I can’t find info on Rethink’s site, is there anything you can link to?
Of the three best-performing messages you’ve linked, I think the first two emphasise risk much more heavily than longtermism. The third does sound more longtermist, but I still suspect the risk-ish phrase ‘ensure a good future’ is a large part of what resonates.
All that said, more info on the tests they ran would obviously update my position.
This seems correct to me, and I would be excited to see more of them. However, I wouldn’t interpret this as meaning ‘longtermism and existential risk have similarly-good reactions from the educated general public’, I would read this as risk messaging performing better.
Also, messages ‘about unspecified, and not necessarily high-probability threats’ is not how I would characterize most of the EA-related press I’ve seen recently (NYTimes, BBC, Time, Vox).
(More generally, I mostly see journalists trying to convince their readers that an issue is important using negative emphasis. Questioning existing practices is important: they might be ineffective; they might be unsuitable to EA aims (e.g. manipulative, insufficiently truth-seeking, geared to persuade as many people as possible which isn’t EA’s objective, etc.). But I think the amount of buy-in this strategy has in high-stakes, high-interest situations (e.g. US presidential elections) is enough that it would be valuable to be clear on when EA deviates from it and why).
tl;dr: I suspect risk-ish messaging works better. Journalists seem to have a strong preference for it. Most of the EA messaging I’ve seen recently departs from this. I think it would be great to be very clear on why. I’m aware I’m missing a lot of data. It would be great to see the data from rethink that you referenced. Thanks!
Thanks for explaining, really interesting and glad so much careful thinking is going into communication issues!
FWIW I find the “meme” framing you use here offputting. The framing feels kinda uncooperative, as if we’re trying to trick people into believing in something, instead of making arguments to convince people who want to understand the merits of an idea. I associate memes with ideas that are selected for being easy and fun to spread, that likely affirm our biases, and that mostly without the constraint whether the ideas are convincing upon reflection, true or helpful for the brain that gets “infected” by the meme.
Some support for this interpretation from the Wikipedia introduction: