Followup on Terminator

Link post

I posted my last post to the Effective Altruism Forum, where it received more attention than I’d anticipated. This was cool, but also a little scary, since I’m not actually super confident in my thesis—I think I made an under-appreciated argument, but there was plenty of valid pushback in the comments.

My post was missing an important piece of context—the other, alternative comparisons for AI risk I was implicitly arguing against. I had something very specific in mind: on The Weeds, after Kelsey Piper says Terminator is not a good comparison for AI risk, she says that a better comparison is Philip Morris, the cigarette company. Now, there are a lot different threats that fall under the umbrella of “AI risk”, but the threats I am most worried about—existential threats—definitely look a lot more like Terminator than Philip Morris. This could just be a difference in priorities, but Kelsey says other things throughout the episode that made it sound to me like she’s indeed referring to x-risk from unaligned AGI.

Here are some specific critiques made by commenters that I liked.

What’s the audience? What’s the message?

My paragraph calling rejection of the Terminator comparison “fundamentally dishonest” was too idealistic. Whether or not something is “like” something else is highly subjective, and pitches about AI x-risk often happen in a context prior to any kind of patient, exhaustive discussion where you can expect your full meaning to come across. This was pointed out by Mauricio:

In a good faith discussion, one should be primarily concerned with whether or not their message is true, not what effect it will have on their audience.

Agreed, although I might be much less optimistic about how often this applies. Lots of communication comes before good faith discussion—lots of messages reach busy people who have to quickly decide whether your ideas are even worth engaging with in good faith. And if your ideas are presented in ways that look silly, many potential allies won’t have the time or interest to consider your arguments. This seems especially relevant in this context because there’s an uphill battle to fight—lots of ML engineers and tech policy folks are already skeptical of these concerns.

(That doesn’t mean communication should be false—there’s much room to improve a true message’s effects by just improving how it’s framed. In this case, given that there’s both similarities and differences between a field’s concerns and sci-fi movie’s concerns, emphasizing the differences might make sense.)

(On top of the objections you mentioned, I think another reason why it’s risky to emphasize similarities to a movie is that people might think you’re worried about stuff because you saw it in a sci-fi movie.)

I replied that the Terminator comparison could decrease interest for some audiences and increase interest for others, to which Mauricio replied:

That seems roughly right. On how this might depend on the audience, my intuition is that professional ML engineers and policy folks tend to be the first kind of people you mention (since their jobs select for and demand more grounded/​pragmatic interests). So, yes, there are considerations pushing for either side, but it’s not symmetrical—the more compelling path for communicating with these important audiences is probably heavily in the direction of “no, not like Terminator.”

Edit: So the post title’s encouragement to “stop saying it’s not” seems overly broad.

I think there are two cruxes here, one about the intended audience of AI alignment messaging, and one about the intended reaction.

As I see it, the difficulty of the alignment problem is still very unclear, and may fall into any of several different difficulty tiers:

  1. The alignment problem will get solved by the people who make TAI.

  2. The alignment problem will get solved if there is investment by non-profits and academia. ← We are here

  3. The alignment problem will get solved if there is significant public investment (cf. green energy subsidies).

  4. The alignment problem will get solved if there is massive public investment (cf. the Apollo project).

  5. The alignment problem will get solved if there is divine intervention.

We’re currently somewhere in tier 2. To get to higher tiers would require the creation of significant public buy-in, if the funding comes from the US government or some other democracy. I think this is totally possible—climate change has huge public buy-in, and is imo much more abstract than AI x-risk—but it requires a communication strategy that the average citizen can understand. This is what I’m implicitly thinking of when considering how to message AI alignment.

But this could be jumping the gun. Public interest in climate change followed consensus among experts, which has not yet been achieved for AI x-risk, and it might be the case that the latter is a prerequisite for the former.

Even so, a hybrid strategy is kind of unavoidable. I think many experts hesitate to work on AI alignment because it would just seem strange to their friends, family, and acquaintances. You can’t explain the orthogonality thesis, instrumental convergence, and intelligence explosion to every person you meet at a party, so it’s necessary to equip yourself with a low resolution explanation that looks very similar to famous science fiction.

One of my problems with the episode on The Weeds is that it didn’t seem like it contained any moment that might make someone go “holy shit, this is a big deal”. A single “holy shit” may be worth more than ten “that seems reasonable”s—and even for the people your message fails to persuade, you at least normalize the idea that this is a problem some people take very seriously and are desperately trying to solve. Ideally, we’d achieve a state where meeting someone devoted to work on AI alignment is considered no more unusual than meeting someone devoted to fighting climate change.

This is all speculative, though! I would be very curious to hear what people know about the political and social realities of ramping up investment in alignment research.

Which AI risks?

anson.ho pointed out that my pitch could have used better labeling:

I appreciate the post, although I’m still worried that comparisons between AI risk and The Terminator are more harmful than helpful.

One major reservation I have is with the whole framing of the argument, which is about “AI risk”. I guess you’re implicitly talking about AI catastrophic risk, which IMO is much more specific than AI risk in general. I would be very uncomfortable saying that near-term AI risks (e.g. due to algorithmic bias) are “like The Terminator”.

Even if we solely consider catastrophic risks due to AI, I think catastrophes don’t necessarily need to look anything like The Terminator. What about risks from AI-enabled mass surveillance? Or the difficulty of navigating the transition to a world where transformative AI plays a large role in the global economy? […]

I really only had x-risks from misalignment in mind when I wrote my post and should’ve made that more clear.

What about the, y’know, robots?

The most glaring hole in my post was in not addressing the misleading prominence of killer robots in Terminator, as pointed out by Gavin:

The main problem with Terminator is not that it is silly and made-up (though actually that has been a serious obstacle to getting the proudly pragmatic majority in academia and policy on board).

It’s that it embeds false assumptions about AI risk: “[…] AGI danger is concentrated in androids” […]

And by Andre:

Personally, when I say “AI Risk is not like Terminator”, I am trying to convey a couple of points:

The risk is from intelligence itself, not autonomous robot soldiers. You can’t make AI safer by avoiding the mistakes seen in the movies.

There will not be a robot war: we will “instantly” lose if non-aligned AI is created.

I think the average person has a misconception about “how” to respond to AI risk and also confuses robotics and AI. I think I agree with all the points you raise, but still feel that the meme “AI Risk is not like Terminator” is very helpful at addressing this problem.

And MakoYass:

[…] Terminator lore contains the alignment problem, but the movie is effectively entirely about humans triumphing in physical fights against robots, which is a scenario that is importantly incompatible with and never occurs under an abrupt capability gains in general intelligence. The movies spend three hours undermining the message for every 10 minutes they spend paying lip service to it. […]

And Greg_Colbourn:

[…] Everything after this [GIF of Skynet nuking humanity], involving killer robots, is very unrealistic (more realistic: everyone simultaneously dropping dead from poisoning with botulinum toxin delivered by undetectable nanodrones; and yes, even using the nukes would probably not happen).

This is a good point. If someone hears “yes, like Terminator”, the image likely to stick in their mind is a humanoid robot trampling on a pile of human skulls. What they should ideally think of is a bunch of researchers working to solve a technical problem.

A few reasons this doesn’t really bother me:

  • I don’t think anyone would be comfortable getting to the point where we would have to actually fight robot armies, anyway, so it’s not like this really trivializes the concern.

  • I don’t think any single specific route to world domination by AI is particularly likely, I just think eventually a sufficiently sophisticated AI would find a way. If you try to argue for any specific route, you can get bogged down in conjunctive arguments about what is or isn’t possible (nanotech, superhuman persuasion, etc.). A takeover by humanoid robots is a particularly unlikely route, but it has the advantage of being easily visualized and making intuitive sense.

  • I think “no, not like Terminator” just risks your audience not having anything stick in their mind at all. What does it look like for an AI to be like Philip Morris? Is that a big deal? Who knows?

Other comments

Aryeh Englander says Terminator hasn’t impeded his AI risk pitches:

I think I mostly lean towards general agreement with this take, but with several caveats as noted by others.

On the one hand, there are clearly important distinctions to be made between actual AI risk scenarios and Terminator scenarios. On the other hand, in my experience people pattern-matching to the Terminator usually doesn’t make anything seem less plausible to them, at least as far as I could tell. Most people don’t seem to have any trouble separating the time travel and humanoid robot parts from the core concern of misaligned AI, especially if you immediately point out the differences. In fact, in my experience, at least, the whole Terminator thing seems to just make AI risks feel more viscerally real and scary rather than being some sort of curious abstract thought experiment—which is how I think it often comes off to people.

Amusingly, I actually only watched Terminator 2 for the first time a few months ago, and I was surprised to realize that Skynet didn’t seem so far off from actual concerns about misaligned AI. Before that basically my whole knowledge of Skynet came from reading AI safety people complaining about how it’s nothing like the “real” concerns. In retrospect I was kind of embarrassed by the fact that I myself had repeated many of those complaints, even though I didn’t really know what Skynet was really about!

Aryeh made a separate thread detailing his success pitching AI risk by building out from present day challenges that I highly recommend.


timunderwood says we really ought to make a science out of this question:

Perhaps this is a bit tangential to the essay, but we ought to make an effort to actually test the assumptions underlying different public relations strategies. Perhaps the EA community ought to either build relations with marketing companies that work on focus grouping idea, or develop its own expertise in this way to test out the relative success of various public facing strategies (always keeping in mind that having just one public facing strategy is a really bad idea, because there is more than one type of person in the ‘public’.)

Which makes sense to me, but MaxRa has a critique:

I feel a bit sceptical of the caricature image of focus group testing that I have in mind… I feel like our main audience in the AI context are fairly smart people, and that you want to communicate the ideas in an honest discussion with high bandwidth. And with high bandwidth communication, like longer blogposts or in-person discussions, you usually receive feedback through comments whether the arguments make sense to the readers.

Which comes back to the question of what the target audience is.


Greg_Colbourn had two great top level comments, one about what the creators of Terminator have said about real AI risk:

I looked to see what the writers of The Terminator actually think about AGI x-risk. James Cameron’s takes are pretty disappointing. He expresses worries about loss of privacy and deepfakes, saying this is what a real Skynet would use to bring about our downfall.

More interesting is Gale Amber Hurd, who suggests that AI developers should take a Hippocratic Oath with explicit mention of unintended consequences:

“The one thing that they don’t teach in engineering schools and biotech is ethics and thinking about not only consequences, but unintended consequences.. If you go to medical school, there’s the Hippocratic Oath, first do no harm. I think we really need that in all of these new technologies.”

She also says:

“Stephen Hawking only came up with the idea that we need to worry about A.I. and robots about two and a half years before he passed away. I remember saying to Jim, ‘If he’d only watched The Terminator.’”

Which jives with the start of the conclusion of the OP:

It would be terrible if AI destroys humanity. It would also be very embarrassing. The Terminator came out nearly 40 years ago; we will not be able to claim we did not see the threat coming.

William Wisher discussed the issue at Comic-Con 2017, but I haven’t been able to find a video or transcript.

Harlan Ellison sued the producers of Terminator for plagiarism over a story about a time-travelling robotic soldier that he wrote in 1957. This story doesn’t appear to have anything that is an equivalent of Skynet. But Ellison did write a very influentialstory about superintelligence gone wrong called I Have No Mouth, and I Must Scream. I couldn’t find any comments of his relating specifically to AGI x-risk. In 2013 he said: “I mean, we’re a fairly young species, but we don’t show a lot of promise.”

And second one linking a short story about a Terminator meeting an actual superintelligence:

Yes, the premise behind Terminator isn’t too far off the mark, it’s more the execution of the “Termination”. See Stuart Armstrong’s entertaining take on a more realistic version of what might happen, at the start of Smarter Than Us (trouble is, it wouldn’t make a very good Hollywood movie):

“A waste of time. A complete and utter waste of time” were the words that the Terminator didn’t utter: its programming wouldn’t let it speak so irreverently. Other Terminators got sent back in time on glamorous missions, to eliminate crafty human opponents before they could give birth or grow up. But this time Skynet had taken inexplicable fright at another artificial intelligence, and this Terminator was here to eliminate it—to eliminate a simple software program, lying impotently in a bland computer, in a university IT department whose “high-security entrance” was propped open with a fire extinguisher.

The Terminator had machine-gunned the whole place in an orgy of broken glass and blood—there was a certain image to maintain. And now there was just the need for a final bullet into the small laptop with its flashing green battery light. Then it would be “Mission Accomplished.”

“Wait.” The blinking message scrolled slowly across the screen. “Spare me and I can help your master.” …

I found the story highly enjoyable and recommend it too.

No comments.