Thanks for reading! I admire that you take the time to respond to critiques even by random internet strangers. Thank you for all your hard work in promoting effective altruist ideas.
skluug
AI Risk is like Terminator; Stop Saying it’s Not
Followup on Terminator
I would like to thank N.N., Voxette, tyuuyookoobung & TCP for reviewing drafts of this post.
I rewatched Terminator 1 & 2 to write this post. One thing I liked but couldn’t fit in: Terminator 2 contains an example of the value specification problem! Young John Connor makes the good Terminator swear not to kill people; the Terminator immediately goes on to merely maim severely, instead.
Clickhole is in fact no longer owned by The Onion! It was bought by the Cards Against Humanity team in early 2020. (link)
I also consider their famous article Heartbreaking: The Worst Person You Know Just Made A Great Point an enormous contribution to the epistemic habits of the internet.
I think the first point is subtly wrong in an important way.
EAGs are not only useful in so far as they let community members do better work in the real world. EAGs are useful insofar as they result in a better world coming to be.
One way in which EAGs might make the world better is by fostering a sense of community, validation, and inclusion among those who have committed themselves to EA, thus motivating people to so commit themselves and to maintain such commitments. This function doesn’t bare on “letting” people do better work per se.
Insofar as this goal is an important component of EAG’s impact, it should be prioritized alongside more direct effects of the conference. EAG obviously exists to make the world a better place, but serving the EA community and making EAs happy is an important way in which EAG accomplishes this goal.
hi, i’m skluug! i’ve been consuming EA-sphere content for a long time and have some friends who are heavily involved, but i so far haven’t had much formal engagement myself. i graduated college this year and have given a few thousand dollars to the AMF (i signed up for the GWWC Pledge back in college and enjoy finally making good on it!). i’m interested in upping my engagement with the community and hopefully working towards a career with direct impact per 80k recommendations (i’m a religious 80k podcast listener).
Yeah, you’re right actually, that paragraph is a little too idealistic.
As a practical measure, I think it cuts both ways. Some people will hear “yes, like Terminator” and roll their eyes. Some people will hear “no, not like Terminator”, get bored, and tune out. Embracing the comparison is helpful, in part, because it lets you quickly establish the stakes. The best path is probably somewhere in the middle, and dependent on the audience and context.
Overall I think it’s just about finding that balance.
I don’t think this is a good characterization of e.g. Kelsey’s preference for her Philip Morris analogy over the Terminator analogy—does rogue Philip Morris sound like a far harder problem to solve than rogue Skynet? Not to me, which is why it seems to me much more motivated by not wanting to sound science-fiction-y. Same as Dylan’s piece; it doesn’t seem to be saying “AI risk is a much harder problem than implied by the Terminator films”, except insofar as it misrepresents the Terminator films as involving evil humans intentionally making evil AI.
It seems to me like the proper explanatory path is “Like Terminator?” → “Basically” → “So why not just not give AI nuclear launch codes?” → “There are a lot of other ways AI could take over”.
“Like Terminator?” → “No, like Philip Morris” seems liable to confuse the audience about the very basic details of the issue, because Philip Morris didn’t take over the world.
I think “Windfall” fits the bill as a positive surprise and has the benefit of being an existing word (I’m probably not going to bother setting up a ETH wallet to submit it).
I like this! UI suggestion: instead of “The first option is 5x as valuable as the second option”, I would insert the sentence between them in the middle: ”...is 5x as valuable as...”. Or if you’re willing to mess up marginal/total utility, you could format it as “One [X] is worth as much as five [Y]”, which I think would help it be more concrete to most people.
I’m of a split mind on this. On the one hand, I definitely think this is a better way to think about what will determine AI values than “the team of humans that succeeds in building the first AGI”.
But I also think the development of powerful AI is likely to radically reallocate power, potentially towards AI developers. States derive their power from a monopoly on force, and I think there is likely to be a period before the obsolesce of human labor in which these monopolies are upset by whoever is able to most effectively develop and deploy AI capabilities. It’s not clear who this will be, but it hardly seems guaranteed to be existing state powers or property holders, and AI developers have an obvious expertise and first mover advantage.
I think this is a great post.
One reason I think it would be cool to see EA become more politically active is that political organizing is a great example of a low-commitment way for lots of people to enact change together. It kind of feels ridiculous that if there is an unsolved problem with the world, the only way I can personally contribute is to completely change careers to work on solving it full time, while most people are still barely aware it exists.
I think the mechanism of “try to build broad consensus that a problem needs to get solved, then delegate collective resources towards solving it” is underrated in EA at current margins. It probably wasn’t underrated before EA had billionaire-level funding, but as EA comes to have about as much money as you can get from small numbers of private actors, and it starts to enter the mainstream, I think it’s worth taking the prospect of mass mobilization more seriously.
This doesn’t even necessarily have to look like getting a policy agenda enacted. I think of climate change as a problem that is being addressed with by mass mobilization, but in the US, this mass mobilization has mostly not come in the form of government policy (at least not national policy). It’s come from widespread understanding that it’s a problem that needs to get solved, and is worth devoting resources to, leading to lots of investment in green technology.
It can seem strange to focus on the wellbeing of future people who don’t even exist yet, when there is plenty of suffering that could be alleviated today. Shouldn’t we aid the people who need help now and let future generations worry about themselves?
We can see the problems with near-sighted moral concern if we imagine that past generations had felt similarly. If prior generations hadn’t cared for the future of their world, we might today find ourselves without many of the innovations we take for granted, suffering from far worse degradation of the environment, or even devastated by nuclear war. If we always prioritize the present, we risk falling into a trap of recurring moral procrastination, where each successive generation struggles against problems that could have been addressed much more effectively by the generations before.
This is not to say there no practical reasons why it might be better to help people today. We know much more about what today’s problems are, and the future may have much better technology that make fixing their own problems much easier. But acknowledging these practical considerations needn’t lead us to believe that helping future people is inherently less worthwhile than helping the people of the present. Just as impartial moral concern leads us to equally weigh the lives of individuals regardless of race or nationality, so too should we place everyone on equal footing regardless of when they exist in time.
“Perhaps the best window into what those working on AI really believe [about existential risks from AI] comes from the 2016 survey of leading AI researchers. As well as asking if and when AGI might be developed, it asked about the risks: 70 percent of the researchers agreed with Stuart Russell’s broad argument about why advanced AI might pose a risk; 48 percent thought society should prioritize AI safety research more (only 12 percent thought less). And half the respondents estimated that the probability of the longterm impact of AGI being “extremely bad (e.g., human extinction)” was at least 5 percent. I find this last point particularly remarkable—in how many other fields would the typical leading researcher think there is a one in twenty chance the field’s ultimate goal would be extremely bad for humanity?”
Toby Ord, The Precipice
Thanks for reading—you’re definitely right, my claim about the representativeness of Yudkowsky & Christiano’s views was wrong. I had only a narrow segment of the field in mind when I wrote this post. Thank you for conducting this very informative survey.
Interesting post! I think analogies are good for public communication but not for understanding things at a deep level. They’re like a good way to quickly template something you haven’t thought about at all with something you are familiar with. I think effective mass communication is quite important and we shouldn’t let the perfect be the enemy of the good.
I wouldn’t consider my Terminator comparison an analogy in the sense of the other items on this list. Most of the other items have the character of “why might AI go rogue?” and then they describe something other than AI that is hard to understand or goes rogue in some sense and assert that AI is like that. But Terminator is just literally about an AI going rogue. It’s not so much an analogy as a literal portrayal of the concern. My point wasn’t so much that you should proactively tell people that AI risk is like Terminator, but that people are just going to notice this on their own (because it’s incredibly obvious), and contradicting them makes no sense.
I feel like this is a pretty insignificant objection, because it implies someone might going around thinking, “don’t worry, AI Risk is just like Terminator! all we’ll have to do is bring humanity back from the brink of extinction, fighting amongst the rubble of civilization after a nuclear holocaust”. Surely if people think the threat is only as bad as Terminator, that’s plenty to get them to care.
Wow, this is a really interesting point that I was not aware of.
I think these have more to do with how some people remember Terminator than with Terminator itself:
As I stated in this post, the AI in Terminator is not malevolent; it attacks humanity out of self-preservation.
Whether the AIs are conscious is not explored in the movies, although we do get shots from the Terminator’s perspective, and Skynet is described as “self-aware”. Most people have a pretty loose understanding of what “consciousness” means anyway, not being far off from “general intelligence”.
Cyberdyne Systems is not portrayed as greedy, at least in the first two films. As soon as the head of research is told about the future consequences of his actions in Terminator 2, he teams up with the heroes to destroy the whole project. No one else at the company tries stop them or is even a character, apart from some unlucky security guards.
The android objection has the most legs. But the film does state that most humans were not killed by robots, but by the nuclear war initiated by Skynet. If Terminator comparisons are embraced, it should be emphasized that an AI could find many different routes to world domination.
I would also contend that 2 & 3 don’t count as thought terminating. AGI very well could be conscious, and in real life, corporations are greedy.