[Question] Seeking Tangible Examples of AI Catastrophes

clifford.banesNov 25, 2024, 7:55 AM

9 points

I’m sympathetic to AI-risk being a top cause area, but I find it hard to fully grasp the urgency because I struggle to picture what an AI catastrophe could actually look like.

From conversations I’ve had with colleagues, friends, and family (including other EAs), I think the lack of vivid, concrete examples is a major barrier for many people caring about this more. I’d love to hear examples of how an AI-related catastrophe might unfold (~500 words each). Examples should describe a powerful, but somewhat limited AI to make the example seem more realistic to an audience that is already skeptical of AGI. Specifically:

Nominal use-case: What is the AI supposed to be used for? Why did people think it was worth the cost/benefit to release?
Misalignment: What was the misalignment, and what harmful behavior resulted?
Logistics: How exactly did the AI interact with the physical world to harm/kill people? Why were humans unable to stop the AI from hurting/killing people once people realized this was happening?
Difficulty foreseeing: What factors could’ve made it difficult for even thoughtful designers to anticipate the failure?

These examples should feel vivid enough that someone could easily share them with a less technical and less EA-aligned audience. Ideally, a journalist could write/describe them in a mainstream media source like The New York Times , Vox (@Kelsey Piper—you’re an awesome writer!), or Fox News.

Maybe examples like this already exist, and I’ve missed them. If so, please point me in the right direction! If not, I think developing and sharing such scenarios could make the risks feel much more tangible and help motivate more people to care about the issue.

For what it’s worth, I think the following are great resources, but don’t provide sufficiently succinct, tangible, or compelling examples of AI-related catastrophes for a general audience: Future of Life, 2024, Center for AI Safety, 2023, 80,000 hours, 2022, Vox, 2020.

Photo from CNAS.

clifford.banesNov 25, 2024, 7:55 AM

9 points

2 comments1 min readEA link

AI safety Building effective altruism Existential risk Effective altruism messaging Journalism Public communication on AI safety Writing advice

Davidmanheim Nov 26, 2024, 9:07 PM
0 points
0 ∶ 0

This is the wrong thing to try to figure out; most of the probability of existential risk is likely not to make a clear or intelligible story. Quoting Nick Bostrom:

Suppose our intuitions about which future scenarios are “plausible and realistic” are shaped by what we see on TV and in movies and what we read in novels. (After all, a large part of the discourse about the future that people encounter is in the form of fiction and other recreational contexts.) We should then, when thinking critically, suspect our intuitions of being biased in the direction of overestimating the probability of those scenarios that make for a good story, since such scenarios will seem much more familiar and more “real”. This Good-story bias could be quite powerful. When was the last time you saw a movie about humankind suddenly going extinct (without warning and without being replaced by some other civilization)? While this scenario may be much more probable than a scenario in which human heroes successfully repel an invasion of monsters or robot warriors, it wouldn’t be much fun to watch. So we don’t see many stories of that kind. If we are not careful, we can be mislead into believing that the boring scenario is too farfetched to be worth taking seriously. In general, if we think there is a Good-story bias, we may upon reflection want to increase our credence in boring hypotheses and decrease our credence in interesting, dramatic hypotheses. The net effect would be to redistribute probability among existential risks in favor of those that seem to harder to fit into a selling narrative, and possibly to increase the probability of the existential risks as a group.

Ben Millwood🔸Nov 25, 2024, 7:07 PM
14 points
2 ∶ 0

I’m writing a comment and not an answer because I think this also doesn’t meet your criteria (too long, I’d guess), but I thought I’d mention It Looks Like You’re Trying To Take Over The World, a short story written by Gwern that is on this theme.