A list of good heuristics that the case for AI X-risk fails

Link post

Note: I’m crossposting an Alignment Forum post, written by David Krueger, which contained a list I thought was worth building out further. There are some good comments at the original link!


I think one reason machine learning researchers don’t think AI x-risk is a problem is because they haven’t given it the time of day. And on some level, they may be right in not doing so!

We all need to do meta-level reasoning about what to spend our time and effort on. Even giving an idea or argument the time of day requires it to cross a somewhat high bar, if you value your time. Ultimately, in evaluating whether it’s worth considering a putative issue (like the extinction of humanity at the hands (graspers?) of a rogue AI), one must rely on heuristics; by giving the argument the time of day, you’ve already conceded a significant amount of resources to it! Moreover, you risk privileging the hypothesis or falling victim to Pascal’s Mugging.

Unfortunately, the case for x-risk from out-of-control AI systems seems to fail many powerful and accurate heuristics. This can put proponents of this issue in a similar position to flat-earth conspiracy theorists at first glance. My goal here is to enumerate heuristics that arguments for AI takeover scenarios fail.

Ultimately, I think machine learning researchers should not refuse to consider AI x-risk when presented with a well-made case by a person they respect or have a personal relationship with, but I’m ambivalent as to whether they have an obligation to consider the case if they’ve only seen a few headlines about Elon. I do find it a bit hard to understand how one doesn’t end up thinking about the consequences of super-human AI, since it seems obviously impactful and fascinating. But I’m a very curious (read “distractable”) person...

A list of heuristics that say not to worry about AI takeover scenarios:

  • Outsiders not experts: This concern is being voiced exclusively by non-experts like Elon Musk, Steven Hawking, and the talkative crazy guy next to you on the bus.

  • Ludditism has a poor track record: For every new technology, there’s been a pack of alarmist naysayers and doomsday prophets. And then instead of falling apart, the world got better.

  • ETA: No concrete threat model: When someone raises a hypothetical concern, but can’t give you a good explanation for how it could actually happen, it’s much less likely to actually happen. Is the paperclip maximizer the best you can do?

  • It’s straight out of science fiction: AI researchers didn’t come up with this concern, Hollywood did. Science fiction is constructed based on entertaining premises, not realistic capabilities of technologies.

  • It’s not empirically testable: There’s no way to falsify the belief that AI will kill us all. It’s purely a matter of faith. Such beliefs don’t have good track records of matching reality.

  • It’s just too extreme: Whenever we hear an extreme prediction, we should be suspicious. To the extent that extreme changes happen, they tend to be unpredictable. While extreme predictions sometimes contain a seed of truth, reality tends to be more mundane and boring.

  • It has no grounding in my personal experience: When I train my AI systems, they are dumb as doorknobs. You’re telling me they’re going to be smarter than me? In a few years? So smart that they can outwit me, even though I control the very substrate of their existence?

  • It’s too far off: It’s too hard to predict the future and we can’t really hope to anticipate specific problems with future AI systems; we’re sure to be surprised! We should wait until we can envision more specific issues, scenarios, and threats, not waste our time on what comes down to pure speculation.

I’m pretty sure this list in incomplete, and I plan to keep adding to it as I think of or hear new suggestions! Suggest away!!

Also, to be clear, I am writing these descriptions from the perspective of someone who has had very limited exposure to the ideas underlying concerns about AI takeover scenarios. I think a lot of these reactions indicate significant misunderstandings about what people working on mitigating AI x-risk believe, as well as matters of fact (e.g. a number of experts have voiced concerns about AI x-risk, and a significant portion of the research community seems to agree that these concerns are at least somewhat plausible and important).