The Basic Argument for AI Safety

High-stakes uncertainty warrants caution and research

When I see confident dismissals of AI risk from other philosophers, it’s usually not clear whether our disagreement is ultimately empirical or decision-theoretic in nature. (Are they confident that there’s no non-negligible risk here, or do they think we should ignore the risk even though it’s non-negligible?) Either option seems pretty unreasonable to me, for the general reasons I previously outlined in X-Risk Agnosticism. But let me now take a stab at spelling out an ultra-minimal argument for worrying about AI safety in particular:

It’s just a matter of time until humanity develops artificial superintelligence (ASI). There’s no in-principle barrier to such technology, nor should we by default expect sociopolitical barriers to automatically prevent the innovation.
1. Indeed, we can’t even be confident that it’s more than a decade away.
2. Reasonable uncertainty should allow at least a 1% chance that it occurs within 5 years (let alone 10).
The stakes surrounding ASI are extremely high, to the point that we can’t be confident that humanity would long survive this development.
Even on tamer timelines (with no “acute jumps in capabilities”), gradual disempowerment of humanity is a highly credible concern.
We should not neglect credible near-term risks of human disempowerment or even extinction. Such risks warrant urgent further investigation and investment in precautionary measures.
1. If there’s even a 1% chance that, within a decade, we’ll develop technology that we can’t be confident humanity would survive—that easily qualifies as a “credible near-term risk” for purposes of applying this principle.

Conclusion: AI risk warrants urgent further investigation and precautionary measures.^[1]

Sufficient probability density in the danger zone?

My question for those who disagree with the conclusion: which premise(s) do you reject?