Paul_Christiano comments on Draft report on existential risk from power-seeking AI

Paul_Christiano 3 May 2021 19:31 UTC
31 points
0 ∶ 0
A 5% probability of disaster isn’t any more or less confident/extreme/radical than a 95% probability of disaster; in both cases you’re sticking your neck out to make a very confident prediction.
“X happens” and “X doesn’t happen” are not symmetrical once I know that X is a specific event. Most things at the level of specificity of “humans build an AI that outmaneuvers humans to permanently disempower them” just don’t happen.
The reason we are even entertaining this scenario is because of a special argument that it seems very plausible. If that’s all you’ve got—if there’s no other source of evidence than the argument—then you’ve just got to start talking about the probability that the argument is right.
And the argument actually is a brittle and conjunctive thing. (Humans do need to be able to build such an AI by the relevant date, they do need to decide to do so, the AI they build does need to decide to disempower humans notwithstanding a prima facie incentive for humans to avoid that outcome.)
That doesn’t mean this is the argument or that the argument is brittle in this way—there might be a different argument that explains in one stroke why several of these things will happen. In that case, it’s going to be more productive to talk about that.
(For example, in the context of the multi-stage argument undershooting success probabilities, it’s that people will be competently trying to achieve X and most of uncertainty is estimating how hard and how effectively people are trying—which is correlated across steps. So you would do better by trying to go for the throat and reason about the common cause of each success, and you will always lose if you don’t see that structure.)
And of course some of those steps may really just be quite likely and one shouldn’t be deterred from putting high probabilities on highly-probable things. E.g. it does seem like people have a very strong incentive to build powerful AI systems (and moreover the extrapolation suggesting that we will be able to build powerful AI systems is actually about the systems we observe in practice and already goes much of the way to suggesting that we will do so). Though I do think that the median MIRI staff-member’s view is overconfident on many of these points.