At any rate, merely uncertain catastrophic risks do not have rerun risk, while chancy ones do.
This is a key point. For many existential risks, the risk is mainly epistemic (i.e. we should assign some probability p to it happening in the next time period), rather than it being objectively chancy. For one-shot decision-making sometimes this distinction doesn’t matter, but here it does.
Complicating matters, what is really going on is not just that the probability is one of two types, but that we have a credence distribution over the different levels of objective chance. A pure subjective case is where all our credence is on 0% and 100%, but in many cases we have credences over multiple intermediate risk levels — these cases are neither purely epistemic nor purely objective chance.
You are right that the magnitude of rerun risk from alignment should be lower than the probability of misaligned AI doom. However, in worlds in which AI takeover is very likely but that we can’t change that, or in worlds where it’s very unlikely and we can’t change that, those aren’t the interesting worlds, from the perspective of taking action. (Owen and Fin have a post on this topic that should be coming out fairly soon). So, if we’re taking this consideration into account, this should also discount the value of word to reduce misalignment risk today, too.
(Another upshot: bio-risk seems more like chance than uncertainty, so biorisk becomes comparatively more important than you’d think before this consideration.)
I would strongly push back on the idea that a world where it’s unlikely and we can’t change that is uninteresting. In that world, all the other possible global catastrophic risks become far more salient as potential flourishing-defeaters.
This is a key point. For many existential risks, the risk is mainly epistemic (i.e. we should assign some probability p to it happening in the next time period), rather than it being objectively chancy. For one-shot decision-making sometimes this distinction doesn’t matter, but here it does.
Complicating matters, what is really going on is not just that the probability is one of two types, but that we have a credence distribution over the different levels of objective chance. A pure subjective case is where all our credence is on 0% and 100%, but in many cases we have credences over multiple intermediate risk levels — these cases are neither purely epistemic nor purely objective chance.
I agree—this is a great point. Thanks, Simon!
You are right that the magnitude of rerun risk from alignment should be lower than the probability of misaligned AI doom. However, in worlds in which AI takeover is very likely but that we can’t change that, or in worlds where it’s very unlikely and we can’t change that, those aren’t the interesting worlds, from the perspective of taking action. (Owen and Fin have a post on this topic that should be coming out fairly soon). So, if we’re taking this consideration into account, this should also discount the value of word to reduce misalignment risk today, too.
(Another upshot: bio-risk seems more like chance than uncertainty, so biorisk becomes comparatively more important than you’d think before this consideration.)
I would strongly push back on the idea that a world where it’s unlikely and we can’t change that is uninteresting. In that world, all the other possible global catastrophic risks become far more salient as potential flourishing-defeaters.
Agree, and this relates to my point about distinguishing the likelihood of retaining alignment knowledge from the likelihood of rediscovering it.