1. With space colonization, we can hopefully create causally isolated civilizations. Once this happens, the risk of a civilizational collapse falls dramatically, because of independence.
2. There are two different kinds of catastrophic risk: chancy, and merely uncertain. Compare flipping a fair coin (chancy) to flipping a coin that is either double headed or double tailed, but you don’t know which (merely uncertain). If alignment is merely uncertain, then conditional on solving it once, we are in the double-headed case, and we will solve it again. Alignment might be like this: for example, one picture is that alignment might be brute forceable with enough data, but we just don’t know whether this is so. At any rate, merely uncertain catastrophic risks do not have rerun risk, while chancy ones do.
3. I’m a bit skeptical of demographic decline as a catastrophic risk, because of evolutionary pressure. If some groups stop reproducing, groups with high reproduction rates will tend to replace them.
4. Regarding unipolar outcomes, you’re suggesting a picture where unipolar outcomes have less catastrophic risk, but more lock-in risk. I’m unsure of this. First, unipolar world government might have higher risk of civil unrest. In particular, you might think that elites tend to treat residents better because of fear of external threats; without that threat, they may exploit residents more, leading to higher civil unrest. Second, unipolar AI outcomes may have higher risk of going rogue than multipolar, because in multipolar outcomes, humans may have extra value to AIs as a partner in competition against other AIs.
At any rate, merely uncertain catastrophic risks do not have rerun risk, while chancy ones do.
This is a key point. For many existential risks, the risk is mainly epistemic (i.e. we should assign some probability p to it happening in the next time period), rather than it being objectively chancy. For one-shot decision-making sometimes this distinction doesn’t matter, but here it does.
Complicating matters, what is really going on is not just that the probability is one of two types, but that we have a credence distribution over the different levels of objective chance. A pure subjective case is where all our credence is on 0% and 100%, but in many cases we have credences over multiple intermediate risk levels — these cases are neither purely epistemic nor purely objective chance.
You are right that the magnitude of rerun risk from alignment should be lower than the probability of misaligned AI doom. However, in worlds in which AI takeover is very likely but that we can’t change that, or in worlds where it’s very unlikely and we can’t change that, those aren’t the interesting worlds, from the perspective of taking action. (Owen and Fin have a post on this topic that should be coming out fairly soon). So, if we’re taking this consideration into account, this should also discount the value of word to reduce misalignment risk today, too.
(Another upshot: bio-risk seems more like chance than uncertainty, so biorisk becomes comparatively more important than you’d think before this consideration.)
On point 1 (space colonization), I think it’s hard and slow! So the same issue as with bio risks might apply: AGI doesn’t get you this robustness quickly for free. See other comment on this post.
I like your point 2 about chancy vs merely uncertain. I guess a related point is that when the ‘runs’ of the risks are in some way correlated, having survived once is evidence that survivability is higher. (Up to an including the fully correlated ‘merely uncertain’ extreme?)
Great post! A few reactions:
1. With space colonization, we can hopefully create causally isolated civilizations. Once this happens, the risk of a civilizational collapse falls dramatically, because of independence.
2. There are two different kinds of catastrophic risk: chancy, and merely uncertain. Compare flipping a fair coin (chancy) to flipping a coin that is either double headed or double tailed, but you don’t know which (merely uncertain). If alignment is merely uncertain, then conditional on solving it once, we are in the double-headed case, and we will solve it again. Alignment might be like this: for example, one picture is that alignment might be brute forceable with enough data, but we just don’t know whether this is so. At any rate, merely uncertain catastrophic risks do not have rerun risk, while chancy ones do.
3. I’m a bit skeptical of demographic decline as a catastrophic risk, because of evolutionary pressure. If some groups stop reproducing, groups with high reproduction rates will tend to replace them.
4. Regarding unipolar outcomes, you’re suggesting a picture where unipolar outcomes have less catastrophic risk, but more lock-in risk. I’m unsure of this. First, unipolar world government might have higher risk of civil unrest. In particular, you might think that elites tend to treat residents better because of fear of external threats; without that threat, they may exploit residents more, leading to higher civil unrest. Second, unipolar AI outcomes may have higher risk of going rogue than multipolar, because in multipolar outcomes, humans may have extra value to AIs as a partner in competition against other AIs.
This is a key point. For many existential risks, the risk is mainly epistemic (i.e. we should assign some probability p to it happening in the next time period), rather than it being objectively chancy. For one-shot decision-making sometimes this distinction doesn’t matter, but here it does.
Complicating matters, what is really going on is not just that the probability is one of two types, but that we have a credence distribution over the different levels of objective chance. A pure subjective case is where all our credence is on 0% and 100%, but in many cases we have credences over multiple intermediate risk levels — these cases are neither purely epistemic nor purely objective chance.
I agree—this is a great point. Thanks, Simon!
You are right that the magnitude of rerun risk from alignment should be lower than the probability of misaligned AI doom. However, in worlds in which AI takeover is very likely but that we can’t change that, or in worlds where it’s very unlikely and we can’t change that, those aren’t the interesting worlds, from the perspective of taking action. (Owen and Fin have a post on this topic that should be coming out fairly soon). So, if we’re taking this consideration into account, this should also discount the value of word to reduce misalignment risk today, too.
(Another upshot: bio-risk seems more like chance than uncertainty, so biorisk becomes comparatively more important than you’d think before this consideration.)
Agree, and this relates to my point about distinguishing the likelihood of retaining alignment knowledge from the likelihood of rediscovering it.
On point 1 (space colonization), I think it’s hard and slow! So the same issue as with bio risks might apply: AGI doesn’t get you this robustness quickly for free. See other comment on this post.
I like your point 2 about chancy vs merely uncertain. I guess a related point is that when the ‘runs’ of the risks are in some way correlated, having survived once is evidence that survivability is higher. (Up to an including the fully correlated ‘merely uncertain’ extreme?)