With our best-guess parameters in the Drake equation, we should be surprised that there are no aliens. But for all we know, maybe one or more of the parameters in the Drake equation is many many orders of magnitude lower than our best guess. And if that’s in fact the case, then we should not be surprised that there are no aliens!
…which seems pretty obvious, right?
So back to the context of AI risk. We have:
a framework in which risk is a conjunctive combination of factors…
…in which, at several of the steps, a subset of survey respondents give rather low probabilities for that factor being present
So at each step in the conjunctive argument, we wind up with some weight on “maybe this factor is really low”. And those add up.
I don’t find the correlation table (of your other comment) convincing. When I look at the review table, there seem to be obvious optimistic outliers—two of the three lowest numbers on the whole table came from the same person. And your method has those optimistic outliers punching above their weight.
(At least, you should be calculating correlations between log(probability), right? Because it’s multiplicative.)
Anyway, I think that AI risk is more disjunctive than conjunctive, so I really disagree with the whole setup. Recall that Joe’s conjunctive setup is:
It will become possible and financially feasible to build APS systems.
There will be strong incentives to build APS systems | (1).
It will be much harder to develop APS systems that would be practically PS-aligned if deployed, than to develop APS systems that would be practically PS-misaligned if deployed (even if relevant decision-makers don’t know this), but which are at least superficially attractive to deploy anyway | (1)-(2).
Some deployed APS systems will be exposed to inputs where they seek power in misaligned and high-impact ways (say, collectively causing >$1 trillion 2021-dollars of damage) | (1)-(3).
Some of this misaligned power-seeking will scale (in aggregate) to the point of permanently disempowering ~all of humanity | (1)-(4).
This will constitute an existential catastrophe | (1)-(5).
Of these:
1 is legitimately a conjunctive factor: If there’s no AGI, then there’s no AGI risk. (Though I understand that 1 is out of scope for this post?)
I don’t think 2 is a conjunctive factor. If there are not strong incentives to build APS systems, I expect people to do so anyway, sooner or later, because it’s scientifically interesting, it’s cool, it helps us better understand the human brain, etc. For example, I would argue that there are not strong incentives to do recklessly dangerous gain-of-function research, but that doesn’t seem to be stopping people. (Or if “doing this thing will marginally help somebody somewhere to get grants and tenure” counts as “strong incentives”, then that’s a very low bar!)
I don’t think 3 is a conjunctive factor, because even if alignment is easy in principle, there are bound to be people who want to try something different just because they’re curious what would happen, and people who have weird bad ideas, etc. etc. It’s a big world!
4-5 does constitute a conjunctive factor, I think, but I would argue that avoiding 4-5 requires a conjunction of different factors, factors that get us to a very different world involving something like a singleton AI or extreme societal resilience against destructive actors, of a type that seems unlikely to me. (More on this topic in my post here.)
6 is also a conjunctive factor, I think, but again avoiding 6 requires (I think) a conjunction of other factors. Like, to avoid 6 being true, we’d probably need to a unipolar outcome (…I would argue…), and the AI would need to have properties that are “good” in our judgment, and the AI would probably need to be able to successfully align its successors and avoid undesired value drift over the vast times and distances.
I think you’re using a philosophical framework I just don’t recognise here - ‘conjunctive’ and ‘disjunctive’ are not ordinary vocabulary in the sort of statistical modelling I do. One possible description of statistical modelling is that you are aiming to capture relevant insights about the world in a mathematical format so you can test hypotheses about those insights. In that respect, a model is good or bad based on how well its key features reflect the real world, rather than because it takes some particular position on the conjunctive-vs-disjunctive dispute. For example I am very excited to see the results of the MTAIR project, which will use a model a little bit like the below. This isn’t really ‘conjunctive’ or ‘disjunctive’ in any meaningful sense—it tries to multiply probabilities when they should be multiplied and add probabilities when they should be added. This is more like the philosophical framework I would expect modelling to be undertaken in.
I’d add that one of the novel findings of this essay is that if there are ‘conjunctive’ steps between ‘disjunctive’ steps it is likely the distribution effect I find will still apply (that is, given order-of-magnitude uncertainty). Insofar as you agree that 4-ish steps in AI Risk are legitimately conjunctive as per your comment above, we probably materially agree on the important finding of this essay (that the distribution of risk is asymmetrically weighted towards low-risk worlds) even if we disagree about the exact point estimate around which that distribution skews
Small point of clarification—you’re looking at the review table for Carlsmith (2021), which corresponds to Section 4.3.1. The correlation table I produce is for the Full Survey dataset, which corresponds to Section 4.1.1. Perhaps to highlight the difference, in the Full Survey dataset of 42 people; 5 people give exactly one probability <10%, 2 people give exactly two probabilities <10%, 2 people give exactly three probabilities <10% and 1 mega-outlier gives exactly four probabilities <10%. To me this does seem like there is evidence of ‘optimism bias’ / correlation relative to what we might expect to see (which would be closer to 1 person giving exactly 2 probabilities <10% I suppose), but not enough to fundamentally alter the conclusion that low-risk worlds are more likely than high-risk worlds based on community consensus (eg see section 4.3.3)
My paraphrase of the SDO argument is:
…which seems pretty obvious, right?
So back to the context of AI risk. We have:
a framework in which risk is a conjunctive combination of factors…
…in which, at several of the steps, a subset of survey respondents give rather low probabilities for that factor being present
So at each step in the conjunctive argument, we wind up with some weight on “maybe this factor is really low”. And those add up.
I don’t find the correlation table (of your other comment) convincing. When I look at the review table, there seem to be obvious optimistic outliers—two of the three lowest numbers on the whole table came from the same person. And your method has those optimistic outliers punching above their weight.
(At least, you should be calculating correlations between log(probability), right? Because it’s multiplicative.)
Anyway, I think that AI risk is more disjunctive than conjunctive, so I really disagree with the whole setup. Recall that Joe’s conjunctive setup is:
Of these:
1 is legitimately a conjunctive factor: If there’s no AGI, then there’s no AGI risk. (Though I understand that 1 is out of scope for this post?)
I don’t think 2 is a conjunctive factor. If there are not strong incentives to build APS systems, I expect people to do so anyway, sooner or later, because it’s scientifically interesting, it’s cool, it helps us better understand the human brain, etc. For example, I would argue that there are not strong incentives to do recklessly dangerous gain-of-function research, but that doesn’t seem to be stopping people. (Or if “doing this thing will marginally help somebody somewhere to get grants and tenure” counts as “strong incentives”, then that’s a very low bar!)
I don’t think 3 is a conjunctive factor, because even if alignment is easy in principle, there are bound to be people who want to try something different just because they’re curious what would happen, and people who have weird bad ideas, etc. etc. It’s a big world!
4-5 does constitute a conjunctive factor, I think, but I would argue that avoiding 4-5 requires a conjunction of different factors, factors that get us to a very different world involving something like a singleton AI or extreme societal resilience against destructive actors, of a type that seems unlikely to me. (More on this topic in my post here.)
6 is also a conjunctive factor, I think, but again avoiding 6 requires (I think) a conjunction of other factors. Like, to avoid 6 being true, we’d probably need to a unipolar outcome (…I would argue…), and the AI would need to have properties that are “good” in our judgment, and the AI would probably need to be able to successfully align its successors and avoid undesired value drift over the vast times and distances.
I think you’re using a philosophical framework I just don’t recognise here - ‘conjunctive’ and ‘disjunctive’ are not ordinary vocabulary in the sort of statistical modelling I do. One possible description of statistical modelling is that you are aiming to capture relevant insights about the world in a mathematical format so you can test hypotheses about those insights. In that respect, a model is good or bad based on how well its key features reflect the real world, rather than because it takes some particular position on the conjunctive-vs-disjunctive dispute. For example I am very excited to see the results of the MTAIR project, which will use a model a little bit like the below. This isn’t really ‘conjunctive’ or ‘disjunctive’ in any meaningful sense—it tries to multiply probabilities when they should be multiplied and add probabilities when they should be added. This is more like the philosophical framework I would expect modelling to be undertaken in.
I’d add that one of the novel findings of this essay is that if there are ‘conjunctive’ steps between ‘disjunctive’ steps it is likely the distribution effect I find will still apply (that is, given order-of-magnitude uncertainty). Insofar as you agree that 4-ish steps in AI Risk are legitimately conjunctive as per your comment above, we probably materially agree on the important finding of this essay (that the distribution of risk is asymmetrically weighted towards low-risk worlds) even if we disagree about the exact point estimate around which that distribution skews
Small point of clarification—you’re looking at the review table for Carlsmith (2021), which corresponds to Section 4.3.1. The correlation table I produce is for the Full Survey dataset, which corresponds to Section 4.1.1. Perhaps to highlight the difference, in the Full Survey dataset of 42 people; 5 people give exactly one probability <10%, 2 people give exactly two probabilities <10%, 2 people give exactly three probabilities <10% and 1 mega-outlier gives exactly four probabilities <10%. To me this does seem like there is evidence of ‘optimism bias’ / correlation relative to what we might expect to see (which would be closer to 1 person giving exactly 2 probabilities <10% I suppose), but not enough to fundamentally alter the conclusion that low-risk worlds are more likely than high-risk worlds based on community consensus (eg see section 4.3.3)