So the prediction experts made are all pureb”subjective” predictions? I think there are some logical thinking/arguments or maybe like fermi estimation to explain how he estimates the number unless it’s mostly intuition.
instrumental convergence: agents will tend to try and accumulate some kinds of resources like money, regardless of what their goals are;
value-capabilities orthogonality (often known as just “the orthogonality thesis”): regardless of their capabilities, agents might have pretty much any kind of goal;
the fact that most possible goals are incompatible with human thriving (we need a very specific set of conditions to survive, let alone thrive);
the fact that current AI capabilities are growing, the growth rate seems to be increasing, and that there are strong economic incentives to keep pushing them forward.
These factors lead me to think we have significantly worse than even odds (that is, <50%) of surviving this century.
I’m also quite interested in how these estimates are being made, so can I ask you for more detail about how you got your estimate?
In particular, I’m interested in the “chain of events” involved. AI extinction involves several consecutive speculative events. What are your estimates for the following, conditional on the previous steps occurring?
at least one AGI is built this century
at least one of these AGI is motivated to conquer and wipe out humanity
at least one of the rebellious AGI successfully conquers and destroy humanity
Did your >50% estimate come from reasoning like this about each step?
I think 1 is >95% likely. We’re in an arms race dynamic for at least some of the components of AGI. This is conditional on us not having been otherwise wiped out (by war, pandemic, asteroid, etc).
I think 2 and 3 are the wrong way to think about the question. Was humankind “motivated to conquer” the dodo? Or did we just have a better use for its habitat, and its extinction was just a whoopsie in the process?
I think 2 and 3 are the wrong way to think about the question. Was humankind “motivated to conquer” the dodo? Or did we just have a better use for its habitat, and its extinction was just a whoopsie in the process?
When I say “motivated to”, I don’t mean that it would be it’s primary motivation. I mean that it has motivations that, at some point, would lead to it having “perform actions that would kill all of humanity” as a sub-goal. And in order to get to the point where we were dodo’s to it, it would have to disempower humanity somehow.
Would you prefer the following restatement, each conditional on the previous step:
At least one Agi is built in our lifetimes
At least one of these AGI’s has the motivations that include “disempower humanity” as a sub-goal
At least one of these disempowerment attempts are successful
And then either:
4a: The process of disempowering humanity involves wiping out all of humanity
Or
4b: After successfully disempowering humanity with some of humanity still intact, the AI ends up wiping out the rest of humanity anyway
So the prediction experts made are all pureb”subjective” predictions? I think there are some logical thinking/arguments or maybe like fermi estimation to explain how he estimates the number unless it’s mostly intuition.
[This article](https://slatestarcodex.com/2013/05/02/if-its-worth-doing-its-worth-doing-with-made-up-statistics/) explores why it is useful to work with subjective, “made-up” statistics.
My own view hinges on the following:
instrumental convergence: agents will tend to try and accumulate some kinds of resources like money, regardless of what their goals are;
value-capabilities orthogonality (often known as just “the orthogonality thesis”): regardless of their capabilities, agents might have pretty much any kind of goal;
the fact that most possible goals are incompatible with human thriving (we need a very specific set of conditions to survive, let alone thrive);
the fact that current AI capabilities are growing, the growth rate seems to be increasing, and that there are strong economic incentives to keep pushing them forward.
These factors lead me to think we have significantly worse than even odds (that is, <50%) of surviving this century.
I’m also quite interested in how these estimates are being made, so can I ask you for more detail about how you got your estimate?
In particular, I’m interested in the “chain of events” involved. AI extinction involves several consecutive speculative events. What are your estimates for the following, conditional on the previous steps occurring?
at least one AGI is built this century
at least one of these AGI is motivated to conquer and wipe out humanity
at least one of the rebellious AGI successfully conquers and destroy humanity
Did your >50% estimate come from reasoning like this about each step?
I think 1 is >95% likely. We’re in an arms race dynamic for at least some of the components of AGI. This is conditional on us not having been otherwise wiped out (by war, pandemic, asteroid, etc).
I think 2 and 3 are the wrong way to think about the question. Was humankind “motivated to conquer” the dodo? Or did we just have a better use for its habitat, and its extinction was just a whoopsie in the process?
When I say “motivated to”, I don’t mean that it would be it’s primary motivation. I mean that it has motivations that, at some point, would lead to it having “perform actions that would kill all of humanity” as a sub-goal. And in order to get to the point where we were dodo’s to it, it would have to disempower humanity somehow.
Would you prefer the following restatement, each conditional on the previous step:
At least one Agi is built in our lifetimes
At least one of these AGI’s has the motivations that include “disempower humanity” as a sub-goal
At least one of these disempowerment attempts are successful
And then either:
4a: The process of disempowering humanity involves wiping out all of humanity
Or
4b: After successfully disempowering humanity with some of humanity still intact, the AI ends up wiping out the rest of humanity anyway