There are yet other views about about what exactly AI catastrophe will look like, but I think it is fair to say that the combined views of Yudkowsky and Christiano provide a fairly good representation of the field as a whole.
I disagree with this.
We ran a survey of prominent AI safety and governance researchers, where we asked them to estimate the probability of five different AI x-risk scenarios.
Arguably, the “terminator-like” scenarios are the “Superintelligence” scenario, and part 2 of “What failure looks like” (as you suggest in your post).[1]
Conditional on an x-catastrophe due to AI occurring, the median respondent gave those scenarios 10% and 12% probability (mean 16% each). The other three scenarios[2] got median 12.5%, 10% and 10% (means 18%, 17% and 15%).
So I don’t think that the “field as a whole” thinks terminator-like x-risk scenarios are the most likely. Accordingly, I’d prefer if the central claim of this post was “AI risk could actually be like terminator; stop saying it’s not”.
Part 1 of “What failure looks like” probably doesn’t look that much like Terminator (disaster unfolds more slowly and is caused by AI systems just doing their jobs really well)
That is, the following three secanrios: Part 1 of “What failure looks like”, existentially catastrophic AI misuse, and existentially catastrophic war between humans exacerbated by AI. See the post for full scenario descriptions.
Thanks for reading—you’re definitely right, my claim about the representativeness of Yudkowsky & Christiano’s views was wrong. I had only a narrow segment of the field in mind when I wrote this post. Thank you for conducting this very informative survey.
Thanks for writing this!
I disagree with this.
We ran a survey of prominent AI safety and governance researchers, where we asked them to estimate the probability of five different AI x-risk scenarios.
Arguably, the “terminator-like” scenarios are the “Superintelligence” scenario, and part 2 of “What failure looks like” (as you suggest in your post).[1]
Conditional on an x-catastrophe due to AI occurring, the median respondent gave those scenarios 10% and 12% probability (mean 16% each). The other three scenarios[2] got median 12.5%, 10% and 10% (means 18%, 17% and 15%).
So I don’t think that the “field as a whole” thinks terminator-like x-risk scenarios are the most likely. Accordingly, I’d prefer if the central claim of this post was “AI risk could actually be like terminator; stop saying it’s not”.
Part 1 of “What failure looks like” probably doesn’t look that much like Terminator (disaster unfolds more slowly and is caused by AI systems just doing their jobs really well)
That is, the following three secanrios: Part 1 of “What failure looks like”, existentially catastrophic AI misuse, and existentially catastrophic war between humans exacerbated by AI. See the post for full scenario descriptions.
Thanks for reading—you’re definitely right, my claim about the representativeness of Yudkowsky & Christiano’s views was wrong. I had only a narrow segment of the field in mind when I wrote this post. Thank you for conducting this very informative survey.