I’d be really interested in reading an updated post that makes the case for there being an especially high (e.g. >10%) probability that AI alignment problems will lead to existentially bad outcomes.
There still isn’t a lot of writing explaining case for existential misalignment risk. And a significant fraction of what’s been produced since Superintelligence is either: (a) roughly summarizing arguments in Superintelligence, (b) pretty cursory, or (c) written by people who are relative optimists and are in large part trying to explain their relative optimism.
Since I have the (possibly mistaken) impression that a decent number of people in the EA community are quite pessimistic regarding existential misalignment risk, on the basis of reasoning that goes significantly beyond what’s in Superintelligence, I’d really like to understand this position a lot better and be in a position to evaluate the arguments for it.
(My ideal version of this post would probably assume some degree of familiarity with contemporary machine learning, and contemporary safety/robustness issues, but no previous familiarity with arguments that AI poses an existential risk.)
I’d be really interested in reading an updated post that makes the case for there being an especially high (e.g. >10%) probability that AI alignment problems will lead to existentially bad outcomes.
My understanding is that Toby Ord does just this in his new book The Precipice (his new AI x-risk estimate is also discussed in his recent 80K podcast interview about the book), though it would still be good to have others weigh in.
I think that chapter in the Precipice is really good, but it’s not exactly the sort of thing I have in mind.
Although Toby’s less optimistic than I am, he’s still only arguing for a 10% probability of existentially bad outcomes from misalignment.* The argument in the chapter is also, by necessity, relatively cursory. It’s aiming to introduce the field of artificial intelligence and the concept of AGI to readers who might be unfamiliar with it, explain what misalignment risk is, make the idea vivid to readers, clarify misconceptions, describe the state of expert opinion, and add in various other nuances all within the span of about fifteen pages. I think that it succeeds very well in what it’s aiming to do, but I would say that it’s aiming for something fairly different.
*Technically, if I remember correctly, it’s a 10% probability within the next century. So the implied overall probability is at least somewhat higher.
I’d be really interested in reading an updated post that makes the case for there being an especially high (e.g. >10%) probability that AI alignment problems will lead to existentially bad outcomes.
There still isn’t a lot of writing explaining case for existential misalignment risk. And a significant fraction of what’s been produced since Superintelligence is either: (a) roughly summarizing arguments in Superintelligence, (b) pretty cursory, or (c) written by people who are relative optimists and are in large part trying to explain their relative optimism.
Since I have the (possibly mistaken) impression that a decent number of people in the EA community are quite pessimistic regarding existential misalignment risk, on the basis of reasoning that goes significantly beyond what’s in Superintelligence, I’d really like to understand this position a lot better and be in a position to evaluate the arguments for it.
(My ideal version of this post would probably assume some degree of familiarity with contemporary machine learning, and contemporary safety/robustness issues, but no previous familiarity with arguments that AI poses an existential risk.)
My understanding is that Toby Ord does just this in his new book The Precipice (his new AI x-risk estimate is also discussed in his recent 80K podcast interview about the book), though it would still be good to have others weigh in.
I think that chapter in the Precipice is really good, but it’s not exactly the sort of thing I have in mind.
Although Toby’s less optimistic than I am, he’s still only arguing for a 10% probability of existentially bad outcomes from misalignment.* The argument in the chapter is also, by necessity, relatively cursory. It’s aiming to introduce the field of artificial intelligence and the concept of AGI to readers who might be unfamiliar with it, explain what misalignment risk is, make the idea vivid to readers, clarify misconceptions, describe the state of expert opinion, and add in various other nuances all within the span of about fifteen pages. I think that it succeeds very well in what it’s aiming to do, but I would say that it’s aiming for something fairly different.
*Technically, if I remember correctly, it’s a 10% probability within the next century. So the implied overall probability is at least somewhat higher.
I see, thanks for the explanation!