There’s a very wide range of views on this question, from “misalignment risk is essentially made up and incoherent” to “humanity will almost certainly go extinct due to misaligned AI.” Most people’s arguments rely heavily on hard-to-articulate intuitions and assumptions.
My sense is that the disagreements are mostly driven “top-down” by general psychological biases/inclinations towards optimism vs pessimism, instead of “bottom-up” as the result of independent lower-level disagreements over specific intuitions and assumptions. The reason I think this is that there seems to be a strong correlation between concern about misalignment risk and concern about other kinds of AI risk (i.e., AI-related x-risk). In other words, if the disagreement was “bottom-up”, then you’d expect that at least some people who are optimistic about misalignment risk would be pessimistic about other kinds of AI risk, such as what I call “human safety problems” (see examples here and here) but in fact I don’t seem to see anyone whose position is something like, “AI alignment will be easy or likely solved by default, therefore we should focus our efforts on these other kinds of AI-related x-risks that are much more worrying.”
(From my limited observation, optimism/pessimism on AI risk also seems correlated to optimism/pessimism on other topics. It might be interesting to verify this through some systematic method like a survey of researchers.)
In other words, if the disagreement was “bottom-up”, then you’d expect that at least some people who are optimistic about misalignment risk would be pessimistic about other kinds of AI risk, such as what I call “human safety problems” (see examples here and here) but in fact I don’t seem to see anyone whose position is something like, “AI alignment will be easy or likely solved by default, therefore we should focus our efforts on these other kinds of AI-related x-risks that are much more worrying.”
FWIW I know some people who explicitly think this. And I think there are also a bunch of people who think something like “the alignment problem will probably be pretty technically easy, so we should be focusing on the problems arising from humanity sometimes being really bad at technically easy problems”.
FWIW, I think my median future includes humanity solving AI alignment but messing up reflection/coordination in some way that makes us lose out on most possible value. I think this means that longtermists should think more about reflection/coordination-issues than we’re currently doing. But technical AI alignment seems more tractable than reflection/coordination, so I think it’s probably correct for more total effort to go towards alignment (which is the status quo).
I’m undecided about whether these reflection/coordination-issues are best framed as “AI risk” or not. They’ll certainly interact a lot with AI, but we would face similar problems without AI.
Single data point: In the most recent survey on community opinion on AI risk, I was in at least the 75th percentile for pessimism (for roughly the same reasons Lukas suggests below). But I’m also seemingly unusually optimistic about alignment risk.
I haven’t found that this is a really unusual combo: I think I know at least a few other people who are unusually pessimistic about ‘AI going well,’ but also at least moderately optimistic about alignment.
(Caveat that my apparently higher level of pessimism could also be explained by me having a more inclusive conception of “existential risk” than other survey participants.)
My sense is that the disagreements are mostly driven “top-down” by general psychological biases/inclinations towards optimism vs pessimism, instead of “bottom-up” as the result of independent lower-level disagreements over specific intuitions and assumptions. The reason I think this is that there seems to be a strong correlation between concern about misalignment risk and concern about other kinds of AI risk (i.e., AI-related x-risk). In other words, if the disagreement was “bottom-up”, then you’d expect that at least some people who are optimistic about misalignment risk would be pessimistic about other kinds of AI risk, such as what I call “human safety problems” (see examples here and here) but in fact I don’t seem to see anyone whose position is something like, “AI alignment will be easy or likely solved by default, therefore we should focus our efforts on these other kinds of AI-related x-risks that are much more worrying.”
(From my limited observation, optimism/pessimism on AI risk also seems correlated to optimism/pessimism on other topics. It might be interesting to verify this through some systematic method like a survey of researchers.)
FWIW I know some people who explicitly think this. And I think there are also a bunch of people who think something like “the alignment problem will probably be pretty technically easy, so we should be focusing on the problems arising from humanity sometimes being really bad at technically easy problems”.
Sounds like their positions are not public, since you don’t cite anyone by name? Is there any reason for that?
FWIW, I think my median future includes humanity solving AI alignment but messing up reflection/coordination in some way that makes us lose out on most possible value. I think this means that longtermists should think more about reflection/coordination-issues than we’re currently doing. But technical AI alignment seems more tractable than reflection/coordination, so I think it’s probably correct for more total effort to go towards alignment (which is the status quo).
I’m undecided about whether these reflection/coordination-issues are best framed as “AI risk” or not. They’ll certainly interact a lot with AI, but we would face similar problems without AI.
FWIW, I haven’t had this impression.
Single data point: In the most recent survey on community opinion on AI risk, I was in at least the 75th percentile for pessimism (for roughly the same reasons Lukas suggests below). But I’m also seemingly unusually optimistic about alignment risk.
I haven’t found that this is a really unusual combo: I think I know at least a few other people who are unusually pessimistic about ‘AI going well,’ but also at least moderately optimistic about alignment.
(Caveat that my apparently higher level of pessimism could also be explained by me having a more inclusive conception of “existential risk” than other survey participants.)