I like this, and think its healthy. I recommend talking to Quintin Pope for a smart person who has thought a lot about alignment, and came to the informed, inside-view conclusion that we have a 5% chance of doom (or just reading his posts or comments). He has updated me downwards on doom a lot.
Hopefully it gets you in a position where you’re able to update more on evidence that I think is evidence, by getting you into a state where you have a better picture of what the best arguments against doom would be.
Yes to (paraphrased) “5% should plausibly still be civilization’s top priority.”
However, in another sense, 5% is indeed low!
I think that’s a significant implicit source of disagreement over AI doom likelihoods – what sort of priors people start with.
The following will be a bit simplistic (in reality proponents of each side will probably state their position in more sophisticated ways).
On one side, optimists may use a prior of “It’s rare that humans build important new technology and it doesn’t function the way it’s intended.”
On the other side, pessimists can say that it has almost never happened that people who developed a revolutionary new technology displayed a lot of foresight about its long-term consequences when they started using it. For instance, there were comparatively few efforts at major social media companies to address ways in which social media might change society for the worse. Or, same reasoning for the food industry and the obesity epidemic or online dating and its effects on single parenthood rates.
I’m not saying revolutions in these sectors were overall negative for human happiness – just that there are what seems to be costly negative side-effects where no one competent has ever been “in charge” of proactively addressing them (nor do we have good plans to address them anytime soon). So, it’s not easily apparent how we’ll suddenly get rid of all these issues and fix the underlying dynamics, apart from “AI will give us god-like power to fix everything.” The pessimists can argue that humans have never seemed particularly “in control” over technological progress. There’s this accelerating force that improves things on some metrics but makes other things worse elsewhere. (Pinker-style arguments for the world getting better seem one-sided to me – he mostly looks at trends that were already relevant 100s of years ago, but doesn’t talk about “newer problems” that only arose as Molochian side-effects of technological progress.) AI will be the culmination of all that (of the accelerating forces that have positive effects on immediately legible metrics, but negative effects on some other variables due to Molochian dynamics). Unless we use it to attain a degree of control that we never had, it won’t go well. To conclude, there’s a sense in which believing “AI doom risk is only 5%” is like believing that there’s a 95% that AI will solve all the world’s major problems. Expressed in that way, it seems like a pretty strong claim.
(The above holds especially for definitions of “AI doom” where humanity would lose most of its long-term “potential.” That said, even if by “AI doom” one means something like “people all die,” it stands to argue that one likely endpoint/attractor state from not being able to fix all the world’s major problems will be people’s extinction, eventually.)
I’ve been meaning to write a longer post on these topics at some point, but may not get to it anytime soon.
Eh, I don’t think this is a priors game. Quintin has lots of information, I have lots of information, so if we were both acting optimally according to differing priors, our opinions likely would have converged.
In general I’m skeptical of arguments of disagreement which reduce things to differing priors. It’s just not physically or predictively correct, and it feels nice because now you no longer have an epistemological duty to go and see why relevant people have differing opinions.
That would be a valid reply if I had said it’s all about priors. All I said was that I think priors make up a significant implicit source of the disagreement – as suggested by some people thinking 5% risk of doom seems “high” and me thinking/reacting with “you wouldn’t be saying that if you had anything close to my priors.”
Or maybe what I mean is stronger than “priors.” “Differences in underlying worldviews” seems like the better description. Specifically, the worldview I identify more with, which I think many EAs don’t share, is something like “The Yudkowskian worldview where the world is insane, most institutions are incompetent, Inadequate Equilibria is a big deal, etc.” And that probably affects things like whether we anchor way below 50% or above 50% on what the risks should be that the culmination of accelerating technological progress will go well or not.
In general I’m skeptical of arguments of disagreement which reduce things to differing priors. It’s just not physically or predictively correct, and it feels nice because now you no longer have an epistemological duty to go and see why relevant people have differing opinions.
That’s misdescribing the scope of my point and drawing inappropriate inferences. Last time I made an object-level argument about AI misalignment risk was just 3h before your comment. (Not sure it’s particularly intelligible, but the point is, I’m trying! :) ) So, evidently, I agree that a lot of the discussion should be held at a deeper level than the one of priors/general worldviews.
Quintin has lots of information, I have lots of information, so if we were both acting optimally according to differing priors, our opinions likely would have converged.
I’m a fan of Shard theory and some of the considerations behind it have already updated me towards a lower chance of doom than I had before starting to incorporate it more into my thinking. (Which I’m still in the process of doing.)
Agreed, but 5% is much lower than “certain or close to certain”, which is the starting point Nuno Sempere said he was sceptical of.
I don’t know that anyone thinks doom is “certain or close to certain”, though the April 1 post could be read that way. 5% is also much lower than, say, 50%, which seems to be a somewhat more common belief.
I like this, and think its healthy. I recommend talking to Quintin Pope for a smart person who has thought a lot about alignment, and came to the informed, inside-view conclusion that we have a 5% chance of doom (or just reading his posts or comments). He has updated me downwards on doom a lot.
Hopefully it gets you in a position where you’re able to update more on evidence that I think is evidence, by getting you into a state where you have a better picture of what the best arguments against doom would be.
Is 5% low? 5% still strikes me as a “preventing this outcome should plausibly be civilization’s #1 priority” level of risk.
Yes to (paraphrased) “5% should plausibly still be civilization’s top priority.”
However, in another sense, 5% is indeed low!
I think that’s a significant implicit source of disagreement over AI doom likelihoods – what sort of priors people start with.
The following will be a bit simplistic (in reality proponents of each side will probably state their position in more sophisticated ways).
On one side, optimists may use a prior of “It’s rare that humans build important new technology and it doesn’t function the way it’s intended.”
On the other side, pessimists can say that it has almost never happened that people who developed a revolutionary new technology displayed a lot of foresight about its long-term consequences when they started using it. For instance, there were comparatively few efforts at major social media companies to address ways in which social media might change society for the worse. Or, same reasoning for the food industry and the obesity epidemic or online dating and its effects on single parenthood rates.
I’m not saying revolutions in these sectors were overall negative for human happiness – just that there are what seems to be costly negative side-effects where no one competent has ever been “in charge” of proactively addressing them (nor do we have good plans to address them anytime soon). So, it’s not easily apparent how we’ll suddenly get rid of all these issues and fix the underlying dynamics, apart from “AI will give us god-like power to fix everything.” The pessimists can argue that humans have never seemed particularly “in control” over technological progress. There’s this accelerating force that improves things on some metrics but makes other things worse elsewhere. (Pinker-style arguments for the world getting better seem one-sided to me – he mostly looks at trends that were already relevant 100s of years ago, but doesn’t talk about “newer problems” that only arose as Molochian side-effects of technological progress.)
AI will be the culmination of all that (of the accelerating forces that have positive effects on immediately legible metrics, but negative effects on some other variables due to Molochian dynamics). Unless we use it to attain a degree of control that we never had, it won’t go well.
To conclude, there’s a sense in which believing “AI doom risk is only 5%” is like believing that there’s a 95% that AI will solve all the world’s major problems. Expressed in that way, it seems like a pretty strong claim.
(The above holds especially for definitions of “AI doom” where humanity would lose most of its long-term “potential.” That said, even if by “AI doom” one means something like “people all die,” it stands to argue that one likely endpoint/attractor state from not being able to fix all the world’s major problems will be people’s extinction, eventually.)
I’ve been meaning to write a longer post on these topics at some point, but may not get to it anytime soon.
Eh, I don’t think this is a priors game. Quintin has lots of information, I have lots of information, so if we were both acting optimally according to differing priors, our opinions likely would have converged.
In general I’m skeptical of arguments of disagreement which reduce things to differing priors. It’s just not physically or predictively correct, and it feels nice because now you no longer have an epistemological duty to go and see why relevant people have differing opinions.
That would be a valid reply if I had said it’s all about priors. All I said was that I think priors make up a significant implicit source of the disagreement – as suggested by some people thinking 5% risk of doom seems “high” and me thinking/reacting with “you wouldn’t be saying that if you had anything close to my priors.”
Or maybe what I mean is stronger than “priors.” “Differences in underlying worldviews” seems like the better description. Specifically, the worldview I identify more with, which I think many EAs don’t share, is something like “The Yudkowskian worldview where the world is insane, most institutions are incompetent, Inadequate Equilibria is a big deal, etc.” And that probably affects things like whether we anchor way below 50% or above 50% on what the risks should be that the culmination of accelerating technological progress will go well or not.
That’s misdescribing the scope of my point and drawing inappropriate inferences. Last time I made an object-level argument about AI misalignment risk was just 3h before your comment. (Not sure it’s particularly intelligible, but the point is, I’m trying! :) )
So, evidently, I agree that a lot of the discussion should be held at a deeper level than the one of priors/general worldviews.
I’m a fan of Shard theory and some of the considerations behind it have already updated me towards a lower chance of doom than I had before starting to incorporate it more into my thinking. (Which I’m still in the process of doing.)
Yeah, he’s working on it, but its not his no. 1 priority. He developed shard theory.
Agreed, but 5% is much lower than “certain or close to certain”, which is the starting point Nuno Sempere said he was sceptical of.
I don’t know that anyone thinks doom is “certain or close to certain”, though the April 1 post could be read that way. 5% is also much lower than, say, 50%, which seems to be a somewhat more common belief.
Thanks!
What outcome does he specifically predict 5% probability of?