How entrenched do you think are old ideas about AI risk in the AI safety community? Do you think that it’s possible to have a new paradigm quickly given relevant arguments?
I’d guess that like most scientific endeavours, there are many social aspects that make people more biased toward their own old way of thinking. Research agendas and institutions are focused on some basic assumptions—which, if changed, could be disruptive to the people involved or the organisation. However, there seems to be a lot of engagement with the underlying questions about the paths to superintelligence and the consequences thereof, and also the research community today is heavily involved with the rationality community—both of these makes me hopeful that more minds can be changed given appropriate argumentation.
How entrenched do you think are old ideas about AI risk in the AI safety community? Do you think that it’s possible to have a new paradigm quickly given relevant arguments?
I actually don’t think they’re very entrenched!
I think that, today, most established AI researchers have fairly different visions of the risks from AI—and of the problems that they need to solve—than the primary vision discussed in Superintelligence and in classic Yudkowsky essays. When I’ve spoken to AI safety researchers about issues with the “classic” arguments, I’ve encountered relatively low levels of disagreement. Arguments that heavily emphasize mesa-optimization or arguments that are more in line with this post seem to be more influential now. (The safety researchers I know aren’t a random sample, though, so I’d be interested in whether this sounds off to anyone in the community.)
I think that “classic” ways of thinking about AI risk are now more prominent outside the core AI safety community than they are within it. I think that they have an important impact on community beliefs about prioritization, on individual career decisions, etc., but I don’t think they’re heavily guiding most of the research that the safety community does today.
(Unfortunately, I probably don’t make this clear in the podcast.)
How entrenched do you think are old ideas about AI risk in the AI safety community? Do you think that it’s possible to have a new paradigm quickly given relevant arguments?
I’d guess that like most scientific endeavours, there are many social aspects that make people more biased toward their own old way of thinking. Research agendas and institutions are focused on some basic assumptions—which, if changed, could be disruptive to the people involved or the organisation. However, there seems to be a lot of engagement with the underlying questions about the paths to superintelligence and the consequences thereof, and also the research community today is heavily involved with the rationality community—both of these makes me hopeful that more minds can be changed given appropriate argumentation.
I actually don’t think they’re very entrenched!
I think that, today, most established AI researchers have fairly different visions of the risks from AI—and of the problems that they need to solve—than the primary vision discussed in Superintelligence and in classic Yudkowsky essays. When I’ve spoken to AI safety researchers about issues with the “classic” arguments, I’ve encountered relatively low levels of disagreement. Arguments that heavily emphasize mesa-optimization or arguments that are more in line with this post seem to be more influential now. (The safety researchers I know aren’t a random sample, though, so I’d be interested in whether this sounds off to anyone in the community.)
I think that “classic” ways of thinking about AI risk are now more prominent outside the core AI safety community than they are within it. I think that they have an important impact on community beliefs about prioritization, on individual career decisions, etc., but I don’t think they’re heavily guiding most of the research that the safety community does today.
(Unfortunately, I probably don’t make this clear in the podcast.)