How entrenched do you think are old ideas about AI risk in the AI safety community? Do you think that it’s possible to have a new paradigm quickly given relevant arguments?
I actually don’t think they’re very entrenched!
I think that, today, most established AI researchers have fairly different visions of the risks from AI—and of the problems that they need to solve—than the primary vision discussed in Superintelligence and in classic Yudkowsky essays. When I’ve spoken to AI safety researchers about issues with the “classic” arguments, I’ve encountered relatively low levels of disagreement. Arguments that heavily emphasize mesa-optimization or arguments that are more in line with this post seem to be more influential now. (The safety researchers I know aren’t a random sample, though, so I’d be interested in whether this sounds off to anyone in the community.)
I think that “classic” ways of thinking about AI risk are now more prominent outside the core AI safety community than they are within it. I think that they have an important impact on community beliefs about prioritization, on individual career decisions, etc., but I don’t think they’re heavily guiding most of the research that the safety community does today.
(Unfortunately, I probably don’t make this clear in the podcast.)
I actually don’t think they’re very entrenched!
I think that, today, most established AI researchers have fairly different visions of the risks from AI—and of the problems that they need to solve—than the primary vision discussed in Superintelligence and in classic Yudkowsky essays. When I’ve spoken to AI safety researchers about issues with the “classic” arguments, I’ve encountered relatively low levels of disagreement. Arguments that heavily emphasize mesa-optimization or arguments that are more in line with this post seem to be more influential now. (The safety researchers I know aren’t a random sample, though, so I’d be interested in whether this sounds off to anyone in the community.)
I think that “classic” ways of thinking about AI risk are now more prominent outside the core AI safety community than they are within it. I think that they have an important impact on community beliefs about prioritization, on individual career decisions, etc., but I don’t think they’re heavily guiding most of the research that the safety community does today.
(Unfortunately, I probably don’t make this clear in the podcast.)