I’m much more excited by scenarios like: ‘a new podcast comes out that has top-tier-excellent discussion of AI alignment stuff, it becomes super popular among ML researchers, and the culture, norms, and expectations of ML thereby shift such that water-cooler conversations about AGI catastrophe are more serious, substantive, informed, candid, and frequent’.
It’s rare for a big positive cultural shift like that to happen; but it does happen sometimes, and it can result in very fast changes to the Overton window. And since it’s a podcast containing many hours of content, there’s the potential to seed subsequent conversations with a lot of high-quality background thoughts.
To my eye, that seems more like the kind of change that might shift us from a current trajectory of “~definitely going to kill ourselves” to a new trajectory of “viable chance of an existential win”.
Whereas warning shots feel more unpredictable to me, and if they’re unhelpful, I expect the helpfulness to at best look like “we were almost on track to win, and then the warning shot nudged us just enough to secure a win”.
That feels to me like the kind of event that (if we get lucky and a lot of things go well) could shift us onto a winning trajectory. Obviously, another event would be some sort of technical breakthrough that makes alignment a lot easier.
What are they?
(I don’t think anyone has written a scenarios that make the world go a lot better post/doc; it might be useful.)
From another subthread:
That feels to me like the kind of event that (if we get lucky and a lot of things go well) could shift us onto a winning trajectory. Obviously, another event would be some sort of technical breakthrough that makes alignment a lot easier.