Playing the devil’s advocate for a minute, I think one main challenge to this way of presenting the case is something like “yeah, and this is exactly what you’d expect to see for a field in its early stages. Can you tell a story for how these kinds of failures end up killing literally everyone, rather than getting fixed along the way, well before they’re deployed widely enough to do so?”
And there, it seems you do need to start talking about agents with misaligned goals, and the reasons to expect misalignment that we don’t manage to fix?
What I do (assuming I get to that point in the conversation) is that I deliberately mention points like this, even before trying to argue otherwise. In my experience (which again is just my experience) a good portion of the time the people I’m talking to debunk those counterarguments themselves. And if they don’t, well then I can start discussing it at that point—but at that point it feels to me like I’ve already established credibility and non-craziness by (a) starting off with noncontroversial topics, (b) starting off the more controversial topics with arguments against taking it seriously, and (c) by drawing mostly obvious lines of reasoning from (a) to (b) to whatever conclusions they do end up reaching. So long as I don’t go signaling science-fiction-geekiness too much during the conversation, it feels to me like if I end up having to make some particular arguments in the end then those become a pretty easy sell.
Thanks, I appreciate this post a lot!
Playing the devil’s advocate for a minute, I think one main challenge to this way of presenting the case is something like “yeah, and this is exactly what you’d expect to see for a field in its early stages. Can you tell a story for how these kinds of failures end up killing literally everyone, rather than getting fixed along the way, well before they’re deployed widely enough to do so?”
And there, it seems you do need to start talking about agents with misaligned goals, and the reasons to expect misalignment that we don’t manage to fix?
What I do (assuming I get to that point in the conversation) is that I deliberately mention points like this, even before trying to argue otherwise. In my experience (which again is just my experience) a good portion of the time the people I’m talking to debunk those counterarguments themselves. And if they don’t, well then I can start discussing it at that point—but at that point it feels to me like I’ve already established credibility and non-craziness by (a) starting off with noncontroversial topics, (b) starting off the more controversial topics with arguments against taking it seriously, and (c) by drawing mostly obvious lines of reasoning from (a) to (b) to whatever conclusions they do end up reaching. So long as I don’t go signaling science-fiction-geekiness too much during the conversation, it feels to me like if I end up having to make some particular arguments in the end then those become a pretty easy sell.