It may be the case that we don’t even need to make general audiences consider paperclip maximizers at all, since the mechanisms needed to prevent them are the same as those needed to prevent the less severe and more plausible-sounding scenarios
I’m somewhat unsure what exactly you meant by this, but if your point is “solutions to near-term AI concerns like bias and unexpected failures will also provide solutions to long-term concerns about AI alignment,” that viewpoint is commonly disputed by AI safety experts.
No, that’s not what I mean. I mean we should use other examples of the form “you ask an AI to do X, and the AI accomplishes X by doing Y, but Y is bad and not what you intended” where Y is not as bad as an extinction event.
I understand—and agree with—the overall point being made about “don’t just talk about the extreme things like paperclip maximizers”, but I’m still thrown off by the statement that “the mechanisms needed to prevent [paperclip maximizers] are the same as those needed to prevent the less severe and more plausible-sounding scenarios”
I’m somewhat unsure what exactly you meant by this, but if your point is “solutions to near-term AI concerns like bias and unexpected failures will also provide solutions to long-term concerns about AI alignment,” that viewpoint is commonly disputed by AI safety experts.
No, that’s not what I mean. I mean we should use other examples of the form “you ask an AI to do X, and the AI accomplishes X by doing Y, but Y is bad and not what you intended” where Y is not as bad as an extinction event.
I understand—and agree with—the overall point being made about “don’t just talk about the extreme things like paperclip maximizers”, but I’m still thrown off by the statement that “the mechanisms needed to prevent [paperclip maximizers] are the same as those needed to prevent the less severe and more plausible-sounding scenarios”
Hm, yeah, I see where you’re coming from. Changed the phrasing.