I’d guess not. From my perspective, humanity’s bottleneck is almost entirely that we’re clueless about alignment. If a meme adds muddle and misunderstanding, then it will be harder to get a critical mass of researchers who are extremely reasonable about alignment, and therefore harder to solve the problem.
It’s hard for muddle and misinformation to spread in exactly the right way to offset those costs; and attempting to strategically sow misinformation so will tend to erode our ability to think well and to trust each other.
I’m not sure I get your point here. Surely the terms “AI Safety” and “AI Alignment” are already causing muddle and misunderstanding? I’m saying we should be more specific in our naming of the problem.
“ASI x-safety” might be a better term for other reasons (though Nate objects to it here), but by default, I don’t think we should be influenced in our terminology decisions by ‘term T will cause some alignment researchers to have falser beliefs and pursue dumb-but-harmless strategies, and maybe this will be good’. (Or, by default this should be a reason not to adopt terminology.)
Whether current terms cause muddle and misunderstanding doesn’t change my view on this. In that case, IMO we should consider changing to a new term in order to reduce muddle and misunderstanding. We shouldn’t strategically confuse and mislead people in a new direction, just because we accidentally confused or misled people in the past.
“AGI existential safety” seems like the most popular relatively-unambiguous term for “making the AGI transition go well”, so I’m fine with using it until we find a better term.
I think “AI alignment” is a good term for the technical side of differentially producing good outcomes from AI, though it’s an imperfect term insofar as it collides with Stuart Russell’s “value alignment” and Paul Christiano’s “intent alignment”. (The latter, at least, better subsumes a lot of the core challenges in making AI go well.)
I’d guess not. From my perspective, humanity’s bottleneck is almost entirely that we’re clueless about alignment. If a meme adds muddle and misunderstanding, then it will be harder to get a critical mass of researchers who are extremely reasonable about alignment, and therefore harder to solve the problem.
It’s hard for muddle and misinformation to spread in exactly the right way to offset those costs; and attempting to strategically sow misinformation so will tend to erode our ability to think well and to trust each other.
I’m not sure I get your point here. Surely the terms “AI Safety” and “AI Alignment” are already causing muddle and misunderstanding? I’m saying we should be more specific in our naming of the problem.
“ASI x-safety” might be a better term for other reasons (though Nate objects to it here), but by default, I don’t think we should be influenced in our terminology decisions by ‘term T will cause some alignment researchers to have falser beliefs and pursue dumb-but-harmless strategies, and maybe this will be good’. (Or, by default this should be a reason not to adopt terminology.)
Whether current terms cause muddle and misunderstanding doesn’t change my view on this. In that case, IMO we should consider changing to a new term in order to reduce muddle and misunderstanding. We shouldn’t strategically confuse and mislead people in a new direction, just because we accidentally confused or misled people in the past.
What are some better options? Or, what are your current favourites?
“AGI existential safety” seems like the most popular relatively-unambiguous term for “making the AGI transition go well”, so I’m fine with using it until we find a better term.
I think “AI alignment” is a good term for the technical side of differentially producing good outcomes from AI, though it’s an imperfect term insofar as it collides with Stuart Russell’s “value alignment” and Paul Christiano’s “intent alignment”. (The latter, at least, better subsumes a lot of the core challenges in making AI go well.)
Perhaps using “doom” more could work (doom encompasses extinction, permanent curtailment of future potential, and fates worse than extinction).