(Note: I’m so far very in favor of work on AI safety. This question isn’t intended to oppose work on AI safety, but to better understand it and its implications.)
(Edit: The point of this question is also to brainstorm some possible harms of AI safety and see if any of these can produce practical considerations to keep in mind for the development of AI safety.)
Is there any content that investigates the harms that could come from AI safety? I’ve so far only found the scattered comments listed below. All types of harm are relevant, but I think I most had in mind harm that could come from AI safety work going as intended as opposed to the opposite (an example of the opposite: it being misrepresented, de-legitimized as a result, and it then being neglected in a way that causes harm). In a sense, the latter seems much less surprising because the final mechanism of harm is still what proponents of AI safety are concerned about (chiefly, unaligned AI). Here, I’m a bit more interested in “surprising” ways the work could cause harm.
[Question] What harm could AI safety do?
(Note: I’m so far very in favor of work on AI safety. This question isn’t intended to oppose work on AI safety, but to better understand it and its implications.)
(Edit: The point of this question is also to brainstorm some possible harms of AI safety and see if any of these can produce practical considerations to keep in mind for the development of AI safety.)
Is there any content that investigates the harms that could come from AI safety? I’ve so far only found the scattered comments listed below. All types of harm are relevant, but I think I most had in mind harm that could come from AI safety work going as intended as opposed to the opposite (an example of the opposite: it being misrepresented, de-legitimized as a result, and it then being neglected in a way that causes harm). In a sense, the latter seems much less surprising because the final mechanism of harm is still what proponents of AI safety are concerned about (chiefly, unaligned AI). Here, I’m a bit more interested in “surprising” ways the work could cause harm.
“AI safety work advancing AGI more than it aligns it”
“influencing [major] international regulatory organisation in a way leading to creating some sort of “AI safety certification” in a situation where we don’t have the basic research yet, creating false sense of security/fake sense of understanding” and “influencing important players in AI or AI safety in a harmful leveraged way, e.g. by bad strategic advice ”