I think the AI Notkilleveryoneism Memes ⏸️ (@AISafetyMemes) twitter account reasonably often says things that feel at least close to crying wolf. (E.g., in response to our recent paper “Alignment Faking in Large Langauge Models”, they posted a tweet which implied that we caught the model trying to escape in the wild. I tried to correct possible misunderstandings here.)
I wish they would stop doing this.
They are on the fringe IMO and often get called out for this.
I think the AI Notkilleveryoneism Memes ⏸️ (@AISafetyMemes) twitter account reasonably often says things that feel at least close to crying wolf. (E.g., in response to our recent paper “Alignment Faking in Large Langauge Models”, they posted a tweet which implied that we caught the model trying to escape in the wild. I tried to correct possible misunderstandings here.)
I wish they would stop doing this.
They are on the fringe IMO and often get called out for this.