AI Safety is Sometimes a Model Property

Link post

Arvind Narayanan and Sayash Kapoor at the AI Snake Oil newsletter have a recent post called “AI safety is not a model property.” I think they make a number of important points in that post, and some incorrect ones.

Boring bottom line up front: AI safety is sometimes a model property, in the following sense: It sometimes makes sense to perform safety interventions at the model level. Other safety interventions should be at the system or deployment context level. It takes skill and careful consideration to determine where in the AI deployment cycle the best interventions lie, and blanket statements excluding one or the other are not helpful.