I think that’s a valid worry and I also don’t expect the standards to end up specifying how to solve the alignment problem. :P I’d still be pretty happy about the proposed efforts on standard setting because I also expect standards to have massive effects that can be more or less useful for a) directing research in directions that reduce longterm risks (e.g. pushing for more mechanistic interpretability), b) limiting how quickly an agentic AI can escape our control (e.g. via regulating internet access, making manipulation harder), c) enabling strong(er) international agreements (e.g. shared standards could become basis for international monitoring efforts of AI development and deployment).
I think that’s a valid worry and I also don’t expect the standards to end up specifying how to solve the alignment problem. :P I’d still be pretty happy about the proposed efforts on standard setting because I also expect standards to have massive effects that can be more or less useful for
a) directing research in directions that reduce longterm risks (e.g. pushing for more mechanistic interpretability),
b) limiting how quickly an agentic AI can escape our control (e.g. via regulating internet access, making manipulation harder),
c) enabling strong(er) international agreements (e.g. shared standards could become basis for international monitoring efforts of AI development and deployment).