This seems really important and under-discussed:
Interactions with AI safety:There was little convergence among participants on whether AI safety and digital mind welfare efforts will align or conflict. Mentioned potential synergies included alignment reducing the need for coercive control and shared technical tools (e.g., interpretability methods). Potential conflicts included safety measures such as monitoring and shutdown protocols that could harm digital minds, welfare protections limiting the ability to control AI behavior, and competition for scarce funding, talent, and regulatory attention.
The brand new episode with Kyle Fish from Anthropic (released since you wrote this comment) discusses some reasons why AI Safety and AI welfare efforts might conflict or be mutually beneficial, if you’re interested!
Thanks, I’ll have a listen!
This seems really important and under-discussed:
The brand new episode with Kyle Fish from Anthropic (released since you wrote this comment) discusses some reasons why AI Safety and AI welfare efforts might conflict or be mutually beneficial, if you’re interested!
Thanks, I’ll have a listen!