Recent advances in LLMs have led me to update toward believing that we live in the world where alignment is easy (i.e. CEV naturally emerge from LLMs, and future AI agents will be based on understanding and following natural language commands by default), but governance is hard (i.e. AI agents might be co-opted by governments or corporations to lock in humanity in a dystopian future, and the current geopolitical environment, characterized by democratic backsliding, cold war mongering, and an increase in military conflicts including wars of aggression, isn’t conducive to robust multilateral governance).
This is roughly my take, with the caveat that I’d replace CEV by instruction following, and I wouldn’t be so sure that alignment is easy (though I do think we can replace it with the assumption that it is highly incentivized to solve the AI alignment problem and that the problem is actually solvable).
Recent advances in LLMs have led me to update toward believing that we live in the world where alignment is easy (i.e. CEV naturally emerge from LLMs, and future AI agents will be based on understanding and following natural language commands by default), but governance is hard (i.e. AI agents might be co-opted by governments or corporations to lock in humanity in a dystopian future, and the current geopolitical environment, characterized by democratic backsliding, cold war mongering, and an increase in military conflicts including wars of aggression, isn’t conducive to robust multilateral governance).
This is roughly my take, with the caveat that I’d replace CEV by instruction following, and I wouldn’t be so sure that alignment is easy (though I do think we can replace it with the assumption that it is highly incentivized to solve the AI alignment problem and that the problem is actually solvable).