My own take is that while I don’t want to defend the “find a correct utility function” approach to alignment to be sufficient at this time, I do think it is actually necessary, and that the modern era is an anomaly in how much we can get away with misalignment being checked by institutions that go beyond an individual.
The basic reason why we can get away with not solving the alignment problem is that humans depend on other humans, and in particular you cannot replace humans with much cheaper workers that have their preferences controlled arbitrarily.
AI threatens the need to depend on other humans, which is a critical part of how we can get away with not needing the correct utility function.
I like the Intelligence Curse series because it points out that an elite that doesn’t need the commoners for anything and the commoners have no selfish value to the elite fundamentally means that by default, the elites starve the commoners to death without them being value aligned.
My own take is that while I don’t want to defend the “find a correct utility function” approach to alignment to be sufficient at this time, I do think it is actually necessary, and that the modern era is an anomaly in how much we can get away with misalignment being checked by institutions that go beyond an individual.
The basic reason why we can get away with not solving the alignment problem is that humans depend on other humans, and in particular you cannot replace humans with much cheaper workers that have their preferences controlled arbitrarily.
AI threatens the need to depend on other humans, which is a critical part of how we can get away with not needing the correct utility function.
I like the Intelligence Curse series because it points out that an elite that doesn’t need the commoners for anything and the commoners have no selfish value to the elite fundamentally means that by default, the elites starve the commoners to death without them being value aligned.
The Intelligence Curse series is below:
https://intelligence-curse.ai/defining/
The AIs are the elites, and the rest of humanity is the commoners in this analogy.