Thanks for this, I think it is an important and under-discussed point. In their AI alignment work, EAs seem to be aiming for intent-alignment rather than social welfare production, which I think is plausibly a very large mistake, or at least one that hasn’t received very much scrutiny.
Incidentally, I also don’t know what it means to say that we have aligned AIs with ‘our values’. Since there is disagreement, ‘our’ has no referent here.
Thanks for this, I think it is an important and under-discussed point. In their AI alignment work, EAs seem to be aiming for intent-alignment rather than social welfare production, which I think is plausibly a very large mistake, or at least one that hasn’t received very much scrutiny.
Incidentally, I also don’t know what it means to say that we have aligned AIs with ‘our values’. Since there is disagreement, ‘our’ has no referent here.