Could you possibly elaborate on what you mean here: “perhaps it adopts an explicit position of metaethical uncertainty… but avoids applying this to its own values”? I’m very much still learning about alignment, but my instinct is that AI value alignment would result in an AI that acknowledges unavoidable moral uncertainty. Do you think that alignment efforts push for certainty when there really isn’t any? I’m just trying to get a sense of your perspective.
Could you possibly elaborate on what you mean here: “perhaps it adopts an explicit position of metaethical uncertainty… but avoids applying this to its own values”? I’m very much still learning about alignment, but my instinct is that AI value alignment would result in an AI that acknowledges unavoidable moral uncertainty. Do you think that alignment efforts push for certainty when there really isn’t any? I’m just trying to get a sense of your perspective.