I feel like the word “values” makes this sound more complex than it is, and I’d say we instead want the agent to understand and act in line with what the human wants / intends.
Doesn’t “wants / intends” makes this sound less complex than it is? To me this phrasing connotes (not to say you actually believe this) that the goal is for AIs to understand short-term human desires, without accounting for ways in which our wants contradict what we would value in the long term, or ways that individuals’ wants can conflict. Once we add caveats like “what we would want / intend after sufficient rational reflection,” my sense is that “values” just captures that more intuitively. I haven’t surveyed people on this, though, so this definitely isn’t a confident claim on my part.
Once we add caveats like “what we would want / intend after sufficient rational reflection,” my sense is that “values” just captures that more intuitively.
I in fact don’t want to add in those caveats here: I’m suggesting that we tell our AI system to do what we short-term want. (Of course, we can then “short-term want” to do more rational reflection, or to be informed of true and useful things that help us make moral progress, etc.)
I agree that “values” more intuitively captures the thing with all the caveats added in.
Doesn’t “wants / intends” makes this sound less complex than it is? To me this phrasing connotes (not to say you actually believe this) that the goal is for AIs to understand short-term human desires, without accounting for ways in which our wants contradict what we would value in the long term, or ways that individuals’ wants can conflict. Once we add caveats like “what we would want / intend after sufficient rational reflection,” my sense is that “values” just captures that more intuitively. I haven’t surveyed people on this, though, so this definitely isn’t a confident claim on my part.
I in fact don’t want to add in those caveats here: I’m suggesting that we tell our AI system to do what we short-term want. (Of course, we can then “short-term want” to do more rational reflection, or to be informed of true and useful things that help us make moral progress, etc.)
I agree that “values” more intuitively captures the thing with all the caveats added in.