zdgroff—that link re. specific preferences to the 80k Hours interview with Stuart Russell is a fascinating example of what I’m concerned about. Russell seems to be arguing that either we align an AI system with one person’s individual stated preferences at a time, or we’d have to discover the ultimate moral truth of the universe, and get the AI aligned to that.
But where’s the middle ground of trying to align with multiple people who have diverse values? That’s where most of the near-term X risk lurks, IMHO—i.e. in runaway geopolitical or religious wars, or other human conflicts, amplified by AI capabilities. Even if we’re talking fairly narrow AI rather than AGI.
zdgroff—that link re. specific preferences to the 80k Hours interview with Stuart Russell is a fascinating example of what I’m concerned about. Russell seems to be arguing that either we align an AI system with one person’s individual stated preferences at a time, or we’d have to discover the ultimate moral truth of the universe, and get the AI aligned to that.
But where’s the middle ground of trying to align with multiple people who have diverse values? That’s where most of the near-term X risk lurks, IMHO—i.e. in runaway geopolitical or religious wars, or other human conflicts, amplified by AI capabilities. Even if we’re talking fairly narrow AI rather than AGI.