Having outsized influence could be enough, when we’re considering the (dis)value in the far future at stake, which is still much larger than from the deaths of everyone killed by AI. What ratio of probabilities between influencing values in a better direction vs preventing extinction would you assign? Is the ratio small enough to give less overall weight to the expected far future impact than the reduction in risk of everyone killed by AI?
(FWIW, I don’t think it’s strictly necessary to explicitly “load” a value system to influence the kinds of values an AI system might have.)
Having outsized influence could be enough, when we’re considering the (dis)value in the far future at stake, which is still much larger than from the deaths of everyone killed by AI. What ratio of probabilities between influencing values in a better direction vs preventing extinction would you assign? Is the ratio small enough to give less overall weight to the expected far future impact than the reduction in risk of everyone killed by AI?
(FWIW, I don’t think it’s strictly necessary to explicitly “load” a value system to influence the kinds of values an AI system might have.)