More specifically, I think there’s a good case to be made* that most of the expected disvalue of suffering risks comes from cooperation failures, so I’d especially encourage people who are interested in suffering risks and AI to look into cooperative AI and cooperation on AI. (These are areas mentioned in the paper you cite and in related writing.)
(*Large-scale efforts to create disvalue seem like they would be much more harmful than smaller-scale or unintentional actions, especially as technology advances. And the most plausible reason I’ve heard for why such efforts might happen is that: various actors might commit to creating disvalue under certain conditions, as a way to coerce other agents, and they would then carry out these threats if the conditions come about. This would leave everyone worse off than they could have been, so it is a sort of cooperation failure. Sadism seems like less of a big deal in expectation, because many agents have incentives to engage in coercion, while relatively few agents are sadists.)
(More closely related to my own interest in them, cooperation failures also seem like one of the main types of things that may prevent humanity from creating thriving futures, so this seems like an area that people with a wide range of perspectives on the value of the future can work together on :)
Look into suffering-focused AI safety which I think is extremely important and neglected (and s-risks).
More specifically, I think there’s a good case to be made* that most of the expected disvalue of suffering risks comes from cooperation failures, so I’d especially encourage people who are interested in suffering risks and AI to look into cooperative AI and cooperation on AI. (These are areas mentioned in the paper you cite and in related writing.)
(*Large-scale efforts to create disvalue seem like they would be much more harmful than smaller-scale or unintentional actions, especially as technology advances. And the most plausible reason I’ve heard for why such efforts might happen is that: various actors might commit to creating disvalue under certain conditions, as a way to coerce other agents, and they would then carry out these threats if the conditions come about. This would leave everyone worse off than they could have been, so it is a sort of cooperation failure. Sadism seems like less of a big deal in expectation, because many agents have incentives to engage in coercion, while relatively few agents are sadists.)
(More closely related to my own interest in them, cooperation failures also seem like one of the main types of things that may prevent humanity from creating thriving futures, so this seems like an area that people with a wide range of perspectives on the value of the future can work together on :)