Ofer answers Why does (any particular) AI safety work reduce s-risks more than it increases them?

Ofer 4 Oct 2021 17:25 UTC
6 points
0 ∶ 1
I think that’s an important question. Here are some thoughts (though I think this topic deserves a much more rigorous treatment):

Creating an AGI with an arbitrary goal system (that is potentially much less satiable than humans’) and arbitrary game theoretical mechanisms—via an ML process that can involve an arbitrary amount of ~suffering/disutility—generally seems very dangerous. Some of the relevant considerations are weird and non-obvious. For example, creating such an arbitrary AGI may constitute wronging some set of agents across the multiverse (due to the goal system & game theoretical mechanisms of that AGI).

I think there’s also the general argument that, due to cluelessness, trying to achieve some form of a vigilant Long Reflection process is the best option on the table, including by the lights of suffering-focused ethics (e.g. due to weird ways in which resources could be used to reduce suffering across the multiverse via acausal trading). Interventions that mitigate x-risks (including AI-related x-risks) seem to increase the probability that humanity will achieve such a Long Reflection process.

Finally, a meta point that seems important: People in EA who have spent a lot of time on AI safety (including myself), or even made it their career, probably have a motivated reasoning bias towards the belief that working on AI safety tends to be net-positive.