Can you give some examples where individual humans have a clear strategic decisive advantage (i.e. very low risk of punishment), where the low-power individual isn’t at a high risk of serious harm?
Why are we assuming a low risk of punishment? Risk of punishment depends largely on social norms and laws, and I’m saying that AIs will likely adhere to a set of social norms.
I think the central question is whether these social norms will include the norm “don’t murder humans”. I think such a norm will probably exist, unless almost all AIs are severely misaligned. I think severe misalignment is possible; one can certainly imagine it happening. But I don’t find it likely, since people will care a lot about making AIs ethical, and I’m not yet aware of any strong reasons to think alignment will be super-hard.
Why are we assuming a low risk of punishment? Risk of punishment depends largely on social norms and laws, and I’m saying that AIs will likely adhere to a set of social norms.
I think the central question is whether these social norms will include the norm “don’t murder humans”. I think such a norm will probably exist, unless almost all AIs are severely misaligned. I think severe misalignment is possible; one can certainly imagine it happening. But I don’t find it likely, since people will care a lot about making AIs ethical, and I’m not yet aware of any strong reasons to think alignment will be super-hard.