Could you expound on this or maybe point me in the right direction to learn why this might be?
I tend to agree with the intuition that s-risks are unlikely because they are a small part of possibility space and that nobody is really aiming for them. I can see a risk that systems trained to produce eudaimonia will instead produce −1 x eudaimonia, but I can’t see how that justifies thinking that astronomic bad is more likely than astronomic good. Surely a random sign flip is less likely than not.
Yeah, that seems likely. Astronomically bad seems much more likely than astronomically good to me though.
Could you expound on this or maybe point me in the right direction to learn why this might be?
I tend to agree with the intuition that s-risks are unlikely because they are a small part of possibility space and that nobody is really aiming for them. I can see a risk that systems trained to produce eudaimonia will instead produce −1 x eudaimonia, but I can’t see how that justifies thinking that astronomic bad is more likely than astronomic good. Surely a random sign flip is less likely than not.