Too unlikely. I’ve heard three versions of this concern. One is that s-risks are unlikely. I simply don’t think it is as explained above, in the post proper. The second version is that it’s 1/10th of extinction, hence less likely, hence not a priority. The third version of this take is that it’s just psychologically hard to be motivated for something that is not the mode of the probability distribution of how the future will turn out (given such clusters as s-risks, extinction, and business as usual). So even if s-risks are much worse and only slightly less likely than extinction, they’re still hard for people to work on.
There have been countless discussions of takeoff speeds. The slower the takeoff and the closer the arms race, the greater the risk of a multipolar takeoff. Most of you probably have some intuition of what the risk of a multipolar takeoff is. S-risk is probably just 1/10th of that – wild guess. So I’m afraid that the risk is quite macroscopic.
The second version ignores the expected value. I acknowledge that expected value calculus has its limitations, but if we use it at all, and we clearly do, a lot, then there’s no reason to ignore its implications specifically for s-risks. With all ITN factors taken together but ignoring probabilities, s-risk work beats other x-risk work by a factor of 10^12 for me (your mileage may vary), so if it’s just 10x less likely, that’s not decisive for me.
S-risk is probably just 1/10th of that – wild guess
This feels high to me – I acknowledge that you are caveating this as just a guess, but I would be interested to hear more of your reasoning.
One specific thing I’m confused about: you described alignment as “an adversarial game that we’re almost sure to lose.” But conflict between misaligned AI’s is not likely to constitute an s-risk, right? You can’t really blackmail a paperclip maximizer by threatening to simulate torture, because the paperclip maximizer doesn’t care about torture, just paperclips.
Maybe you think that multipolar scenarios are likely to result in AI’s that are almost but not completely aligned?
Maybe you think that multipolar scenarios are likely to result in AI’s that are almost but not completely aligned?
Exactly! Even GPT-4 sounds pretty aligned to me, maybe dangerously so. And even if that might have nothing to do with any real goals it might have deep down if it’s a mesa optimizer, the appearance could still lead to trouble in adversarial games with less seemingly aligned agents.
Too unlikely. I’ve heard three versions of this concern. One is that s-risks are unlikely. I simply don’t think it is as explained above, in the post proper. The second version is that it’s 1/10th of extinction, hence less likely, hence not a priority. The third version of this take is that it’s just psychologically hard to be motivated for something that is not the mode of the probability distribution of how the future will turn out (given such clusters as s-risks, extinction, and business as usual). So even if s-risks are much worse and only slightly less likely than extinction, they’re still hard for people to work on.
There have been countless discussions of takeoff speeds. The slower the takeoff and the closer the arms race, the greater the risk of a multipolar takeoff. Most of you probably have some intuition of what the risk of a multipolar takeoff is. S-risk is probably just 1/10th of that – wild guess. So I’m afraid that the risk is quite macroscopic.
The second version ignores the expected value. I acknowledge that expected value calculus has its limitations, but if we use it at all, and we clearly do, a lot, then there’s no reason to ignore its implications specifically for s-risks. With all ITN factors taken together but ignoring probabilities, s-risk work beats other x-risk work by a factor of 10^12 for me (your mileage may vary), so if it’s just 10x less likely, that’s not decisive for me.
I don’t have a response to the third version.
This feels high to me – I acknowledge that you are caveating this as just a guess, but I would be interested to hear more of your reasoning.
One specific thing I’m confused about: you described alignment as “an adversarial game that we’re almost sure to lose.” But conflict between misaligned AI’s is not likely to constitute an s-risk, right? You can’t really blackmail a paperclip maximizer by threatening to simulate torture, because the paperclip maximizer doesn’t care about torture, just paperclips.
Maybe you think that multipolar scenarios are likely to result in AI’s that are almost but not completely aligned?
Exactly! Even GPT-4 sounds pretty aligned to me, maybe dangerously so. And even if that might have nothing to do with any real goals it might have deep down if it’s a mesa optimizer, the appearance could still lead to trouble in adversarial games with less seemingly aligned agents.