Can you spell your argument out in more detail? I get the sense that you think AI doom is obvious given misalignment, and I’m trying to get you to see that there seem to be many implicit steps in the argument that you’re leaving out.
For example, one such step in the argument seems to be: “If an entity is powerful and misaligned, then it will be cost-efficient for that entity to kill everyone else.” If that were true, you’d probably expect some precedent, like powerful entities in our current world murdering everyone to get what they want. To some extent that may be true. Yet, while I admit wars and murder have happened a lot, overall the world seems fairly peaceful, despite vast difference in wealth and military power.
Plausibly you think that, OK, sure, in the human world, entities like the US government don’t kill everyone else to get what they want, but that’s because humans are benevolent and selfless. And my point is: no, I don’t think humans are. Most humans are basically selfish. You can verify this by measuring how much of their disposable income people spend on themselves and their family, as opposed to strangers. Sure there’s some altruism present in the world. I don’t deny that. But some non-zero degree of altruism seems plausible in an AI misalignment scenario too.
So I’m asking: what exactly about AIs makes it cost-efficient for them to kill all humans? Perhaps AIs will lead to a breakdown of the legal system and they won’t use it to resolve their disputes? Maybe AIs will all gang up together as a unified group and launch a massive revolution, ending with a genocide of humans? Make these assumptions explicit, because I don’t find them obvious. I see them mostly as speculative assertions about what might happen, rather than what is likely to happen.
Maybe the AI’s all team up together. Maybe some ally with us at the start and backstab us down the line. I don’t think it makes a difference. When tangling with entities much smarter than us, I’m sure we get screwed somewhere along the line.
The AI needs to marginalise us/limit our power so we’re not a threat. At that point, even if it’s not worth the effort to wipe us out then and there, slowly strangling us should only take marginally more resources than keeping us marginalised. My expectation is that it should almost always worth the small bit of extra effort to cause a slow decline.
This may even occur naturally with an AI gradually claiming more and more land. Like at the start, it may be focused on developing its own capacity and not be bothered to chase down humans in remote parts of the globe. But over time, an AI would likely spread out to claim more resources, in which point it’s more likely to decide to mop up any humans lest we get in its way. That said, it may have no reason to mop us up if we’re just going to die out anyway.
When tangling with entities much smarter than us, I’m sure we get screwed somewhere along the line.
This is probably the key point of disagreement. You seem to be “sure” that catastrophic outcomes happen when individual AIs are misaligned, whereas I’m saying “It could happen, but I don’t think the case for that is strong”. I don’t see how a high level of confidence can be justified given the evidence you’re appealing to. This seems like a highly speculative thesis.
Also, note that my argument here is meant as a final comment in my section about AI optimism. I think the more compelling argument is that AIs will probably care for humans to a large degree. Alignment might be imperfect, but it sounds like to get the outcomes you’re talking about, we need uniformity and extreme misalignment among AIs, and I don’t see why we should think that’s particularly likely given the default incentives of AI companies.
Can you spell your argument out in more detail? I get the sense that you think AI doom is obvious given misalignment, and I’m trying to get you to see that there seem to be many implicit steps in the argument that you’re leaving out.
For example, one such step in the argument seems to be: “If an entity is powerful and misaligned, then it will be cost-efficient for that entity to kill everyone else.” If that were true, you’d probably expect some precedent, like powerful entities in our current world murdering everyone to get what they want. To some extent that may be true. Yet, while I admit wars and murder have happened a lot, overall the world seems fairly peaceful, despite vast difference in wealth and military power.
Plausibly you think that, OK, sure, in the human world, entities like the US government don’t kill everyone else to get what they want, but that’s because humans are benevolent and selfless. And my point is: no, I don’t think humans are. Most humans are basically selfish. You can verify this by measuring how much of their disposable income people spend on themselves and their family, as opposed to strangers. Sure there’s some altruism present in the world. I don’t deny that. But some non-zero degree of altruism seems plausible in an AI misalignment scenario too.
So I’m asking: what exactly about AIs makes it cost-efficient for them to kill all humans? Perhaps AIs will lead to a breakdown of the legal system and they won’t use it to resolve their disputes? Maybe AIs will all gang up together as a unified group and launch a massive revolution, ending with a genocide of humans? Make these assumptions explicit, because I don’t find them obvious. I see them mostly as speculative assertions about what might happen, rather than what is likely to happen.
Maybe the AI’s all team up together. Maybe some ally with us at the start and backstab us down the line. I don’t think it makes a difference. When tangling with entities much smarter than us, I’m sure we get screwed somewhere along the line.
The AI needs to marginalise us/limit our power so we’re not a threat. At that point, even if it’s not worth the effort to wipe us out then and there, slowly strangling us should only take marginally more resources than keeping us marginalised. My expectation is that it should almost always worth the small bit of extra effort to cause a slow decline.
This may even occur naturally with an AI gradually claiming more and more land. Like at the start, it may be focused on developing its own capacity and not be bothered to chase down humans in remote parts of the globe. But over time, an AI would likely spread out to claim more resources, in which point it’s more likely to decide to mop up any humans lest we get in its way. That said, it may have no reason to mop us up if we’re just going to die out anyway.
This is probably the key point of disagreement. You seem to be “sure” that catastrophic outcomes happen when individual AIs are misaligned, whereas I’m saying “It could happen, but I don’t think the case for that is strong”. I don’t see how a high level of confidence can be justified given the evidence you’re appealing to. This seems like a highly speculative thesis.
Also, note that my argument here is meant as a final comment in my section about AI optimism. I think the more compelling argument is that AIs will probably care for humans to a large degree. Alignment might be imperfect, but it sounds like to get the outcomes you’re talking about, we need uniformity and extreme misalignment among AIs, and I don’t see why we should think that’s particularly likely given the default incentives of AI companies.
“When tangling with entities much smarter than us, I’m sure we get screwed somewhere along the line.”
“This seems like a highly speculative thesis.”
I think it’s more of an anti-prediction tbh.