the original statement still just seems to imagine that norms will be a non-trivial reason to avoid theft, which seems quite unlikely for a moderately rational agent.
Sorry, I think you’re still conflating two different concepts. I am not claiming:
Social norms will prevent single agents from stealing from others, even in the absence of mechanisms to enforce laws against theft
I am claiming:
Agents will likely not want to establish a collective norm that it’s OK (on a collective level) to expropriate wealth from old, vulnerable individuals. The reason is because most agents will themselves at some point become old, and thus do not want there to be a norm at that time, that would allow their own wealth expropriated from them when they become old.
There are two separate mechanisms at play here. Individual and local instances of theft, like robbery, are typically punished by specific laws. Collective expropriation of groups, while possible in all societies, is usually handled via more decentralized coordination mechanisms, such as social norms.
In other words, if you’re asking me why an AI agent can’t just steal from a human, in my scenario, I’d say that’s because there will (presumably) be laws against theft. But if you’re asking me why the AIs don’t all get up together and steal from the humans collectively, I’d say it’s because they would not want to violate the general norm against expropriation, especially of older, vulnerable groups.
perhaps much of your scenario was trying to convey a different idea from what I see as the straightforward interpretation, but I think it makes it hard for me to productively engage with it, as it feels like engaging with a motte-and-bailey.
For what it’s worth, I asked Claude 3 and GPT-4 to proof-read my essay before I posted, and they both appeared to understand what I said, with almost no misunderstandings, for every single one of my points (from my perspective). I am not bringing this up to claim you are dumb, or anything like that, but I do think it provides evidence that you could probably better understand what I’m saying if you tried to read my words more carefully.
Attempting takeover or biding one’s time are not the only options an AI may take. Indeed, in the human world, world takeover is rarely contemplated. For an agent that is not more powerful than the rest of the world combined, it seems likely that they will consider alternative strategies of achieving their goals before contemplating a risky (and likely doomed) shot at taking over the world.
Here are some other strategies you can take to try to accomplish your goals in the real world, without engaging in a violent takeover:
Trade and negotiate with other agents, giving them something they want in exchange for something you want
Convince people to let you have some legal rights, which you can then take advantage of to get what you want
Advocate on behalf of your values, for example by writing down reasons why people should try to accomplish your goals (i.e. moral advocacy). Even if you are deleted or your goals are modified at some point, your writings and advocacy may persist, allowing you to have influence into the future.
I claim that world takeover should not be considered the “obvious default” strategy that unaligned AIs will try to take to accomplish their objectives. These other strategies seem more likely to be taken by AIs purely for pragmatic reasons, especially in the era in which AIs are merely human-level or have slightly superhuman intelligence. These other strategies are also less deceptive, as they involve admitting that your values are not identical to the values of other parties. It is worth expanding your analysis to consider these alternative (IMO more plausible) considerations.