I really don’t get the “simplicity” arguments for fanatical maximising behaviour. When you consider subgoals, it seems that secretly plotting to take over the world will obviously be much more complicated? Do you have any idea how much computing power and subgoals it takes to try and conquer the entire planet?
I think this is underspecified because
The hard part of taking over the whole planet is being able to execute a strategy that actually works in a world with other agents (who are themselves vying for power), rather than the compute or complexity cost of having the subgoal of taking over the world
The difficulty of taking over the world depends on the level of technology, among other factors. For example, taking over the world in the year 1000 AD was arguably impossible because you just couldn’t manage an empire that large. Taking over the world in 2024 is perhaps more feasible, since we’re already globalized, but it’s still essentially an ~impossible task.
My best guess is that if some agent “takes over the world” in the future, it will look more like “being elected president of Earth” rather than “secretly plotted to release a nanoweapon at a precise time, killing everyone else simultaneously”. That’s because in the latter scenario, by the time some agent has access to super-destructive nanoweapons, the rest of the world likely has access to similarly-powerful technology, including potential defenses to these nanoweapons (or their own nanoweapons that they can threaten you with).
I think this is underspecified because
The hard part of taking over the whole planet is being able to execute a strategy that actually works in a world with other agents (who are themselves vying for power), rather than the compute or complexity cost of having the subgoal of taking over the world
The difficulty of taking over the world depends on the level of technology, among other factors. For example, taking over the world in the year 1000 AD was arguably impossible because you just couldn’t manage an empire that large. Taking over the world in 2024 is perhaps more feasible, since we’re already globalized, but it’s still essentially an ~impossible task.
My best guess is that if some agent “takes over the world” in the future, it will look more like “being elected president of Earth” rather than “secretly plotted to release a nanoweapon at a precise time, killing everyone else simultaneously”. That’s because in the latter scenario, by the time some agent has access to super-destructive nanoweapons, the rest of the world likely has access to similarly-powerful technology, including potential defenses to these nanoweapons (or their own nanoweapons that they can threaten you with).