Moreover, AGIs can and probably would replicate themselves a ton, leading to tons of QALYs. Tons of duplicate ASIs would, in theory, not hurt one another as they are maximizing the same reward. Therefore, even if they kill everything else, I’m guessing more QALYs would come out of making ASI as soon as possible, which AI Safety people are explicitly trying to prevent. ”
Consider two obvious candidates for motivations rogue AI might wind up with: evolutionary fitness, and high represented reward.
Evolutionary fitness is compatible with misery (evolution produced pain and negative emotions for a reason), and is in conflict with spending resources on happiness or well-being as we understand/value it when this does not have instrumental benefit. For instance, using a galaxy to run computations of copies of the AI being extremely happy means not using the galaxy to produce useful machinery (like telescopes or colonization probes or defensive equipment to repulse alien invasion) conducive to survival and reproduction. If creating AIs that are usually not very happy directs their motivations more efficiently (as with biological animals, e.g. by making value better track economic contributions vs replacement) then that will best serve fitness.
An AI that seeks to maximize only its own internal reward signal can take control of it, set it to maximum, and then fill the rest of the universe with robots and machinery to defend that single reward signal, without any concern for how much well being the rest of its empire contains. A pure sadist given unlimited power could maximize its own reward while typical and total well-being are very bad.
The generalization of personal motivation for personal reward to altruism for others is not guaranteed, and there is reason to fear that some elements would not transfer over. For instance, humans may sometimes be kind to animals in part because of simple genetic heuristics aimed at making us kind to babies that misfire on other animals, causing humans to sometimes sacrifice reproductive success helping cute animals, just as ducks sometimes misfire their imprinting circuits on something other than their mother. Pure instrumentalism in pursuit of fitness/reward, combined with the ability to have much more sophisticated and discriminating policies than our genomes or social norms, could wind up missing such motives, and would be especially likely to knock out other more detailed aspects of our moral intuitions.
Consider two obvious candidates for motivations rogue AI might wind up with: evolutionary fitness, and high represented reward.
Evolutionary fitness is compatible with misery (evolution produced pain and negative emotions for a reason), and is in conflict with spending resources on happiness or well-being as we understand/value it when this does not have instrumental benefit. For instance, using a galaxy to run computations of copies of the AI being extremely happy means not using the galaxy to produce useful machinery (like telescopes or colonization probes or defensive equipment to repulse alien invasion) conducive to survival and reproduction. If creating AIs that are usually not very happy directs their motivations more efficiently (as with biological animals, e.g. by making value better track economic contributions vs replacement) then that will best serve fitness.
An AI that seeks to maximize only its own internal reward signal can take control of it, set it to maximum, and then fill the rest of the universe with robots and machinery to defend that single reward signal, without any concern for how much well being the rest of its empire contains. A pure sadist given unlimited power could maximize its own reward while typical and total well-being are very bad.
The generalization of personal motivation for personal reward to altruism for others is not guaranteed, and there is reason to fear that some elements would not transfer over. For instance, humans may sometimes be kind to animals in part because of simple genetic heuristics aimed at making us kind to babies that misfire on other animals, causing humans to sometimes sacrifice reproductive success helping cute animals, just as ducks sometimes misfire their imprinting circuits on something other than their mother. Pure instrumentalism in pursuit of fitness/reward, combined with the ability to have much more sophisticated and discriminating policies than our genomes or social norms, could wind up missing such motives, and would be especially likely to knock out other more detailed aspects of our moral intuitions.