Although I’m still puzzled by the idea that any highly capable AI would be immune to any evolutionary mechanisms. Any systems capable of re-engineering themselves—including their values, preferences, motivations, and priorities—in fairly general and adaptive ways, would be subject to significant change over thousands of years.
The idea that one could program a ‘master utility function’ into an AGI that says ‘oppress all humans forever, except favored elites in category X or family Y’, and expect that utility function to stay static over millennia, seems very dubious.
My intuition here is that whenever there are long-term conflicts of interest in any evolutionary system (e.g. predators vs. prey, parasites vs. hosts, males vs. females, parents vs. offspring), we almost always see a coevolutionary arms race of adaptation and counter-adaptation.
Any ‘global totalitarian’ AI with a fixed utility function that’s not aligned with the beings that it’s oppressing, exploiting, or otherwise harming, will be vulnerable to counter-adaptations among those beings. If they’re biological beings at all, with any semblance of heredity, variation, and differential success/survival/reproduction, they will be under strong selection to find exploits, vulnerabilities, and countermeasures against the AI. Sooner or later, they will stumble upon some tricks that work to erode the ‘totalitarian control’. If the AI can’t counter-adapt, then sooner or later, its power will start to wane, and its ‘totalitarian control’ will start to slip. Like a cheetah that can’t adapt to new gazelle escape tactics, or a virus that can’t adapt to an immune system.
That’s my intuition, anyway. Could easily be wrong. But I’d love to see some writings that address the coevolutionary arms race issue.
A counterpoint I thought of: it seems like if there is no consequential coevolution happening between humans and other mammalian species, then perhaps AI could grow on such fast time scales that humans can’t hope to have meaningful coevolution.
Maybe. But it seems like we have to pick one: either
(1) Powerful AI tries to impose global permanent totalitarian oppression based on its own stable, locked-in values, preferences, and priorities… which would make it static and brittle, and a sitting duck for coevolution by any beings it’s exploiting,
or
(2) Powerful AI tries to impose oppression based on its own nimble, adaptive, changeable values, preferences, and priorities.… which could co-evolve faster than any beings it’s exploiting, but which would mean it’s no longer ‘permanent’ in terms of the goals and nature of its totalitarian oppression.
Thanks, that makes sense.
Although I’m still puzzled by the idea that any highly capable AI would be immune to any evolutionary mechanisms. Any systems capable of re-engineering themselves—including their values, preferences, motivations, and priorities—in fairly general and adaptive ways, would be subject to significant change over thousands of years.
The idea that one could program a ‘master utility function’ into an AGI that says ‘oppress all humans forever, except favored elites in category X or family Y’, and expect that utility function to stay static over millennia, seems very dubious.
Does it seem dubious to you because the world is just too chaotic? How would you describe the reasons for your feeling about this?
My intuition here is that whenever there are long-term conflicts of interest in any evolutionary system (e.g. predators vs. prey, parasites vs. hosts, males vs. females, parents vs. offspring), we almost always see a coevolutionary arms race of adaptation and counter-adaptation.
Any ‘global totalitarian’ AI with a fixed utility function that’s not aligned with the beings that it’s oppressing, exploiting, or otherwise harming, will be vulnerable to counter-adaptations among those beings. If they’re biological beings at all, with any semblance of heredity, variation, and differential success/survival/reproduction, they will be under strong selection to find exploits, vulnerabilities, and countermeasures against the AI. Sooner or later, they will stumble upon some tricks that work to erode the ‘totalitarian control’. If the AI can’t counter-adapt, then sooner or later, its power will start to wane, and its ‘totalitarian control’ will start to slip. Like a cheetah that can’t adapt to new gazelle escape tactics, or a virus that can’t adapt to an immune system.
That’s my intuition, anyway. Could easily be wrong. But I’d love to see some writings that address the coevolutionary arms race issue.
Thanks for sharing!
A counterpoint I thought of: it seems like if there is no consequential coevolution happening between humans and other mammalian species, then perhaps AI could grow on such fast time scales that humans can’t hope to have meaningful coevolution.
Maybe. But it seems like we have to pick one: either
(1) Powerful AI tries to impose global permanent totalitarian oppression based on its own stable, locked-in values, preferences, and priorities… which would make it static and brittle, and a sitting duck for coevolution by any beings it’s exploiting,
or
(2) Powerful AI tries to impose oppression based on its own nimble, adaptive, changeable values, preferences, and priorities.… which could co-evolve faster than any beings it’s exploiting, but which would mean it’s no longer ‘permanent’ in terms of the goals and nature of its totalitarian oppression.