Erich_Grunewald 🔸 comments on A framework for thinking about AI power-seeking

Erich_Grunewald 🔸 25 Jul 2024 21:53 UTC
4 points
0 ∶ 0
I don’t think you’re wrong exactly, but AI takeover doesn’t have to happen through a single violent event, or through a treacherous turn or whatever. All of your arguments also apply to the situation with H sapiens and H neanderthalensis, but those factors did not prevent the latter from going extinct largely due to the activities of the former:
1. There was a cost to violence that humans did against neanderthals
2. The cost of using violence was not obviously smaller than the benefits of using violence—there was a strong motive for the neanderthals to fight back, and using violence risked escalation, whereas peaceful trade might have avoided those risks
3. There was no one human that controlled everything; in fact, humans likely often fought against one another
4. You allow for neanderthals to be less capable or coordinated than humans in this analogy, which they likely were in many ways
The fact that those considerations were not enough to prevent neanderthal extinction is one reason to think they are not enough to prevent AI takeover, although of course the analogy is not perfect or conclusive, and it’s just one reason among several. A couple of relevant parallels include:
- If alignment is very hard, that could mean AIs compete with us over resources that we need to survive or flourish (e.g., land, energy, other natural resources), similar to how humans competed over resources with neanderthals
- The population of AIs may be far larger, and grow more rapidly, than the population of humans, similar to how human populations were likely larger and growing at a faster rate than those of neanderthals
- Matthew_Barnett 25 Jul 2024 23:57 UTC
  4 points
  1 ∶ 0
  Parent
  I want to distinguish between two potential claims:
  1. When two distinct populations live alongside each other, sometimes the less intelligent population dies out as a result of competition and violence with the more intelligent population.
  2. When two distinct populations live alongside each other, by default, the more intelligent population generally develops convergent instrumental goals that lead to the extinction of the other population, unless the more intelligent population is value-aligned with the other population.
  I think claim (1) is clearly true and is supported by your observation that Neanderthals’ went extinct, but I intended to argue against claim (2) instead. (Although, separately, I think the evidence that Neanderthals’ were less intelligent than homo sapiens is rather weak.)
  Despite my comment above, I do not actually have much sympathy towards the claim that humans can’t possibly go extinct, or that our species is definitely going to survive over the very long run in a relatively unmodified form, for the next billion years. (Indeed, perhaps like the Neanderthals, our best hope to survive in the long-run may come from merging with the AIs.)
  It’s possible you think claim (1) is sufficient in some sense to establish some important argument. For example, perhaps all you’re intending to argue here is that AI is risky, which to be clear, I agree with.
  On the other hand, I think that claim (2) accurately describes a popular view among EAs, albeit with some dispute over what counts as a “population” for the purpose of this argument, and what counts as “value-aligned”. While important, claim (1) is simply much weaker than claim (2), and consequently implies fewer concrete policy prescriptions.
  I think it is important to critically examine (2) even if we both concede that (1) is true.