Geoffrey Miller comments on Imitation Learning is Probably Existentially Safe

Geoffrey Miller 2 May 2024 21:10 UTC
10 points
0 ∶ 0
This argument seems extremely naive.
Imitation learning could easily become an extinction risk if the individuals or groups being imitated actively desire human extinction, or even just death to a high proportion of humans. Many do.
Radical eco-activists (e.g. Earth First) have often called for voluntary human extinction, or at least massive population reduction.
Religious extremists (e.g. Jihadist terrorists) have often called for death to all non-believers (e.g. the 6 billion people who aren’t Muslim.)
Antinatalists and negative utilitarians are usually careful not to call for extinction or genocide as a solution to ‘suffering’, but calls for human extinction seem like a logical outgrowth of their world-view.
Many kinds of racists actively want the elimination, or at least reduction, of other races.
I fear that any approach to AI safety that assumes the whole world shares the same values as Bay Area liberals will utterly fail when advanced AI systems become available to a much wider range of people with much more misanthropic agendas.
- Vasco Grilo🔸 3 May 2024 9:58 UTC
  2 points
  0 ∶ 0
  Parent
  Thanks for the comment, Geoffrey! I strongly upvoted it because I think it points to a discussion which is important to have.
  Imitation learning could easily become an extinction risk if the individuals or groups being imitated actively desire human extinction, or even just death to a high proportion of humans.
  I think such individuals or groups will not be the ones training the most powerful models. Gemini costed 630 M$, and the development cost of the leading models is expected to continue to increase. I appreciate the cost of a model of a given capability will decrease over time due to improvements in hardware and software. However, by the time terrorist individuals or groups have the resources to train a model as capable as e.g. Gemini, the leading models will be much more powerful. As long as the leading models are imitating most humans (as they seem to be now), who are not in favour of unilaterally causing human extinction, I think this would remain extremely unlikely.
  Radical eco-activists (e.g. Earth First) have often called for voluntary human extinction, or at least massive population reduction.
  In my mind, there is still a big difference between calling for human extinction and being willing to unilaterally cause human extinction. To illustrate, the vast majority of people arguing for less population would not be willing to kill people even if there were no consequences to themselves.
  Religious extremists (e.g. Jihadist terrorists) have often called for death to all non-believers (e.g. the 6 billion people who aren’t Muslim.)
  6 billion deaths would be terrible, but still quite far from human extinction. The global population reached 2 billion in 1927, i.e. only 97 years ago.
  Moreover, I assume religious extremists want to increase the longterm number of Muslims, and killing the 6 billion people who are not Muslim seems to be a very suboptimal strategy of doing that. If Jihadist terrorists could have an AI model capable of doing this, which is much harder than just killing 6 billion random people, they could also use the super model to convert people who are not Muslim, or help them achieve greater influence in the world via other means (e.g. coming up with new technological investions, and sustainbly increasing their offspring).
  In addition, I wonder whether there would still be Jihadist terrorists if they had the ability to become much richer with their own model. I suspect a key reason they are willing to sacrifice themselves if that their current lives are not great, but a model capable of causing human extinction could much more easily be used to increase their wealth and quality of life.
  Antinatalists and negative utilitarians are usually careful not to call for extinction or genocide as a solution to ‘suffering’, but calls for human extinction seem like a logical outgrowth of their world-view.
  Many kinds of racists actively want the elimination, or at least reduction, of other races.
  I believe the points I mentioned above apply to these groups too.
  I fear that any approach to AI safety that assumes the whole world shares the same values as Bay Area liberals will utterly fail when advanced AI systems become available to a much wider range of people with much more misanthropic agendas.
  I guess you are assuming the amount of resources needed to cause human extinction will dramatically go down with advanced AI, and therefore worry that increasingly more individuals and groups will have the ability to cause human extinction. However, I do not think the absolute amount of resources controlled by terrorist groups is the key metric. I would say what matters is the offense-defense balance, such that the risk of human extinction depends on the fraction of the global resources controlled by terrorist groups. Historical trends suggest people with Bay Area values will control an increasingly larger fraction of the global resources, and terrorist groups an increasingly smaller fraction, which makes it more difficult for terrorist groups to cause human extinction. Historical terrorist attack deaths also seem to suggest an astronomically low probability of a terrorist attack causing human extinction.