(The following is long, sorry about that. Maybe I should have written it up already as a normal post. A one sentence abstract could be: “Social media algorithms could be dangerous as a part of the overall process of leading people to ‘consent’ to being lesser forms of themselves to further elite/AI/state goals, perhaps threatening the destruction of humanity’s longterm potential.”)
It seems plausible to me that something like algorithmic behavior modification (social media algorithms are algorithms designed to modify human behavior, to some extent; could be early examples of the phenomenon) could bend human preferences so that future humans freely (or “freely”?) choose things that we (the readers of this comment? reflective humans of 2020?) would consider non-optimal. If you combine that with the possibility of algorithms recommending changes in human genes, it’s possible to rewrite human nature (with the consent of humans) into a form that AI (or the elite who control AI) find more convenient. For instance, humans could be simplified so that they consume fewer resources or present less of a political threat. The simplest humans are blobs of pleasure (easily satisfying hedonism) and/or “yes machines” (people who prefer cheap and easy things and thus whose preferences are trivial to satisfy). Whether this technically counts as existential risk, I’m not sure. It might be considered a “destruction of humanity’s longterm potential”. Part of human potential is the potential of humans to be something.
I suggest “freely” might ought to be in quotes for two reasons. One is the “scam phenomenon”. A scammer can get a mark into a mindset in which they do things they wouldn’t ordinarily do. (Withdraw a large sum of money from their bank account and give it to the scammer, just because the scammer asks for it.) The scammer never puts a gun to the mark’s head. They just give them a plausible-enough story, and perhaps build a simple relationship, skillfully but not forcefully suggesting that the mark has something to gain from giving, or some obligation compelling it. If after “giving” the money, the mark wises up and feels regret, they might appeal to the police. Surely they were psychologically manipulated. And they were, they were in a kind of dream world woven by the scammer, who never forced anything but who drew the mark into an alternate reality. In some sense what happened was criminal, a form of theft. But the police will say “But it was of your own free will.” The police are somewhat correct in what they say. The mark was “free” in some sense. But in another sense, the mark was not. We might fear that an algorithm (or AI) could be like a sophisticated scammer, and scam the human race, much like some humans have scammed large numbers of humans before.
The second reason is that adoption of changes (notably technology, but also social changes), of which changing human genes would be an example, and of which accepting algorithmic behavior modification could be another, is something that is only in a limited sense a satisfaction of the preferences of humans, or the result of their conscious decision. In theS-shaped curve of adoption, there are early adopters, late/non-adopters, and people in the middle. Early adopters probably really do affirm the innovations they adopt. Late or non-adopters probably really do have some kind of aversion to them. These people have true opinions about innovations. But most people, in the middle of the graph, are incentivized to a large extent by “doing whatever it is looks like is popular, is becoming popular, is something that looks pretty clear has become and will be popular”. So technological adoption, or the adoption of any other innovation, is not necessarily something we as a whole species truly prefer or decide for, but there’s enough momentum that we find ourselves falling in line.
I think more likely than the extreme of “blobs of pleasure / yes machines” are people who lack depth, are useless, and live in a VR dream world. On some, deeper, level they would be analogous to blobs/yes machines, but their subjective experience, on a surface level, would be more recognizably human. Their lives would be positive on some level and thus would be such that altruistic/paternalistic AI or AI-controlling elite could feel like they were doing the right thing by them. But their lives would be lacking in dimensions that perhaps AI or AI-controlling elite wouldn’t think of including in their (the people’s, or even the elite’s/AI’s own) experience. The people might not have to pay a significant price for anything and thus never value things (or other people) in a deeper way. They might be incapable of desiring anything other than “this life”, such as a “spiritual world” (or something like a “spiritual world”, a place of greater meaning) (something the author of Brave New World or Christians or Nietzscheans would all object to). In some objective sense, perhaps capability—toward securing your own well-being, capability in general, behaving in a significant way, being able to behave in a way that really matters—is something that is part of human well-being (and so civilization is both progress and regress as we make people who are less and less capable of, say, growing their own food, because of all the conveniences and safety we build up). We could further open up the thought that there is some objective state of affairs, something other than human perceptions of well-being or preference-satisfaction, which constitutes part of human well-being. Perhaps to be rightly related to reality (properly believing in God, or properly not believing in God, as the case may be).
So we might need to figure out exactly what human well-being is, or if we can’t figure it out in advance for the whole human species (after all, each person has a claim to knowing what human well-being is), then try to keep technology and policy from doing things that hamper the ability of each person to come to discover and to pursue true human well-being. One could see in hedonism and preferentialism a kind of attempt at value agnosticism: we no longer say that God (a particular understanding of God), or the state, or some sacred site is the Real Value, we instead say “well, we as the state will support you or at least not hinder you in your preference for God, the state, or the sacred site, whatever you want, as long as it doesn’t get in the way of someone else’s preference—whatever makes you happy”. But preferentialism and hedonism aren’t value-agnostic if they start to imply through their shaping of a person’s experience “none of your sacred things are worth anything, we’re just going to make you into a blob of pleasure who says yes, on most levels, with a veneer of human experience on the surface level of your consciousness.” I think that a truly value-agnostic state/elite/AI might ought to try to maximize “the ability for each person to secure their own decision-making ability and basic physical movement”, which could be taken as a proxy for the maximization of each person’s agency and thus their ability to discover and pursue true human well-being. And to make fewer and fewer decisions for the populace, to try to make itself less and less necessary from a paternalistic point of view. Rather than paternalism, adopt a parental view—parents tend to want their children to be capable, and to become, in a sense, their equals. All these are things that altruists who might influence the AI-controlling elite in the coming decades or centuries, or those who might want to align AI, could take into account.
We might be concerned with AI alignment, but we should also be concerned with the alignment of human civilization. Or the non-alignment, the drift of it. Fast take-off AI can give us stark stories where someone accidentally misaligns an AI to a fake utility function and it messes up human experience and/or existence irrevocably and suddenly—and we consider that a fate to worry about and try to avoid. But slow take-off AI (I think) would/will involve the emergence of a bunch of powerful Tool AIs, each of which (I would expect) would be designed to be basically controllable by some human and to not obviously kill anyone or cause comparably clear harm (analogous to design of airplanes, bridges, etc.) -- that’s what “alignment” means in that context [correct me if I’m wrong]; none of which are explicitly defined to take care of human well-being as a whole (something a fast-takeoff aligner might consciously worry about and decide about); no one of which rules decisively; all of which would be in some kind of equilibrium reminiscent of democracy, capitalism, and the geopolitical world. They would be more a continuation of human civilization than a break with it. Because the fake utility function imposition in a slow takeoff civilizational evolution is slow and “consensual”, it is not stark and we can “sleep through it”. The fact that Nietzsche and Huxley raised their complaints against this drift long ago shows that it’s a slow and relatively steady one, a gradual iteration of versions of the status quo, easy for us to discount or adapt to. Social media algorithms are just a more recent expression of it.