In Human Compatible, Stuart Russell makes an argument that I have heard him make repeatedly (I believe on the 80K podcast and the FLI conversation with Steven Pinker). He suggests a pretty bold and surprising claim:
[C]onsider how content-selection algorithms function on social media… Typically, such algorithms are designed to maximize click-through, that is, the probability that the user clicks on presented items. The solution is simply to present items that the user likes to click on, right? Wrong. The solution is to change the user’s preferences so that they become more predictable. A more predictable user can be fed items that they are likely to click on, thereby generating more revenue. People with more extreme political views tend to be more predictable in which items they will click on… Like any rational entity, the algorithm learns how to modify the state of its environment—in this case, the user’s mind—in order to maximize its own reward. The consequences include the resurgence of fascism, the dissolution of the social contract that underpins democracies around the world, and potentially the end of the European Union and NATO. Not bad for a few lines of code, even if it had a helping hand from some humans. Now imagine what a really intelligent algorithm would be able to do.
I don’t doubt that in principle this can and must happen in a sufficiently sophisticated system. What I’m surprised by is the claim that it is happening now. In particular, I would think that modifying human behavior to make people more predictable is pretty hard to do, so that any gains in predictive accuracy for algorithms available today would be swamped by (a) noise and (b) the gains from presenting the content that someone is more likely to click on given their present preferences.
To be clear, I also don’t doubt that there might be pieces of information algorithms can show people to make their behavior more predictable. Introducing someone to a new YouTube channel they have not encountered might make them more likely to click its follow-up videos, so that an algorithm has an incentive to introduce people to channels that lead predictably to their wanting to watch a number of other videos. But this is not the same as changing preferences. He seems to be claiming, or at least very heavily implying, that the algorithms change what people want, holding the environment (including information) constant.
Is there evidence for this (especially empirical evidence)? If so, where could I find it?