Facebook has at least experimented with using deep reinforcement learning to adjust its notifications according to https://arxiv.org/pdf/1811.00260.pdf . Depending on which exact features they used for the state space (i.e. if they are causally connected to preferences), the trained agent would at least theoretically have an incentive to change user’s preferences.
The fact that they use DQN rather than a bandit algorithm seems to suggest that what they are doing involves at least some short term planning, but the paper does not seem to analyze the experiments in much detail, so it is unclear whether they could have used a myopic bandit algorithm instead. Either way, seeing this made me update quite a bit towards being more concerned about the effect of recommender systems on preferences.