Great post! The section “Truth-seeking requires grounding in reality” describes some points I’ve previously wanted to make but didn’t have good examples for.
I discuss a few similar issues in my post The Moral Uncertainty Rabbit Hole, Fully Excavated. Instead of discussing “the Long Reflection” as MacAskill described it, my post there discusses the more general class of “reflection procedures” (could be society-wide or just for a given individual) where we hit pause and think about values for a long time. The post points out how reflection procedures change the way we reflect and how this requires us to make judgment calls about which of these changes are intended or okay. I also discuss “pitfalls” of reflection procedures (things that are unwanted and avoidable at least in theory, but might make reflection somewhat risky in practice).
One consideration I discovered seems particularly underappreciated among EAs in the sense that I haven’t seen it discussed anywhere. I’ve called it “lack of morally urgent causes.” In short, I think high levels of altruistic dedication and people forming self-identities as altruists dedicated to a particular cause often come from a kind of desperation about the state of the world (see Nate Soares’ “On Caring”). During the Long Reflection (or other “reflection procedures” more generally), the state of the world is assumed to be okay/good/taken care of. So, any serious problems are assumed to be mostly taken care of or put on hold. What results is a “lack of morally urgent causes” – which will likely affect the values and self-identities that people who are reflecting might form. That is, compared to someone who forms their values prior to the moral reflection, people in the moral reflection may be less likely to adopt identities that were strongly shaped by ongoing “morally urgent causes.” For better or worse. This is neither good nor bad per se – it just seems like something to be aware of.
Here’s a longer excerpt from the post where I provide a non-exhaustive list of factors to consider for setting up reflection environments and choosing reflection strategies:
Reflection strategies require judgment calls
In this section, I’ll elaborate on how specifying reflection strategies requires many judgment calls. The following are some dimensions alongside which judgment calls are required (many of these categories are interrelated/overlapping):
Social distortions: Spending years alone in the reflection environment could induce loneliness and boredom, which may have undesired effects on the reflection outcome. You could add other people to the reflection environment, but who you add is likely to influence your reflection (e.g., because of social signaling or via the added sympathy you may experience for the values of loved ones).
Transformative changes: Faced with questions like whether to augment your reasoning or capacity to experience things, there’s always the question “Would I still trust the judgment of this newly created version of myself?”
Distortions from (lack of) competition: As Wei Dai points out in this Lesswrong comment: “Current human deliberation and discourse are strongly tied up with a kind of resource gathering and competition.” By competition, he means things like “the need to signal intelligence, loyalty, wealth, or other ‘positive’ attributes.” Within some reflection procedures (and possibly depending on your reflection strategy), you may not have much of an incentive to compete. On the one hand, a lack of competition or status considerations could lead to “purer” or more careful reflection. On the other hand, perhaps competition functions as a safeguard, preventing people from adopting values where they cannot summon sufficient motivation under everyday circumstances. Without competition, people’s values could become decoupled from what ordinarily motivates them and more susceptible to idiosyncratic influences, perhaps becoming more extreme.
Lack of morally urgent causes: In the blogpost On Caring, Nate Soares writes: “It’s not enough to think you should change the world — you also need the sort of desperation that comes from realizing that you would dedicate your entire life to solving the world’s 100th biggest problem if you could, but you can’t, because there are 99 bigger problems you have to address first.” In that passage, Soares points out that desperation can strongly motivate why some people develop an identity around effective altruism. Interestingly enough, in some reflection environments (including “My favorite thinking environment”), the outside world is on pause. As a result, the phenomenology of “desperation” that Soares described would be out of place. If you suffered from poverty, illnesses, or abuse, these hardships are no longer an issue. Also, there are no other people to lift out of poverty and no factory farms to shut down. You’re no longer in a race against time to prevent bad things from happening, seeking friends and allies while trying to defend your cause against corrosion from influence seekers. This constitutes a massive change in your “situation in the world.” Without morally urgent causes, you arguably become less likely to go all-out by adopting an identity around solving a class of problems you’d deem urgent in the real world but which don’t appear pressing inside the reflection procedure. Reflection inside the reflection procedure may feel more like writing that novel you’ve always wanted to write – it has less the feel of a “mission” and more of “doing justice to your long-term dream.”[11]
Ordering effects: The order in which you learn new considerations can influence your reflection outcome. (See page 7 in this paper. Consider a model of internal deliberation where your attachment to moral principles strengthens whenever you reach reflective equilibrium given everything you already know/endorse.)
Persuasion and framing effects: Even with an AI assistant designed to give you “value-neutral” advice, there will be free parameters in the AI’s reasoning that affect its guidance and how it words things. Framing effects may also play a role when interacting with other humans (e.g., epistemic peers, expert philosophers, friends, and loved ones).
Pitfalls of reflection procedures
There are also pitfalls to avoid when picking a reflection strategy. The failure modes I list below are avoidable in theory,[12] but they could be difficult to avoid in practice:
Going off the rails: Moral reflection environments could be unintentionally alienating (enormous option space; time spent reflecting could be unusually long). Failure modes related to the strangeness of the moral reflection environment include existential breakdown and impulsively deciding to lock in specific values to be done with it.
Issues with motivation and compliance: When you set up experiments in virtual reality, the people in them (including copies of you) may not always want to play along.
Value attacks: Attackers could simulate people’s reflection environments in the hope of influencing their reflection outcomes.
Addiction traps: Superstimuli in the reflection environment could cause you to lose track of your goals. For instance, imagine you started asking your AI assistant for an experiment in virtual reality to learn about pleasure-pain tradeoffs or different types of pleasures. Then, next thing you know, you’ve spent centuries in pleasure simulations and have forgotten many of your lofty ideals.
Unfairly persuasive arguments: Some arguments may appeal to people because they exploit design features of our minds rather than because they tell us “What humans truly want.” Reflection procedures with argument search (e.g., asking the AI assistant for arguments that are persuasive to lots of people) could run into these unfairly compelling arguments. For illustration, imagine a story like “Atlas Shrugged” but highly persuasive to most people. We can also think of “arguments” as sequences of experiences: Inspired by the Narnia story, perhaps there exists a sensation of eating a piece of candy so delicious that many people become willing to sell out all their other values for eating more of it. Internally, this may feel like becoming convinced of some candy-focused morality, but looking at it from the outside, we’ll feel like there’s something problematic about how the moral update came about.)
Subtle pressures exerted by AI assistants: AI assistants trained to be “maximally helpful in a value-neutral fashion” may not be fully neutral, after all. (Complete) value-neutrality may be an illusory notion, and if the AI assistants mistakenly think they know our values better than we do, their advice could lead us astray. (See Wei Dai’s comments in this thread for more discussion and analysis.)
Great post! The section “Truth-seeking requires grounding in reality” describes some points I’ve previously wanted to make but didn’t have good examples for.
I discuss a few similar issues in my post The Moral Uncertainty Rabbit Hole, Fully Excavated. Instead of discussing “the Long Reflection” as MacAskill described it, my post there discusses the more general class of “reflection procedures” (could be society-wide or just for a given individual) where we hit pause and think about values for a long time. The post points out how reflection procedures change the way we reflect and how this requires us to make judgment calls about which of these changes are intended or okay. I also discuss “pitfalls” of reflection procedures (things that are unwanted and avoidable at least in theory, but might make reflection somewhat risky in practice).
One consideration I discovered seems particularly underappreciated among EAs in the sense that I haven’t seen it discussed anywhere. I’ve called it “lack of morally urgent causes.” In short, I think high levels of altruistic dedication and people forming self-identities as altruists dedicated to a particular cause often come from a kind of desperation about the state of the world (see Nate Soares’ “On Caring”). During the Long Reflection (or other “reflection procedures” more generally), the state of the world is assumed to be okay/good/taken care of. So, any serious problems are assumed to be mostly taken care of or put on hold. What results is a “lack of morally urgent causes” – which will likely affect the values and self-identities that people who are reflecting might form. That is, compared to someone who forms their values prior to the moral reflection, people in the moral reflection may be less likely to adopt identities that were strongly shaped by ongoing “morally urgent causes.” For better or worse. This is neither good nor bad per se – it just seems like something to be aware of.
Here’s a longer excerpt from the post where I provide a non-exhaustive list of factors to consider for setting up reflection environments and choosing reflection strategies: