In my previous post, I wrote that self-reinforcing feedback loops in our brains remind me of the common political strategy of blaming all failures of your policies on the fact that you didn’t have enough power to implement them comprehensively enough.
I don’t think this similarity is a coincidence at all—instead, it’s a consequence of the fact that one of the most useful models of the mind is the multi-agent model. According to this model, we should think of ourselves as being composed of multiple different parts, each with their own beliefs and goals, with their attempts to achieve those goals sometimes involving conflicts with other parts. In this post my central example of parts will be the emotional schemas described in my previous post, which can each be seen as having a goal of avoiding the types of trauma which gave rise to them, and acting based on their beliefs about which strategies best achieve that.
The multi-agent model of the mind can be seen as an extension of Kahneman and Tversky’s dual-process theory, which has been described using colorful metaphors like Haidt’s “elephant and rider” and Christiano’s “monkey and machine”. Dual-process theories help us describe internal conflicts between our rational and emotional sides (sometimes known as akrasia)—for example, the conflict between working and procrastinating, or between dieting and having another scoop of ice-cream.
I think dual-process theories are an important advance, but don’t capture the full complexity of internal conflict. That’s because most internal conflicts aren’t just between emotion and reason, but rather between multiple emotional schemas, with reason being wielded by one as a weapon against the others. In other words, Christiano is right in thinking of explicit reasoning as a “lever” that gets pulled by an emotional monkey. But he’s wrong in thinking that there’s just one monkey. Instead, we contain many different emotion-driven monkeys which pull the “intellectual argument lever” whenever they think it’ll give their preferred strategy an advantage over the strategies of the other monkeys.
Once you’re familiar with this dynamic, you can see it all over the place. Even the examples above showcase it: when we procrastinate or break a diet, we often get frustrated and angry with ourselves. But is it really our “rational” side that’s berating us for being useless? That seems suspiciously…emotional. Getting angry at ourselves might seem like such an obvious response that it’s hard to even imagine alternatives—but they do exist. After seeing their child throw their food on the floor day after day, some parents will get frustrated and angry, but others will calmly pick the food up and try to figure out how to do a little bit better the next day. Why can’t we treat ourselves in the latter way? In short: because these are examples of “rational” arguments being wielded by an underlying emotional schema (perhaps the fear of failure, or the fear of being seen as unattractive) in order to coerce other parts into giving it control.
To be clear, the problem here isn’t reason itself, but rather the use of it to support conclusions of the form “X is bad”—in other words, the use of it to support one-place judgments. In post #2 I talked about how harmful these judgments can be—but I didn’t explain why our brains use them in the first place, or why they’re so hard to dislodge. Now we have an explanation: the more articulate parts of ourselves are using them to castigate other parts, in order to control their behavior. Negative self-talk and critical one-place judgments are very powerful coercive tactics—we call them “verbal abuse” when done to other people. When you tell yourself that you’re worthless unless your career succeeds, or that you’re selfish unless you do as much good as you can, you can think of that as another salvo in an ongoing battle between two different parts of yourself.
The more complex our internal conflicts, the more we should favor multi-agent models over dual-process models—and the examples above are just the tip of the iceberg. Some more strategies which we often apply in internal conflicts:
Our parts learn to adjust their defence mechanisms over time. For example, suppose I discover that I get much more work done whenever I write down a concrete schedule. This will work for a while, but the part that’s scared of working usually figures out the trick eventually, and applies its efforts to making schedule-writing itself aversive.
Our parts strategically hide information from each other. Many classic examples involve telling ourselves that our motivations are much more altruistic than they actually are. Self-deception is also behind a lot of the defensiveness and evasiveness that people display when talking about sensitive topics—e.g. via making jokes, getting angry, leaving the conversation, or (especially in my circles) pulling the “intellectual argument” lever.
Our parts use threats to keep other parts in line. For example, someone’s career-driven parts could threaten to ruin a vacation by constantly thinking about work, to prevent their stressed parts from even pushing for that vacation in the first place.
Like external conflicts, internal conflicts are typically wasteful and destructive. Because of this, people who face serious internal conflict can’t be viewed as coherently pursuing any set of goals—as in the examples above, they often carry out self-destructive behavior which is detrimental towards all their goals, and which undermines their ability to trust themselves in the future. When you notice yourself flinching away from something that seems like an obviously good idea, that’s often because it was associated with coercion in the past. Burnout, for example, can come as a result of piling on more and more internal coercion to do work, until any type of work starts to trigger a flinch reflex.
Note also that coercion in a context where parts disagree can “contaminate” even other contexts where they agree: if one part thinks that they would be coerced if they had different preferences, that itself often feels like coercion and builds up a level of resentment disproportionate to whatever actual disagreements exist. If that seems counterintuitive, remember how much people value having their autonomy respected—the same people who would happily do you a favor if you ask nicely would often resent you for simply assuming that they’ll do it. So while the optimal amount of self-coercion isn’t zero, applying coercion often has downstream effects which we should be very careful about. But instead, we typically apply far more coercion to ourselves than we ever would to other people, because it’s so much easier to get away with in the short term, and because the quickest way of dealing with the harmful consequences of self-coercion is often more self-coercion. How can we move away from this trap? That’s the focus of the next part of this sequence.
Conflicts between emotional schemas often involve internal coercion
In my previous post, I wrote that self-reinforcing feedback loops in our brains remind me of the common political strategy of blaming all failures of your policies on the fact that you didn’t have enough power to implement them comprehensively enough.
I don’t think this similarity is a coincidence at all—instead, it’s a consequence of the fact that one of the most useful models of the mind is the multi-agent model. According to this model, we should think of ourselves as being composed of multiple different parts, each with their own beliefs and goals, with their attempts to achieve those goals sometimes involving conflicts with other parts. In this post my central example of parts will be the emotional schemas described in my previous post, which can each be seen as having a goal of avoiding the types of trauma which gave rise to them, and acting based on their beliefs about which strategies best achieve that.
The multi-agent model of the mind can be seen as an extension of Kahneman and Tversky’s dual-process theory, which has been described using colorful metaphors like Haidt’s “elephant and rider” and Christiano’s “monkey and machine”. Dual-process theories help us describe internal conflicts between our rational and emotional sides (sometimes known as akrasia)—for example, the conflict between working and procrastinating, or between dieting and having another scoop of ice-cream.
I think dual-process theories are an important advance, but don’t capture the full complexity of internal conflict. That’s because most internal conflicts aren’t just between emotion and reason, but rather between multiple emotional schemas, with reason being wielded by one as a weapon against the others. In other words, Christiano is right in thinking of explicit reasoning as a “lever” that gets pulled by an emotional monkey. But he’s wrong in thinking that there’s just one monkey. Instead, we contain many different emotion-driven monkeys which pull the “intellectual argument lever” whenever they think it’ll give their preferred strategy an advantage over the strategies of the other monkeys.
Once you’re familiar with this dynamic, you can see it all over the place. Even the examples above showcase it: when we procrastinate or break a diet, we often get frustrated and angry with ourselves. But is it really our “rational” side that’s berating us for being useless? That seems suspiciously…emotional. Getting angry at ourselves might seem like such an obvious response that it’s hard to even imagine alternatives—but they do exist. After seeing their child throw their food on the floor day after day, some parents will get frustrated and angry, but others will calmly pick the food up and try to figure out how to do a little bit better the next day. Why can’t we treat ourselves in the latter way? In short: because these are examples of “rational” arguments being wielded by an underlying emotional schema (perhaps the fear of failure, or the fear of being seen as unattractive) in order to coerce other parts into giving it control.
To be clear, the problem here isn’t reason itself, but rather the use of it to support conclusions of the form “X is bad”—in other words, the use of it to support one-place judgments. In post #2 I talked about how harmful these judgments can be—but I didn’t explain why our brains use them in the first place, or why they’re so hard to dislodge. Now we have an explanation: the more articulate parts of ourselves are using them to castigate other parts, in order to control their behavior. Negative self-talk and critical one-place judgments are very powerful coercive tactics—we call them “verbal abuse” when done to other people. When you tell yourself that you’re worthless unless your career succeeds, or that you’re selfish unless you do as much good as you can, you can think of that as another salvo in an ongoing battle between two different parts of yourself.
The more complex our internal conflicts, the more we should favor multi-agent models over dual-process models—and the examples above are just the tip of the iceberg. Some more strategies which we often apply in internal conflicts:
Our parts learn to adjust their defence mechanisms over time. For example, suppose I discover that I get much more work done whenever I write down a concrete schedule. This will work for a while, but the part that’s scared of working usually figures out the trick eventually, and applies its efforts to making schedule-writing itself aversive.
Our parts strategically hide information from each other. Many classic examples involve telling ourselves that our motivations are much more altruistic than they actually are. Self-deception is also behind a lot of the defensiveness and evasiveness that people display when talking about sensitive topics—e.g. via making jokes, getting angry, leaving the conversation, or (especially in my circles) pulling the “intellectual argument” lever.
Our parts use threats to keep other parts in line. For example, someone’s career-driven parts could threaten to ruin a vacation by constantly thinking about work, to prevent their stressed parts from even pushing for that vacation in the first place.
Like external conflicts, internal conflicts are typically wasteful and destructive. Because of this, people who face serious internal conflict can’t be viewed as coherently pursuing any set of goals—as in the examples above, they often carry out self-destructive behavior which is detrimental towards all their goals, and which undermines their ability to trust themselves in the future. When you notice yourself flinching away from something that seems like an obviously good idea, that’s often because it was associated with coercion in the past. Burnout, for example, can come as a result of piling on more and more internal coercion to do work, until any type of work starts to trigger a flinch reflex.
Note also that coercion in a context where parts disagree can “contaminate” even other contexts where they agree: if one part thinks that they would be coerced if they had different preferences, that itself often feels like coercion and builds up a level of resentment disproportionate to whatever actual disagreements exist. If that seems counterintuitive, remember how much people value having their autonomy respected—the same people who would happily do you a favor if you ask nicely would often resent you for simply assuming that they’ll do it. So while the optimal amount of self-coercion isn’t zero, applying coercion often has downstream effects which we should be very careful about. But instead, we typically apply far more coercion to ourselves than we ever would to other people, because it’s so much easier to get away with in the short term, and because the quickest way of dealing with the harmful consequences of self-coercion is often more self-coercion. How can we move away from this trap? That’s the focus of the next part of this sequence.