Consequentialists (in society) should self-modify to have side constraints
Tl;dr In a society of peers where one’s internal motivations are somewhat transparent to others, having a deontological aversion to antisocial actions can make you more trustworthy than just abstaining from those actions for consequentialist reasons. Since being highly trusted is instrumentally valuable for many purposes, consequentialists should often prefer to self-modify[1] to have deontological constraints on their actions[2].
The utilitarian and deontological surgeons
You live in a city well known for its shortage of organs for transplant. Typically when a healthy young person dies this can be used to save the lives of several other people.
You’re choosing between two surgeons:
Andrew is a deontologist about murder. He would never murder a patient even if their organs could help more people, because he thinks murder is abhorrent, and approximately unthinkable.
Bob is a consequentialist about murder. He would never murder a patient even if their organs could help more people, because he thinks the risks of being discovered are too large and would lead both to him personally and to the reputation of consequentialism are too large to be worth countenancing.
Otherwise they’re both rated as excellent. Who would you feel more comfortable having surgery from?
We think for most people the answer is “Andrew”. While both surgeons have strong ethical reasons not to murder you, Andrew’s reasons feel more robust. In order to murder a patient Andrew would have to throw out the entire basis of his morality. Bob would merely have to make a large error in reasoning — perhaps mistakenly thinking that in this particular case the risks of discovery are so low as to justify the murder. Or worse, perhaps correctly noticing that in this particular case the chances of discovery are essentially zero, so the murder is mandated.
Since both surgeons would prefer to have more patients choose them (so that they can make more money that they can use for ends that seem good to them), it’s likely that Bob should self-modify (if he can) to adopt a deontological constraint on murder. This needn’t stop him remaining consequentialist in other regards.
It seems possible that people who are utilitarian with side-constraints have intuitions that track this argument.
Other examples
It’s easy to feel higher trust in the word of a person who abhors lying than the word of someone that you know doesn’t want to lie because they don’t want a reputation as a liar
We can feel deeper trust that our friends will be there for us even if something goes very wrong in our lives if their kindness is based on something like the virtue of being there for one’s friends rather than a consequentialist belief that it’s worth it for the sake of what you’ll go on to do
We feel more trust in organizations that avoid behaviours we feel are immoral if we think they are avoiding them because they’re wrong rather than because of PR reasons
Doesn’t integrity for consequentialists solve this?
Maybe. But the same arguments about murder or lying might apply — we may be more suspicious of someone who seems to hold onto integrity purely as an instrumental virtue for consequentialist ends (since they might later discover arguments that they believe mean they shouldn’t have integrity in this particular case).
On the other hand it could make total sense for consequentialists to adopt integrity as an intrinsic virtue, and let most of the deontological stuff follow from that.
What are the limits on this recommendation?
From a consequentialist perspective, it won’t always be worth self-modifying to be deontologist or intrinsically virtuous about an issue. It depends on how large the benefits are (in terms of increased trust), and how large the costs are (in terms of times one might be forced to depart from the consequentially-optimal behaviour).
It’s important to work fully that the self-modification is ~irrevocable — that one feels a deep sense of rightness or wrongness about the relevant actions. Otherwise it may seem to others like the self-modification could be undone given strong enough consequentialist reasons. (Of course that would reduce the costs, so it might sometimes be correct to do such a reversible self-modification, so long as the circumstances in which one would self-modify are not trivial.)
Our argument here doesn’t specify how much self-modification consequentialists should do. We’re just pointing out the basic mechanism for why the correct amount is probably nonzero for most agents (like humans) whose inner workings are somewhat transparent to their peers, and who benefit from being trusted. (It’s an open question which AI systems this would apply to.)
- ^
In practice, this probably looks like leaning into and/or refraining from over-riding common-sense moral intuitions.
- ^
Though self-identified consequentialists are probably wrong if they think they don’t have some of these deontological constraints already—e.g. wrong if they think their intuitive morality wouldn’t prevent them from directly and deliberately harming someone even if there were a consequentialist argument for it.
- «Boundaries», Part 2: trends in EA’s handling of boundaries by 6 Aug 2022 0:42 UTC; 81 points) (LessWrong;
- Deontology is not the solution by 16 Nov 2022 14:22 UTC; 38 points) (
- Consolidation of EA criticism? by 29 Sep 2024 16:51 UTC; 25 points) (
- [Link-post] Beware of Other-Optimizing by 6 Aug 2022 22:57 UTC; 12 points) (
- The Happiness Maximizer: Why EA is an x-risk by 30 Aug 2022 4:29 UTC; 8 points) (
- 30 May 2023 5:12 UTC; 1 point) 's comment on «Boundaries», Part 2: trends in EA’s handling of boundaries by (LessWrong;
This point is covered quite well by Derek Parfit in his seminal book Reasons and Persons, Chapter 1, Part 17. In my view the entire chapter is excellent and worth reading, but here is an excerpt from Part 17:
This paragraph, I think, is especially relevant for EA:
Edit: I also recommend the related When Utilitarians Should Be Virtue Theorists.
Yep, this is one of several reasons why I think that Part I is perhaps the best and certainly the most underrated part of the book. :)
This is good to know—thank you for making this connection!
I don’t understand why the consequentialist would not simply falsely represent herself as the deontologist, as she could retain the reputational benefits and would almost always act identically.
It seems like self-modifying would, from such a perspective, require a one to act irrationally where a departure from a rule is warranted by the circumstances.
The key here is transparency. Partially because people openly discuss their moral views, and partially because even when not explicitly stating their views, other people are good enough at reading them to get at least weak evidence about whether they are trustworthy, consequentialists may be unable to seem like perfect deontologists without actually being deontologists.
Agreed. The first big barrier to putting self-modification into practice is “how do you do it”; the second big barrier is “how do you prove to others that you’ve done it.” I’m not sure why the authors don’t discuss these two issues more.
On how to actually self-modify/self-deceive, all they say is that it might involve “leaning into and/or refraining from over-riding common-sense moral intuitions”. But that doesn’t explain how to make the change irrevocably (which is the crucial step).
On how to demonstrate self-modification to others, they mention a “society of peers where one’s internal motivations are somewhat transparent to others.” I agree that our motivations are in general somewhat transparent—but are they transparent in this particular case, the case of differentiating between between a deontologist and a consequentialist-leaning-into-common-sense-morality-in-order-to-be-more-trustworthy?
Maybe so. For instance, maybe the deontologist naturally reacts to side-constraint violations with strong emotion, believing that they are intrinsically bad—but the consequentialist naturally reacts with less emotion, believing that the violation is neither good nor bad intrinsically, but instrumentally bad through [long chain of reasoning]. And maybe the emotional response is hard to fake.
So when someone lies to you, if you get angry—rather than exhibiting calculated disapproval—maybe that’s weak evidence that you have an intrinsic aversion to lying.
Actual self-modification-it’s similar to the problem with Pascal’s wager: even if you can persuade yourself of the utility of believing proposition X, it is at best extremely difficult, and, at worst, impossible to make yourself believe it if your epistemological system leads you to a contrary belief.
Counterfeiting deontological position-if the consequentialist basis for rejecting murder-for-organ-harvest is clear, you may nonetheless be able to convey a suitable outrage. Many of the naively repugnant utilitarian conclusions would actually be extraordinarily corrosive to our social fabric and could inspire similar emotional states. Consequentialists are no less emotional, caring, beings than deontologist (in fact we care more, because we don’t subordinate well-being to other principles). Thus the consequentialist surgeon could be just as perturbed by such repugnant schemes because of the actual harm they would entail!
I’m noticing two ways of interpreting/reacting to this argument:
“This is incredibly off-putting; these consequentialists aren’t unlike charismatic sociopaths who will try to match my behavior to achieve hidden goals that I find abhorrent” (see e.g. Andy Bernard from The Office; currently, this is the interpretation that feels most salient to me)
“This is like a value handshake between consequentialists and the rest of society: consequentialists may have different values than many other people (perhaps really only at the tail ends of morality), but it’s worth putting aside our differences and working together to solve the problems we all care about rather than fighting battles that result in predictable loss”
Another thought in the gendre “consequentialism+”: capabilitariansim à la Senn and Nussbaumer (e.g. here (h/t TJ) for an intro or SEP) seems attractive to me (among other reasons) because I believe it makes a practically useful abstraction from “what we believe ultimately matters” to “what are the best levers to affect that which we believe ultimatly matters”. (In this case, the suggestion would be, while we might still think that some broad notion of utility is what we consider to ultimately matter morally, given the specific world we live in and its causal structure, focus on improving people’s centrals capabilities (as listed for example in the post linked earlier) is a effective and robust way of promoting that good.
And importantly, consequentialism-viewed-through-the-lens-of-capabilitairnism will equip you with some different intuitions in e.g. political philosophy than a more “straitghtforward” notion of consequentialism will (at least before you reach what I am suggesting here to be a new reflective equilibrium).
I’m very pleased to see this line of reasoning being promoted. Mutual transparency of agents (information permeability of boundaries) is a really important feature of & input to real-world ethics; thanks for expounding on it!
Similar reasoning was also expressed here
And here