Good Judgment with Numbers

Richard Y Chappell🔸23 Sep 2024 15:10 UTC

63 points

Decision theory Philosophy Moral philosophy

Summary: I critique the view that the role of quantitative tools in practical decision-making must be “all or nothing”.

On the critiqued view, either you’re committed to blindly following a simple algorithm come what may (a la naive instrumentalism), or you dismiss “soulless number-crunching” as entirely irrelevant to ethics. I think both options are bad, and moral agents should instead use good judgment informed by quantitative considerations. Getting this balance right is crucial to the project of effective altruism.

Introduction

A general theme of my writing is that people are often too quick to “read off” practical differences from theoretical ones. A classic example: whereas pluralistic views tend to leave unspecified the relative weights of their varied moral reasons, utilitarian reasons have precise weights (in theory), determinately fixing what ought to be done in any given situation. But people often infer from this that we can easily determine what ought to be done, if utilitarianism is true. (“Just add up and compare the numbers!”) This inference is fallacious: positing precise and determinate truth-makers doesn’t imply that we have easy epistemic access to them. The world’s a complicated place, which leaves plenty of room for reasonable disagreement over the theoretically “simple” question of what interventions will actually do the most good.

In this post, I especially want to target the view (rarely explicitly defended, but often implicitly assumed) that the role of quantitative tools in practical decision-making must be all or nothing. On this view, either you’re committed to blindly following a simple algorithm come what may (a la naive instrumentalism), or you dismiss “soulless number-crunching” as entirely irrelevant to ethics. I think both options are bad, and moral agents should instead use good judgment informed by quantitative considerations.^[1]

The ‘All or Nothing’ Assumption

In my experience, the ‘All or Nothing’ view of quantitative tools is especially common amongst critics of effective altruism. They are opposed to quantitative methods,^[2] so to resist EA calls to quantify their impact, they (i) implicitly assume the ‘All or Nothing’ view, (ii) attribute the ‘All’ view to effective altruists, (iii) suggest that the ‘All’ view is bad, and so (iv) conclude that their ‘Nothing’ view is good and any pressures towards moral optimization can rightly be dismissed.

Consider, for example, Leif Wenar’s old WIRED diatribe against GiveWell.^[3] He gets very outraged by GiveWell’s “hedging” of their quantitative results:

In the fine print, the calculations are hedged with phrases like “very rough guess,” “very limited data,” “we don’t feel confident,” “we are highly uncertain,” “subjective and uncertain inputs.” These pages also say that “we consider our cost-effectiveness numbers to be extremely rough,” and that these numbers “should not be taken literally.” What’s going on?

GiveWell transparently explains that their quantitative modelling is uncertain; it’s just one input that goes into determining their overall verdicts.^[4] (An excellent 2011 blog post explains why weakly-supported explicit expected-value estimates should be adjusted back towards the mean, the same way you would adjust down your estimate of the quality of a restaurant sporting a single five-star review and no other info.) This all seems very reasonable and epistemically responsible to me. But it makes critics like Wenar feel cheated. What’s going on?

My best guess is that the All or Nothing theorist associates numbers with mathematical certainty. So, to use numbers to present one’s best estimate inherently “projects absolute confidence” (an accusation Wenar levels against GiveWell, for no other reason but that they present numbers). But this just seems a mistake on the part of the All or Nothing theorist. There’s no necessary connection between numerical representation and higher-order confidence in the represented number. If you want to know how confident GiveWell is in their estimates, you can read their full reports to find out. If you make assumptions based on nothing but the presented estimate, that’s on you.

Don’t Blindly Trust Numbers

Sometimes we have higher-order evidence that certain kinds of first-order estimates require adjustment, or even outright dismissal. I’ve already mentioned GiveWell’s argument for discounting weakly-supported “extreme” verdicts. And I often write about the standard two-level utilitarian case for trusting reliable heuristics over error-prone individual calculations. I don’t have much patience for galaxy-brained proposals about how murdering your rival (or “stealing to give”) would make the world a better place. Unreliably calculating those things out demonstrates bad practical judgment, in my view. We can better approximate ideal guidance (i.e., maximizing expected value, by the lights of the objectively warranted expectations) by not doing that.

If someone presents you with a “proof” that 1 = 0, you can (rightly) be extremely confident that it contains a mistake, even if you can’t immediately identify the error. Their “evidence” isn’t strong enough to overcome your (rational) prior resistance to the conclusion. Similarly, in moderated fashion, if someone presents you with an expected value “calculation” favoring a disreputable act like theft or murder (in a context where it seems intuitively bad). Even if you can’t immediately see what they’re leaving out, you can reasonably expect that they’re missing something important. Your prior resistance to their conclusion should be weaker than in the pure math case, of course; you’d expect there to be some cases where disreputable-seeming acts really are for the best. But a handwavy EV estimate is very weak evidence indeed, and shouldn’t much move you away from expectations that seem reasonable based purely on priors. (Even if there are some cases where disreputable acts turn out for the best, your expectation in any particular case should not be so sanguine—compare the expected value of buying an overpriced lottery ticket.)

Similarly, as I wrote in It’s Not Wise to be Clueless, if someone pulls out some numbers (and an associated narrative) suggesting that global nuclear war would actually be best for the long-term future, you should not believe them. Practical rationality requires good priors; if you don’t have them, you’ll be led badly astray.^[5] (Compare Pascal’s Mugging.) It really matters what we should expect to have better consequences. And, as J.S. Mill rightly stressed, we ought to have some pretty robust expectations about some of these things.^[6]

Look where you’re going! Don’t blindly follow big numbers off a cliff.

Don’t Blindly Trust Formalisms

It’s not just “input” numbers that need to be filtered through good judgment. It’s also quite possible to start with accurate or reasonable numbers, and be led to utter insanity due to doing the wrong things with them.

For example, many philosophers seem to take seriously a formalization of risk-aversion which implies pro-extinctionism. As far as I can tell, there’s no principled basis for those particular formalisms (unlike the arguments for orthodox decision theory); they just happened to yield desired results in the narrow range of (low-stakes) cases usually considered. As a result, I don’t see any reason to take these formalisms seriously when they have crazy implications in high-stakes cases. Especially when ordinary risk-aversion seems better explained as a simple humility heuristic, trying to build it into the fundamental formalism itself just seems a mistake.

Another common mistake with formalisms is the failure to consider model uncertainty, relying on a single “most-likely” scenario or model, when you should be distributing your credence across a wider range. (“Respectable” academic objections to longtermism often take this form.)

Don’t Blindly Dismiss All Numbers

So, it’s easy to go wrong when dealing with numbers in practical decision-making. Still, the numbers do count, in principle. So it’s even easier to go wrong if you refuse to consider numbers at all. That route leads to all the familiar sorts of inefficiencies that motivated effective altruism to begin with. (For example: donating to a charity that trains seeing-eye dogs for $50,000, when another could restore sight to someone suffering from trachoma for just $50.) It’s worth trying seriously to do more good rather than less, all else equal. There’s plenty of low-hanging fruit here. Favoring GiveWell-recommended anti-malaria charities over arbitrary alternatives, for example, doesn’t risk any of the problems discussed in the previous section.^[7] It’s just a no-brainer. Yet most people still fail to do this. (You even get crazy people like Crary and Wenar who are positively hostile to GiveWell. It’s the weirdest thing.)

If we ask the question, what decision-procedure (or broader moral mindset) should we expect to have the best consequences? I think it’s clear that the answer is:

Not blindly following naive expected value calculations; but also…
Not completely ignoring numbers.
Rather, take scale (and tractability) into account when trying to do good, in the ways that effective altruists recommend, considering a wide range of potential opportunities and ambitiously pursuing potential “upside”, while…
Taking care to mitigate “downside” risks, e.g. by acting with integrity, respecting moral “guard rails” (e.g. legitimate laws and rights), and avoiding Machiavellian manipulation or deceit (except when validated by common sense).
And generally exercising good judgment throughout!

If folks want to argue that EA-style quantification is bad, I’d like to see them go beyond “All or Nothing” reasoning and seriously engage with this far more reasonable intermediate position. In particular, I’d love to see them spell out an alternative decision procedure that they expect would have better consequences for the world, and offer some argument or reasoning in support of that expectation.

In general, criticism is much more valuable when it goes beyond merely noting that something is flawed or imperfect (as if anything isn’t), and positively establishes a better alternative. If your criticism is just that EA is imperfect, that’s compatible with every alternative being (as I believe) much worse. And it’s hardly reasonable to try to discourage people from pursuing the best moral project currently on offer. (Indeed, that seems transparently vicious.)

Fallacies to Watch Out For

Be very careful about what you infer from the fact that some tradeoffs are unclear or difficult to make. Sometimes people will try to infer sweeping conclusions from this: things like, “therefore, you should sometimes prioritize homeless Americans over the global poor”; or “therefore, we should not even try to optimize”; or “therefore, effective altruism should be entirely abandoned.”

These inferences are bonkers,^[8] and you should severely downgrade your estimate of the reasonableness of anyone you see making them. They are rationalizing, not reasoning. (This becomes especially clear if they further infer, “therefore, you should prioritize my pet cause over things that seem more effective.”)

The correct inference to draw is just that it will sometimes be difficult to discern which option is most worth prioritizing. That’s all.

^
There’s a risk here of thinking, “Anyone more quantitatively-inclined than I am is a blind number-cruncher; anyone less inclined is stupid.” But even without getting too specific, I think it’s helpful to note that some — possibly broad — “middle ground” between the two extremes is plausibly going to be the most reasonable stance. Both extremes are very much worth avoiding!
^
Often, I think, because it’s really clear that there’s simply no basis for prioritizing their preferred interventions, but they don’t want to admit this.
^
I previously criticized the central mistake of his article: trying to raise the salience of rare side-effects, in a way that exploits people’s status quo bias, with the predictable effect of deterring life-saving interventions. Bad stuff. The mistake I want to discuss today is less egregious, but still worth remedying.
^
Relatedly: it’s totally fine for them to say, “Here are some potential negative and offsetting effects of the program that we believe to be too small or unlikely to have been included in our quantitative model.” Wenar claimed to find this outrageous, which is (again) simply unreasonable on his part.
^
Sometimes people imagine there’s an epistemic norm of open-mindedness, that you should never reject a view without argument, or even that you should always assign non-trivial credence to any view that you cannot absolutely disprove using non-question-begging premises. Something along these lines may (?) often be a good heuristic norm for general discourse. But as a strict universal generalization, it is completely insane—as again demonstrated by Pascal’s Mugging. (Middling credences are not automatically more reasonable than “extreme” ones—it depends on the proposition!)
^
J.S. Mill, chp 2 of Utilitarianism: “People talk as if… at the moment when some man feels tempted to meddle with the property or life of another, he had to begin considering for the first time whether murder and theft are injurious to human happiness.” Or, as Kamala Harris’s mom put it, “Do you think you just fell out of a coconut tree?”
^
It would be different if you really thought we should expect anti-malarial interventions to do more harm than good (due to second-order effects or whatever). But that seems a crazy expectation: one can of course describe a scenario vindicating how it is possible, but it would seem hard to justify having this as your dominant expectation. Even Wenar makes no such positive claims. To muddy the waters, he just raises the possibility of bad outcomes, without doing the epistemic work of establishing that we should actually expect them to predominate. Possibilities are cheap.
^
You might think that the first claim is independently defensible. What I’m criticizing here is the specific inference, not the conclusion being drawn. (I’d say it’s misguided to prioritize homeless Americans over the global poor, but I wouldn’t call the view “bonkers”.)

What links here?

Richard Y Chappell🔸's comment on Optimistic Longtermism and Suspicious Judgment Calls by Jim Buhler (25 Mar 2025 2:58 UTC; 5 points)

Richard Y Chappell🔸23 Sep 2024 15:10 UTC

63 points

12 comments6 min readEA link

Decision theory Philosophy Moral philosophy

Richard Y Chappell🔸 23 Sep 2024 15:15 UTC
7 points
1 ∶ 0
Not sure why this got tagged as ‘Community’. It’s not about the community, but about applying EA principles, substantive issues in applied decision theory, and associated mistakes in the reasoning of many critics of effective altruism. (Maybe an overzealous bot didn’t like the joking footnote reference to Kamala Harris’s “coconut tree” line, and it got mischaracterized as political?)

Edit—fixed now, thanks mods!
Gemma 🔸 23 Sep 2024 15:39 UTC
3 points
0 ∶ 0
Hell yeah! I’ve got a draft with something similar about the importance of judgement in the application of EA principles. I think this is underrated within EA.
- shepardriley 4 Oct 2024 13:22 UTC
  3 points
  0 ∶ 0
  Parent
  Totally agree. Very underrated, and this post makes a great point.
titotal 23 Sep 2024 17:20 UTC
2 points
2 ∶ 4
My best guess is that the All or Nothing theorist associates numbers with mathematical certainty. So, to use numbers to present one’s best estimate inherently “projects absolute confidence”
I think a version of this critique is still entirely fair. My problem here is that the numbers are often presented or spread without uncertainty qualifications.
For example, the EA page on the against malaria foundation states:
As of July 2022, GiveWell estimates that AMF can deliver a LLIN at a cost of about $5, and that a donation to AMF has an average cost-effectiveness of $5,500 per life saved.^[7][8][9]
This statement gives no information about how sure they are about the $5 or $5500 figure. Is givewell virtually certain the cost effectiveness it’s in the range of $5000 to $6000? Or do they think it could be between $2000 and $9000? Givewell explains it’s methodology in detail, but their uncertainty ranges are dropped when this claim is spread (do you know of the top of your head what their uncertainty is?). Absent these ranges, I see these claims repeated all over the place as if $5000 really is an objectively correct answer and not a rough estimate.
- Richard Y Chappell🔸 23 Sep 2024 17:26 UTC
  14 points
  7 ∶ 2
  Parent
  I actually think that’s fine. You can always look it up if you’re interested in the details, but for the casual consumer of charity-evaluation information, the bottom-line best estimate is the info that’s decision-relevant, not the uncertainty range. I think it’s completely fine for people to share core info like this without simultaneously sharing all the fine print. Just like it’s OK for public health experts to promote simple pro-vax messaging that doesn’t include all the fine print.
  (See moral misdirection for my principled account of when it is or isn’t OK to leave out information.)
  Absent these ranges, I see these claims repeated all over the place as if $5000 really is an objectively correct answer and not a rough estimate.
  Here you just seem to be repeating the mistake of assuming that presenting a best estimate without also presenting the uncertainty range is thereby to present it as certain. I disagree with that interpretative norm. There is no “as if” being presented. That’s on you.
  - Jason 24 Sep 2024 6:48 UTC
    10 points
    5 ∶ 0
    Parent
    I’ll take an intermediate position: most readers will at least unconsciously infer an uncertainty range when presented with a point estimate only. If my mechanic tells me their best estimate for fixing my car is $1000 without saying more, I should understand from that $1200 is a reasonable possibility but would legitimately be upset if presented with a $2000 bill even if $1000 were provably the mean, median, mode, and likely outcome.
    
    Here, I think the reader is on notice that estimating cost to save a life is likely to involve some imprecision, plus it is presented as an estimate, it is linked to a more detailed explanation, and it is rounded off.
    
    There would be cases in which more should be said about the uncertainty range, for instance if it were between $500 and $50K! In that kind of scenario, you would need to say more to clue the reader into the degree of imprecision.
    - Richard Y Chappell🔸 24 Sep 2024 13:03 UTC
      2 points
      0 ∶ 0
      Parent
      Agreed!
  - Guive 23 Sep 2024 22:51 UTC
    7 points
    1 ∶ 0
    Parent
    Yeah. The words “estimates” and “about” are right there in the quote. There is no pretension of certainty here, unless you think mere use of numbers amounts to pretended certainty.
    But what is decision relevant is the expected value. So by best estimate do they mean expected value, or maximum likelihood estimate, or something else? To my ear, “best estimate” sounds like it means the estimate most likely to be right, and not the mean of the probability distribution. For instance, take the (B) option in “Why it can be OK to predictably lose”, where you have a 1% chance of saving 1000 people, and a 99% chance of saving no one, and the choice is non-repeatable. I would think the “best estimate” of the effectiveness of option (B) is that you will save 0 lives. But what matters for decision making is the expected value which is 10 lives.
    Sorry if this is a stupid question, I’m not very familiar with GiveWell.
    - Richard Y Chappell🔸 23 Sep 2024 23:48 UTC
      3 points
      0 ∶ 0
      Parent
      Fair question! I don’t know the answer. But I’d be surprised if the two came apart too sharply in this case (even though, as you rightly note, they can drastically diverge in principle). My sense is that GiveWell aims to recommend relatively “safe” bets, rather than a “hits-based” EV-maximizing approach. (I think it’s important to be transparent when recommending the latter, just because I take it many people are not in fact so comfortable with pursuing that strategy, even if I think they ought to be.)
NickLaing 24 Sep 2024 13:45 UTC
1 point
1 ∶ 0
Great article—a minor point, you might be straw-manning/pidgeon holing this article a little
https://www.vox.com/future-perfect/372519/charity-giving-effective-altruism-mutual-aid-homeless
They don’t just argue that we should help homeless people because some tradeoffs are difficult to make. They make a number of reasonable points including citing Yud’s “fuzzies and utilons” as a potential reason to help homeless people as well as appealing to a reasonable philosophical argument about integrity.
- Richard Y Chappell🔸 24 Sep 2024 14:09 UTC
  12 points
  3 ∶ 1
  Parent
  I’m happy for folks to read the article and judge for themselves. The author briefly references some reasonable ideas in the course of building up a fundamentally unreasonable thesis: that “The problem [with effective altruism] is that we’ve stretched optimization beyond its optimal limits,” and that sometimes donating to the local homeless over EA charities will better serve “the real value you hold dear [that is, helping people].”
  They most clearly exhibit the fallacy I warn against (“some tradeoffs are unclear, therefore you might as well be an ineffective altruist”) in this passage criticizing attempted optimization:
  In your case, you’re trying to optimize how much you help others, and you believe that means focusing on the neediest. But “neediest” according to what definition of needy? You could assume that financial need is the only type that counts, so you should focus first on lifting everyone out of extreme poverty, and only then help people in less dire straits. But are you sure that only the brute poverty level matters?
  … if you want to optimize, you need to be able to run an apples-to-apples comparison — to calculate how much good different things do in a single currency, so you can pick the best option. But because helping people isn’t reducible to one thing — it’s lots of incommensurable things, and how to rank them depends on each person’s subjective philosophical assumptions — trying to optimize in this domain will mean you have to artificially simplify the problem. You have to pretend there’s no such thing as oranges, only apples.
  I also think their discussion of integrity is fundamentally confused:
  It sounds like that’s what you’re feeling when you pass a person experiencing homelessness and ignore them. Ignoring them makes you feel bad because it alienates you from the part of you that is moved by this person’s suffering — that sees the orange but is being told there are only apples. That core part of you is no less valuable than the optimizing part, which you liken to your “brain.” It’s not dumber or more irrational. It’s the part that cares deeply about helping people, and without it, the optimizing part would have nothing to optimize!
  It’s not apples and oranges. It’s just helping people you can see vs helping people who are out of sight, and so less emotionally engaging. Those shouldn’t be different values—as the author themselves says at the start, there’s just the one value of helping people, and different strategies for how to achieve that. What they don’t acknowledge is that the strategy of prioritizing more salient / emotionally-engaging people is less effective at helping, even if it’s more effective at indulging your emotional needs. Calling the emotional bias “integrity” is not philosophically helpful or illuminating. It’s muddled thinking, running cover for blatant bias.
  - NickLaing 26 Sep 2024 9:10 UTC
    3 points
    0 ∶ 0
    Parent
    I don’t entirely disagree with this argument they make you quoted.
    
    ”The problem [with effective altruism] is that we’ve stretched optimization beyond its optimal limits,” and that sometimes donating to the local homeless over EA charities will better serve “the real value you hold dear [that is, helping people].”
    
    I think sometimes helping local homeless over EA charities can be a good idea to connect us emotionally with suffering, maintain strong social cohesion and set good examples to those around us who might be interested in EA. This argument may be weak, but isn’t as terrible as you seem to make out.
    
    Also I think when their second argument is steelmanned its not that bad either. It seems to me they are arguing that helping those directly around us can help us care more, giving us more energy capacity to do more good while we do optimise abroad. I agree with them that directly helping people that suffer can help us care enough to actually optimise with the rest of our lives (altruism begets altrusim). In that sense it can be “Apples and Oranges’ in a way”, not in that the people have different value but that the Orange is protecting the part that cares deeply, which helps us care more for the Apple—the people far away who we could optimise and help more.
    
    I also think there’s a reasonable argument that helping those we are in proximity to at least to some extent (perhaps due to emotional bias) can show integrity to our wider goal of doing the most good. I don’t think this is necessarily “muddled thinking”
    
    It sounds like that’s what you’re feeling when you pass a person experiencing homelessness and ignore them. Ignoring them makes you feel bad because it alienates you from the part of you that is moved by this person’s suffering — that sees the orange but is being told there are only apples. That core part of you is no less valuable than the optimizing part, which you liken to your “brain.” It’s not dumber or more irrational. It’s the part that cares deeply about helping people, and without it, the optimizing part would have nothing to optimize!
    My overall point is that I think you are unreasonably harsh on some of the reasoning here, even if you disagree with it. Many articles I have read which criticise Effective Altruism are so bad I don’t take them seriously, whereas I feel like this one makes some reasonable arguments even if we might disagree with them.
[ ]
[deleted]