I would even (tentatively) support the recommendation for ‘at least 1 of the above traits’. And slightly less tentatively for ‘at least 1 of (1) or (2)’. (In the absence of (1) and (2), on its own, (3) doesn’t seem so risky.)
HaydenW
Summary: In defence of fanaticism
Summary: In Defence of Fanaticism (Hayden Wilkinson)
How to neglect the long term (Hayden Wilkinson)
Egyptology and Fanaticism (Hayden Wilkinson)
Can an evidentialist be risk-averse? (Hayden Wilkinson)
The overgeneralisation isextremely easy to make. Just search “effective altruism” on twitter right now. :’( (n.b., not recommended if you care about your own emotional well-being.)
I was one of the people who commented, on what was likely version 26 or 27. (This was in November, 2021.) And Torres certainly wasn’t listed as an author by that stage. I don’t think I saw any comments from them on that version either, but there were a lot of comments in total so I’m not sure.
I doubt that any effective altruists would say that our wellbeing (as benefactors) doesn’t matter. Nor is there any incompatibility between the basic ideas (or practice) of effective altruism on one hand, and that there are limits on our duties to help others on the other hand.
Ah, I think we’ve got different notions of probability in mind: the subjective credence of the agent (OpenPhil grantmakers) versus something like the objective chances of the thing actually happening, irrespective of anyone’s beliefs.
OpenPhilanthropy’s “hits based giving” approach seems like it doesn’t fall prey to your argument, because they are willing to ignore the “Don’t Prevent Impossible Harms” constraint.
For what it’s worth, I don’t think this is true (unless I’m misinterpreting!). Preferring low-probability, high-expected value gambles doesn’t require preferring gambles with probability 0 of success.
Thanks for a brilliant post! I really enjoyed it. And in particular, as someone unfamiliar with the computational complexity stuff, your explanation of that part was great!
I have a few thoughts/questions, most of them minor. I’ll try to order them from most to least important.
The recommendation for Good-Old-Fashioned-EA
If I’m understanding the argument correctly, it seems to imply that real-world agents can’t assign fully coherent probability distributions over Σ in general. So, if we want to compare actions by their prospects of outcomes, we just can’t do so. (By any plausible decision theory, not just expected value theory.) The same goes for the action of saving a drowning child—we can’t give the full prospect of how that’s going to turn out. And, at least on moral theories that say we should sometimes promote the good (impartially wrt time, etc), consequentialist theories especially, it seems that it’s going to be NP-hard to say whether it’s better to save the child or not save the child. (cf Greaves’ suggestion wrt cluelessness that we’re more clueless about the effects of near-term interventions than those of long-term interventions) So, why is it that the argument doesn’t undermine those near-term interventions too, at least if we do them on ‘promoting-the-good’ grounds?
2. Broader applications
On a similar note, I wonder if there are much broader applications of this argument than just longtermism (or even for promoting the good in general). Non-consequentialist views (both those that sometimes recommend against promoting the good and those that place no weight on promoting the good) are affected by uncertainty too. Some rule-absolutist theories in particular can have their verdicts swayed by extremely low-probability propositions—some versions say that if an action has any non-zero probability of killing someone, you ought not do it. (Interesting discussion here and here) And plausible versions of views that recognise a harm-benefit asymmetry run into similar problems of many low-probability risks of harm (see this paper). Given that, just how much of conventional moral reasoning do you think your argument undermines?
(FWIW, I think this is a really neat line of objection against moral theories!)
3. Characterising longtermism
The definition of Prevent Possible Harms seemed a bit unusual. In fact, it sounds like it might violate Ought Implies Can just by itself. I can imagine there being some event e that might occur in the future, for which there’s no possible way we could make that e less likely or mitigate its impacts.
On a similar note, I think most longtermist EAs probably wouldn’t sign up to that version of PPH. Even when e can be made less likely or less harmful, they wouldn’t want to say we should take costly steps to prevent such an e regardless of how costly those steps are, and regardless of how much they’d affect e’s probability/harms.
Also, how much more complicated would it be to run the argument with the more standard definition of “deontic strong longtermism” from p26 of Greaves & MacAskill? (Or even just their definition of “axiological strong longtermism” on p3?)
Related: the line “a worldview that seeks to tell us what we ought to do, and which insists that extreme measures may need to be taken to prevent low-probability events with potentially catastrophic effects” seems like a bit of a mischaracterization. A purely consequentialist longtermist might endorse taking extreme measures, but G&M’s definition is compatible with having absolute rules against doing awful things—it allows that we should only do what’s best for the long term in decision situations where we don’t need to do awful things to achieve it, or even just in decisions of which charity to donate to. (And in What We Owe The Future, Will explicitly advocates against doing things that commonsense morality says are wrong.)
4. Longtermism the idea vs. what longtermists do in practice
On your response to the first counterargument (”...the imperative to avoid and mitigate the possibility of catastrophic climate change is not uniquely highlighted by longtermist effective altruists...Indeed, we have good evidence that we are already experiencing significant negative impacts of climate change (Letchner 2021), such that there is nothing especially longtermist about taking steps now to reduce climate change...” etc), this doesn’t seem like an objection to longtermism actually being true (at least as Greaves & MacAskill define it). It sounds like potentially a great objection to working on AI risk or causes with even more speculative evidence bases (some wild suggestions here). But for it to be ex ante better for the far future to work on climate change seems perfectly consistent with the basic principle of longtermism; it just means that a lot of self-proclaimed longtermists aren’t actually doing what longtermism recommends.
5. What sort of probabilities?
One thing I wasn’t clear on was what sort of probabilities you had in mind.
If they’re objective chances: The probabilities of lots of things will just be 0 or 1, perhaps including the proposition about AI risk. And objective chances already don’t seem action-guiding—there are plenty of decision situations where agents just won’t have any clue what the objective chances are (unless they’re running all sorts of quantum measurements).
If they’re subjective credences: It seems pretty easy for agents to figure out the probability of, say, AI catastrophe. They just need to introspect about how confident they are that it will/won’t happen. But then I think (but am unsure) that the basic problem you identify is that it would take way too much computation (more than any human could ever do) to figure out if those credences are actually coherent with all of the agent’s other credences. And, if they’re not, you might think that all possible decision theories just break down. Which is worrying! But it seems like, if we can put together a decision theory for incoherent probability distributions / bounded agents, then the problem could be overcome, maybe?
If they’re evidential probabilities (of the Williamson sort, relative to the agent’s evidence): These seem like the best candidate for being the normatively relevant sort of probabilities. And, if that’s what you have in mind, then it makes sense that agents can’t do all the computation necessary to work out what all the evidential probabilities are (which maybe isn’t a new point—it seems pretty widely recognised that doing Bayesian updating on everything would be way too hard for human agents).
6. “for all”
I think you’ve mostly answered this with the first counterargument, but I’ll ask anyway.
In the definitions of No Efficient Algorithm, PIBNETD-Harms, Independence of Bad Outcomes, and the statement of Dagum & Luby’s result, I was confused about the quantifiers. Why are we interested in the computational difficulty of this for any value of δ , for any belief network, for any proposition/variable V, and (for estimation) for any assignment of w to variables? Not just the actual value of δ, the agent’s actual belief network, and the actual propositions we’re trying do figure out whether they have non-zero probability? I don’t quite understand how general this needs to be to say something very specific like “There’s a non-zero probability that a pandemic will wipe out humanity”.
Here’s my more general confusion, I think: I don’t quite intuitively understand why it’s computationally hard to look up the probability of something if you’ve already got the full probability distribution over possible outcomes. Is it basically that, to do so, we have to evaluate Δu(V) across lots and lots of different possible states? Or is it the difficulty of thinking up every possible way the proposition could be true and every possible way it could be false and checking the probability of each of those? (Apologies for the dumb question!)
7. Biting the fanaticism bullet
(Getting into the fairly minor comments now)
I don’t think you need to bite the fanaticism bullet for your argument. At least if I’m roughly understanding the argument, it doesn’t require that we care about all propositions with non-zero probability, no matter how low their probability. Your response to the 3rd counterargument seems to get at this: we can just worry about propositions with absolute harms/benefits below some bound (and, I’m guessing, with probabilities above some bound) and we still have an NP-hard problem to solve. Is this right?
This is mainly a dialectical thing. I agree that fanaticism has good arguments behind it, but still many decision theorists would reject it and so would most longtermist EAs. It’d be a shame to give them the impression that, because of that, they don’t need to worry about this result!
8. Measuring computation time
I was confused by this: “In general, this means we can make efficient gains in the accuracy of our inferences. Setting δ=10^−4, if it takes takes approximately 1 minute to generate an estimate with a margin for error of ϵ=.05, then achieving a margin for error of ϵ=.025 will take four minutes.”
To be able to give computation times like 1 minute, do you have a particular machine in mind? And can you make the general point that “the time it takes goes up by a factor of 4 if we reduce the margin of error from x to y”?
Typos/phrasing
In the definition of Don’t Prevent Impossible Harms, I initially misread “For any event e that will not occur in the future” as being about what will actually happen, as against what it’s possible/impossible will happen. Maybe change the phrasing?
On the Ought Implies Can point, specifically “Moreover, Don’t Prevent Impossible Harms follows from the idea that “ought implies can” (Kant, 1781); if e won’t occur, then it’s not possible for us to make it any less likely, or to mitigate negative outcomes that occur because e occurs, and so we cannot be compelled to attempt to do so. To illustrate, if Venus were to suddenly deviate from its orbit tomorrow and collide with Earth, this would presumably lead to a very large aggregate reduction in utility on Earth. But Venus won’t do that...”: Ought Implies Can implies the version of Don’t Prevent Impossible Harms that you give (put in terms of reducing the probability), but it doesn’t imply that we shouldn’t prevent such harms. After all, if Venus is definitely not going to do that, then any action we take might (arguably) be said to ‘prevent’ it!
When you say “If P(V=1)>0, then there is real number δ such that, if Δu(V)<δ, then those agents ought to take costly steps now to make it less likely that V=1” (and mention Δu(V) elsewhere), shouldn’t it be “Δu(V)<δ” since Δu(V) is a measure of the difference in value, and only if that difference is great enough should agents take costly steps?
Typo: “Dagum and Luby’s result shows that this cannot be done in efficiently in the general case.”
Thanks again for the post!
Yep, we’ve got pretty good evidence that our spacetime will have infinite 4D volume and, if you arranged happy lives uniformly across that volume, we’d have to say that the outcome is better than any outcome with merely finite total value. Nothing logically impossible there (even if it were practically impossible).
That said, assigning value “∞” to such an outcome is pretty crude and unhelpful. And what it means will depend entirely on how we’ve defined ∞ in our number system. So, what I think we should do in such a case is not say V equals such and such. Instead, ditch the value function when you’ve left the domain where it works. Instead, just deal with your set of possible outcomes, your lotteries (probability measures over that set), and a betterness relation which might sometimes follow a value function but might also extend to outcomes beyond the function’s domain. That’s what people tend to do in the infinite aggregation literature (including the social choice papers that consider infinite time horizons), and for good reason.
That’d be fine for the paper, but I do think we face at least some decisions in which EV theory gets fanatical. The example in the paper—Dyson’s Wager—is intended as a mostly realistic such example. Another one would be a Pascal’s Mugging case in which the threat was a moral one. I know I put P>0 on that sort of thing being possible, so I’d face cases like that if anyone really wanted to exploit me. (That said, I think we can probably overcome Pascal’s Muggings using other principles.)
Thanks!
Good point about Minimal Tradeoffs. But there is a worry that if you don’t make it a fixed r then you could have an infinite sequence of decreasing rs but they don’t go arbitrarily low. (e.g., 1, 3⁄4, 5⁄8, 9⁄16, 17⁄32, 33⁄64, …)
I agree that Scale-Consistency isn’t as compelling as some of the other key principles in there. And, with totalism, it could be replaced with the principle you suggest in which multiplication is just duplicating the world k. Assuming totalism, that’d be a weaker claim, which is good. I guess one minor worry is that, if we reject totalism, duplicating a world k times wouldn’t scale its value by k. So Scale-Consistency is maybe the better principle for arguing in greater generality. But yeah, not needed for totalism.
>Nor can they say that Lsafe plus an additional payoff b is better than Lrisky plus the same b.
They can’t say this for all b, but they can for some b, right? Aren’t they saying exactly this when they deny Fanaticism (“If you deny Fanaticism, you know that no matter how your background uncertainty is resolved, you will deny that Lrisky plus b is better than Lsafe plus b.”)? Is this meant to follow from Lrisky+B≻Lsafe+B? I think that’s what you’re trying to argue after, though.
Nope, wasn’t meaning for the statement involving little b to follow from the one about big B. b is a certain payoff, while B is a lottery. When we add b to either lottery, we’re just adding a constant to all of the payoffs. Then, if lotteries can be evaluated by their cardinal payoffs, we’ve got to say that L_1 +b > L_2 +b iff L_1 > L_2.
Aren’t we comparing lotteries, not definite outcomes? Your vNM utility function could be arctan(∑iui), where the function inside the arctan is just the total utilitarian sum. Let Lsafe=π2, and Lrisky=∞ with probability 0.5 (which is not small, but this is just to illustrate) and 0 otherwise. Then these have the same expected value without a background payoff (or b=0), but with b>0, the safe option has higher EV, while with b<0, the risky option has higher EV.
Yep, that utility function is bounded, so using it and EU theory will avoid Fanaticism and bring on this problem. So much the worse for that utility function, I reckon.
And, in a sense, we’re not just comparing lotteries here. L_risky + B is two independent lotteries summed together, and we know in advance that you’re not going to affect B at all. In fact, it seems like B is the sort of thing you shouldn’t have to worry about at all in your decision-making. (After all, it’s a bunch of events off in ancient India or in far distant space, outside your lightcone.) In the moral setting we’re dealing with, it seems entirely appropriate to cancel B from both sides of the comparison and just look at L_risky and L_safe, or to conditionalise the comparison on whatever B will actually turn out as: some b. That’s roughly what’s going on there.
Just a note on the Pascal’s Mugging case: I do think the case can probably be overcome by appealing to some aspect of the strategic interaction between different agents. But I don’t think it comes out of the worry that they’ll continue mugging you over and over. Suppose you (morally) value losing $5 to the mugger at −5 and losing nothing at 0 (on some cardinal scale). And you value losing every dollar you ever earn in your life at −5,000,000. And suppose you have credence (or, alternatively, evidential probability) of p that the mugger can and will generate any among of moral value or disvalue they claim they will. Then, as long as they claim they’ll bring about an outcome worse than −5,000,000/p if you don’t give them $5, or they claim they’ll bring about an outcome better than +5,000,000/p if you do, then EV theory says you should hand it over. And likewise for any other fanatical theory, if the payoff is just scaled far enough up or down.
Yes, in practice that’ll be problematic. But I think we’re obligated to take both possible payoffs into account. If we do suspect the large negative payoffs, it seems pretty awful to ignore them in our decision-making. And then there’s a weird asymmetry if we pay attention to the negative payoffs but not the positive.
More generally, Fanaticism isn’t a claim about epistemology. A good epistemic and moral agent should first do their research, consider all of the possible scenarios in which their actions backfire, and put appropriate probabilities on them. If as they do the epistemic side right, it seems fine for them to act according to Fanaticism when it comes to decision-making. But in practice, yeah, that’s going to be an enormous ‘if’.
Both cases are traditionally described in terms of payoffs and costs just for yourself, and I’m not sure we have quite as strong a justification for being risk-neutral or fanatical in that case. In particular, I find it at least a little plausible that individuals should effectively have bounded utility functions, whereas it’s not at all plausible that we’re allowed to do that in the moral case—it’d lead something a lot like the old Egyptology objection.
That said, I’d accept Pascal’s wager in the moral case. It comes out of Fanaticism fairly straightforwardly, with some minor provisos. But Pascal’s Mugging seems avoidable—for it to arise, we need another agent interacting with you strategically to get what they want. I think it’s probably possible for an EV maximiser to avoid the mugging as long as we make their decision-making rule a bit richer in strategic interactions. But that’s just speculation—I don’t have a concrete proposal for that!
Even without precisely quantifying the harms each way, I think we can be pretty confident that the harms on one side are greater than on the other. It seems pretty clear that the harms of letting a non-trivial number of people experience sexual harassment and assault (or even the portion of those harms prevented by implementing a strong norm about this) are greater than the harms of preventing (even 100x as many) people from sleeping around within the community. The latter is just a far, far smaller harm per person—far less than 1⁄100 as great. And I think the same verdict holds even if the latter harm is concentrated mainly on neurodivergent people. And it holds even more clearly if we add on (to the first type of harm) the further harms of making the community less welcoming or uncomfortable for many more people than just those who directly experience harassment or assault.
(But, if there are at-least-as-effective ways to prevent the former harms, without imposing the latter harms, then this isn’t very relevant.)