Anthony DiGiovanni 🔸 comments on 2. Why intuitive comparisons of large-scale impact are unjustified

Anthony DiGiovanni 🔸 7 Jan 2026 13:15 UTC
5 points
0 ∶ 0
I don’t know exactly what you mean by “feels very hard to compare”. I’d appreciate more direct responses to the arguments in this post, namely, about how the comparison seems arbitrary.
- Vasco Grilo🔸 7 Jan 2026 17:29 UTC
  3 points
  1 ∶ 2
  Parent
  I don’t know exactly what you mean by “feels very hard to compare”.
  It looks like you are inferring incomparability between the value of 2 futures (non-discrete overlap between their UEVs) from the subjective feeling (in your mind) that their EVs feel very hard to compare (given all the evidence you considered), as any comparisons involve decisive arbitrary assumptions. I mean “arbitrary” as used in common language.
  I’d appreciate more direct responses to the arguments in this post, namely, about how the comparison seems arbitrary.
  Comparisons among the expected cost-effectiveness of the vast majority of interventions seem arbitrary to me too due to effects on soil animals and microorganisms. However, the same goes for comparisons among the expected mass of seemingly identical objects with a similar mass if I can only assess their mass using my hands, but this does not mean their mass is incomparable. To assess this, we have to empirically determine which fraction of the uncertainty in their mass is irreducible. 10 k years ago, it would not have been possible to determine which of 2 rocks with around 1 kg was the heaviest if their mass only differed by 10^-6 kg. Yet, this is possible today. Some semi-micro balances have a resolution of 0.01 mg, 10^-8 kg. So I would say the expected mass of the rocks was comparable 10 k years ago. Do you agree? There could be some irreducible uncertainty in the mass of the rocks, but much less than suggested by the evidence available 10 k years ago.
  What links here?
  - Vasco Grilo🔸's comment on 1. The challenge of unawareness for impartial altruist action guidance: Introduction by Anthony DiGiovanni 🔸 (26 Mar 2026 15:08 UTC; 2 points)
  - Anthony DiGiovanni 🔸 29 Mar 2026 4:10 UTC
    4 points
    0 ∶ 0
    Parent
    However, the same goes for comparisons among the expected mass of seemingly identical objects with a similar mass if I can only assess their mass using my hands, but this does not mean their mass is incomparable.
    I don’t exactly understand what argument you’re making here.
    My core argument in the post is: Take any intervention X. We want to weigh up its impact for all sentient beings across the cosmos, where this “weighing up” is aggregation over all hypotheses. Now suppose we want to force ourselves to compare X with inaction, i.e., say either UEV(do X) > UEV(don’t do X) or vice versa. We have such an extremely coarse-grained understanding (if any) of these hypotheses^[1] that, when we do the weighing-up, whether we say UEV(do X) > UEV(don’t do X) or vice versa seems to depend on an arbitrary choice.
    Can you say how your argument relates to mine?
    ^
    Relative to the amount of fine-grained detail necessary to evaluate the hypothesis, when what we value is “well-being of all sentient beings across the cosmos”.
    - Vasco Grilo🔸 29 Mar 2026 9:54 UTC
      2 points
      0 ∶ 0
      Parent
      Thanks for following up, Anthony.
      My best guess about which of 2 identical objects has a larger mass in expectation will be arbitrary if their mass only differs by 10^-6 kg, and I have no way of assessing this small difference. However, this does not mean the expected mass of the 2 objects is fundamentally incomparable. Likewise, my best guess about which of 2 actions increases welfare more in expectation may be arbitrary without this implying that their expected change in welfare is incomparable.
      I am not sure it matters whether one endorses precise expected values (EVs) or not. In practice, I still like to test different EVs when the underlying probability density function (PDF) is very arbitrary and uncertain, as it is the case for PDFs of welfare ranges. In such cases, I suspect decreasing uncertainty to find the best options has higher EV than the supposedly imprecise EVs of going with the current best option.
      - Anthony DiGiovanni 🔸 30 Mar 2026 2:26 UTC
        5 points
        0 ∶ 0
        Parent
        My best guess about which of 2 identical objects has a larger mass in expectation will be arbitrary is their mass only differs by 10^-6 kg, and I have no way of assessing this small difference. However, this does not mean the expected mass of the 2 objects is fundamentally incomparable
        I worry you’re reifying “expectations” as something objective here. The relative actual masses of the objects are clearly comparable. But if you subjectively can’t compare them, then they’re indeed incomparable “in expectation” in the relevant sense.
        Vasco Grilo🔸 30 Mar 2026 9:22 UTC
        2 points
        0 ∶ 0
        Parent
        I would be able to subjectively compare the mass of the 2 objects with more evidence. Some comparisons may not be feasible with currently available evidence, but the degree of imprecision should be set by what is physically possible?
        Anthony DiGiovanni 🔸 30 Mar 2026 12:21 UTC
        4 points
        0 ∶ 0
        Parent
        If you had more evidence, you could make the comparison. But you currently have no clue which direction the comparison would go, in expectation over the evidence you might receive. So how are you supposed to compare them right now?
        Vasco Grilo🔸 30 Mar 2026 12:40 UTC
        2 points
        0 ∶ 0
        Parent
        I would simply say the expected mass is practically (not exactly) the same given the evidence available to me, and consider gathering additional evidence depending on how much I expected this to change future decisions. Likewise for altruistic interventions among which comparisons of the expected change in welfare feel very arbitrary.
        Anthony DiGiovanni 🔸 5 Apr 2026 21:57 UTC
        4 points
        1 ∶ 0
        Parent
        I don’t know what you mean by “practically the same”, can you say more?
        Regardless, the problem is that “gathering evidence” vs “doing something else” is itself a decision, whose consequences you’ll be clueless about. I discuss this more here.
        Vasco Grilo🔸 6 Apr 2026 6:15 UTC
        4 points
        0 ∶ 0
        Parent
        I meant my future decisions would be the same in reality if I could not gather additional evidence regardless of whether the mass of the 2 identical objects was exactly the same or differed by 10^-6 kg.
        Do you think annual human welfare per human-year has increased since 1900? Child mortality decreased 37.3 pp (= 0.41 − 0.037) since then until 2023. If you agree annual human welfare per human-year has increased since 1900, are you confident that similar progress cannot be extented to non-humans? Would you have argued 200 years ago that we are all clueless about how to increase human welfare? I agree research can backfire. However, at least historically, doing research on the sentience of animals, and on how to increase their welfare has mostly been beneficial for the target animals?
        Anthony DiGiovanni 🔸 18 Apr 2026 11:04 UTC
        4 points
        0 ∶ 0
        Parent
        I meant my future decisions would be the same in reality if I could not gather additional evidence
        Perhaps, but that’s consistent with incomparability. Given the independent motivations we’ve discussed (/given in my post) for calling the two options incomparable, I’d say you should call them incomparable.
        I think I address your questions in the second paragraph in “Why we’re especially unaware of large-scale consequences” (this post) and “Meta-extrapolation” (post #4). See also my discussion with Richard here.
  - Anthony DiGiovanni 🔸 8 Jan 2026 9:12 UTC
    4 points
    0 ∶ 0
    Parent
    (Sorry, due to lack of time I don’t expect I’ll reply further. But thank you for the discussion! A quick note:)
    from the subjective feeling (in your mind) that their EVs feel very hard to compare
    EV is subjective. I’d recommend this post for more on this.
    - Vasco Grilo🔸 8 Jan 2026 10:11 UTC
      1 point
      0 ∶ 1
      Parent
      I don’t expect I’ll reply further
      You are welcome to return to this later. I would be curious to know your thoughts.
      EV is subjective. I’d recommend this post for more on this.
      I liked the post. I agree EV is subjective to some extent. The same goes for the concept of mass, which depends on our imperfect understanding of physics. However, the expected mass of objects is still comparable, unless there is only an infinitesimal difference between their mass.
      - Michael St Jules 🔸 29 Mar 2026 1:34 UTC
        5 points
        0 ∶ 0
        Parent
        Do you think it’s reasonable for two people with all of the same evidence to disagree on precise probabilities and expected values? If so, how would you justify picking your own precise probabilities over someone else’s, if you think theirs are just as defensible?
        Or would you just average yours and theirs in some way to get a new distribution? How?
        And how far would you go, if you consider all the defensible precise probability distributions anyone could assign (whether or not anyone actually does so)? How do you weigh them all if there are infinitely many of them and no uniform distribution over them?
        Here’s another example I like.
        Vasco Grilo🔸 29 Mar 2026 9:41 UTC
        1 point
        0 ∶ 1
        Parent
        Hi Michael.
        Do you think it’s reasonable for two people with all of the same evidence to disagree on precise probabilities and expected values?
        It depends on what is included in “all of the same evidence”. If 2 people had exactly the same evidence about everything, including internal states about the plausibility of the probabilities, they would be the same people, and therefore would agree on everything. In practice, different people share some evidence, but start with different priors, and therefore do not have to agree on precise probabilities and expected values. The stronger the evidence they share relative to their priors, the more they will agree.
        Or would you just average yours and theirs in some way to get a new distribution? How?
        If 2 probability density functions (PDFs) feel exactly as plausible, I would simply use the mean between them.
        Side note. I often link to concepts in my comments that I am sure the person I am replying to is familiar with, but I do it anyway in case others find it relevant.
        Michael St Jules 🔸 29 Mar 2026 16:36 UTC
        4 points
        1 ∶ 0
        Parent
        I think you’ve simplified the problem too much. There can be special cases where we can use symmetry and just take simple averages, but many practical cases are not like that. Indeed, that’s the point of the distinction between complex and simple cluelessness in the first place.
        I think, ideally, we should look for and exploit as much evidential symmetry as possible, but I don’t think we’ll always find enough of it to land on a unique precise distribution, I’d guess in principle impossible in many cases (probably almost all cases of intervention and cause area research) without further evidence.
        It’s true that direct impressions (e.g. internal states about the plausibility of the probabilities) could be considered evidence, but to the extent that for the same objective external evidence, these direct impressions can vary between people or depending on how or when you present the evidence, they seem arbitrary.
        Would you take the fact that a direct impression came from your brain — from an inscrutable process, prone to cognitive biases of various kinds, and whose reliability you can at best verify by track records in limited domains where feedback is practical, and where track records may not generalize across tasks and domains well — is better evidence than a direct impression from another person’s brain (with similar problems), with access to the same objective external evidence?
        Or, what if there are multiple people with different distributions and different track records in relevant domains? How do you weigh them? How much should track record be worth? EDIT: What if their track records are measured in different ways, e.g. you have forecasters with Brier scores, investors or betters with measures of their gains and losses, researchers and grantmakers of various seniorities at different organizations?
        And what’s the range of direct impressions humans or other semi-rational agents could have, and how would you weigh them all?
        I’d also be keen to get your response to this (and also this, if you have the time.)
        Vasco Grilo🔸 29 Mar 2026 17:50 UTC
        2 points
        0 ∶ 0
        Parent
        I agree with the points you make in the 1st 3 paragraphs of your comment.
        Would you take the fact that a direct impression came from your brain — from an inscrutable process, prone to cognitive biases of various kinds, and whose reliability you can at best verify by track records in limited domains where feedback is practical, and where track records may not generalize across tasks and domains well — is better evidence than a direct impression from another person’s brain, with access to the same objective external evidence?
        Not necessarily. It depends on which evidence is being assessed. I am certainly not the best person to assess all kinds of evidence.
        Or, what if there are multiple people with different distributions and different track records in relevant domains? How do you weigh them? How much should track record be worth?
        I think track record weighted by the relevance of the domain to X is one of the most important sources of evidence to decide on how much to weigh the views of different people with respect to X. However, I believe it is often tricky to know how much a good track record in a given domain generalises to another domain.
        And what’s the range of direct impressions humans or other semi-rational agents could have, and how would you weigh them all?
        It depends on the case. Do you think my answer to the above should influence which interventions I prioritise? My current top recommendations are research on i) the welfare of soil animals and microorganisms, and ii) comparisons of (expected hedonistic) welfare across species and digital systems. Could you see these changing if I thought EVs were imprecise instead of precise at a fundamental level?
        I’d also be keen to get your response to this (and also this, if you have the time.)
        I have added the comments to my reading list.
        Anthony DiGiovanni 🔸 29 Mar 2026 22:23 UTC
        6 points
        1 ∶ 0
        Parent
        Besides the links Michael shared, I highly recommend this really short post.
        Vasco Grilo🔸 30 Mar 2026 9:15 UTC
        2 points
        0 ∶ 0
        Parent
        Thanks for sharing, Anthony. I just commented there.
        Michael St Jules 🔸 29 Mar 2026 21:40 UTC
        6 points
        1 ∶ 0
        Parent
        It depends on the case. Do you think my answer to the above should influence which interventions I prioritise? My current top recommendations are research on i) the welfare of soil animals and microorganisms, and ii) comparisons of (expected hedonistic) welfare across species and digital systems. Could you see these changing if I thought EVs were imprecise instead of precise at a fundamental level?
        I think there’s a lot that could change if you very seriously weighed others’ actual or possible direct impressions/intuitions without heavily privileging your own, before we even get into the question of precise vs imprecise credences. Epistemic modesty is going to do a lot of work first.
        Holding your current normative views ~constant, with precise credences, then epistemic modesty would make infinite expected values (and possibly cardinally larger infinities) your focus, as long as there are well-defined consistent ways to handle them without always getting infinity minus infinity errors in practice. With imprecise credences, you could plausibly justify ignoring them on some versions of bracketing (also see here), say because they’re so speculative and you’re clueless about the direction of your impacts on infinities, including possibly even the effects of research into infinite effects (because the research could be used in ways you’d judge to be very bad).
        (Independently of precise vs imprecise) If you’re a moral realist, then you wouldn’t privilege your own direct normative intuitions just for being yours either, and this would plausibly mean not privileging consequentialism, utilitarianism, hedonism, risk neutrality, etc.. This could have important implications. Your current priorities might still be among your top priorities, but your list of priorities could expand a lot.
        It might be impossible to compare these priorities; there’s no universal common standard/unit across all normative stances. You might go for a portfolio of interventions.
        If you’re not a moral realist, or for the part of you that isn’t, you can just not care about views that conflict too much with your most important intuitions.
        If you’re doing some version of bracketing with imprecise credences, some vertebrate welfare work could be worth prioritizing. I’m clueless about whether crops or nature is better for wild animals, even though I’m suffering-focused, so I ignore conversions between nature and crops. Far future effects and acausal influence could guide some priorities unless you’re clueless about them and bracket them away.
        Again, potentially impossible comparisons + portfolio.
        With imprecise credences, I think you would also be more pessimistic about the marginal value of research to compare welfare ranges and sentience across types of possible moral patients. You should also be more pessimistic about the value of further research into the sign of the welfare of moral patients. That doesn’t mean no such research is worth doing, but I think it would focus on scoping out possibilities and their implications and gathering evidence that could basically rule out the more extreme hypotheses (e.g. for (near-)constant welfare ranges and for welfare ranges with the most extreme ratios between potential moral patients). Arguments like the two envelopes problem, conscious subsystems, how moral weights could scale with neuron counts, gradations/vagueness, looking for more ways to assign welfare ranges with very different implications from the ones we have now. If you’re gathering empirical evidence, you would aim it at shifting or ruling out extremes.
        Personally, I’ve decided to draw some lines in practice, and basically leave out nematodes and simpler systems as priorities. This depends largely on my normative views (and I’m not a moral realist, so I’m more willing to make some judgement calls about this). I think what counts as consciousness is largely normative and subjective, I have some objections to aggregation (e.g. torture vs dust specks) and I’m not entirely risk neutral or ambiguity neutral. The capacities I’ve observed in them don’t seem so compelling. Maybe some of it is motivated reasoning, though. And maybe some sentience research on nematodes would be worth doing. If they met some of the standards here or here or we found evidence for some of the most sophisticated cognitive capacities we observe in fruit flies, I might take them pretty seriously.
        Vasco Grilo🔸 30 Mar 2026 12:29 UTC
        4 points
        0 ∶ 0
        Parent
        I’d also be keen to get your response to this (and also this, if you have the time.)
        I have replied to both comments.
        I think there’s a lot that could change if you very seriously weighed others’ actual or possible direct impressions/intuitions without heavily privileging your own, before we even get into the question of precise vs imprecise credences. Epistemic modesty is going to do a lot of work first.
        Thanks for elaborating on this. I imagine I could arrive to different (practical) priorities if I changed my mind about the topics you listed. At the same time, my more foundational philosophical views have historically changed very little. Investigations about empirical matters have updated my priorities a lot more. So I would be curious to know if you think there are areas which are more amenable to empirical investigation, and where I am not giving enough consideration to the views of others.
        I’m clueless about whether crops or nature is better for wild animals, even though I’m suffering-focused
        I agree it is very unclear whether increasing cropland is good or bad, even for suffering-focussed people.