Doctor from NZ, independent researcher (grand futures / macrostrategy) collaborating with FHI / Anders Sandberg. Previously: Global Health & Development research @ Rethink Priorities.
Feel free to reach out if you think there’s anything I can do to help you or your work, or if you have any Qs about Rethink Priorities! If you’re a medical student / junior doctor reconsidering your clinical future, or if you’re quite new to EA / feel uncertain about how you fit in the EA space, have an especially low bar for reaching out.
Outside of EA, I do a bit of end of life care research and climate change advocacy, and outside of work I enjoy some casual basketball, board games and good indie films. (Very) washed up classical violinist and Oly-lifter.
All comments in personal capacity unless otherwise stated.
bruce
Note also that you can accept outweighability and still believe that extreme suffering is really bad. You could—e.g. - think that 1 second of a cluster headache can only be outweighed by trillions upon trillions of years of bliss. That would give you all the same practical implications without the theoretical trouble.
+1 to this, this echoes some earlier discussion we’ve had privately and I think it would be interesting to see it fleshed out more, if your current view is to reject outweighability in theory
More importantly I think this points to a potential drawback RE: “IHE thought experiment, I claim, is an especially epistemically productive way of exploring that territory, and indeed for doing moral philosophy more broadly”[1]For example, if your intuition is that 70 years of the worst possible suffering is worse than 1E10 and 1E100 and 10^10^10 years of bliss, and these all feel like ~equally clear tradeoffs to you, there doesn’t seem (to me) to be a clear way of knowing whether you should believe your conclusion is that 70 years of the worst possible suffering is “not offsetable in theory” or “offsetable in theory but not in practice, + scope insensitivity”,[2] or some other option.
I’m much more confident about the (positive wellbeing + suffering) vs neither trade than intra-suffering trades. It sounds right that something like the tradeoff you describe follows from the most intuitive version of my model, but I’m not actually certain of this; like maybe there is a system that fits within the bounds of the thing I’m arguing for that chooses A instead of B (with no money pumps/very implausible conclusions following)
Ok interesting! I’d be interested in seeing this mapped out a bit more, because it does sound weird to have BOS be offsettable with positive wellbeing, positive wellbeing to be not offsettable with NOS, but BOS and NOS are offsetable with each other? Or maybe this isn’t your claim and I’m misunderstanding
2) Well the question again is “what would the IHE under experiential totalization do?” Insofar as the answer is “A”, I endorse that. I want to lean on this type of thinking much more strongly than hyper-systematic quasi-formal inferences about what indirectly follows from my thesis.
Right, but if IHE does prefer A over B in my case while also preferring the “neither” side of the [positive wellbeing + NOS] vs neither trade then there’s something pretty inconsistent right? Or a missing explanation for the perceived inconsistency that isn’t explained by a lexical threshold.
I think it’s possible that the answer is just B because BOS is just radically qualitatively different from NOS.
I think this is plausible but where does the radical qualitative difference come from? (see comments RE: formalising the threshold).
Maybe most importantly I (tentatively?) object to the term “barely” here because under the asymptotic model I suggest, the value of subtracting arbitrarily small amount of suffering instrument from the NOS state results in no change in moral value at all because (to quote myself again) “Working in the extended reals, this is left-continuous: ”
Sorry this is too much maths for my smooth brain but I think I’d be interested in understanding why I should accept the asymptotic model before trying to engage with the maths! (More on this below, under “On the asymptotic compensation schedule”)
So in order to get BOS, we need to remove something larger than , and now it’s a quasi-empirical question of how different that actually feels from the inside. Plausibly the answer is that “BOS” (scare quotes) doesn’t actually feel “barely” different—it feels extremely and categorically different
Can you think of one generalisable real world scenario here? Like “I think this is clearly non-offsetable and now I’ve removed X, I think it is clearly offsetable”
And I’ll add that insofar as the answer is (2) and NOT 3, I’m pretty inclined to update towards “I just haven’t developed an explicit formalization that handles both the happiness trade case and the intra-suffering trade case yet” more strongly than towards “the whole thing is wrong, suffering is offsetable by positive wellbeing”—after all, I don’t think it directly follows from “IHE chooses A” that “IHE would choose the 70 years of torture.” But I could be wrong about this! I 100% genuinely think I’m literally not smart enough to intuit super confidently whether or a formalization that chooses both A and no torture exists. I will think about this more!
Cool! Yeah I’d be excited to see the formalisation; I’m not making a claim that the whole thing is wrong, more making a claim that I’m not currently sufficiently convinced to hold the view that some suffering cannot be offsetable. I think while the intuitions and the hypotheticals are valuable, like you say later, there are a bunch of things about this that we aren’t well placed to simulate or think about well, and I suspect if you find yourself in a bunch of hypotheticals where you feel like your intuitions differ and you can’t find a way to resolve the inconsistencies then it is worth considering the possibility that you’re not adequately modelling what it is like to be the IHE in at least one of the hypotheticals
I more strongly want to push back on (2) and (3) in the sense that I think parallel experience, while probably conceptually fine in principle, really greatly degrades the epistemic virtue of the thought experiment because this literally isn’t something human brains were/are designed to do or simulate.
Yeah reasonable, but presumably this applies to answers for your main question[1] too?
Suppose the true value of exchange is at 10 years of happiness afterwards; this seems easier for our brains to simulate than if the true exchange rate is at 100,000 years of happiness, especially if you insist on parallel experiences. Perhaps it is just very difficult to be scope sensitive about exactly how much bliss 1E12 years of bliss is!
And likewise with (3), the self interest bit seems pretty epistemically important.
can you clarify what you mean here? Isn’t the IHE someone who is “maximally rational/makes no logical errors, have unlimited information processing capacity, complete information about experiences with perfect introspective access, and full understanding of what any hedonic state would actually feel like”?
On formalising where the lexical threshold is you say:
I agree it is imporant! Someone should figure out the right answer! Also in terms of practical implementation, probably better to model as a probability distribution than a single certain line.
This is reasonable, and I agree with probability distribution given uncertainty, but I guess it feels hard to engage with the metaphysical claim “some suffering in fact cannot be morally justified (“offset”) by any amount of happiness” and their implications if you are so deeply uncertain about what counts as NOS. I guess my view is that conditional on physicalism then whatever combination of nociceptor / neuron firing and neurotransmitter release / you can think of, this is a measurable amount. some of these combinations will cross the threshold of NOS under your view, but you can decrease all of those in continuous ways that shouldn’t lead to a discontinuity in tradeoffs you’re willing to make. It does NOT mean that the relationship is linear, but it seems like there’s some reason to believe it’s continuous rather than discontinuous / has an asymptote here. And contra your later point:
“I literally don’t know what the threshold is. I agree it would be nice to formalize it! My uncertainty isn’t much evidence against the view as a whole”
I think if we don’t know where a reasonable threshold is it’s fine to remain uncertain about it, but I think that’s much weaker than accepting the metaphysical claim! It’s currently based just on the 70 years of worst-possible suffering VS ~infinite bliss hypothetical. Because your uncertainty about the threshold means I can conjure arbitrarily high numbers of hypotheticals that would count as evidence against your view in the same way your hypothetical is considered evidence for your view.
On the asymptotic compensation schedule
I disagree that it isn’t well-justified in principlle, but maybe I should have argued this more thoroughly. It just makes a ton of intuitive sense to me but possibly I am typical-minding.
As far as I can tell, you just claim that it creates an asymptote and label it the correct view right? But why should it grow without bound? Sorry if I’ve missed something!
And I’m pretty sure you’re wrong about the second thing—see point 3 a few bullets up. It seems radically less plausible to me that the true nature of ethics involves discontinuous i_s vs i_h compensation schedules.
I was unclear about the “doesn’t seem to meaningfully change the unintuitive nature of the tradeoffs your view is willing to endorse” part you’re referring to here, and I agree RE: discontinuity. What I’m trying to communicate is that if someone isn’t convinced by the perceived discontinuity of NOS being non-offsettable and BOS being offsettable, a large subset of them also won’t be very convinced by the response “the radical part is in the approach to infinity, (in your words: the compensation schedule growing without bound (i.e., asymptotically) means that some sub-threshold suffering would require 10^(10^10) happy lives to offset, or 1000^(1000^1000). (emphasis added)”.
Because they could just reject the idea that an extremely bad headache (but not a cluster headache), or a short cluster headache episode, or a cluster headache managed by some amount of painkiller, etc, requires 1000^(1000^1000) happy lives to offset.
I guess this is just another way of saying “it seems like you’re assuming people are buying into the asymptotic model but you haven’t justified this”.- ^
“Would you accept 70 years of the worst conceivable torture in exchange for any amount of happiness afterward?”
Thanks for writing this up publicly! I think it’s a very thought provoking piece and I’m glad you’ve written it. Engaging with it has definitely helped me consider some of my own views in this space more deeply. As you know this is basically just a compilation of comments I’ve left in previous drafts, and am deferring to your preference to have these discussions in public. Some caveats for other readers: I don’t have any formal philosophical background so this is largely first principles reasoning rather than anything philosophically grounded.[1]
All of this is focussed on (to me) the more interesting metaphysical claim that “some suffering in fact cannot be morally justified (“offset”) by any amount of happiness.”
TL;DR
The positive argument for the metaphysical claim and the title of this piece relies (IMO) too heavily on a single thought experiment, that I don’t think supports the topline claim as written.
The post illustrates an unintuitive finding about utilitarianism, but doesn’t seem to provide a substantive case for why utilitarianism that includes lexicality is the least unintuitive option compared to other unintuitive utilitarian conclusions. For example, my understanding of your view is that given a choice of the following options:
A) 70 years of non-offsettable suffering, followed by 1 trillion happy human lives and 1 trillion happy pig lives, or
B) [70 years minus 1 hour of non-offsettable suffering (NOS)], followed by 1 trillion unhappy humans who are living at barely offsettable suffering (BOS), followed by 1 trillion pig lives that are living at the BOS,
You would prefer option B here. And it’s not at all obvious to me that we should find this deal more acceptable or intuitive than what I understand is basically an extreme form of the Very Repugnant Conclusion, and I’m not sure you’ve made a compelling case for this, or that world B contains less relevant suffering.
Thought experiment variations:
People’s intuitions about the suffering/bliss trade might reasonably change based on factors like:Duration of suffering (70 minutes vs. 70 years vs. 70 billion years)
Whether experiences happen in series or parallel
Whether you can transfer the bliss to others
Threshold problem:
Formalizing where the lexical threshold sits is IMO pretty important, because there are reasonable pushbacks to both, but they feel like meaningfully different viewsHigh threshold (e.g.,”worst torture”) means the view is still susceptible to unintuitive package deals that endorse arbitrarily large amounts of barely-offsettable suffering (BOS) to avoid small amounts of suffering that does cross the threshold
Low threshold (e.g., “broken hip” or “shrimp suffering”) seems like it functionally becomes negative utilitarianism
Asymptotic compensation schedule:
The claim that compensation requirements grow asymptotically and approach infinity (rather than linearly, or some other way) isn’t well-justified, and doesn’t seem to meaningfully change the unintuitive nature of the tradeoffs your view is willing to endorse.
============
Longer
As far as I can tell, the main positive argument you have for is the thought experiment where you reject the offer of 70 years of worst conceivable suffering in exchange for any amount of happiness afterwards”. But I do think it would be rational for an IHE as defined to accept this trade
I agree that package deals that permit or endorse the creation of extreme suffering as part of a package deal is an unintuitive / uncomfortable view to want to accept. But AFAICT most if not all utilitarian views have some plausibly unintuitive thought experiment like this, and my current view is that you have still not made a substantive positive claim for non-offsettability / negative lexical utilitarianism beyond broadly “here is this unintuitive result about total utilitarianism”, and I think an additional claim of “why is this the least unintuitive result / the one that we should accept out of all unintuitive options” would be helpful for readers, otherwise I agree more with your section “not a proof” than your topline metaphysical claim (and indeed your title “Utilitarians Should Accept that Some Suffering Cannot be “Offset””).
The thought experiment:
I do actually think that the IHE should take this trade. But I think a lot of my pushbacks apply even if you are uncertain about whether the IHE should or not.
For example, I think whether the thought experiment stipulates 70 minutes years or 70 years or 70 billion years of the worst possible suffering meaningfully changes how the thought experiment feels, but if lexicality was true we should not take the trade regardless of the duration. I know you’ve weakened your position on this, but it does open up more uncertainties of the kinds of tradeoffs you should be willing to make since the time aspect is continuous, and if this alone is sufficient to turn something from offsettable to not-offsettable then it could imply some weird things, like it seems a little weird to prioritise averting 1 case of a 1 hour cluster headache over 1 million cases of 5 minute cluster headaches.[2]
As Liv pointed out in a previous version of the draft, there are also versions of the thought experiment which I think people’s intuitive answer may reasonably change, shouldn’t if you think lexicality is true:-is the suffering / bliss happening in parallel or in series
-is there the option of taking the suffering on behalf of others (e.g. some might be more willing to take the trade if after you take the suffering, the arbitrary amounts of bliss can be transferred to other people as well, and not just yourself)On the view more generally:
I’m not sure you explicitly make this claim so if this isn’t your view let me know! But I think your version of lexicality doesn’t just say “one instance of NOS is so bad that we should avert this no matter how much happiness we might lose / give up as a result”, but it also says “one instance of NOS is so bad that we should prioritise averting this over any amount of BOS”[3]
Why I think formalising the threshold is helpful in understanding the view you are arguing for:
If the threshold is very high, e.g. “worst torture imaginable”, then you are (like total utilitarianism) in a situation where you are also having uncomfortable/unintuitive package deals where you have to endorse high amounts of suffering. For example, you would prefer to avert 1 hour of the worst torture imaginable in exchange for never having any more happiness and positive value, but also actively produce arbitrarily high amounts of BOS.
My understanding of your view is that given a choice of living in series:
A) 70 years of NOS, followed by 1 trilion positive happy human lives and 1 trillion happy pig lives, or
B) [70 years minus 1 hour of NOS], followed by 1 trillion unhappy humans who are living at BOS, followed by 1 trillion pig lives that are living at the BOS,
you would prefer the latter. It’s not at all obvious to me that we should find this deal more acceptable or intuitive than what I understand is basically an extreme form of the Very Repugnant Conclusion. It’s also not clear to me that you have actually argued a world like world B would have to have “less relevant suffering” than world A (e.g. your footnote 24).If the threshold is lower, e.g. “broken hip”, or much lower, e.g. “suffering of shrimp that has not been electrically stunned”, then you while you might less unintuitive suffering package deals, but you end up functionally very similar to negative utilitarianism, where averting one broken leg outweighs, or saving 1 shrimp outweighs all other benefits.
Formalising the threshold:
Using your example of a specific, concrete case of extreme suffering: “a cluster headache [for one human] lasting for one hour”.
If this uncontroversially crosses the non-offsetable threshold for you, consider how you’d view the headache if you hypothetically decrease the amount of time, the number of nociceptors that are exposed to the stimuli, how often they fire, etc until you get to 0 on some or all variables. This feels pretty continuous! And if you think there should be a discontinuity that isn’t explained by this, then it’d be helpful to map out categorically what it entails. For example, if someone is boiled alive[4] this is extreme suffering, because suffering involving extreme heat, confronting your perceived impending doom, loss of autonomy or some combination of the above. But you might still probably need more than this because not all suffering involving extreme heat or suffering involving loss of autonomy is necessarily extreme, and it’s not obvious how this maps onto e.g. cluster headaches. Or you might bite the bullet on “confronting your impending doom”, but this might be a pretty different view with different implications etc.
On “Continuity and the Location of the Threshold”
The radical implications (insofar as you think any of this is radical) aren’t at the threshold but in the approach to it. The compensation schedule growing without bound (i.e., asymptotically) means that some sub-threshold suffering would require 10^(10^10) happy lives to offset, or 1000^(1000^1000). (emphasis added)
============
This arbitrariness diminishes somewhat (though, again, not entirely) when viewed through the asymptotic structure. Once we accept that compensation requirements grow without bound as suffering intensifies, some threshold becomes inevitable. The asymptote must diverge somewhere; debates about exactly where are secondary to recognizing the underlying pattern.It’s not clear that we have to accept the compensation schedule as growing asymptotically? Like if your response to “the discontinuity of tradeoffs caused by the lexical threshold does not seem to be well justified” is “actually the radical part isn’t the threshold, it’s because of the asymptotic compensation schedule”, then it would be helpful to explain why you think the asymptotic compensation schedule is the best model, or preferable to e.g. a linear one.
For example, suppose a standard utilitarian values converting 10 factory farmed pig lives to 1 happy pig life to 1 human life similarly, and they also value 1E4 happy pig lives to 1E3 human lives.
Suppose you are deeply uncertain about whether a factory farmed pig experiences NOS because it’s very close to the threshold of what you think constitutes extreme / NOS suffering.If the answer is yes, then converting 1 factory farmed pig to a happy pig life should trade off against arbitrarily high numbers of human lives. But according to the asymptotic compensation schedule, if the answer is no, then you might need 10^(10^10) human lives to offset a happy pig life. But either way, it’s not obvious to the standard utilitarian why they should value 1 case of factory farmed pig experience this much!
Other comments:
In other words, let us consider a specific, concrete case of extreme suffering: say a cluster headache lasting for one hour.
Here, the lexical suffering-oriented utilitarian who claims that this crosses the threshold of in-principle compensability has much more in common with the standard utilitarian who thinks that in principle creating such an event would be morally justified by TREE(3) flourishing human life-years than the latter utilitarian has with the standard utilitarian who claims that the required compensation is merely a single flourishing human life-month.
I suspect this intended to be illustrative, but I would be surprised if there were many, if any standard utilitarians who would actually say that you need TREE(3)[5] flourishing human life years to offset a cluster headache lasting 1 hour, so this seems like a strawman?
Like it does seem like the more useful Q to ask is something more like:
Does the lexical suffering-oriented utilitarian who claims that this crosses the threshold of in-principle compensability have more in common with the standard utilitarian who thinks the event would be morally justified by 50 flourishing human life years (which is already a lot!), than that latter utilitarian has with another standard utilitarian who claims the required compensation is a single flourishing life month?
Like 1 month : TREE(3) vs. TREE(3) : infinity seems less likely to map to the standard utilitarian view than something like 1 month : 50 years vs. 50 years : infinity.
Thanks again for the post, and all the discussions!- ^
I’m also friends with Aaron and have already had these discussions with him and other mutual friends in other contexts and so have possibly made less effort into making sure the disagreements land as gently as possible than I would otherwise. I’ve also spent a long time on the comment already so have focussed on the disagreements rather than the parts of the post that are praiseworthy.
- ^
To be clear I find the time granularity issue very confusing personally, and I think it does have important implications for e.g. how we value extreme suffering (for example, if you define extreme suffering as “not tolerable even for a few seconds + would mark the threshold of pain under which many people choose to take their lives rather than endure the pain”, then much of human suffering is not extreme by definition, and the best way of reaching huge quantities of extreme suffering is by having many small creatures with a few seconds of pain (fish, shrimp, flies, nematodes). However, depending on how you discount for these small quantities of pain, it could change how you trade off between e.g. shrimp and human welfare, even without disagreements on likelihood of sentience or the non-time elements that contribute to suffering.
- ^
Here I use extreme suffering and non-offsetable suffering interchangeably, to mean anything worse than the lexical threshold, and thus not offsetable, and barely offsetable suffering to mean some suffering that is as close to the lexical threshold as possible but considered offsetable. Credit to Max’s blog post for helping me with wording some of this, though I prefer non-offsetable over extreme as this is more robust to different lexical thresholds).
- ^
to use your example
- ^
I don’t even have the maths ability to process how big this is, I’m just deferring to Wikipedia saying it’s larger than g64
Speaking just for myself (RE: malaria), the topline figures include adjustments for various estimates around how much USAID funding might be reinstated, as well as discounts for redistribution / compensation by other actors, rather than forecasting an 100% cut over the entire time periods (which was the initial brief, and a reasonable starting point at the time but became less likely to be a good assumption by the time of publication).
My 1 year / 5 year estimates without these discounts are approx. 130k to 270k and 720k to 1.5m respectively.
You can, of course, hold that insects don’t matter at all or that they matter infinitely less than other things so that we can, for all practical purposes, ignore their welfare. Certainly this would be very convenient. But the world does not owe us convenience and rarely provides us with it. If insects can suffer—and probably experience in a week more suffering than humans have for our entire history—this is certainly worth caring about. Plausibly insects can suffer rather intensely. When hundreds of billions of beings die a second, most experiencing quite intense pain before their deaths, that is quite morally serious, unless there’s some overwhelmingly powerful argument against taking their interests seriously.
If you replace insects here with mites doesn’t your argument basically still apply? A 10 sec search suggests that mites are plausibly significantly more numerous than insects. When you say “they’re not conscious”, is this coming from evidence that they aren’t, or lack of evidence that they are, and would you consider this an “overwhelmingly powerful argument”?
RE: inflation adjusted this just means that we’re using the value of USD in 2025 rather than at the time the conditional gift is due; $1000 in 2000 is worth like $1200 now for example.
Thanks for the catch! It was meant to be the same link as above; fixed
Appreciate this! There are a decent amount happening; can you DM me with a bit more info about yourself / what you’d be willing to help with?
The claim isn’t that your answers don’t fit your definitions/methdologies, but that given highly unintuitive conclusions, one should more strongly consider questioning the methodology / definitions you use.
For example, the worst death imaginable for a human is, to a first approximation, capped at a couple of minutes of excruciating pain (or a couple of factors of this), since you value excruciating pain at 10,000 times as bad as the next category, and say that by definition excruciating pain can’t exist for more than a few minutes. But this methodology will be unlikely to accurately capture a lot of extremely bad states of suffering that humans can have. On the other hand, it is much easier to scale even short periods of excruciating suffering with high numbers of animals, especially when you’re happy to consider ~8 million mosquitos killed per human life saved by a bednet—I don’t have empirical evidence to the contrary, but this seems rather high.
Here’s another sense check to illustrate this (please check if I’ve got the maths right here!):
-GiveWell estimate “5.53 deaths averted per 1000 children protected per year” or 0.00553 lives saved per year of protection for a child, or 1 life saved per 180.8 children protected per year.
-They model 1.8 children under each bednet, on average.This means it requires approximately 100 bednets over the course of 1 year to save 1 life/~50 DALYs.
At your preferred rate of 1 mosquito death per hour per net[1] this comes to approximately 880,000 mosquito deaths per life saved,[2] which is
3 OOMs1 OOM lower than the ~8 million you would reach if you do the “excruciating pain” calculation, assuming your 763x claim is correct[3]
(I may not continue engaging on this thread due to capacity constraints, but appreciate the responses!)- ^
Here I make no claims about the reasonableness of 1 mosquito per hour killed by the net as I don’t have any empirical data on this / I’m more uncertain than Nick is but also note that he has more relevant experience than I do here.
- ^
180.8/1.8 * 24* 365 = 879,893
- ^
Assuming 763x GiveWell is correct, a tradeoff of 14.3 days of mosquito excruciating pain (MEP) for 1 happy human life, 2 minutes of MEP per mosquito, this requires a tradeoff of 7.9 million mosquitos killed for one human life saved.
763*(14.3*24*60)/2 = 7,855,848
- ^
Don’t have a lot of details to share right now but there are a bunch of folks coordinating on things to this effect—though if you have ideas or suggestions or people to put forward feel free to DM!
The values I provide are not my personal best guesses for point estimates, but conservative estimates that are sufficient to meaningfully weaken your topline conclusions. In practice, even the assumptions I just listed would be unintuitive to most if used as the bar!
I agree “what fits intuition” is often a bad way of evaluating claims, but this is in context of me saying “I don’t know where exactly to draw the line here, but 14.3 mosquito days of excruciating suffering for one happy human life seems clearly beyond it.”
It seems entirely plausible that a human might take a tradeoff of 100x less duration (3.5 hours * 100 is ~14.5 days), and also value human:mosquito tradeoff at >100x. It wouldn’t be difficult to suggest another OOM in both directions for the same conclusion.The main thing I’m gesturing at is that for a conclusion as unintuitive as “2 mosquito weeks of excruciating suffering cancels out 1 happy human life”, I think it’s reasonable to consider that there might be other explanations, including e.g. underlying methodological flaws (and in retrospect perhaps inconsistent isn’t the right word, maybe ‘inaccurate’ is better).
For example, by your preferred working definition of excruciating pain, it definitionally can’t exist for more than a few minutes at a time before neurological shutdown. I think this isn’t necessarily unreasonable, but there might be failure modes in your approach when basically all of your BOTECs come down to “which organisms have more aggregate seconds of species-adjusted excruciating pain”.
I estimate 14.3 mosquito-days of excruciating pain neutralise the benefits of the additional human welfare from saving 1 life under GW’s moral weights.
Makes sense—just to clarify:
My previous (mis)interpretation of you suggesting 11minutes of MEP trading off 1 day of fully healthy human life would indicate a tradeoff of 11 / (24*60) = 0.0076.Your clarification is that 14.3 mosquito-days trades off against 1 life:
assuming 1 life as 50 DALYs this is 14.3 / (50*365.25) = 0.00078
So it seems like my misinterpretation was ~10x overvaluing the human side compared to your true view?I understand that may seem very little time, but I do not think it can be dismissed just on the basis of seeming surprising. I would say one should focus on checking whether the results mechanistically follow from the inputs, and criticing these:
My view is probably something like:
”I think on the margin most people should be more willing to entertain radical seeming ideas rather than intuitions given unknown unknowns about moral catastrophes we might be contributing to, but I also think the implicit claim[1] I’m happy to back here is that if your BOTEC spits out a result of “14.3 mosquito days of excruciating pain trades off with 50 human years of fully healthy life” then I do expect on priors that some combination of your inputs / assumptions / interpretation of the evidence etc have lead to a result that is likely many factors (if not OOMs) off the true value (if we magically found out what it was (and I think such a surprising result should also prompt similar kinds of thoughts on your end!)). I’ll admit I don’t have a strong sense of how to draw a hard line here, but I can imagine for this specific case that I might expect the tradeoff for humans is closer to 3.5 hours of excruciating pain vs a life, and that I value / expect the human capacity for welfare to be >100x that of a mosquito. If you believe both of those to be true then you’d reject your conclusion.
Another thing to consider might be something like “the way you count/value excruciating pain in humans vs in animals is inconsistent in a way that systematically gives results in favour of animals”
I don’t have too much to offer here in terms of this—I just wanted to know what the implied trade-off actually was and have it spelled out.
Gotcha RE: 23.9secs / 11mins, thanks for the clarification!
Looking at this figure you are trading off 7910000 * 2 minutes of MEP for a human death averted, which is 15820000 minutes, which is ~30 mosquito years[1] of excruciating pain trading off for 50 human years of a practically maximally happy life.
Is this a correct representation of your views?
(Btw just flagging that I think I edited my comment as you were responding to it RE: 1.3~37 trillion figures, I realised I divided by 2 instead of by 120 (minutes instead of seconds).)
- ^
7910000 * 2 / 60 / 24 / 365.25 = 30.08
- ^
TL;DR
I think you are probably at least a few OOMs off with these figures, even granting most of your assumptions, as this implies (iiuc) ~8 million mosquito deaths per human death averted.
At 763x GiveWell, a tradeoff of 14.3 days of mosquito excruciating pain (MEP) for 1 happy human life, 2 minutes of MEP per mosquito, a $41 million dollar grant in DRC, and $5100 per life saved, this implies 7.9 million mosquitos killed per human life saved, and that the grant will kill ~63 billion mosquitos.[1]
EDIT: my initial estimates were initially based on 11 mosquito minutes of excruciating pain neutralising 1 day of human life as stated in the text. This was incorrect because I misinterpreted the text. The true value that this post endorses is approximately a factor of 10 in the direction of the mosquito side of the tradeoff (i.e. the equivalent of ~68 mosquito seconds of excruciating pain neutralising 1 day of human life, and ~8million mosquito deaths per human death averted by bednets). I have edited the topline claim accordingly.
============
[text below is no longer accurate / worth reading; see above]A quick sense check using your assumptions and numbers (I typed this up quickly so might have screwed up the maths somewhere!)When you say:”1 day of [a practically maximally happy life] would be neutralised with 23.9s of excruciating pain.and”As a result, 11.0 min of excruciating pain would neutralise 1 day of a practically maximally happy life”I’m assuming you mean “23.9 mosquito seconds of excruciating pain” and “11.0 mosquito minutes of excruciating pain” trading off against 1 human day of a practically maximlly happy life (please correct me if I’m misunderstanding you!)At 763 times as much harm to mosquitos to humans, ~50 DALYs per life saved, and 11min (or 23.9 seconds) of MEP, this implies you are suggesting bednets are causing something like 333 million ~ 9 billion seconds of MEP per human death averted.[2]Using your figures of2 minutes of excruciating pain per mosquitokilled this gives a range of 3million~77 million mosquito deaths per human death averted in order for your 763x claim to be correct.[3]Using your stated figures of $41 million and $5100 per life for the GW grant, this implies you think the grant will lead to somewhere between 22~616 billion mosquito deaths in DRC alone.[4]For context, this source estimates global mosquito population as between110 trillion and ‘in the quadrillions’.- ^
763*(14.3*24*60)/2 = 7,855,848
41 million / 5100 * 763*(14.3*24*60)/2 = 63,154,856,470.6 - ^
365.25*50*11*60*763 = 9,196,629,750
365.25*50*23.9*763 = 333,029,471.25 - ^
333029471 / 120 = 2,775,245.59
9196629750 / 120 = 76,638,581.25 - ^
41 million / 5100 * 2,775,245.59 = 22,310,797,880.4
41 million / 5100 * 76,638,581.25 = 616,114,084,559
- ^
I didn’t catch this post until I saw this comment, and it prompted a response. I’m not well calibrated on how much upvotes different posts should get,[1] but personally I didn’t feel disappointed that this post wasn’t on the front page of the EA Forum, and I don’t expect this is a post I’d share with e.g., non-vegans who I’d discuss the meat eater problem with.[2]
- ^
I’m assuming you’re talking about the downvotes, rather than the comments? I may be mistaken though.
- ^
This isn’t something I’d usually comment because I do think the EA Forum should be more welcoming on the margin and I think there are a lot of barriers to people posting. But just providing one data point given your disappointment/surprise.
- ^
The ethical vegan must therefore decide whether their objection is to animals dying or to animals living.
One might object to animal suffering, rather than living/dying. So a utilitarian might say factory farming is bad because of the significantly net-negative states that animals endure while alive, while being OK with eating meat from a cow that is raised in a way such that it is living a robustly net positive life, for example.[1]
If you’re really worried about reducing the number of animal life years, focus on habitat destruction—it obviously kills wildlife on net, while farming is about increasing lives.
This isn’t an obvious comparison to me, there are clear potential downsides of habitat destruction (loss of ecosystem services) that don’t apply to reducing factory farming. There are also a lot of uncertainties around impacts of destroying habitats—it is much harder to recreate the ecosystem and its benefits than to re-introduce factory farming if we are wrong in either case. One might also argue that we might have a special obligation to reduce the harms we cause (via factory farming) than attempt habitat destruction, which is reducing suffering that exists ~independently of humans.
...the instrumentalization of animals as things to eat is morally repugnant, so we should make sure it’s not perpetuated. This seems to reflect a profound lack of empathy with the perspective of a domesticate that might want to go on existing. Declaring a group’s existence repugnant and acting to end it is unambiguously a form of intergroup aggression.
I’m not sure I’m understanding this correctly. Are you saying animals in factory farms have to be able to indicate to you that they don’t want to go on existing in order for you to consider taking action on factory farming? What bar do you think is appropriate here?
If factory farming seems like a bad thing, you should do something about the version happening to you first.
If there were 100 billion humans being killed for meat / other products every year and living in the conditions of modern factory farms, I would most definitely prioritise and advocate for that as a priority over factory farming.
The domestication of humans is particularly urgent precisely because, unlike selectively bred farm animals, humans are increasingly expressing their discontent with these conditions, and—more like wild animals in captivity than like proper domesticates—increasingly failing even to reproduce at replacement rates.
Can you say more about what you mean by “the domestication of humans”? It seems like you’re trying to draw a parallel between domesticated animals and domesticated humans, or modern humans and wild animals in captivity, but I’m not sure what the parallel you are trying to draw is. Could you make this more explicit?
This suggests our priorities have become oddly inverted—we focus intense moral concern on animals successfully bred to tolerate their conditions, while ignoring similar dynamics affecting creatures capable of articulating their objections...
This seems like a confusing argument. Most vegans I know aren’t against factory farming because it affects animal replacement rates. It’s also seems unlikely to me that reduced fertility rates in humans is a good proxy/correlate for the amount of suffering that exists (it’s possible that the relationship isn’t entirely linear, but if anything, historically the opposite is more true—countries have reduced fertility rates as they develop and standards of living improve). It’s weird that you use fertility rates as evidence for human suffering but seem to have a extremely high bar for animal suffering! Most of the evidence I’m aware of would strongly point to factory farmed animals in fact not tolerating their conditions well.
...who are moreover the only ones known to have the capacity and willingness to try to solve problems faced by other species.
This is a good argument to work on things that might end humanity or severely diminish it’s ability to meaningfully + positively affect the world. Of all the options that might do this, where would you rank reduced fertility rates?
- ^
Though (as you note) one might also object to farming animals for food for rights-based rather than welfare-based reasons.
- ^
Reposting from LessWrong, for people who might be less active there:[1]
TL;DRFrontierMath was funded by OpenAI[2]
This was not publicly disclosed until December 20th, the date of OpenAI’s o3 announcement, including in earlier versions of the arXiv paper where this was eventually made public.
There was allegedly no active communication about this funding to the mathematicians contributing to the project before December 20th, due to the NDAs Epoch signed, but also no communication after the 20th, once the NDAs had expired.
OP claims that “I have heard second-hand that OpenAI does have access to exercises and answers and that they use them for validation. I am not aware of an agreement between Epoch AI and OpenAI that prohibits using this dataset for training if they wanted to, and have slight evidence against such an agreement existing.”
Seems to have confirmed the OpenAI funding + NDA restrictions
Claims OpenAI has “access to a large fraction of FrontierMath problems and solutions, with the exception of a unseen-by-OpenAI hold-out set that enables us to independently verify model capabilities.”
They also have “a verbal agreement that these materials will not be used in model training.”
Edit (19/01): Elliot (the project lead) points out that the holdout set does not yet exist (emphasis added):As for where the o3 score on FM stands: yes I believe OAI has been accurate with their reporting on it, but Epoch can’t vouch for it until we independently evaluate the model using the holdout set we are developing.[3]
Edit (24/01):
Tamay tweets an apology (possibly including the timeline drafted by Elliot). It’s pretty succinct so I won’t summarise it here! Blog post version for people without twitter. Perhaps the most relevant point:OpenAI commissioned Epoch AI to produce 300 advanced math problems for AI evaluation that form the core of the FrontierMath benchmark. As is typical of commissioned work, OpenAI retains ownership of these questions and has access to the problems and solutions.
Nat from OpenAI with an update from their side:
We did not use FrontierMath data to guide the development of o1 or o3, at all.
We didn’t train on any FM derived data, any inspired data, or any data targeting FrontierMath in particular
I’m extremely confident, because we only downloaded frontiermath for our evals *long* after the training data was frozen, and only looked at o3 FrontierMath results after the final announcement checkpoint was already picked .
============
Some quick uncertainties I had:
What does this mean for OpenAI’s 25% score on the benchmark?
What steps did Epoch take or consider taking to improve transparency between the time they were offered the NDA and the time of signing the NDA?
What is Epoch’s level of confidence that OpenAI will keep to their verbal agreement to not use these materials in model training, both in some technically true sense, and in a broader interpretation of an agreement? (see e.g. bottom paragraph of Ozzi’s comment).
In light of the confirmation that OpenAI not only has access to the problems and solutions but has ownership of them, what steps did Epoch consider before signing the relevant agreement to get something stronger than a verbal agreement that this won’t be used in training, now or in the future?
- ^
Epistemic status: quickly summarised + liberally copy pasted with ~0 additional fact checking given Tamay’s replies in the comment section
- ^
arXiv v5 (Dec 20th version) “We gratefully acknowledge OpenAI for their support in creating the benchmark.”
- ^
See clarification in case you interpreted Tamay’s comments (e.g. that OpenAI “do not have access to a separate holdout set that serves as an additional safeguard for independent verification”) to mean that the holdout set already exists
I’ll say up front that I definitely agree that we should look into the impacts on worms a nonzero amount! The main reason for the comment is that I don’t think the appropriate bar for whether or not the project should warrant more investigation is whether or not it passes a BOTEC under your set of assumptions (which I am grateful for you sharing—I respect your willingness to share this and your consistency).
Again, not speaking on behalf of the team—but I’m happy to bite the bullet and say that I’m much more willing to defer to some deontological constraints in the face of uncertainty, rather than follow impartiality and maximising expected value all the way to its conclusion, whatever those conclusions are. This isn’t an argument against the end goal that you are aiming for, but more my best guess in terms of how to get there in practice.
Impartiality and hedonism often recommend actions widely considered bad in super remote thought experiments, but, as far as I am aware, none in real life.
I suspect this might be driven by it not being considered to be bad under your own worldview? Like it’s unsurprising that your preferred worldview doesn’t recommend actions that you consider bad, but actually my guess is that not working on global poverty and development for the meat eater problem is in fact an action that might be widely considered bad in real life for many reasonable operationalisations (though I don’t have empirical evidence to support this).[1]
I do agree with you on the word choices under this technical conception of excruciating pain / extreme torture,[2] though I think the idea that it ‘definitionally’ can’t be sustained beyond minutes does have some potential failure modes.
That being said, I wasn’t actually using torture as a descriptor for the screwworm situation, more just illustrating what I might consider a point of difference between our views, i.e. that I would not be in favour of allowing humans to be tortured by AIs even if you created a BOTEC showed that this caused net positive utils in expectation; and I would not be in favour of an intervention to spread the new world screwworm around the world, even if you created a BOTEC that showed it was the best way of creating utils—I would reject these at least on deontological grounds in the current state of the world.- ^
This is not to suggest that I think “widely considered bad” is a good bar here! A lot of moral progress came from ideas that initially were “widely considered bad”. Just suggesting this particular defence of impartiality + hedonism; namely that it “does not recommend actions widely considered bad in real life” seems unlikely to be correct—simply because most people are not impartial hedonists to the extent you are.
- ^
Neither of which were my wording!
- ^
Speaking for myself / not for anyone else here:
My (highly uncertain + subjective) guess is that each lethal infection is probably worse than 0.5 host-years equivalents, but the number of worms per host animal probably could vary significantly.
That being said, personally I am fine with the assumption of modelling ~0 additional counterfactual suffering for screwworms that are never brought into existence, rather than e.g. an eradication campaign that involves killing existing animals.
I’m unsure how to think about the possibility that the screwworm species which might be living significantly net positive lives such that it trumps the benefit of reduced suffering from screwworm deaths, but I’d personally prefer stronger evidence for wellbeing or harms on the worm’s end to justify inaction here (ie not look into the possibility/feasibility of this)Again, speaking only for myself—I’m not personally fixated on either gene drives or sterile insect approaches! I am also very interested in finding out reasons to not proceed with the project, find alternative approaches, which doesn’t preclude the possibility that the net welfare of screwworms should be more heavily weighed as a consideration. That being said, I would be surprised if something like “we should do nothing to alleviate host animal suffering because their suffering can provide more utils for the screwworm” was a sufficiently convincing reason to not do more work / investigation in this area (for nonutilitarian reasons), though I understand there are a set of assumptions / views one might hold that could drive disagreement here.[1]
- ^
If a highly uncertain BOTEC showed you that torturing humans would bring more utility to digital beings than the suffering incurred on the humans, would you endorse allowing this? At what ratio would you change your mind, and how many OOMs of uncertainty on the BOTEC would you be OK with?
Or—would you be in favour of taking this further and spreading the screwworm globally simply because it provides more utils, rather than just not eradicating the screwworm?
- ^
Going along with ‘subjective suffering’, which I think is subject to the risks you mention here, to make the claim that the compensation schedule is asymptotic (which is pretty important to your topline claim RE: offsetability) I think you can’t only be uncertain about Ben’s claim or “not bite the bullet”, you have to make a positive case for your claim. For example:
Like, is it correct that absent some categorical lexical property that you can identify, “the way out” is very dependent on you being able to support the claim “near the threshold a small change in i_s --> large change in subjective experience”?
So I suspect your view is something like: “as i_s increases linearly, subjective experience increases in a non-linear way that approaches infinity at some point, earlier than 70 years of torture”?[1] If so, what’s the reason you think this is the correct view / am I missing something here?
RE: the shape of the asymptote and potential risks of conflating empirical uncertainties
I think this is an interesting graph, and you might feel like you can make some rough progress on this conceptually with your methodology. For example, how many years of bliss would the IHE need to be offered to be indifferent between the equivalent experience of:
1 person boiled alive for an hour at 100degC
Change the time variable to 30mins / 10min / 5min / 1 minute / 10 seconds / 1 seconds of the above experience[2]
Change the exposure variable to different % of the body (e.g. just hand / entire arm / abdomen / chest / back, etc)
(I would be separately interested in how the IHE would make tradeoffs if making a decision for others and the choice was about: 10/10000/1E6 people having ^all the above time/exposure variations, rather than experiencing it themselves, but this is further away from your preferred methodology so I’ll leave it for another time)
And then plotted the graph instrument with different combinations of the time/exposure/temperature variables. This could help you either elucidate the shape of your graph, or the location of uncertainties around your time granularity.
The reason I chose this > cluster headaches is partly because you can get more variables here, but if you wanted just a time comparison then cluster headaches might be easier.
But I actually think temperature is an interesting one to consider for multiple additional reasons. For example, it’s interesting as a real life example where you have perceived discontinuities of responses to continuous changes in some variable. You might be willing to tolerate 35 degree water for a very long time but as soon as it gets to 40+ how tolerable it is very rapidly decreases in a way that feels like a discontinuity.
But what’s happening here is that heat nociceptors activate at a specific temperature (say e.g. 40degC). So you basically just aren’t moving up the suffering instrument below that temperature ~at all, and so the variable you’d change is “how many nociceptors do you activate” or “how frequently do they fire” (all of which are modulated by temperature and amount of skin exposed), and that rapidly goes up as you reach / exceed 40degC.[3]
And so if you naively plot “degrees” or “person-hours” at the bottom, you might think subjective suffering is going up exponentially compared to a linear increase in i_s, but you are not accounting for thresholds in i_s activation, or increased sensitisation or recruitment of nociceptors over time, which might make the relationship look much less asymptotic.[4]
And empirical uncertainties about exactly how these kinds of signals work and are processed I think is a potentially large limiting factor for being able to strongly support “as i_s increases linearly, subjective experience increases in a non-linear way that approaches infinite bliss at some point”[5]
I obviously don’t think it’s possible to have all the empirical Qs worked out for the post, but I wanted to illustrate these empirical uncertainties because I think even if I felt it would be correct for the IHE to reject some weaker version of the torture-bliss trade package[6], it would still be unclear that this reflected an asymptotic relationship, rather than just e.g. a large asymmetry between sensitivity to i_s and i_h, or maximum amount of i_s and i_h possible. I think these possibilities could satisfy the (weaker) IHE thought experiment while potentially satisfying lexicality in practice, but not in theory. It might also explain why you feel much more confident about lexicality WRT happiness but not intra-suffering tradeoffs, and if you put the difference of things like 1E10 vs 1E50 vs 10^10^10 down to scope insensitivity I do think this explains a decent portion of your views.
And indeed 1 hour of cluster headache
I’m aware that approaching 1 second is getting towards your uncertainty for the time granularity problem, but I think if you do think 1 hour of cluster headache is NOS then these are the kinds of tradeoffs you’d want to be able to make (and back)
There are other heat receptors at higher temperatures but to a first approximation it’s probably fine to ignore
Because of uncertainty around how much i_s there actually is
Also worth flagging that RE: footnote 26, where you say:
You should also expect this to apply to the suffering instrument; there is also some upper bound for all of these variables
e.g. 1E10 years, rather than infinity, since I find that pretty implausible and hard to reason about