I haven’t watched the talk, but I have just left a long comment on original article, Logarithmic Scales of Pleasure and Pain
Here’s the TL;DR of my comment:
I don’t think this post provides an argument that we should interpret pleasure/pain scales as logarithmic. What’s more, whether or not this is true is not necessary for post’s practical claim—which is roughly that “the best/worst things are much better/worse than most people think”.
Here’s the link to my comment. I meant to write up my thoughts 3 months ago when the original article was posted, but never got around to it.
TL;DR I don’t think this post provides an argument that we should interpret pleasure/pain scales as logarithmic. What’s more, whether or not this is true is not necessary for post’s practical claim - which is roughly that “the best/worst things are much better/worse than most people think”.
Thanks for writing this up; sorry not to have got around to it sooner.
I think there are two claims that need be to carefully distinguished.
(A) that the relationship between actual and reported pleasure(/pain) is not linear but instead follows some other relationship, e.g. a logarithmic function where a 1-unit increase in self-reported pleasure represents a ten-fold increase in actual pleasure.
(B) whether the best/worst experiences that some people have are many times more intense than other people (who haven’t had those experiences) assume they are.
I point this out because you say
the best way to interpret pleasure and pain scales is by thinking of them as logarithmic compressions of what is truly a long-tail. The most intense pains are orders of magnitude more awful than mild pains (and symmetrically for pleasure). [...]
Since the bulk of suffering is concentrated in a small percentage of experiences, focusing our efforts on preventing cases of intense suffering likely dominates most utilitarian calculations.
The idea, I take it, is that if we thought the relationship between self-reported and actual pleasure(/pain) was linear, but it turns out it was logarithmic, then the best(/worse) experiences are much better(/worse) that we expected they were because we’d be using the wrong scale.
However, I don’t think you’ve provided (any?) evidence that (A) is true (or that it’s true but we thought it was false). What’s more, (B) is actually quite plausible by itself and you can claim (B) is true without needing (A) to be true.
Let me unpack this a bit.
(A) is a claim about how people choose to use self-reported scales. The idea is that people have experiences of a certain intensity they can distinguish for themselves in cardinal units, e.g. you can tell (roughly) how many perceivable increments of pleasure one experience gives you vs the next. A further question is how people choose to report these intensities when people give them a scale, say a 0-10 scale.
This reporting could be linear, logarithmic, etc. Indeed, people could choose to report anyway they want to. It seems most likely people use a linear reporting function because that’s the helpful way to use language to convey how you feel to the person asking you how you feel. I won’t get stuck into this here, but I say more about it in my PhD thesis at chapter 4, section 4.
Hence, on your pleasure/pain scales when you contrast ‘intuitive’ to ‘long-tailed’ scales, what I think you mean is that the intuitive scale is really ‘reported’ pleasure and the ‘long-tailed’ scale is ‘actual’ pleasure i.e. your claim is that there is a logarithmic relationship between reported and actual pleasure. I note you don’t provide evidence that people generally use scales this way. Regarding the stings scale, that just is a logarithmic scale by construction, where going from a 1 to 2 on the scale represent a 10 times increase in actual pain. That doesn’t show we have to report pleasure using log scales, or that we do, just that the guy who constructed that scale chose to build it that way. In fact, we can only use log pleasure/pain scales if we can somehow measure pain/pleasure on an arithmetic scale in the first place, and then convert from those numbers to a log scale, which requires that people are able to construct arithmetic pleasure/pain scales anyway.
(You might wonder if people can know, on an arithmetic scale, how much pleasure/pain they feel. However, if people really have no idea about this, then it follows they can’t intelligibly report their pleasure/pain at all, whatever scale they are using.)
Regarding (B), note that claims such as “the worst stings are 1000x worse than the average person expects they are” can be true without it needing to be the case that people have misunderstood how other people tend to use pleasure/pain scale. For instance, I could alternatively claim that the relationship between reported pleasure/pain and actual pain is linear, but that people’s predictions are just misinformed—e.g. torture is actually more worse than they thought. For comparison, if I claim “the heaviest building in the world weighs 1000x more than most people think it weighs” I don’t need to say anything about the relationship between reports of perceived weight and actual weight.
Hence, if you want to claim “experiences X and Y are much better/worse than we thought”, just claim that without getting into distracting stuff about reported vs actual scale use!
(P.S. The Fechner-Weber stuff is a red-herring: that’s about the relationship between increases in an objective quantity and in subjective perceptions of increases in that quantity. That’s different from talking about the relationship between a reported subjective quantity and the actually experienced subjective quantity. Plausibly the former relationship is logarithmic, but one shouldn’t directly infer from that that the latter relationship is logarithmic too).
Thanks for writing this up—I found it helpful. I’m just trying to summarise this in my head and have some questions.
To get the claim that the best interventions are much better than the rest, don’t you need to claim that interventions follow a (very) fat-tailed distribution, rather than the claim there are lots of interventions? If they were normally distributed, then (say) bednets would be a massive outlier in terms of effectiveness, right? Do you (or does someone else) have an argument that interventions should be heavy-tailed?
About predicting effectiveness, it seems your conclusion should be one of epistemic modesty relating to hard-to-quantify interventions, not that we should never think they are better. The thought seems to be people are bad at predicting interventions in general, but we can at least check for the easy-to-quantify predictions to overcome our bias; whereas we cannot do this for the hard ones. It seems the implication is that we should discount the naive cost-effectiveness of systemic interventions to account for this bias. But ‘sophisticated’ estimates of cost-effectiveness for hard-to-quantify interventions might still turn out to be better than those for estimates of simple interventions. Hence it’s a note of caution about estimations, not a claim that, in fact, hard to quantify interventions are (always or generally) less cost-effective.
Okay, that makes more sense. You could have a systematic review which unambiguously pointed in one conclusion, you perhaps you should add something like you’ve already said, i.e. that you’re just trying to report the finding without drawing an overall conclusion (although I don’t know why someone would avoid drawing an overall conclusion if they thought there was one). And again, it would be helpful to add that there doesn’t seem to be a consensus on this point (and possibly that it ‘falls between the gaps’ of various disciplines).
A couple of very general suggestions to aid the reader—I’ve only read the summary. Given the length of the post, could you add a line or two to your summary to say what conclusion you’re arguing for? Reading the summary, I get what the topic is, but not what your take is. It would also be good if you could orientate the reader as to where this fits in the literature, e.g. what the consensus in the field is and whether you are agreeing with it.
I also thought the World Happiness Survey looked flat but it has gone up. 0.25/10 is not be sniffed at.
WHS has a much smaller sample size—around 1,000 per year—whereas the Office of National Statistics asks around 300,000 people a year. ONS data also shows a rise of about 0.3/10 between 2011 and 2019 (https://www.ons.gov.uk/peoplepopulationandcommunity/wellbeing/datasets/headlineestimatesofpersonalwellbeing)
I should just flag I’ve put a post on this topic on the forum too, albeit one that doesn’t directly reply to John but addressed many of the points raised in the OP and in the comments.
I will make a direct reply to John on one issue. He suggests we should:
We quantify importance to neglectedness ratios for different problems.
(that’s supposed to be in quotation marks but they seem not be working).
I don’t think this is a useful heuristic and I don’t see problems which are higher scale:neglectedness should be higher priority. There are two issues with this. One is that problems with no resources going towards them will score infinitely highly on this schema. Another is that delineating one ‘problem’ from another is arbitrary anyway.
Let’s illustrate what happens when we put these together. Suppose we’re talking about the cause of reducing poverty and, suppose further, it happens to be the case that it’s just as cost-effective to help one poor person as another. As a cause, quite a lot of money goes to poverty and let’s assume poverty scores badly (relative to our other causes) on this scale/neglectedness rating. I pick out person P, who is currently not receiving any aid and declare that ‘cause P’ – helping person P – is entirely neglected. Cause P now has infinite score on scale/neglectedness and suddenly looks very promising via this heuristic. This is perverse as, by stipulation, helping P is just as cost-effective any helping any other person in poverty.
Take technical AI safety research as an example. I’d have trouble directly estimating “How much good would we do by spending $1000 in this area”, or sanity checking the result. I’d also have trouble with “What % of this problem would we solve by spending another $100?”
Hmmm. I don’t really see how this is any harder, or different from, your proposed method, which is to figure out how much of the problem would be solved by increasing spend by 10%. In both cases you’ve got to do something like working out how much money it would take to ‘solve’ AI safety. Then you play with that number.
Thanks for this. That’s a good question. I think it partially depends on whether you agree with the above analysis. If you think it’s correct that, when we drill down into it, evaluating problems (aka ‘causes’) by S, N, and T is just equivalent to evaluating the cost-effectiveness of particular solutions (aka ‘interventions’) to those problems, then that settles the mystery of what the difference really is between ‘cause prioritisation’ and ‘intervention evaluation’ - in short, they are the same thing and we were confused if we thought otherwise. However, if someone thought there was a difference, it would be useful to hear what it is.
The further question, if cause prioritisation is just the business of assessing particular solutions to problems, is: what the best ways to go about picking which particular solutions to assess first? Do we just pick them at random? Is there some systemic approach we can use instead? If so, what is it? Previously, we thought we have a two-step method: 1) do cause prioritisation, 2) do intervention evaluation. If they are the same, then we don’t seem to have much of a method to use, which feels pretty dissatisfying.
FWIW, I feel inclined towards what I call the ‘no shortcuts’ approach to cause prioritisation: if you want to know how to do the most good, there isn’t a ‘quick and dirty’ way to tell what those problems are, as it were, from 30,000 ft. You’ve just got to get stuck in and (intuitively) estimate particular different things you could do. I’m not confident that we can really assess things ‘at-the-problem-level’ without looking at solutions, or that we can appeal to e.g. scale or neglectedness by themselves and expect that to very much work. A problem can be large and neglecteded because its intractable, so can only make progress on cost-effectiveness by getting ‘into the weeds’ and looking at particular things you can do and evaluating them.
Thanks for putting this up here. One major and three minor comments.
First, and probably most importantly, I don’t see how this line of reasoning gets an asymmetry. If I understand it correctly, the idea is that people need to actually exist to have interests, so if people do or will exist, we can say existence will be good/bad for them. But that gets you to actualism, it seems, but not an asymmetry. If X would have a bad life, were X to exist, I take it we shouldn’t create X. But then why, if X were to have a good life, were X to exist, do we not to have reason to create X? You say you’re ‘agnostic’ about whether those who would have good lives have an interest in existing, but I don’t think you give a reason for this agnosticism, which would be the crucial thing to do.
Second, I didn’t really understand the explication of Meacham’s view—you said it ‘solves’ a cavalcade of issues on pop ethics but didn’t spell out how it actually solves them. I’m also not sure if your view is different from Meacham’s and, if so, how.
Third, it would be useful if you could spell out what you take (some of) the practical implications of your view to be.
Fourth, because you get stuck into the deep end quite quickly, I wonder if you should add a note that this is a relatively more ‘advanced’ forum post.
To illustrate, suppose we have two (finite or infinite) sequences representing the amount of suffering in our sphere of influence at each point in time, but we make earlier progress on moral circle expansion in one so the amount of suffering in our sphere of influence is reduced by 1 at each step in that sequence compared to the other;
Just to say I really liked this point, which I think applies equally to focusing on the correct account of value (and opposed to who the value-bearers are, which is this point)
Is putting some non-trivial budget into cash prizes for arguments against what you do the only way to show you’re self-critical? Your statement suggests you believe something like that. But that doesn’t seem the only way to show you’re self-critical. I can’t think of any other organisation that have ever done that, so if it is the only way to show you’re self-critical, that suggests no organisation (I’ve heard of) is self-critical, which seems false. I wonder if you’re holding CEA to a peculiarly high standard; would you expect MIRI, 80k, the Gates Foundation, Google, etc. to do the same?
Despite your reservations, I think it would actually be very useful for you to input your best guess inputs (and its likely to be more useful for you to do it than an average EA, given you’ve thought about this more). My thinking is this. I’m not sure I entirely followed the argument, but I took it that the thrust of what you’re saying is “we should do uncertainty analysis (use Monte Carlo simulations instead of point-estimates) as our cost-effectiveness might be sensitive to it”. But you haven’t shown that GiveWell’s estimates are sensitive to a reliance on point estimates (have you?), so you haven’t (yet) demonstrated it’s worth doing the uncertainty analysis you propose after all. :)
More generally, if someone says “here’s a new, really complicated methodology we *could* use”, I think its incumbent on them to show that we *should* use it, given the extra effort involved.
Well, how about starting “Tinder for sparerooms”?
I note your main project is writing a book on longtermism. Would you like to see the EA movement going in a direction where it focuses exclusively, or almost exclusively, on longtermist issues? If not, why not?
To explain the second question, it would seem answering ‘no’ to the first question would be in tension with advocating (strong) longtermism.
shows a major problem
You mean, shows a major finding, no? :)
suggesting a violation of transitivity
The (normal) person-affecting response here is to say that options 1 and 3 and incomparable in value to 2 - existence is neither better than, worse than, or equally good as, non-existence for someone. However, if Sam exists necessarily, then 2 isn’t a option, so then we say 3 is better than 1. Hence, no issues with transitivity.
(ii) Society currently privileges those who live today above those who will live in the future; and
(iii) We should take action to rectify that, and help ensure the long-run future goes well.
Do you mean Necessitarians wouldn’t accept (iii) of the above? Necessitarians will agree with (ii) and deny (iii). (Not sure if this is what you were referring to).
I’m sympathetic to Necessitarianism, but I don’t know how fringe it is. It strikes me as the most philosophically defensible population axiology that rejects long-termism which leans me towards thinking the definition shouldn’t fall foul of it. (I think Hilary’s suggestion would fall foul of it, but yours would not).
An alternative minimal definition, suggested by Hilary Greaves (though the precise wording is my own), is that we could define longtermism as the view that the (intrinsic) value of an outcome is the same no matter what time it occurs.
Just to make a brief, technical (pedantic?) comment, I don’t think this definition would give you want you want. (Strict) Necessitarianism holds the only persons who matter are those who exist whatever we do. On such a view, the practical implication is, in effect, that only present people matter. The view is thus not longtermist on your chosen definition. However, Necessitarianism doesn’t discount for time per se (the discounting is contingent on time) and hence is longtermist on the quoted definition.