Say how much, not more or less versus someone else
Or: âUnderrated/âoverratedâ discourse is itself overrated.
BLUF: âX is overratedâ, âY is neglectedâ, âZ is a weaker argument than people thinkâ, are all species of second-order evaluations: we are not directly offering an assessment of X, Y, or Z, but do so indirectly by suggesting another assessment, offered by someone else, needs correcting up or down.
I recommend everyone cut this habit down ~90% in aggregate for topics they deem important, replacing the great majority of second-order evaluations with first-order evaluations. Rather than saying whether you think X is over/âunder rated (etc.) just try and say how good you think X is.
The perils of second-order evaluation
Suppose I say âI think forecasting is underratedâ. Presumably I mean something like:
I think forecasting should be rated this highly (e.g. 8â10 or whatever)
I think others rate forecasting lower than this (e.g. 5â10 on average or whatever)
So I think others are not rating forecasting highly enough.
Yet whether âForecasting is overratedâ is true or not depends on more than just âhow good is forecasting?â It is confounded by questions of which âothersâ I have in mind, and what their views actually are. E.g.:
Maybe you disagree with meâyou think forecasting is overratedâbut it turns out we basically agree on how good forecasting is. Our apparent disagreement arises because you happen to hang out in more pro-forecasting environments than I do.
Or maybe we hang out in similar circles, but we disagree in how to assess the prevailing vibes. We basically agree on how good forecasting is, but differ on what our mutual friends tend to really think about it.
(Obviously, you could also get specious agreement of two-wrongs-make-a-right variety: you agree with me forecasting is underrated despite having a much lower opinion of it than I do, because you assess third parties having an even lower opinion still)
These are confounders as they confuse the issue we (usually) care about: how good or bad forecasting is, not the inaccuracy of others nor in which direction they err re. how good they think forecasting is.
One can cut through this murk by just assessing the substantive issue directly. I offer my take on how good forecasting is: if folks agree with me, it seems people generally werenât over or under- rating forecasting after all. If folks disagree, we can figure outâin the course of figuring out how good forecasting isâwhether one of us is over/âunder rating it versus the balance of reason, not versus some poorly scribed subset of prevailing opinion. No phantom third parties to the conversation are neededâor helpful toâthis exercise.
In praise of (kind-of) objectivity, precision, and concreteness
This is easier said than done. In the forecasting illustration above, I stipulated âmarks out of tenâ as an assessment of the âtrue valueâ. This is still vague: if I say forecasting is â8/â10â, that could mean a wide variety of thingsâincluding basically agreeing with you despite you giving a different number to me. What makes something 8â10 versus 7â10 here?
It is still a step in the right direction. Although my â8/â10â might be essentially the same as your â7/â10â, there probably some substantive difference between 8â10 and 5â10, or 4â10 and 6â10. It is still better than second order evaluation, which adds another source of vagueness: although saying for myself forecasting is X/â10 is tricky, it is still harder to do this exercise on someone elseâs (or everyone elseâs) behalf.
And we need not stop there. Rather than some singular measure like âmarks out of 10â for âforecastingâ as a whole, maybe we have some specific evalution or recommendation in mind. Perhaps: âMost members of the EA community should have a Metaculus or Good Judgement account they forecast on regularlyâ, or âForecasting interventions are the best opportunities in the improving institutional decision-making cause areaâ, or âForecasting should pay well enough that skilled practitioners can realistically âgo proâ, vs. it remaining universally an amateur sportâ. Or whatever else.
We thus approach substantive propositions (or proposals), and can avoid a mire of a purely verbal disagreementâor vaguely adversarial vibing.
Caveats
(Tl;dr: Iâm right.)
Sometimes things arenât that ambiguous
The risk I highlight of âAlice thinks X is overrated, Bob thinks it is underratedâbut they basically agree on X, but disagree on what other people think about itâ can sometimes be remote. One example is if someone has taken the trouble to clearly and precisely spell out where they stand themselves. Just saying âIâd take the over/âunder on what they thinkâ could be poor epistemic sportsmanship (all too easy to criticise something specific whilst sheltering in generalities yourself), and could do to be more precise (how much over? etc.) but at least there is an actual difference, and you can be reliably placed to a region on the number line.
Another example is where you are really sure you are an outlier vs. ~ everyone else: you rate something so highly or lowly that ~ everyone elseâwhoever they areâis under/âoverrating it by your lights. This will typically be reserved for ones hottest, most extreme, and iconoclastic takes. In principle, this should be rare. In practice, it can be the prelude to verbal clickbait: âlooking after your kids is overratedâ better be elaborated with something at least as spicy as Caplanâs views on parenting, rather than some milquetoast climbdown along the lines of âparents should take care of themselves tooâ or whatever.
Even here, trying to say how much can be clearer if your view really is âa hell of a lotâ. âBuffy the Vampire Slayer is criminally underratedâ could merely mean I place it a cut above other ~naughties TV serials. Yet if I really think things like, âSeason 5 of Buffy alone places it on the highest summits of artistic achievement, and the work as a whole makes a similar contribution to television as Beethovenâs Grosse Fuge does to classical musicâ I should say so, such that listeners are clear in which ballpark I am in, and how far I am departing from common sense.
Updates and pricing in
Overrated/âunderrated can have a different goal than offering an overall assessment. It could instead be a means of introducing a new argument for or against X. E.g. perhaps what I could mean by âforecasting is underratedâ is something like âI have found a new consideration in favour of forecasting, so folksâwho are not aware of it yetâneed to update upwards from wherever they were beforehand.â
This is better, but still not great. (E.g.) âX is underrated because Râ at least gives a locus for discussion (R? ÂŹR?), but second-order considerations can still confound. Although R may be novel to the speaker, others may at least be dimly aware of it, or some R* nearby to it, so perhaps they have already somewhat âpriced inâ R for the all things considered assessment. âI think the strength of R pro/âcon X is under/âoverestimated by othersâ has the familiar problems outlined above.
Saying how muchâthe now familiar remedyâremains effective. (E.g.) âI think R drops the value of X by 5%/â50%/â99%â or whatever clearly signals the strength of consideration you are assigning to R, and sidesteps issues of trying to assess whether someone else (in the conversation or not) are aware of or are appropriately incorporating R into their deliberations.
Cadenza
As before, this greater precision is not a free lunch: it takes both more space on the page to write and more time in the brain to think through. Also as before, there are times when this extra effort is a waste. If I assert âTaylor Swift is overratedâ to my sister, and she asserts âBach is overrated [sic][1]â in turn, neither the subject matter warrantsânor the conversational purpose well-served byâa careful pseudo-quantitative quasi-objective disquisition into the musical merit of each. Low-res âLess/âmore than someone thinksâ remarks are also fine for a bunch of other circumstances. Usually unimportant ones.
Yet also as before, sometimes there is a real matter which really matters, sometimes we want our words to amount to substantial work not idle talk, and sometimes we at least aspire to be serious people striving to say something serious about something serious. For such Xs, it is rare for there to be disagreement about whether a given issue is relevant to X, ditto whether its direction is âproâ or âconâ X, but rather its magnitude: how much it counts âproâ or âconâ X, and so where the overall balance of reason lies re. X all things considered, where all the things to be considered are all various degrees of âkinda, but...â, which need to be all weighed together.[2]
In these cases that count, something like counting needs to be attempted in natural language, despite its inadequacy for the task. Yet although (e.g.) â8/â10â, âmaybe this cuts 20% off the overall value of Xâ (etc.) remain imperfect, more/âless statements versus some usually vague comparator is even worse. Simply put: underrated/âoverrated is a peregrination, not a prolegomenon, for the project of proper precisification.[3]
Reality is concrete; its machinations, exact. When it is important to talk about it, our words should try to be the same.
Itâs somewhat striking that you frame your top-level advice as a comparative:
People surely differ in their current behaviour, and need different adjustments. So why not simply specify what you think the optimal ratio of first- to second-order evaluations is?
My take: not infrequently, as here, comparatives are more precise than first-order evaluations.
Youâre calling attention to a dimension that people may not have thought about much, and certainly donât have established metrics for. If you said âpeople should be 9â10 on the use of first-order evaluations and 3â10 on the use of second-order evaluationsâ, you donât know how people will interpret that. Itâs well within the realm of possibility that some readers will nod along and say âyes thatâs how I do things alreadyâ, even when you would assess their actions quite differently.
By using a comparative, you get the benefit of a common reference pointâhow much things are already being done. People will have a sense of this even if they donât know how to measure it. You get to specify that people should cut it down by 90%, which is concrete and can surface disagreements.
I do happen to think youâre quite wrong in suggesting cutting it down by ~90%, although I agree with the directional nudge vs current practice. I guess that at the moment second-order comparisons comprise the large majority of communication, and it would be better if they comprised a slightly smaller majorityâperhaps tripling the amount of use first-order evaluations get.
Hi Owen,
My interpretation is that Gregory is arguing for greater precision in comparative statements, rather than arguing against comparisons in general.
I feel that often saying X is overrated/âunderrated is a lazy way for people (including me sometimes) to increase/âdecrease Xâs status without making the effort to state concretely their position on X (which opens them up to more criticism and might require introspection and more careful reasoning rather than purely evaluating vibes)
As an example, could you give X/â10 ratings to the idea of relative and absolute ratings?
I am glad somebody wrote this post. I often have the inclination to write posts like these, but I feel like advice like this is sometimes good and sometimes bad and it would be disingenuous for me to stake out a claim in any direction. Nonetheless, I think itâs a good mental exercise to explicitly state the downsides of comparative claims and the upsides of absolute claims, and then people in the comments will (and have) assuredly explain the opposite.
Interesting take. I donât like it.
Perhaps because I like saying overrated/âunderrated.
But also because overrated/âunderrated is a quick way to provide information. âForecasting is underrated by the population at largeâ is much easier to think of than âforecasting is probably rated 4â10 by the population at large and should be rated 6/â10âł
Over/âunderrated requires about 3 mental queries, âIs it better or worse than my ingroup thinksâ âIs it better or worse than my ingroup thinks?â âAm I gonna have to be clear about what I mean?â
Scoring the current and desired status of something requires about 20 queries âIs 4 fair?â âIs 5 fairâ âWhat axis am I rating on?â âPopularity?â âIf I score it a 4 will people think Iâm crazy?â...
Like in some sense your right that % forecasts are more useful than âMore likely/âless likelyâ and sizes are better than âbigger smallerâ but when dealing with intangibles like status I think itâs pretty costly to calculate some status number, so I do the cheaper thing.
Also would you prefer people used over/âunderrated less or would you prefer the people who use over/âunderrated spoke less? Because I would guess that some chunk of those 50ish karma are from people who donât like the vibe rather than some epistemic thing. And if thatâs the case, I think we should have a different discussion.
I guess I think that might come from a frustration around jargon or rationalists in general. And Iâm pretty happy to try and broaden my answer from over/âunderratedâjust as I would if someone asked me how big a star was and I said âbigger than an elephantâ. But itâs worth noting itâs a bandwidth thing and often used because giving exact sizes in status is hard. Perhaps we shouldnât have numbers and words for it, but we donât.
I agree that âunderrated/âoverratedâ or similar directional commentary is often a better way to convey information. Not least because the directional comment sometimes is information (e.g. thereâs a source of systematic error which biases the results) whereas an attempt to estimate a magnitude of the adjustment necessary is just a guess. And using vague verbal qualifiers (x is very large, the error is minimal) instead of a made-up figure much more accurately conveys that something is opinion or methodological critique rather than new data.
Using an actual figure where it exists is obviously good epistemics, but use of guesstimates risks anchoring truth-seekers to your guesses. Setting the expectation that anyone who participates to supply numbers is worse, as it sets a high bar to commentary (really I should be able to say a field is âneglectedâ without specifying how much funding it deserves and how it should be spent!) and can be used to insulate from criticism. âIf you think Iâve inflated my outlying estimate you should tell me exactly how much you think each figure should be so I can attack your lack of evidence insteadâ seems like a more problematic rhetorical technique than understating just how extreme your enthusiasm for something is in order to help reach consensus.
As an outsider (other outside perspectives exist!) Iâd say thereâs probably more frustration with rationalists/âEAs often appearing to like the vibe of artificially precise numerical claims about things which are weakly evidenced or completely subjective...
Reality is concrete but the artistic merit of Buffy or moral weight for livestock isnât (even if it is an occasionally useful concept for modelling/âranking priorities), and Iâm not sure âpeople should rate forecasting at 8/â10â actually conveys any information at all. The illusion of precision is overrated ;-)
(I am pretty unsure I understood this correctly, so this comment might be a mistake, posting anyway as it might be clarifying for others as well if so)
It seems to me that there are two dimensions here:
(a) whether or not a statement is comparative (b) whether or not a statement is confounded by an unobservable
Comparative statements can be confounded when the comparison standard is not made explict, which seems to be your main critique. If I understand you correctly, you see the main response in non-comparative first order evaluations.
But shouldnât, in many cases, the solution to that be better explicated and precise comparative statements (e.g. âI think forecasting is X times better than commonly assumed where my assumption of commonly assumed is based on Y?â) rather than a non-comparative first-order evaluation of how good forecasting is in objective standards?
It seems to me that a big advantage of comparative statements is that (i) usually decisions require comparative statements and, if those are not available, non-comparative estimates willl then often be compared (introducing confounding in terms of whether different estimates were made with roughly comparable methods and standards) and also that (ii) many situations only allow for comparative statements and allow for more robustness on comparative grounds rather than trying to get to accurate first-order evaluations.
E.g. it seems to me that almost all credible knowledge in longtermism comes from comparative statements where there are vast uncertainties on the absolute first-order goodness of many things, butârelatively speakingâmuch more certainty on the relative priority and, luckily, that is also what matters most when making decisions. E.g. it seems pretty impossible to estimate the absolute goodness of reducing existential risk from source X and source Y, but we can say relatively meaningful things about the priority of working on X or Y. Would getting to more precise comparisons on the level of comparative statements also be part of your suggested project here?
I think overrated-underrated is useful because itâs trying to say whether we should be doing more or less of X on the margin. Often itâs much more useful to know whether something is good on the current margin rather than on average.