Rohin Shah comments on The motivated reasoning critique of effective altruism

Rohin Shah 14 Sep 2021 22:10 UTC
7 points
0 ∶ 0
Overall great post, and I broadly agree with the thesis. (I’m not sure the evidence you present is all that strong though, since it too is subject to a lot of selection bias.) One nitpick:
Most of the posts’ comments were critical, but they didn’t positively argue against EV calculations being bad for longtermism. Instead they completely disputed that EV calculations were used in longtermism at all!
I think you’re (unintentionally) running a motte-and-bailey here.
Motte: Longtermists don’t think you should build explicit quantitative models, take their best guess at the inputs, chug through the math, and do whatever the model says, irrespective of common sense, verbal arguments, model uncertainty, etc.
Bailey: Longtermists don’t think you should use numbers or models (and as a corollary don’t consider effectiveness).
(My critical comment on that post claimed the motte; later I explicitly denied the bailey.)
- Linch 14 Sep 2021 23:39 UTC
  4 points
  0 ∶ 0
  Parent
  I’m not sure the evidence you present is all that strong though, since it too is subject to a lot of selection bias
  Oh I absolutely agree. I generally think the more theoretical sections of my post are stronger than the empirical sections. I think the correct update from my post is something like “there is strong evidence of nonzero motivated reasoning in effective altruism, and some probability that motivated reasoning + selection bias-mediated issues are common in our community” but not enough evidence to say more than that.
  
  I think a principled follow-up work (maybe by CEA’s new epistemics project manager?) would look like combing through all (or a statistically representative sample of) impact assessments and/or arguments made in EA, and try to catalogue them for motivated reasoning and other biases.
  I think you’re (unintentionally) running a motte-and-bailey here.
  I think this is complicated. It’s certainly possible I’m fighting against strawmen!
  
  But I will just say what I think/believe right now, and others are free to correct me. I think among committed longtermists, there is a spectrum of trust in explicit modeling, going from my stereotype of weeatquince(2020)‘s views to maybe 50% (30%?) of the converse of what you call the “motte.”(Maybe Michael Dickens(2016) is closest?). My guess is that longtermist EAs ( like almost all humans) have never been that close to purely quantitative models guiding decisions, and we’ve moved closer in the last 5 years to reference classes of fields like the ones that weeatquince’s post pulls from.
  
  I also think I agree with MichaelStJules’ point about the amount of explicit modeling that actually happens relative to effort given to other considerations. “Real” values are determined not by what you talk about, but by what tradeoffs you actually make.
  - Rohin Shah 15 Sep 2021 8:12 UTC
    7 points
    0 ∶ 0
    Parent
    My guess is that longtermist EAs ( like almost all humans) have never been that close to purely quantitative models guiding decisions
    I agree with the literal meaning of that, because it is generally a terrible idea to just do what a purely quantitative model tells you (and I’ll note that even GiveWell isn’t doing this). But imagining the spirit of what you meant, I suspect I disagree.
    I don’t think you should collapse it into the single dimension of “how much do you use quantitative models in your decisions”. It also matters how amenable the decisions are to quantitative modeling. I’m not sure how you’re distinguishing between the two hypotheses:
    Longtermists don’t like quantitative modeling in general.
    Longtermist questions are not amenable to quantitative modeling, and so longtermists don’t do much quantitative modeling, but they would if they tackled questions that were amenable to quantitative modeling.
    (Unless you want to defend the position that longtermist questions are just as easy to model as, say, those in global poverty? That would be… an interesting position.)
    Also, just for the sake of actual evidence, here are some attempts at modeling, biased towards AI since that’s the space I know. Not all are quantitative, and none of them are cost effectiveness analyses.
    Open Phil’s reports on AI timelines: Biological anchors, Modeling the Human Trajectory, Semi-informative priors, brain computation, probability of power-seeking x-risk
    Races: racing to the precipice, followup
    Mapping out arguments: MTAIR and its inspiration
    going from my stereotype of weeatquince(2020)’s views
    Fwiw, my understanding is that weeatquince(2020) is very pro modeling, and is only against the negation of the motte. The first piece of advice in that post is to use techniques like assumption based planning, exploratory modeling, and scenario planning, all of which sound to me like “explicit modeling”. I think I personally am a little more against modeling than weeatquince(2020).
    - Linch 15 Sep 2021 10:51 UTC
      7 points
      0 ∶ 0
      Parent
      Thanks so much for the response! Upvoted.
      
      (I’m exaggerating my views here to highlight the differences, I think my all-things-considered opinion on these positions are much closer to yours than the rest of the comment will make it sound)
      I think my strongest disagreement with your comment is the framing here:
      I’m not sure how you’re distinguishing between the two hypotheses:
      Longtermists don’t like quantitative modeling in general.
      Longtermist questions are not amenable to quantitative modeling, and so longtermists don’t do much quantitative modeling, but they would if they tackled questions that were amenable to quantitative modeling.
      (Unless you want to defend the position that longtermist questions are just as easy to model as, say, those in global poverty? That would be… an interesting position.)
      If we peel away the sarcasm, I think the implicit framing is that
      If X is less amenable than Y to method A of obtaining truth, and X is equally or more amenable to methods B, C, and D relative to Y, we should do less method A to obtain truth in X (relative to Y), and more methods B, C, and D.
      X is less amenable than Y to method A of obtaining truth.
      Thus, we should use method A less in X than in Y.
      Unless I’m missing something, I think this is logically invalid. The obvious response here is that I don’t think longtermist questions are more amenable to explicit quantitative modeling than global poverty, but I’m even more suspicious of other methodologies here.
      Medicine is less amenable to empirical testing than physics, but that doesn’t mean that clinical intuition is a better source of truth for the outcomes of drugs than RCTs. (But medicine is relatively much less amenable to theorems than physics, so it’s correct to use less proofs in medicine than physics.)
      More minor gripes:
      (and I’ll note that even GiveWell isn’t doing this).
      I think I’m willing to bite the bullet and say that GiveWell (or at least my impression of them from a few years back) should be more rigorous in their modeling. Eg, weird to use median staff member’s views as a proxy for truth, weird to have so few well-specified forecasts, and so forth.
      The first piece of advice in that post is to use techniques like assumption based planning, exploratory modeling, and scenario planning, all of which sound to me like “explicit modeling
      I think we might just be arguing about different things here? Like to me, these seem more like verbal arguments of questionable veracity than something that has a truth-value like cost-effectiveness analyses or forecasting. (In contrast, Open Phil’s reports on AI, or at least the ones I’ve read, would count as modeling).
      because it is generally a terrible idea to just do what a purely quantitative model tells you
      What’s the actual evidence for this? I feel like this type of reasoning (and other general things like it in the rough cluster of “injunctions against naive consequentialism”) are pretty common in our community and tend to be strongly held, but when I ask people to defend it, I just see weird thought experiments and handwaved intuitions (rather than a model or a track record)?
      
      This type of view also maps in my head to being the type of view that’s a) high-status and b) diplomatic/”plays nicely” with high-prestige non-EA Western intellectuals, which makes me doubly suspicious that views of this general shape are arrived at through impartial truth-seeking means.
      I also think in practice if you have a model telling you to do one thing but your intuitions tell you to do something else, it’s often worth making enough updates to form a reflective equilibrium. There are at least two ways to go about this:
      
      1) Use the model to maybe update your intuitions, and go with your intuitions in the final decision, being explicit about how your final decisions may have gone against the naive model.
      2) Use your intuitions to probe which pieces your model is making, update your model accordingly, and then go with your (updated) model in the final decision, being explicit about how your final model may have been updated for unprincipled reasons.
      
      I think you (and by revealed preferences, the EA community, including myself) usually goes with 1) as the correct form of intuition vs model reflective equilibrium. But I don’t think this is backed by too much evidence, and I think we haven’t really given 2) a fair shake.
      Now I think in practice 1) and 2) might end up getting the same result much of the time anyway. But a) probably not all the time and b) this is an empirical question.
      - Rohin Shah 15 Sep 2021 13:37 UTC
        10 points
        0 ∶ 0
        Parent
        The obvious response here is that I don’t think longtermist questions are more amenable to explicit quantitative modeling than global poverty, but I’m even more suspicious of other methodologies here.
        Yeah, I’m just way, way more suspicious of quantitative modeling relative to other methodologies for most longtermist questions.
        I think we might just be arguing about different things here?
        Makes sense, I’m happy to ignore those sorts of methods for the purposes of this discussion.
        Medicine is less amenable to empirical testing than physics, but that doesn’t mean that clinical intuition is a better source of truth for the outcomes of drugs than RCTs.
        You can’t run an RCT on arms races between countries, whether or not AGI leads to extinction, whether totalitarian dictatorships are stable, whether civilizational collapse would be a permanent trajectory change vs. a temporary blip, etc.
        What’s the actual evidence for this?
        It just seems super obvious in almost every situation that comes up? I also don’t really know how you expect to get evidence; it seems like you can’t just “run an RCT” here, when a typical quantitative model for a longtermist question takes ~a year to develop (and that’s in situations that are selected for being amenable to quantitative modeling).
        For example, here’s a subset of the impact-related factors I considered when I was considering where to work:
        Lack of non-xrisk-related demands on my time
        Freedom to work on what I want
        Ability to speak publicly
        Career flexibility
        Salary
        I think incorporating just these factors into a quantitative model is a hell of an ask (and there are others I haven’t listed here—I haven’t even included the factors for the academia vs industry question). A selection of challenges:
        I need to make an impact calculation for the research I would do by default.
        I need to make that impact calculation comparable with donations (so somehow putting them in the same units).
        I need to predict the counterfactual research I would do at each of the possible organizations if I didn’t have the freedom to work on what I wanted, and quantify its impact, again in similar units.
        I need to model the relative importance of technical research that tries to solve the problem vs. communication.
        To model the benefits of communication, I need to model field-building benefits, legitimizing benefits, and the benefit of convincing key future decision-makers.
        I need to quantify the probability of various kinds of “risks” (the org I work at shuts down, we realize AI risk isn’t actually a problem, a different AI lab reveals that they’re going to get to AGI in 2 years, unknown unknowns) in order to quantify the importance of career flexibility.
        I think just getting a framework that incorporates all of these things is already a Herculean effort that really isn’t worth it, and even if you did make such a framework, I would be shocked if you could set the majority of the inputs based on actually good reference classes rather than just “what my gut says”. (And that’s all assuming I don’t notice a bunch more effects I failed to mention initially that my intuitions were taking into account but that I hadn’t explicitly verbalized.)
        It seems blatantly obvious that the correct choice here is not to try to get to the point of “quantitative model that captures the large majority of the relevant considerations with inputs that have some basis in reference classes / other forms of legible evidence”, and I’d be happy to take a 100:1 bet that you wouldn’t be able to produce a model that meets that standard (as I evaluate it) in 1000 person-hours.
        I have similar reactions for most other cost effectiveness analyses in longtermism. (For quantitative modeling in general, it depends on the question, but I expect I would still often have this reaction.)
        Eg, weird to use median staff member’s views as a proxy for truth
        If you mean that the weighting on saving vs. improving lives comes from the median staff member, note that GiveWell has been funding research that aims to set these weights in a manner with more legible evidence, because the evidence didn’t exist. In some sense this is my point—that if you want to get legible evidence, you need to put in large amounts of time and money in order to generate that evidence; this problem is worse in the longtermist space and is rarely worth it.
- Michael St Jules 🔸 14 Sep 2021 22:53 UTC
  4 points
  0 ∶ 0
  Parent
  I’m not defending what you think is a bailey, but as a practical matter, I would say until recently (with Open Phil publishing a few models for AI), longtermists have not been using numbers or models much, or when they do, some of the most important parameters are extremely subjective personal guesses or averages of people’s guesses, not based on reference classes, and risks of backfire were not included.
  - NunoSempere 15 Sep 2021 9:20 UTC
    9 points
    0 ∶ 0
    Parent
    until recently (with Open Phil publishing a few models for AI), longtermists have not been using numbers or models much
    This seems to me to not be the case. For a very specific counterexample, AI Impacts has existed since 2015.
    - Michael St Jules 🔸 15 Sep 2021 16:44 UTC
      2 points
      0 ∶ 0
      Parent
      Fair. I should revise my claim to being about the likelihood of a catastrophe and the risk reduction from working on these problems (especially or only in AI; I haven’t looked as much at what’s going on in other x-risks work). AI Impacts looks like they were focused on timelines.
  - Rohin Shah 15 Sep 2021 8:19 UTC
    2 points
    0 ∶ 0
    Parent
    Replied to Linch—TL;DR: I agree this is true compared to global poverty or animal welfare, and I would defend this as simply the correct way to respond to actual differences in the questions asked in longtermism vs. those asked in global poverty or animal welfare.
    You could move me by building an explicit quantitative model for a popular question of interest in longtermism that (a) didn’t previously have models (so e.g. patient philanthropy or AI racing doesn’t count), (b) has an upshot that we didn’t previously know via verbal arguments, (c) doesn’t involve subjective personal guesses or averages thereof for important parameters, and (d) I couldn’t immediately tear a ton of holes in that would call the upshot into question.
    - Michael St Jules 🔸 15 Sep 2021 17:01 UTC
      5 points
      0 ∶ 0
      Parent
      You could move me by building an explicit quantitative model for a popular question of interest in longtermism that (a) didn’t previously have models (so e.g. patient philanthropy or AI racing doesn’t count), (b) has an upshot that we didn’t previously know via verbal arguments, (c) doesn’t involve subjective personal guesses or averages thereof for important parameters, and (d) I couldn’t immediately tear a ton of holes in that would call the upshot into question.
      I feel that (b) identifying a new upshot shouldn’t be necessary; I think it should be enough to build a model with reasonably well-grounded parameters (or well-grounded ranges for them) in a way that substantially affects the beliefs of those most familiar with or working in the area (and maybe enough to change minds about what to work on, within AI or to AI or away from AI). E.g., more explicitly weighing risks of accelerating AI through (some forms of) technical research vs actually making it safer, better grounded weights of catastrophe from AI, a better-grounded model for the marginal impact of work. Maybe this isn’t a realistic goal with currently available information.
      - Rohin Shah 15 Sep 2021 21:55 UTC
        2 points
        0 ∶ 0
        Parent
        Yeah, I agree that would also count (and as you might expect I also agree that it seems quite hard to do).
        Basically with (b) I want to get at “the model does something above and beyond what we already had with verbal arguments”; if it substantially affects the beliefs of people most familiar with the field that seems like it meets that criterion.