Mau comments on Simplify EA Pitches to “Holy Shit, X-Risk”

Mau 11 Feb 2022 18:18 UTC
13 points
0 ∶ 0
Hm, I think I have different intuitions about several points.

you have a limited budget to convince people of weird ideas

I’m not sure this budget is all that fixed. Longtermism pretty straightforwardly implies that empirical claims about x-risk are worth thinking more about. So maybe this budget grows significantly (maybe differentially) if someone gets convinced of longtermism. (Anecdotally, this seems true—I don’t know any committed longtermist who doesn’t think empirical claims about x-risk are worth figuring out, although admittedly there’s confounding factors.)

My guess is that most people who genuinely believe these empirical claims about x-risk will be on board with most of the action relevant EA recommendations.

Maybe some of our different intuitions are also coming from thinking about different target audiences. I agree that simplifying pitches to just empirical x-risk stuff would make sense when talking to most people. Still, the people who sign up for intro programs aren’t most people—they’re strongly (self-)selected for interest in prioritization, interest in ethical reasoning, and for having ethically stronger competing demands on their careers.

And, IMO, the empirical claims are much more objective than the moral claims, and are an easier case to make. I just don’t think you can make moral philosophy arguments that are objectively convincing.

Sure, they’re more objective, but I don’t see why that’s relevant—to be convincing, an argument doesn’t need to be objectively convincing; it just needs to be convincing to its audience. (And if that weren’t the case, we might be in trouble, since the notion of “objectively convincing arguments” seems confused.)

(Tangentially, there’s also the question about whether arguments over subjective probabilities can be entirely objective/empirical.)

Theoretical points aside, the empirical arguments also don’t seem to me like an easier case to make. The minimum viable case you present for AI is over a page long, while the minimum viable case for longtermism is just a few sentences (i.e., a slightly more elaborate version of, “Future people matter just as much as current people, and there could be a lot of future people.”)

Also, if the failure mode of this advice is a bunch of people trying to prevent biorisks that kill billions of people but doesn’t actually permanently derail civilisation, I’m pretty fine with that? That feels like a great outcome to me.

Whether this outcome involves a huge waste of those individuals’ potential for impact seems downstream of disagreement on longtermism. And of course we can conclude that longtermism should be excluded from the intro program if we’re confidently assuming that it’s wrong. I thought the more interesting question that your post was raising was whether it would make sense for the intro program to cover longtermism, under the assumption that it’s true (or under agnosticism).

One exception might be people who significantly prioritise animal welfare, and think that the current world is majorly net bad due to factory farming? But that the future world will likely contain far less factory farming and many more happy humans. But if your goal is to address that objection, IMO current intro materials still majorly miss the mark.

I agree that intro materials should include empirical stuff. If we’re talking specifically about intro materials that do include that as well as the philosophical stuff, then I don’t see why they majorly miss the mark for these people. I think both the empirical and philosophical stuff are logically necessary for convincing these people (and I suspect these people tend to be unusually good at figuring stuff out and therefore pretty valuable to convince, although I’m biased).

I tentatively agree with most of your other points.
- Neel Nanda 12 Feb 2022 6:29 UTC
  4 points
  0 ∶ 0
  Parent
  Thanks, this is some great pushback. Strongly upvoted.
  Re long-termists will think hard about x-risk, that’s a good point. Implicitly I think I’m following the intuition that people don’t really evaluate a moral claim in isolation. And that when someone considers how convinced to be by long-termism, they’re asking questions like “does this moral system imply important things about my actions?” And that it’s much easier to convince them of the moral claim once you can point to tractable action relevant conclusions.
  Re target audiences, I think we are imagining different settings. My read from running intro fellowships is that lots of people find long-termism weird, and I implicitly think that many people who ultimately end up identifying as long-termist still have a fair amount of doubt but are deferring to their perception of the EA consensus. Plus, even if your claim IS true, to me that would imply that we’re selecting intro fellows wrong!
  Implicit model: People have two hidden variables - ‘capacity to be convinced of long-termism’ and ‘capacity to be convinced of x-risk’. These are not fully correlated, and I’d rather only condition on the second one, to maximise the set of reachable people (I say as someone identifying with the second category much more than the first!)
  This also addresses your third point—I expect the current framing is losing a bunch of people who buy x risk but not long-termism, or who are eg suspicious of highly totalising arguments like Astronomical Waste that imply ‘it is practically worthless to do things that just help people alive today’.
  Though it’s fair to say that there are people who CAN be reached by long-termism much more easily than x-risk. I’d be pro giving them the argument for long-termism and some intuition pumps and seeing if it grabs people, so long as we also ensure that the message doesn’t implicitly feel like “and if you don’t agree with long-termism you also shouldn’t prioritise x-risk”. The latter is the main thing I’m protecting here
  Re your fourth point, yeah that’s totally fair, point mostly conceded. By the lights of long-termism I guess I’d argue that the distinction between work to prevent major disasters and work to ruthlessly focus on x-risk isn’t that strong? It seems highly likely that work to prevent natural pandemics is somewhat helpful to prevent engineered pandemics, or work to prevent mild engineered pandemics is useful to help prevent major ones. I think that work to reduce near-term problems in AI systems is on average somewhat helpful for long-term safety. It is likely less efficient, but maybe only 3-30x? And I think we should often be confused and uncertain about our stories for how to just prevent the very worst disasters, and this kind of portfolio is more robust to mistakes re the magnitude of different disasters. Plus, I expect a GCBR to heavily destabilise the world and to be an x-risk increaser by making x risks that can be averted with good coordination more likely
  - Mau 12 Feb 2022 7:26 UTC
    4 points
    0 ∶ 0
    Parent
    Thanks! Great points.
    
    people don’t really evaluate a moral claim in isolation. [...] And that it’s much easier to convince them of the moral claim once you can point to tractable action relevant conclusions.
    
    This seems right—I’ve definitely seen people come across longtermism before coming across x-risks, and have a reaction like, “Well, sure, but can we do anything about it?” I wonder if this means intro programs should at least flip the order of materials—put x-risks before longtermism.
    
    My read from running intro fellowships is that lots of people find long-termism weird, and I implicitly think that many people who ultimately end up identifying as long-termist still have a fair amount of doubt but are deferring to their perception of the EA consensus. Plus, even if your claim IS true, to me that would imply that we’re selecting intro fellows wrong!
    
    Oh interesting, in my experience (from memory, which might be questionable) intro fellows tend to theoretically buy (at least weak?) longtermism pretty easily. And my vague impression is that a majority of professional self-identified longtermists are pretty comfortable with the idea—I haven’t met anyone who’s working on this stuff and says they’re deferring on the philosophy (while I feel like I’ve often heard that people feel iffy/confused about the empirical claims).
    
    And interesting point about the self-selection effects being ones to try to avoid! I think those self-selection effects mostly come from the EA branding of the programs, so it’s not immediately clear to me how those self-selection effects can be eliminated without also losing out on some great self-selection effects (e.g., selection for analytical thinkers, or for people who are interested in spending their careers helping others).
    
    I’d be pro giving them the argument for long-termism and some intuition pumps and seeing if it grabs people, so long as we also ensure that the message doesn’t implicitly feel like “and if you don’t agree with long-termism you also shouldn’t prioritise x-risk”. The latter is the main thing I’m protecting here
    
    Yeah, that’s fair.
    
    It is likely less efficient, but maybe only 3-30x
    
    I’m sympathetic to something along these lines. But I think that’s a great case (from longtermists’ lights) for keeping longtermism in the curriculum. If one week of readings has a decent chance of boosting already-impactful people’s impact by, say, 10x (by convincing them to switch to 10x more impactful interventions), that seems like an extremely strong reason for keeping that week in the curriculum.
    - Neel Nanda 12 Feb 2022 9:18 UTC
      4 points
      0 ∶ 0
      Parent
      I haven’t met anyone who’s working on this stuff and says they’re deferring on the philosophy (while I feel like I’ve often heard that people feel iffy/confused about the empirical claims).
      Fair—maybe I feel that people mostly buy ‘future people have non-zero worth and extinction sure is bad’, but may be more uncertain on a totalising view like ‘almost all value is in the far future, stuff today doesn’t really matter, moral worth is the total number of future people and could easily get to >=10^20’.
      I’m sympathetic to something along these lines. But I think that’s a great case (from longtermists’ lights) for keeping longtermism in the curriculum. If one week of readings has a decent chance of boosting already-impactful people’s impact by, say, 10x (by convincing them to switch to 10x more impactful interventions), that seems like an extremely strong reason for keeping that week in the curriculum.
      Agreed! (Well, by the lights of longtermism at least—I’m at least convinced that extinction is 10x worse than civilisational collapse temporarily, but maybe not 10^10x worse). At this point I feel like we mostly agree—keeping a fraction of the content on longtermism, after x-risks, and making it clear that it’s totally legit to work on x-risk without buying longtermism would make me happy
- Neel Nanda 12 Feb 2022 6:31 UTC
  2 points
  0 ∶ 0
  Parent
  Re your final point, I mostly just think they miss the mark by not really addressing the question of what the long-term distribution of animal welfare looks like (I’m personally pretty surprised by the comparative lack of discussion about how likely our Lightcone is to be net bad by the lights of people who put significant weight on animal welfare)
  - Mau 12 Feb 2022 7:37 UTC
    2 points
    0 ∶ 0
    Parent
    Maybe I’m getting mixed up, but weren’t we talking about convincing people who believe that “the future world will likely contain far less factory farming and many more happy humans”? (I.e., the people for whom the long-term distribution of animal welfare is, by assumption, not that much of a worry)
    
    Maybe you had in mind the people who (a) significantly prioritize animal welfare, and (b) think the long-term future will be bad due to animal welfare issues? Yeah, I’d also like to see more good content for these people. (My sense is there’s been a decent amount of discussion, but it’s been kind of scattered (which also makes it harder to feature in a curriculum). Maybe you’ve already seen all this, but I personally found section 1.2 of the GPI agenda helpful as a compilation of this discussion.)
    - Neel Nanda 12 Feb 2022 9:20 UTC
      4 points
      0 ∶ 0
      Parent
      Ah sorry, the original thing was badly phrased. I meant, a valid objection to x-risk work might be “I think that factory farming is really really bad right now, and prioritise this over dealing with x-risk”. And if you don’t care about the distant future, that argument seems pretty legit from some moral perspectives? While if you do care about the distant future, you need to answer the question of what the future distribution of animal welfare looks like, and it’s not obviously positive. So to convince these people you’d need to convince them that the distribution is positive.