Mau comments on Simplify EA Pitches to “Holy Shit, X-Risk”

Mau 11 Feb 2022 5:43 UTC
30 points
0 ∶ 0
Thanks for this!

It’s reasonable to agree with these arguments, but consider something else an even bigger problem! While I’d personally disagree, any of the following seem like justifiable positions: climate change, progress studies, global poverty, factory farming.

This seems to me like more than a caveat—I think it reverses this post’s conclusions that “the common discussion of longtermism, future generations and other details of moral philosophy in intro materials is an unnecessary distraction,” and disagreement on longtermism has “basically no important implications for [your, and implicitly, others’] actions.”

After all, if (strong) longtermism has very big, unique implications about what cause areas people should focus on (not to mention implications about whether biosecurity folks should focus on preventing permanent catastrophes or more temporary ones)… aren’t those some pretty important implications for our actions?

That seems important for introductory programs; if longtermism is necessary to make the case that AI/bio are most important (as opposed to “just” being very important), then introducing longtermism will be helpful for recruiting EAs to work on these issues.
- Neel Nanda 11 Feb 2022 11:51 UTC
  25 points
  0 ∶ 0
  Parent
  TL;DR I think that in practice most of these disagreements boil down to empirical cruxes not moral ones. I’m not saying that moral cruxes are literally irrelevant, but that they’re second order, only relevant to some people, and only matter if people buy the empirical cruxes, and so should not be near the start of the outreach funnel but should be brought up eventually
  Hmm, I see your point, but want to push back against this. My core argument is essentially stemming from an intuition that you have a limited budget to convince people of weird ideas, and that if you can only convince them of one weird ideas it should be the empirical claims about the probability of x-risk, not the moral claims about future people. My guess is that most people who genuinely believe these empirical claims about x-risk will be on board with most of the action relevant EA recommendations. While people who buy the moral claims but NOT the empirical claims will massively disagree with most EA recommendations.
  And, IMO, the empirical claims are much more objective than the moral claims, and are an easier case to make. I just don’t think you can make moral philosophy arguments that are objectively convincing.
  I’m not arguing that it’s literally useless to make the moral arguments—once you’ve convinced someone of the first weird idea, they’re probably willing to listen to the second weird idea! But if you fail to convince them that the first weird idea is worth taking seriously they probably aren’t. And I agree that once you get into actually working on a field there may be subtle differences re trading off short term disasters against long-term disasters, which can really matter for the work you do. But IMO most intro material is just trying to convey an insight like “try to work on bio/AI”, and that subtle disagreements about which research agendas and subfields most matter are things that can be hashed out later. In the same way that I wouldn’t want intro fellowships to involve a detailed discussion of the worth of MIRI vs DeepMind Safety’s research agenda.
  Also, if the failure mode of this advice is a bunch of people trying to prevent biorisks that kill billions of people but doesn’t actually permanently derail civilisation, I’m pretty fine with that? That feels like a great outcome to me.
  Further, I think that prioritising AI or bio over these other problems is kinda obviously the right thing to do from just the perspective of ensuring the next 200 years go well, and probably from the perspective of ensuring the next 50 go well. To the degree that people disagree, IMO it tends to come from empirical disagreements, not moral ones. Eg people who think that climate change is definitely an x-risk—I think this is an incorrect belief, but that you resolve it by empirically discussing how bad climate change is, not by discussing future generations. This may just be my biased experience, but I often meet people who have different cause prio and think that eg AI Safety is delusional, but very rarely meet people with different cause prio who agree with me about the absolute importance of AI and bio.
  One exception might be people who significantly prioritise animal welfare, and think that the current world is majorly net bad due to factory farming? But that the future world will likely contain far less factory farming and many more happy humans. But if your goal is to address that objection, IMO current intro materials still majorly miss the mark.
  - Mau 11 Feb 2022 18:18 UTC
    13 points
    0 ∶ 0
    Parent
    Hm, I think I have different intuitions about several points.
    
    you have a limited budget to convince people of weird ideas
    
    I’m not sure this budget is all that fixed. Longtermism pretty straightforwardly implies that empirical claims about x-risk are worth thinking more about. So maybe this budget grows significantly (maybe differentially) if someone gets convinced of longtermism. (Anecdotally, this seems true—I don’t know any committed longtermist who doesn’t think empirical claims about x-risk are worth figuring out, although admittedly there’s confounding factors.)
    
    My guess is that most people who genuinely believe these empirical claims about x-risk will be on board with most of the action relevant EA recommendations.
    
    Maybe some of our different intuitions are also coming from thinking about different target audiences. I agree that simplifying pitches to just empirical x-risk stuff would make sense when talking to most people. Still, the people who sign up for intro programs aren’t most people—they’re strongly (self-)selected for interest in prioritization, interest in ethical reasoning, and for having ethically stronger competing demands on their careers.
    
    And, IMO, the empirical claims are much more objective than the moral claims, and are an easier case to make. I just don’t think you can make moral philosophy arguments that are objectively convincing.
    
    Sure, they’re more objective, but I don’t see why that’s relevant—to be convincing, an argument doesn’t need to be objectively convincing; it just needs to be convincing to its audience. (And if that weren’t the case, we might be in trouble, since the notion of “objectively convincing arguments” seems confused.)
    
    (Tangentially, there’s also the question about whether arguments over subjective probabilities can be entirely objective/empirical.)
    
    Theoretical points aside, the empirical arguments also don’t seem to me like an easier case to make. The minimum viable case you present for AI is over a page long, while the minimum viable case for longtermism is just a few sentences (i.e., a slightly more elaborate version of, “Future people matter just as much as current people, and there could be a lot of future people.”)
    
    Also, if the failure mode of this advice is a bunch of people trying to prevent biorisks that kill billions of people but doesn’t actually permanently derail civilisation, I’m pretty fine with that? That feels like a great outcome to me.
    
    Whether this outcome involves a huge waste of those individuals’ potential for impact seems downstream of disagreement on longtermism. And of course we can conclude that longtermism should be excluded from the intro program if we’re confidently assuming that it’s wrong. I thought the more interesting question that your post was raising was whether it would make sense for the intro program to cover longtermism, under the assumption that it’s true (or under agnosticism).
    
    One exception might be people who significantly prioritise animal welfare, and think that the current world is majorly net bad due to factory farming? But that the future world will likely contain far less factory farming and many more happy humans. But if your goal is to address that objection, IMO current intro materials still majorly miss the mark.
    
    I agree that intro materials should include empirical stuff. If we’re talking specifically about intro materials that do include that as well as the philosophical stuff, then I don’t see why they majorly miss the mark for these people. I think both the empirical and philosophical stuff are logically necessary for convincing these people (and I suspect these people tend to be unusually good at figuring stuff out and therefore pretty valuable to convince, although I’m biased).
    
    I tentatively agree with most of your other points.
    - Neel Nanda 12 Feb 2022 6:29 UTC
      4 points
      0 ∶ 0
      Parent
      Thanks, this is some great pushback. Strongly upvoted.
      Re long-termists will think hard about x-risk, that’s a good point. Implicitly I think I’m following the intuition that people don’t really evaluate a moral claim in isolation. And that when someone considers how convinced to be by long-termism, they’re asking questions like “does this moral system imply important things about my actions?” And that it’s much easier to convince them of the moral claim once you can point to tractable action relevant conclusions.
      Re target audiences, I think we are imagining different settings. My read from running intro fellowships is that lots of people find long-termism weird, and I implicitly think that many people who ultimately end up identifying as long-termist still have a fair amount of doubt but are deferring to their perception of the EA consensus. Plus, even if your claim IS true, to me that would imply that we’re selecting intro fellows wrong!
      Implicit model: People have two hidden variables - ‘capacity to be convinced of long-termism’ and ‘capacity to be convinced of x-risk’. These are not fully correlated, and I’d rather only condition on the second one, to maximise the set of reachable people (I say as someone identifying with the second category much more than the first!)
      This also addresses your third point—I expect the current framing is losing a bunch of people who buy x risk but not long-termism, or who are eg suspicious of highly totalising arguments like Astronomical Waste that imply ‘it is practically worthless to do things that just help people alive today’.
      Though it’s fair to say that there are people who CAN be reached by long-termism much more easily than x-risk. I’d be pro giving them the argument for long-termism and some intuition pumps and seeing if it grabs people, so long as we also ensure that the message doesn’t implicitly feel like “and if you don’t agree with long-termism you also shouldn’t prioritise x-risk”. The latter is the main thing I’m protecting here
      Re your fourth point, yeah that’s totally fair, point mostly conceded. By the lights of long-termism I guess I’d argue that the distinction between work to prevent major disasters and work to ruthlessly focus on x-risk isn’t that strong? It seems highly likely that work to prevent natural pandemics is somewhat helpful to prevent engineered pandemics, or work to prevent mild engineered pandemics is useful to help prevent major ones. I think that work to reduce near-term problems in AI systems is on average somewhat helpful for long-term safety. It is likely less efficient, but maybe only 3-30x? And I think we should often be confused and uncertain about our stories for how to just prevent the very worst disasters, and this kind of portfolio is more robust to mistakes re the magnitude of different disasters. Plus, I expect a GCBR to heavily destabilise the world and to be an x-risk increaser by making x risks that can be averted with good coordination more likely
      - Mau 12 Feb 2022 7:26 UTC
        4 points
        0 ∶ 0
        Parent
        Thanks! Great points.
        
        people don’t really evaluate a moral claim in isolation. [...] And that it’s much easier to convince them of the moral claim once you can point to tractable action relevant conclusions.
        
        This seems right—I’ve definitely seen people come across longtermism before coming across x-risks, and have a reaction like, “Well, sure, but can we do anything about it?” I wonder if this means intro programs should at least flip the order of materials—put x-risks before longtermism.
        
        My read from running intro fellowships is that lots of people find long-termism weird, and I implicitly think that many people who ultimately end up identifying as long-termist still have a fair amount of doubt but are deferring to their perception of the EA consensus. Plus, even if your claim IS true, to me that would imply that we’re selecting intro fellows wrong!
        
        Oh interesting, in my experience (from memory, which might be questionable) intro fellows tend to theoretically buy (at least weak?) longtermism pretty easily. And my vague impression is that a majority of professional self-identified longtermists are pretty comfortable with the idea—I haven’t met anyone who’s working on this stuff and says they’re deferring on the philosophy (while I feel like I’ve often heard that people feel iffy/confused about the empirical claims).
        
        And interesting point about the self-selection effects being ones to try to avoid! I think those self-selection effects mostly come from the EA branding of the programs, so it’s not immediately clear to me how those self-selection effects can be eliminated without also losing out on some great self-selection effects (e.g., selection for analytical thinkers, or for people who are interested in spending their careers helping others).
        
        I’d be pro giving them the argument for long-termism and some intuition pumps and seeing if it grabs people, so long as we also ensure that the message doesn’t implicitly feel like “and if you don’t agree with long-termism you also shouldn’t prioritise x-risk”. The latter is the main thing I’m protecting here
        
        Yeah, that’s fair.
        
        It is likely less efficient, but maybe only 3-30x
        
        I’m sympathetic to something along these lines. But I think that’s a great case (from longtermists’ lights) for keeping longtermism in the curriculum. If one week of readings has a decent chance of boosting already-impactful people’s impact by, say, 10x (by convincing them to switch to 10x more impactful interventions), that seems like an extremely strong reason for keeping that week in the curriculum.
        Neel Nanda 12 Feb 2022 9:18 UTC
        4 points
        0 ∶ 0
        Parent
        I haven’t met anyone who’s working on this stuff and says they’re deferring on the philosophy (while I feel like I’ve often heard that people feel iffy/confused about the empirical claims).
        Fair—maybe I feel that people mostly buy ‘future people have non-zero worth and extinction sure is bad’, but may be more uncertain on a totalising view like ‘almost all value is in the far future, stuff today doesn’t really matter, moral worth is the total number of future people and could easily get to >=10^20’.
        I’m sympathetic to something along these lines. But I think that’s a great case (from longtermists’ lights) for keeping longtermism in the curriculum. If one week of readings has a decent chance of boosting already-impactful people’s impact by, say, 10x (by convincing them to switch to 10x more impactful interventions), that seems like an extremely strong reason for keeping that week in the curriculum.
        Agreed! (Well, by the lights of longtermism at least—I’m at least convinced that extinction is 10x worse than civilisational collapse temporarily, but maybe not 10^10x worse). At this point I feel like we mostly agree—keeping a fraction of the content on longtermism, after x-risks, and making it clear that it’s totally legit to work on x-risk without buying longtermism would make me happy
    - Neel Nanda 12 Feb 2022 6:31 UTC
      2 points
      0 ∶ 0
      Parent
      Re your final point, I mostly just think they miss the mark by not really addressing the question of what the long-term distribution of animal welfare looks like (I’m personally pretty surprised by the comparative lack of discussion about how likely our Lightcone is to be net bad by the lights of people who put significant weight on animal welfare)
      - Mau 12 Feb 2022 7:37 UTC
        2 points
        0 ∶ 0
        Parent
        Maybe I’m getting mixed up, but weren’t we talking about convincing people who believe that “the future world will likely contain far less factory farming and many more happy humans”? (I.e., the people for whom the long-term distribution of animal welfare is, by assumption, not that much of a worry)
        
        Maybe you had in mind the people who (a) significantly prioritize animal welfare, and (b) think the long-term future will be bad due to animal welfare issues? Yeah, I’d also like to see more good content for these people. (My sense is there’s been a decent amount of discussion, but it’s been kind of scattered (which also makes it harder to feature in a curriculum). Maybe you’ve already seen all this, but I personally found section 1.2 of the GPI agenda helpful as a compilation of this discussion.)
        Neel Nanda 12 Feb 2022 9:20 UTC
        4 points
        0 ∶ 0
        Parent
        Ah sorry, the original thing was badly phrased. I meant, a valid objection to x-risk work might be “I think that factory farming is really really bad right now, and prioritise this over dealing with x-risk”. And if you don’t care about the distant future, that argument seems pretty legit from some moral perspectives? While if you do care about the distant future, you need to answer the question of what the future distribution of animal welfare looks like, and it’s not obviously positive. So to convince these people you’d need to convince them that the distribution is positive.
  - mic 11 Feb 2022 18:08 UTC
    11 points
    0 ∶ 0
    Parent
    Suppose it takes $100 billion to increase our chance of completely averting extinction (or the equivalent) by 0.1%. By this, I don’t mean averting an extinction event by having it be an event that only kills 98% of people, or preventing the disempowerment of humanity due to AI; I mean that we save the entire world’s population. For convenience, I’ll assume no diminishing marginal returns. If we only consider the 7 generations of lost wellbeing after the event, and compute $100 billion / (7 * 8 billion * 0.1%), then we get a cost-effectiveness of $1,780 to save a life. With the additional downside of being extremely uncertain, this estimate is only in the same ballpark as the Malaria Consortium’s seasonal chemoprevention program (which takes ~$4,500 to save a life). It’s also complicated by the fact that near-term animal charities, etc. are funding-constrained while longtermist orgs are not so much. Unlike a strong longtermist view, it’s not at all clear under this view that it would be worthwhile to pivot your career to AI safety or biorisk, instead of taking the more straightforward route of earning to give to standard near-term interventions.
    - Thomas Kwa🔹 11 Feb 2022 18:37 UTC
      8 points
      0 ∶ 0
      Parent
      My best estimate of price to decrease extinction risk by 0.1% is under $10B. Linch has only thought about this for a few hours, but he’s pretty well informed on the state of megaprojects, plus others have thought more than that. This is consistent with my own estimates too.
      - Benjamin_Todd 13 Feb 2022 22:47 UTC
        13 points
        0 ∶ 0
        Parent
        One thing I find really tricky about this is figuring out where the margin will end up in the future.
        It seems likely to me that $100bn will be spent on x-risk reduction over the next 100 years irrespective of what I do. My efforts mainly top up that pot.
        Personally I expect the next $10bn might well reduce x-risk by ~1% rather than 0.1%; but it’ll be far less once we get into the next $90bn and then $100bn after it. It might well be a lot less than 0.1% per $10bn billion.
        Linch 14 Feb 2022 19:01 UTC
        3 points
        0 ∶ 0
        Parent
        Yes this is a really good point. I meant to make it when I first read Thomas’ comment but then forgot about this as I was typing up my own comment.
        I think
        it’ll be far less once we get into the next $90bn and then $100bn after it. It might well be a lot less than 0.1% per $10bn billion.
        Might be a plausible position after the movement has a few more years of experience and researchers have put a few thousand hours of research and further thinking into this question, but right now we (or at least I) don’t have a strong enough understanding of the landscape to confidently believe in very low cost-effectiveness for the last dollar. In slightly more mathy terms, we might have a bunch of different cost-effectiveness distributions in the ensemble that forms our current prior, which means we can’t go very low (or high) if we do a weighted average across them.
        Benjamin_Todd 14 Feb 2022 20:00 UTC
        2 points
        0 ∶ 0
        Parent
        The point about averaging over several cost-effective distributions is interesting!
        Linch 14 Feb 2022 22:43 UTC
        2 points
        0 ∶ 0
        Parent
        If you find the analogy helpful, my comment here mirrors Toby’s on why having a mixed prior on the Hinge of History question is reasonable.
      - Linch 12 Feb 2022 18:19 UTC
        11 points
        0 ∶ 0
        Parent
        (I thought about it for a few more hours and haven’t changed my numbers much).
        I think it’s worth highlighting that our current empirical best guesses (with a bunch of uncertainty) is that catastrophic risk mitigation measures are probably better in expectation than near-term global health interventions, even if you only care about currently alive people.
        But on the other hand, it’s also worth highlighting that you only have 1-2 OOMs to work with, so if we only care about present people, the variance is high enough that we can easily change our minds in the future. Also, e.g. community building interventions or other “meta” interventions in global health (e.g. US foreign aid research and advocacy) may be better even on our current best guesses. Neartermist animal interventions may be more compelling as well.
        Finally, what axilogies you have would have implications for what you should focus on within GCR work. Because I’m personally more compelled by the longtermist arguments for existential risk reduction than neartermist ones, I’m personally comparatively more excited about disaster mitigation, robustness/resilience, and recovery, not just prevention. Whereas I expect the neartermist morals + empirical beliefs about GCRs + risk-neutrality should lead you to believe that prevention and mitigation is worthwhile, but comparatively little resources should be invested in disaster resilience and recovery for extreme disasters.
        Linch 14 Feb 2022 18:56 UTC
        2 points
        0 ∶ 0
        Parent
        Why was this comment downvoted a bunch?
        Charles He 14 Feb 2022 21:00 UTC
        4 points
        0 ∶ 0
        Parent
        Here you go:
        Why was this comment downvoted a bunch?
        I think your content and speculation in your comment was both principled and your right to say. My guess is that a comment that comes close to saying that an EA cause area has a different EV per dollar than others can get this sort of response.
        This is a complex topic. Here’s some rambling, verbose thoughts , that might be wrong, and that you and others have might have already thought about:
        This post exposes surface area for “disagreement of underlying values” in EA.
        
        Some people don’t like a lot of math or ornate theories. For someone who is worried that the cause area representing their values is being affected, it can be easy to perceive adding a lot of math or theories as overbearing.
        
        In certain situations, I believe “underlying values” drive a large amount of the karma of posts and comments, boosting messages whose content otherwise doesn’t warrant it. I think this is important to note, as it reduces communication, and can be hard to fix (or even observe) and one reason it is good to give this some attention or “work on this”^[1].
        
        I don’t think content or karma on the EA forum has a direct, simple relationship to all EA opinion or opinion of those that work in EA areas. However, I know someone who has information and models about related issues and opinions from EA’s “offline” and I think this suggests these disagreements are far from an artifact of the forum or “very online”.
        
        I see the underlying issues as tractable and fixable.
        
        There is a lot of writing in this comment, but this comes from a different perspective as a commenter. For a commenter, I think if they take the issues too seriously, I think it can be overbearing and make it unfairly hard to write things.
        
        As a commenter, if they wanted to address this, talking to a few specific people and listening can help.
        ^
        I think I have some insight because of this project, but it is not easy for me to immediately explain.
        Linch 14 Feb 2022 22:43 UTC
        6 points
        0 ∶ 0
        Parent
        I mentioned this before, but again I don’t think strong-upvoting comments asking why they received downvotes from others is appropriate!
    - Neel Nanda 11 Feb 2022 18:22 UTC
      3 points
      0 ∶ 0
      Parent
      It’s not at all clear under this view that it would be worthwhile to pivot your career to AI safety or biorisk, instead of taking the more straightforward route of earning to give to standard near-term interventions.
      I’d disagree with this. I think the conversion of money to labour is super inefficient on longtermism, and so this analogy breaks down. Sure, maybe I should donate to the Maximum Impact Fund rather than LTFF. But it’s really hard to usefully convert billions of dollars into useful labour on longtermist stuff. So, as someone who can work on AI Safety, there’s a major inefficiency factor if I pivot to ETG. I think the consensus basically already is that ETG for longtermism is rarely worth it, unless you’re incredibly good at ETG.
  - Linch 11 Feb 2022 15:24 UTC
    10 points
    0 ∶ 0
    Parent
    I’m not saying this consideration is overriding, but one reason you might want moral agreement and not just empirical agreement is that people who agree with you empirically but not morally may be more interested in trading x-risk points for ways to make themselves more powerful.
    I don’t think this worry is completely hypothetical, I think there’s a fairly compelling story where both DeepMind and OpenAI were started by people who agree with a number of premises in the AGI x-risk argument but not all of them.
    Fortunately this hasn’t happened in bio (yet), at least to my knowledge.
  - IanDavidMoss 11 Feb 2022 14:50 UTC
    7 points
    0 ∶ 0
    Parent
    if the failure mode of this advice is a bunch of people trying to prevent biorisks that kill billions of people but doesn’t actually permanently derail civilisation, I’m pretty fine with that? That feels like a great outcome to me.
    For me this is the key point. I feel that the emphasis on longtermism for longtermism’s sake in some influential corners of EA might have the effect of prolonging the neglectedness of catastrophic-but-not-existential risks, which IMHO are far more likely and worth worrying about. It’s not exactly a distraction since work on x-risks is generally pretty helpful for work on GCRs as well, but I do think Neel’s approach would bring more people into the fold.
  - Anthony DiGiovanni 🔸 13 Feb 2022 9:00 UTC
    4 points
    0 ∶ 0
    Parent
    Note that your “tl;dr” in the OP is a stronger claim than “these empirical claims are first order while the moral disagreements are second order.” You claimed that agreement on these empirical claims is “enough to justify the core action relevant points of EA.” Which seems unjustified, as others’ comments in this thread have suggested. (I think agreement on the empirical claims very much leaves it open whether one should prioritize, e.g., extinction risks or trajectory change.)