Devin Kalish comments on On how various plans miss the hard bits of the alignment challenge

Devin Kalish 12 Jul 2022 9:54 UTC
8 points
0 ∶ 0
This is pretty unrelated to the substantial content here to the point where I’m unsure about writing about it, especially as it’s looking like it’s going to be the first comment on this post. Still, I wanted to offer some feedback on messaging in case it helps. Whenever I see you use the word “dignity” in this piece, I sort of recoil and feel more alienated from this post overall. In particular, there are two semi-related reasons for this:
1. It references a post that almost single-handedly gave me and I think lots of others, pretty bad mental health issues for a few months, and to an extent still today. On its own I don’t know that this disqualifies the piece as worthwhile, but it does make me recoil from references to it like a hand on a hot stove somewhat automatically. On a more substantial level, I think the post itself was awkward and probably a misstep. The weird April Fools but not April Fools framing, and the fact that it didn’t contribute much to the substantial discussion, though it had some valuable things to say about being a consequentialist rather than a cartoon supervillain (which I think he has said elsewhere less prominently if I remember).
2. Dignity is the wrong thing to aim for. This is maybe the more substantial problem I have. First and foremost, I don’t want to aim for “dying with dignity”. I would rather just, exclusively, aim for not dying. It’s true there’s an awfully funny coincidence that Yudkowsky’s version of “dying with dignity” lines up so perfectly with aiming for not dying, but that still doesn’t justify aiming for it instead. Dignity, in this situation, is just not that motivating to me. If aiming for “dying with dignity” diverges one nano-degree from aiming for not dying, then it is the wrong thing to aim for. This is especially striking to find in writing from Yudkowsky, for those of us who have read his other stuff, because it is not how he ever advocates people think elsewhere. Whenever you parry, hit, spring, strike or touch the cutting sword of AGI safety, you must cut the actual solution in the same movement. This almost is enough to convince me that this peculiar dignity framing was the only part of the piece that was at least sort of facetious in the whole, not-very-April-Foolsy post. At the very least I can’t relate to ever getting to a point of being so hopeless that what I am aiming for is “dying with dignity” rather than just “not dying”.
I guess my steelman of this is that, if you aim for not dying, you will probably be disappointed, so to keep motivated, you should aim for something achievable, like dignity. I am not nearly as pessimistic as you or Yudkowsky about this matter, and maybe if I was this framing would seem better to me, but even then I find it unlikely. What differs between a world where a few people put in a good deal of effort, this effort proves mostly counterproductive or just misguided, and they all die, and a world where a few people put in a good deal of effort in the right direction, aren’t accidentally counterproductive, and they almost don’t die. I am just as sympathetic to both, I find that they had similar dignity in my eyes, but the latter world, most crucially, almost survived. More dignified than both is a world in which everyone coordinates a great deal and society really buckles down and gets serious about the issue and they still die. Less dignified is a world where no one but one lonely weirdo cares at all, everyone else laughs about it, and the one weirdo gives up, and they all die. Both of those worlds are unrealistic at this point, I judge that if our option set for solving this problem is narrow enough that you can expect us to probably fail with any reliability, then it is narrow enough that we can’t change how much dignity we die with very much either.

All of this is a rambling way of saying that, this “dignity” stuff has come up in a number of serious writings from MIRI now, and I’m worried that it is going to become a standard fixture of your messaging. I just want to register some concerns I have about this happening, I would rather you just say things increase our odds of succeeding than say that things increase our dignity.
- JakubK 12 Jul 2022 19:39 UTC
  3 points
  0 ∶ 0
  Parent
  Maybe I’m missing something, but it seems that “dignity” only appears once in the OP? Namely, here:
  On my model, solutions to how capabilities generalize further than alignment are necessary but not sufficient. There is dignity in attacking a variety of other real problems, and I endorse that practice.
  This usage appears to have nothing to do with the April Fool’s Day post.
  
  Perhaps Soares made a subsequent edit to the OP?
  - RobBensinger 12 Jul 2022 21:46 UTC
    3 points
    0 ∶ 0
    Parent
    “Dignity” indeed only occurs once, and I assume it’s calling back to the same “death with dignity” concept from the April Fool’s post (which I agree shouldn’t have been framed as an April Fool’s thing).
    I assume EY didn’t expect the post to have such a large impact, in part because he’d already said more or less the same thing, with the same terminology, in a widely-read post back in November 2021:
    Anonymous
    At a high level one thing I want to ask about is research directions and prioritization. For example, if you were dictator for what researchers here (or within our influence) were working on, how would you reallocate them?
    Eliezer Yudkowsky
    The first reply that came to mind is “I don’t know.” I consider the present gameboard to look incredibly grim, and I don’t actually see a way out through hard work alone. We can hope there’s a miracle that violates some aspect of my background model, and we can try to prepare for that unknown miracle; preparing for an unknown miracle probably looks like “Trying to die with more dignity on the mainline” (because if you can die with more dignity on the mainline, you are better positioned to take advantage of a miracle if it occurs).
    The term also shows up a ton in the Late 2021 MIRI Conversations, e.g., here and here.
    I appreciate the data point about the term being one you find upsetting to run into; thanks for sharing about that, Devin. And, for whatever it’s worth, I’m sorry. I don’t like sharing info (or framings) that cause people distress like that.
    I don’t know whether data points like this will update Nate and/or Eliezer all the way to thinking the term is net-negative to use. If not, and this is a competing access needs issue (‘one group finds it much more motivating to use the phrase X; another group finds that exact same phrase extremely demotivating’), then I think somebody should make a post walking folks through a browser text-replacement method that can swap out words like ‘dignity’ and ‘dignified’ (on LW, the EA Forum, the MIRI website, etc.) for something more innocuous/silly.
    - Devin Kalish 12 Jul 2022 22:34 UTC
      4 points
      0 ∶ 0
      Parent
      The word dignity only appears once, but variations appear as well:
      
      “And it sure would be undignified for our world to die of antitrust law at the final extremity.”
      
      “It’s as dignified as any of the other attempts to walk around this hard problem”
      
      Some version of this reference appears mostly when Soares is endorsing efforts to solve a problem in a way that won’t work if the standard MIRI model of doom is correct, but which is still worthwhile in case it isn’t. To be clear, I respect you, Soares, and Yudkowsky a great deal, my impression is that MIRI is a great bunch of folks whose approach is worthwhile, even if I lean somewhat more Christiano/Critch on some of these issues. It is also possible that dignity is a good framing overall and I’m just weird, in which case I fully endorse using it. I just personally don’t like it for the reasons I mentioned, and I think there are many others with similar reactions.
      - RobBensinger 12 Jul 2022 22:47 UTC
        2 points
        0 ∶ 0
        Parent
        Oops, thanks! I checked for those variants elsewhere but forgot to do so here. :)
        It is also possible that dignity is a good framing overall and I’m just weird, in which case I fully endorse using it.
        I think it’s a good framing for some people and not for others. I’m confident that many people shouldn’t use this framing regularly in their own thinking. I’m less sure about whether the people who do find it valuable should steer clear of mentioning it, that’s a bit more extreme.
        Devin Kalish 12 Jul 2022 23:23 UTC
        1 point
        0 ∶ 0
        Parent
        That’s fair, I think it depends how it’s intended. If the point is to talk about how you think about or relate to the issue, talking about the framing that works best for you makes sense. If the purpose is outreach, there are framings that make more or less sense to use.