Akash comments on On Deference and Yudkowsky’s AI Risk Estimates

Akash Jun 20, 2022, 2:40 AM
36 points
1 ∶ 0
Thank you for writing this, Ben. I think the examples are a helpful and I plan to read more about several of them.
With that in mind, I’m confused about how to interpret your post and how much to update on Eliezer. Specifically, I find it pretty hard to assess how much I should update (if at all) given the “cherry-picking” methodology:
Here, I’ve collected a number of examples of Yudkowsky making (in my view) dramatic and overconfident predictions concerning risks from technology.
Note that this isn’t an attempt to provide a balanced overview of Yudkowsky’s technological predictions over the years. I’m specifically highlighting a number of predictions that I think are underappreciated and suggest a particular kind of bias.
If you were apply this to any EA thought leader (or non-EA thought leader, for that matter), I strongly suspect you’d find a lot clearcut and disputable examples of them being wrong on important things.
As a toy analogy, imagine that Alice is widely-considered to be extremely moral. I hire an investigator to find as many examples of Alice doing Bad Things as possible. I then publish my list of Bad Things that Alice has done. And I tell people “look—Alice has done some Bad Things. You all think of her as a really moral person, and you defer to her a lot, but actually, she has done Bad Things!”
And I guess I’m left with a feeling of… OK, but I didn’t expect Alice to have never done Bad Things! In fact, maybe I expected Alice to do worse things than the things that were on this list, so I should actually update toward Alice being moral and defer to Alice more.
To make an informed update, I’d want to understand your balanced take. Or I’d want to know some of the following:
- How much effort did the investigator spend looking for examples of Bad Things?
- Given my current impression of Alice, how many Bad Things (weighted by badness) would I have expected the investigator to find?
- How many Good Things did Alice do (weighted by goodness)?
Final comment: I think this comment might come across as ungrateful—just want to point out that I appreciate this post, find it useful, and will be more likely to challenge/question my deference as a result of it.
- Guy Raveh Jun 20, 2022, 6:56 PM
  32 points
  0 ∶ 0
  Parent
  I think the effect should depend on your existing view. If you’ve always engaged directly with Yudkowsky’s arguments and chose the ones convinced you, there’s nothing to learn. If you thought he was a unique genius and always assumed you weren’t convinced of things because he understood things you didn’t know about, and believed him anyway, maybe it’s time to dial it back. If you’d always assumed he’s wrong about literally everything, it should be telling for you that OP had to go 15 years back for good examples.
  
  Writing this comment actually helped me understand how to respond to the OP myself.
  - David Mathers🔸Jun 21, 2022, 10:24 PM
    1 point
    0 ∶ 0
    Parent
    ‘If you’d always assumed he’s wrong about literally everything, it should be telling for you that OP had to go 15 years back to get good examples.’ How strong evidence this is also depends on whether he has made many resolvable predictions since 15-years ago, right? If he hasn’t it’s not very telling. To be clear, I genuinely don’t know if he has or hasn’t.
    - Guy Raveh Jun 21, 2022, 10:29 PM
      7 points
      0 ∶ 0
      Parent
      Sounds reasonable. Though predictions aren’t the only thing one can be demonstratably wrong about.