gwern comments on On Deference and Yudkowsky’s AI Risk Estimates

gwern Jun 19, 2022, 6:13 PM
33 points
0 ∶ 0

The above seems voluminous and I believe this is the written output with the goal of defending a person.

Yes, much like the OP is voluminous and is the written output with the goal of criticizing a person. You’re familiar with such writings, as you’ve written enough criticizing me. Your point?

Yeah, no, it’s the exact opposite.

No, it’s just as I said, and your Karnofsky retrospective strongly supports what I said. (I strongly encourage people to go and read it, not just to see what’s before and after the part He screenshots, but because it is a good retrospective which is both informative about the history here and an interesting case study of how people change their minds and what Karnofsky has learned.)

Karnofsky started off disagreeing that there is any problem at all in 2007 when he was introduced to MIRI via EA, and merely thought there were some interesting points. Interesting, but certainly not worth sending any money to MIRI or looking for better alternative ways to invest in AI safety. These ideas kept developing, and Karnofsky kept having to engage, steadily moving from ‘there is no problem’ to intermediate points like ‘but we can make tool AIs and not agent AIs’ (a period in his evolution I remember well because I wrote criticisms of it), which he eventually abandons. You forgot to screenshot the part where Karnofsky writes that he assumed ‘the experts’ had lots of great arguments against AI risk and the Yudkowsky paradigm and that was why they just bother talking about it, and then moved to SF and discovered ‘oh no’, that not only did those not exist, the experts hadn’t even begun to think about it. Karnofsky also agrees with many of the points I make about Bostrom’s book & intellectual pedigree (“When I’d skimmed Superintelligence (prior to its release), I’d felt that its message was very similar to—though more clearly and carefully stated than—the arguments MIRI had been making without much success.” just below where you cut off). And so here we are today, where Karnofsky has not just overseen donations of millions of dollars to MIRI and AI safety NGOs or the recruitment of MIRI staffers like ex-MIRI CEO Muehlhauser, but it remains a major area for OpenPhil (and philanthropies imitating it like FTX). It all leads back to Eliezer. As Karnofsky concludes:

One of the biggest changes is the one discussed above, regarding potential risks from advanced AI. I went from seeing this as a strange obsession of the community to a case of genuine early insight and impact. I felt the community had identified a potentially enormously important cause and played a major role in this cause’s coming to be taken more seriously. This development became—in my view—a genuine and major candidate for a “hit”, and an example of an idea initially seeming “wacky” and later coming to seem prescient.

Of course, it is far from a settled case: many questions remain about whether this cause is indeed important and whether today’s preparations will look worthwhile in retrospect. But my estimate of the cause’s likely importance—and, I believe, conventional wisdom among AI researchers in academia and industry—has changed noticeably.

That is, Karnofsky explicitly attributes the widespread changes I am describing to the causal impact of the AI risk community around MIRI & Yudkowsky. He doesn’t say it happened regardless or despite them, or that it was already fairly common and unoriginal, or that it was reinvented elsewhere, or that Yudkowsky delayed it on net.

I’m really sure even a median thought leader would have better convinced the person written this.

Hard to be convincing when you don’t exist.
- bgarfinkel Jun 19, 2022, 6:56 PM
  17 points
  0 ∶ 0
  Parent
  
  No, it’s just as I said, and your Karnofsky retrospective strongly supports what I said.
  
  I also agree that Karnfosky’s retrospective supports Gwern’s analysis, rather than doing the opposite.
  
  (I just disagree about how strongly it counts in favor of deference to Yudkowsky. For example, I don’t think this case implies we should currently defer more to Yudkwosky’s risk estimates than we do to Karnofsky’s.)
  - Charles He Jun 19, 2022, 7:35 PM
    −1 points
    0 ∶ 0
    Parent
    Ugh. Y’all just made me get into “EA rhetoric” mode:
    I also agree that Karnfosky’s retrospective supports Gwern’s analysis, rather than doing the opposite.
    What?
    No. Not only is this not true but this is indulging in a trivial rhetorical maneuver.
    My comment said that the counterfactual would be better without the involvement of the person mentioned in the OP. I used the retrospective as evidence.
    The retrospective includes at least two points for why the author changed their mind:
    The book Superintelligence, which they explicitly said was the biggest event
    The author moved to SF and learned about DL, and was informed by speaking to non-rationalist AI researchers, and then decided that LessWrong and MIRI were right.
    In response to this, Gwern states the point #2, and asserts that this is causal evidence in favor of the person mentioned in the OP being useful.
    Why? How?
    Notice that #2 above doesn’t at all rule out that the founders or culture was repellent. In fact it seems like a lavish, and unlikely level amount of involvement.
    - bgarfinkel Jun 19, 2022, 7:54 PM
      7 points
      0 ∶ 0
      Parent
      
      What?
      
      I interpreted Gwern as mostly highlighting that people have updated toward’s Yudkowsky’s views—and using this as evidence in favor of the view we should defer a decent amount to Yudkowsky. I think that was a reasonable move.
      
      There is also a causal question here (‘Has Yudkowsky on-net increased levels of concern about AI risk relative to where they would otherwise be?’), but I didn’t take the causal question to be central to the point Gwern was making. Although now I’m less sure.
      
      I don’t personally have strong views on the causal question—I haven’t thought through the counterfactual.
    - Charles He Jun 19, 2022, 7:37 PM
      −10 points
      0 ∶ 0
      Parent
      (I strongly encourage people to go and read it, not just to see what’s before and after the part He screenshots, but because it is a good retrospective which is both informative about the history here and an interesting case study of how people change their minds and what Karnofsky has learned.)
      By the way, I didn’t screenshot the pieces that fit my narrative—Gwern’s assertion of bad faith is another device being used.
      Yes, much like the OP is voluminous and is the written output with the goal of criticizing a person. You’re familiar with such writings, as you’ve written enough criticizing me. Your point?
      Gwern also digs up a previous argument. Not only is that issue entirely unrelated, its sort of exactly the opposite evidence he wants to show: Gwern appeared to borderline or threaten to dox someone who spoke out against him.
      I commented. However I do not know anyone involved, such as who Gwern was, but only acting on the content and behaviour I saw, which was outright abusive.
      
      There is no expected benefit to doing this. It’s literally the most principled thing to act in this way and I would do it again.
      The consequences of that incident, the fact that this person with this behavior and content had this much status, was a large update for me.
      More subtly and perniciously, Gwern’s adverse behavior in this comment chain and the incident mentioned above, is calibrated to the level of “EA rhetoric”. Digs like his above can sail through, with the tailwind of support of a subset of this community, a subset that values authority over content and Truth, to a degree much more than it understands.
      On the other hand, in contrast, an outsider, who already has to dance through all the rhetorical devices and elliptical references, has to make a high effort, unemotional comment to try to make a point. Even or especially if they manage to do this, they can expect to be hit with a wall of text with various hostilities.
      Like, this is awful. This isn’t just bad but it’s borderline abusive.
      It’s wild that that this is the level of discourse here.
      Because of the amount of reputation, money and ingroupness, this is probably one of the most extreme forms of tribalism that exists.
      Do you know how much has been lost?
      - technicalities Jun 19, 2022, 9:03 PM
        9 points
        0 ∶ 0
        Parent
        Charles, consider going for that walk now if you’re able to. (Maybe I’m missing it, but the rhetorical moves in this thread seem equally bad, and not very bad at that.)
        Charles He Jun 19, 2022, 10:21 PM
        2 points
        0 ∶ 0
        Parent
        You are right, I don’t think my comments are helping.
- Charles He Jun 19, 2022, 6:23 PM
  −1 points
  0 ∶ 0
  Parent
  Like, how can so many standard, stale patterns of internet forum authority, devices and rhetoric be rewarded and replicate in a community explicitly addressing topics like tribalism and “evaporative cooling”?