Guy Raveh comments on A Quick List of Some Problems in AI Alignment As A Field

Guy Raveh 21 Jun 2022 18:05 UTC
6 points
0 ∶ 0
I think the fact that MIRI has not managed to get even close to solving the problems it set out to solve—combined with their ideas for how the world would fare if those aren’t solved—speak very strongly against the secrecy. You termed the secrecy understandable, but I don’t really think it is. It comes from the assumption that the risk from not telling anyone (and thus not having them collaborate with you) is smaller than the risk of telling them (and having someone somewhere misuse your ideas). This doesn’t hold up to reality.

I upvoted your post, but would like to strongly object to your last point about writing and prose. Resorting to technical styles of communication would make it much easier for some people, while, I suspect, making it much harder for a lot more people. It’s hard for research that isn’t communicated clearly to have a significant contribution to the common knowledge.

cf. Mochizuki’s “proof” of the ABC conjecture that he almost entirely refused to explain and ended up wasting years of the mathematical community’s time and eventually being refuted.
- Charles He 21 Jun 2022 22:54 UTC
  5 points
  0 ∶ 0
  Parent
  I don’t know anything about AI safety or machine learning, and also I think I view your comment, sentiments, as well as the post being valuable.
  However, I don’t see how this is right:
  You termed the secrecy understandable, but I don’t really think it is.
  Given the worldview/theory of change/beliefs behind MIRI, secrecy seems justified. (There might be a massive, inexcusable defect that produced these beliefs, as well very ungenerous alternate reasons for the secrecy) but taking at face value the beliefs or infohazards claimed, it seems valuable ex ante.
  - Guy Raveh 22 Jun 2022 22:00 UTC
    1 point
    0 ∶ 0
    Parent
    I can accept that it seemed to make sense at the start, but can you explain how it would still make sense now given what’s happened (or, rather, didn’t happen) in the meantime?
    - Charles He 22 Jun 2022 22:38 UTC
      2 points
      0 ∶ 0
      Parent
      Basically, I don’t know . I think it’s good to start off by emphatically stating I don’t have any real knowledge of MIRI.
      
      A consideration is that the beliefs in MIRI are still on very short timelines. A guess is that because of the nature of some work relevant to short timelines, maybe some projects could have bad consequences if made public (or just don’t make sense to ever make public).
      
      Again, this is presumptuous, but my instinct is not to have attitudes of instructing org policy in a situation like this, because of dependencies we don’t see. (Just so this doesn’t read like a statement that nothing can ever change: I guess the change here would be a new org or new leaders, obviously this is hard).
      
      Also, to be clear, this is accepting the premise of MIRI. IMO one should take seriously the premise of shorter timelines, like, it’s a valid belief. Under this premise, the issue here is really bad execution, like actively bad.
      
      If your comment was alluding to shifting of beliefs away from short timelines, that seems like a really different discussion.
      - Guy Raveh 22 Jun 2022 22:44 UTC
        1 point
        0 ∶ 0
        Parent
        No, I’m saying the nearer and more probable you thing doom-causing AGI is, and the longer you stagnate on solving the problem, the less it makes sense to not let the rest of the world in on the work. If you don’t, you’re very probably doomed. If you do, you’re still very probably doomed, but at least you have orders of magnitude more people collaborating with you to prevent it, this increasing the chance of success.
        Charles He 22 Jun 2022 22:53 UTC
        3 points
        0 ∶ 0
        Parent
        I think what you said makes sense.
        
        (As a presumptuous comment) I don’t have a positive view about the work from strong circumstantial evidence. However, as sort of devils advocate:
        
        There are very few good theories of change for very short timelines and one of them is build it yourself. So, I don’t see how that’s good to share.
        
        Alignment might be entangled in this to the degree that sharing even alignment might be capabilities research.
        
        The above might be awful beliefs but I don’t see how it’s wrong.
        
        By the way, just to calibrate so people can read if I’m crazy:
        
        It reads like MIRI or closely related people have tried to build AGI or find the requisite knowledge, many times over the years. The negative results seems to be an update about their beliefs.
        Guy Raveh 22 Jun 2022 23:07 UTC
        3 points
        0 ∶ 0
        Parent
        Thanks. That kinda sorta makes sense. I still think if they’re trying to build an aligned AGI, it’s arrogant and unrealistic to think you can achieve it with a small group that’s not collaborating with others, faster than the entire AI capabilities community who are basically collaborating together can.
- Nicholas / Heather Kross 21 Jun 2022 21:25 UTC
  1 point
  0 ∶ 0
  Parent
  Good point about the secrecy, I hadn’t heard of the ABC thing. The secrecy is “understandable” to the extent that AI safety is analogous to the Manhattan Project, but less useful to the extent that AIS is analogous to… well, the development of theoretical physics.