I want to clarify that I don’t think ideas like the Orthogonality Thesis or Instrumental Convergence are wrong. They’re strong predictive hypotheses that follow logically from very reasonable assumptions, and even the possibility that they could be correct is more than enough justification for AI safety work to be critical.
I was more just pointing out some examples of ideas that are very strongly held by the community, that happen to have been named and popularized by people like Bostrom and Yudkowsky, both of whom might be considered elites among us.
P.S. I’m always a bit surprised that the Neel Nanda of Google DeepMind has the time and desire to post so much on the EA Forums (and also Less Wrong). That probably says very good things about us, and also gives me some more hope that the folks at Google are actually serious about alignment. I really like your work, so it’s an honour to be able to engage with you here (hope I’m not fanboying too much).
I want to clarify that I don’t think ideas like the Orthogonality Thesis or Instrumental Convergence are wrong. They’re strong predictive hypotheses that follow logically from very reasonable assumptions, and even the possibility that they could be correct is more than enough justification for AI safety work to be critical.
I was more just pointing out some examples of ideas that are very strongly held by the community, that happen to have been named and popularized by people like Bostrom and Yudkowsky, both of whom might be considered elites among us.
P.S. I’m always a bit surprised that the Neel Nanda of Google DeepMind has the time and desire to post so much on the EA Forums (and also Less Wrong). That probably says very good things about us, and also gives me some more hope that the folks at Google are actually serious about alignment. I really like your work, so it’s an honour to be able to engage with you here (hope I’m not fanboying too much).