Habryka comments on Two reasons we might be closer to solving alignment than it seems

Habryka 25 Sep 2022 19:46 UTC
13 points
1 ∶ 2
There is a cambrian explosion of research groups, but basically no new agendas as far as I can tell? Of the agendas listed on that post, I think basically all are 5+ years old (some have morphed, like ELK is a different take on scalable oversight than Paul had 5 years ago, but I would classify it as the same agenda).
There is a giant pile of people working on the stuff, though the vast majority of new work can be characterized “let’s just try to solve some near-term alignment problems and hope that it somehow informs our models of long-term alignment problems” and a large pile of different types of transparency research. I think there are good cases for that work, though I am not very optimistic about it helping with existential risk.
- Kat Woods 26 Sep 2022 13:01 UTC
  7 points
  0 ∶ 0
  Parent
  That’s really interesting and unexpected! Seems worth figuring out why that’s happening. What are your top hypotheses for why that’s happening?
  My first guess would be epistemic humility norms.
  My second would be that the first people in a field are often disproportionately talented compared to people coming in later. (Although you could also tell a story about how at the beginning it’s too socially weird so it can’t attract a lot of top talent).
  My third is that since alignment is so hard, it’s easier for people to latch onto existing research agendas instead of creating new ones. At the beginning there were practically no agendas to latch onto, so people had to make new ones, but now there are a few, so most people just sort themselves into those.
- Greg_Colbourn 26 Sep 2022 7:04 UTC
  5 points
  0 ∶ 0
  Parent
  Are there any promising directions for AGI x-risk reduction that you are aware of that aren’t being (significantly) explored?