Sam Clarke comments on We’re Redwood Research, we do applied alignment research, AMA

Sam Clarke 6 Oct 2021 15:02 UTC
11 points
0 ∶ 0
What might be an example of a “much better weird, theory-motivated alignment research” project, as mentioned in your intro doc? (It might be hard to say at this point, but perhaps you could point to something in that direction?)
- Buck 6 Oct 2021 16:23 UTC
  6 points
  0 ∶ 0
  Parent
  I think the best examples would be if we tried to practically implement various schemes that seem theoretically doable and potentially helpful, but quite complicated to do in practice. For example, imitative generalization or the two-head proposal here. I can imagine that it might be quite hard to get industry labs to put in the work of getting imitative generalization to work in practice, and so doing that work (which labs could perhaps then adopt) might have a lot of impact.