ag4000 comments on VictorW’s Quick takes

ag4000 Dec 6, 2023, 9:37 PM
4 points
0 ∶ 0
Late to the party here but I’d check out Räuker et al. (2023), which provides one taxonomy of AI interpretability work.
- VictorW Dec 7, 2023, 12:05 AM
  1 point
  0 ∶ 0
  Parent
  Brilliant, thank you. One of the very long lists of interp work on the forum seemed to have everything as mech interp (or possibly I just don’t recognize alternative key words). Does the EA AI safety community feel particularly strongly about mech interp or is it just my sample size being too small?
  - ag4000 Dec 7, 2023, 12:23 AM
    1 point
    0 ∶ 0
    Parent
    Not an expert, but I think your impression is correct. See this post, for example (I recommend the whole sequence).