Wei Dai comments on How Well Does RL Scale?

Wei Dai 4 Nov 2025 21:54 UTC
5 points
0 ∶ 0
I wrote a post that I think was partly inspired by this discussion. The implication of it here is that I don’t necessarily want philosophers to directly try to solve the many hard philosophical problems relevant to AI alignment/safety (especially given how few of them are in this space or concerned about x-safety), but initially just to try to make them “more legible” to others, including AI researchers, key decision makers, and the public. Hopefully you agree that this is a more sensible position.
- Elliott Thornley 7 Nov 2025 22:40 UTC
  4 points
  0 ∶ 0
  Parent
  try to make them “more legible” to others, including AI researchers, key decision makers, and the public
  Yes, I agree this is valuable, though I think it’s valuable mainly because it increases the probability that people use future AIs to solve these problems, rather than because it will make people slow down AI development or try very hard to solve them pre-TAI.
  - Elliott Thornley 7 Nov 2025 22:44 UTC
    4 points
    0 ∶ 0
    Parent
    I’m not sure but I think maybe I also have a different view than you on what problems are going to be bottlenecks to AI development. e.g. I think there’s a big chance that the world would steam ahead even if we don’t solve any of the current (non-philosophical) problems in alignment (interpretability, shutdownability, reward hacking, etc.).