I wrote a post that I think was partly inspired by this discussion. The implication of it here is that I don’t necessarily want philosophers to directly try to solve the many hard philosophical problems relevant to AI alignment/safety (especially given how few of them are in this space or concerned about x-safety), but initially just to try to make them “more legible” to others, including AI researchers, key decision makers, and the public. Hopefully you agree that this is a more sensible position.
try to make them “more legible” to others, including AI researchers, key decision makers, and the public
Yes, I agree this is valuable, though I think it’s valuable mainly because it increases the probability that people use future AIs to solve these problems, rather than because it will make people slow down AI development or try very hard to solve them pre-TAI.
I’m not sure but I think maybe I also have a different view than you on what problems are going to be bottlenecks to AI development. e.g. I think there’s a big chance that the world would steam ahead even if we don’t solve any of the current (non-philosophical) problems in alignment (interpretability, shutdownability, reward hacking, etc.).
I wrote a post that I think was partly inspired by this discussion. The implication of it here is that I don’t necessarily want philosophers to directly try to solve the many hard philosophical problems relevant to AI alignment/safety (especially given how few of them are in this space or concerned about x-safety), but initially just to try to make them “more legible” to others, including AI researchers, key decision makers, and the public. Hopefully you agree that this is a more sensible position.
Yes, I agree this is valuable, though I think it’s valuable mainly because it increases the probability that people use future AIs to solve these problems, rather than because it will make people slow down AI development or try very hard to solve them pre-TAI.
I’m not sure but I think maybe I also have a different view than you on what problems are going to be bottlenecks to AI development. e.g. I think there’s a big chance that the world would steam ahead even if we don’t solve any of the current (non-philosophical) problems in alignment (interpretability, shutdownability, reward hacking, etc.).