I’m curious whether people (e.g., David, MIRI folk) think that LLMs now or in the near future would be able to substantially speed up this kind of theoretical safety work?
I would prefer a pause on LLMs that are more capable, in part to give us time to figure out how to align these systems. As I argued, I think mathematical approaches are potentially critical there. But yes, general intelligences could help—I just don’t expect them to be differentially valuable for mathematical safety over capabilities, so if they are capable of these types of work, it’s a net loss.
I’m curious whether people (e.g., David, MIRI folk) think that LLMs now or in the near future would be able to substantially speed up this kind of theoretical safety work?
I would prefer a pause on LLMs that are more capable, in part to give us time to figure out how to align these systems. As I argued, I think mathematical approaches are potentially critical there. But yes, general intelligences could help—I just don’t expect them to be differentially valuable for mathematical safety over capabilities, so if they are capable of these types of work, it’s a net loss.