1 is very true, 2 I agree with apart from the word main, it seems hard to label any factor as “the main” thing, and there’s a bunch of complex reasoning about counterfactuals—eg if GDM stopped work that wouldn’t stop Meta, so is GDM working on capabilities actually the main thing?
I’m pretty unconvinced that not sharing results with frontier labs is tenable—leaving aside that these labs are often the best places to do certain kinds of safety work, if our work is to matter, we need the labs to use it! And you often get valuable feedback on the work by seeing it actually used in production. Having a bunch of safety people who work in secret and then unveil their safety plan at the last minute seems very unlikely to work to me
1 is very true, 2 I agree with apart from the word main, it seems hard to label any factor as “the main” thing, and there’s a bunch of complex reasoning about counterfactuals—eg if GDM stopped work that wouldn’t stop Meta, so is GDM working on capabilities actually the main thing?
I’m pretty unconvinced that not sharing results with frontier labs is tenable—leaving aside that these labs are often the best places to do certain kinds of safety work, if our work is to matter, we need the labs to use it! And you often get valuable feedback on the work by seeing it actually used in production. Having a bunch of safety people who work in secret and then unveil their safety plan at the last minute seems very unlikely to work to me