âThe emphasis on technical solutions only benefits themâ
This is blatantly question-begging, right? In that it is only true if looking for technical solutions doesnât lead to safe models, which is one of the main points in dispute between you versus people with a higher opinion of the work inside on safety strategy. Of course, it is true that if you donât have your own opinion already, you shouldnât trust people who work at leading labs (or want to) on the question of whether technical safety work will help, for the reasons you give. But âpeople have an incentive to say Xâ isnât actually evidence that X is false, itâs just evidence you shouldnât trust them. If all people outside labs thought technical safety work was useless that would be one thing. But I donât think that is actually true, it seems people with relevant expertise are divided even outside the labs. Now of course, there are subtler ways in which even people outside the labs might be incentivized to play down the risks. (Though they might also have other reasons to play them up.) But even that wonât get you to âtherefore technical safety is definitely uselessâ; itâs all meta, not object-level.
Thereâs also a subtler point that even if âdo technical safety work on the insideâ is unlikely to work, it might still be the better strategy if confrontational lobbying from the outside is unlikely to work too (something that I think is more true now Trump is in power, although Musk is a bit of a wildcard in that respect.)
I didnât mean âthere is no benefit to technical safety workâ; I meant more like âthere is only benefit to labs to emphasizing technical safety work to the exclusion of other thingsâ, as in it benefits them and doesnât cost them to do this.
âThe emphasis on technical solutions only benefits themâ
This is blatantly question-begging, right? In that it is only true if looking for technical solutions doesnât lead to safe models, which is one of the main points in dispute between you versus people with a higher opinion of the work inside on safety strategy. Of course, it is true that if you donât have your own opinion already, you shouldnât trust people who work at leading labs (or want to) on the question of whether technical safety work will help, for the reasons you give. But âpeople have an incentive to say Xâ isnât actually evidence that X is false, itâs just evidence you shouldnât trust them. If all people outside labs thought technical safety work was useless that would be one thing. But I donât think that is actually true, it seems people with relevant expertise are divided even outside the labs. Now of course, there are subtler ways in which even people outside the labs might be incentivized to play down the risks. (Though they might also have other reasons to play them up.) But even that wonât get you to âtherefore technical safety is definitely uselessâ; itâs all meta, not object-level.
Thereâs also a subtler point that even if âdo technical safety work on the insideâ is unlikely to work, it might still be the better strategy if confrontational lobbying from the outside is unlikely to work too (something that I think is more true now Trump is in power, although Musk is a bit of a wildcard in that respect.)
I didnât mean âthere is no benefit to technical safety workâ; I meant more like âthere is only benefit to labs to emphasizing technical safety work to the exclusion of other thingsâ, as in it benefits them and doesnât cost them to do this.