In that case your strategy is just feeding the labs talent and poisoning the ability of their circles to oppose them.
It seems like your model only has such influence going one way. The lab worker will influence their friends, but not the other way around. I think two-way influence is a more accurate model.
Another option is to ask your friends to monitor you so you don’t get ideologically captured, and hold an intervention if it seems appropriate.
I think you, and this community, have no idea how difficult it is to resist value/mission drift in these situations. This is not a friend:friend exchange. It’s a small community of nonprofits and individuals:the most valuable companies in the world. They aren’t just gonna pick up the values of a few researchers by osmosis.
From your other comment it seems like you have already been affected by the lab’s influence via the technical research community. The emphasis on technical solutions only benefits them, and it just so happens that to work on the big models you have to work with them. This is not an open exchange where they have been just as influenced by us. Sam and Dario sure want you and the US government to think they are the right safety approach, though.
“The emphasis on technical solutions only benefits them”
This is blatantly question-begging, right? In that it is only true if looking for technical solutions doesn’t lead to safe models, which is one of the main points in dispute between you versus people with a higher opinion of the work inside on safety strategy. Of course, it is true that if you don’t have your own opinion already, you shouldn’t trust people who work at leading labs (or want to) on the question of whether technical safety work will help, for the reasons you give. But “people have an incentive to say X” isn’t actually evidence that X is false, it’s just evidence you shouldn’t trust them. If all people outside labs thought technical safety work was useless that would be one thing. But I don’t think that is actually true, it seems people with relevant expertise are divided even outside the labs. Now of course, there are subtler ways in which even people outside the labs might be incentivized to play down the risks. (Though they might also have other reasons to play them up.) But even that won’t get you to “therefore technical safety is definitely useless”; it’s all meta, not object-level.
There’s also a subtler point that even if “do technical safety work on the inside” is unlikely to work, it might still be the better strategy if confrontational lobbying from the outside is unlikely to work too (something that I think is more true now Trump is in power, although Musk is a bit of a wildcard in that respect.)
I didn’t mean “there is no benefit to technical safety work”; I meant more like “there is only benefit to labs to emphasizing technical safety work to the exclusion of other things”, as in it benefits them and doesn’t cost them to do this.
It seems like your model only has such influence going one way. The lab worker will influence their friends, but not the other way around. I think two-way influence is a more accurate model.
Another option is to ask your friends to monitor you so you don’t get ideologically captured, and hold an intervention if it seems appropriate.
I think you, and this community, have no idea how difficult it is to resist value/mission drift in these situations. This is not a friend:friend exchange. It’s a small community of nonprofits and individuals:the most valuable companies in the world. They aren’t just gonna pick up the values of a few researchers by osmosis.
From your other comment it seems like you have already been affected by the lab’s influence via the technical research community. The emphasis on technical solutions only benefits them, and it just so happens that to work on the big models you have to work with them. This is not an open exchange where they have been just as influenced by us. Sam and Dario sure want you and the US government to think they are the right safety approach, though.
“The emphasis on technical solutions only benefits them”
This is blatantly question-begging, right? In that it is only true if looking for technical solutions doesn’t lead to safe models, which is one of the main points in dispute between you versus people with a higher opinion of the work inside on safety strategy. Of course, it is true that if you don’t have your own opinion already, you shouldn’t trust people who work at leading labs (or want to) on the question of whether technical safety work will help, for the reasons you give. But “people have an incentive to say X” isn’t actually evidence that X is false, it’s just evidence you shouldn’t trust them. If all people outside labs thought technical safety work was useless that would be one thing. But I don’t think that is actually true, it seems people with relevant expertise are divided even outside the labs. Now of course, there are subtler ways in which even people outside the labs might be incentivized to play down the risks. (Though they might also have other reasons to play them up.) But even that won’t get you to “therefore technical safety is definitely useless”; it’s all meta, not object-level.
There’s also a subtler point that even if “do technical safety work on the inside” is unlikely to work, it might still be the better strategy if confrontational lobbying from the outside is unlikely to work too (something that I think is more true now Trump is in power, although Musk is a bit of a wildcard in that respect.)
I didn’t mean “there is no benefit to technical safety work”; I meant more like “there is only benefit to labs to emphasizing technical safety work to the exclusion of other things”, as in it benefits them and doesn’t cost them to do this.