This seems almost exactly like the repugnant conclusion. Taken to extremes, intuition disagrees with logic. When that happens, it’s usually the worse for intuition.
I’m not a utilitarian, but I find the repugnant conclusion impossible to reject if you are.
If you want chose what is good for everyone, there’s little argument what that is in those cases.
And if we’re talking about what’s good for everyone, that’s got to be a linear sum of what’s good for each someone. If the sum is nonlinear, who exactly is worth less than the others? This leads to the repugnant conclusion and your conclusion here.
Other definitions of “good for everyone” seem to always mean “what I idiosyncratically prefer for everyone else but me”.
It seems like having genuinely safety-minded people within orgs is invaluable. Do you think that having them refuse to join is going to meaningfully slow things down?
It just takes one brave or terrified person in the know to say “these guys are internally deploying WHAT? I’ve got to stop this!”
I worry very much that we won’t have one such person in the know in OpenAI. I’m very glad we have them in Anthropic.
Having said that, I agree that Anthropic should not be shielded from criticism.
Your assumption that influence flows one way in organizations seems based on fear not psychology. If someone believes AGI is a real risk, they should be motivated enough to resist some pressure from superiors who merely argue that they’re doing good stuff.
If you won’t actively resist changing your beliefs once you join a culture with importantly different beliefs, then don’t join an org.
While Anthropic’s plan is a terrible one, so is PauseAI’s. We have no good plans. And we must’nt fight amongst ourselves.