I feel quite worried that the alignment plan of Anthropic currently basically boils down to “we are the good guys, and by doing a lot of capabilities research we will have a seat at the table when AI gets really dangerous, and then we will just be better/more-careful/more-reasonable than the existing people, and that will somehow make the difference between AI going well and going badly”. That plan isn’t inherently doomed, but man does it rely on trusting Anthropic’s leadership, and I genuinely only have marginally better ability to distinguish the moral character of Anthropic’s leadership from the moral character of FTX’s leadership, and in the absence of that trust the only thing we are doing with Anthropic is adding another player to an AI arms race.
More broadly, I think AI Alignment ideas/the EA community/the rationality community played a pretty substantial role in the founding of the three leading AGI labs (Deepmind, OpenAI, Anthropic), and man, I sure would feel better about a world where none of these would exist, though I also feel quite uncertain here. But it does sure feel like we had a quite large counterfactual effect on AI timelines.
Thank you so much for voicing these concerns. I share them too and they need to be said more loudly. I’m extremely worried the EA/LessWrong community has had a net negative impact on the world simply because of the increased AI risk.[1] I haven’t heard any good arguments against this.
Thank you so much for voicing these concerns. I share them too and they need to be said more loudly. I’m extremely worried the EA/LessWrong community has had a net negative impact on the world simply because of the increased AI risk.[1] I haven’t heard any good arguments against this.
If we exclude AI-related work, I do think EA has been net positive.