AI Safety Field Building vs. EA CB
As part of the EA Strategy fortnight, I am sharing a reflection on my experience doing AI safety movement building over the last year, and why I am more excited about more efforts in the space compared to EA movement-building. This is mostly due to the relative success of AI safety groups compared to EA groups at universities with both (e.g. read about Harvard and MIT updates from this past year here). I expect many of the takeaways to extend beyond the university context. The main reasons AI safety field building seems more impactful are:
Experimental data from universities with substantial effort put into EA and AI safety groups: Higher engagement overall, and from individuals with relevant expertise, interests, and skills
Stronger object-level focus encourages skill and knowledge accumulation, offers better career capital, and lends itself to engagement from more knowledgeable and senior individuals (including graduate students and professors).
Impartial/future-focused altruism not being a crux for many for working on AI safety
Recent developments increasing the salience of potential risks from transformative AI, and decreasing the appeal of the EA community/ideas.
I also discuss some hesitations and counterarguments, of which the large decrease in neglectedness of existential risk from AI is most salient (and which I have not reflected too much on the implications of yet, though I still agree with the high-level takes this post argues for).
Context/Why I am writing about this
I helped set up and run the Cambridge Boston Alignment Initiative (CBAI) and the MIT AI Alignment group this past year. I also helped out with Harvard’s AI Safety team programming, along with some broader university AI safety programming (e.g. a retreat, two MLAB-inspired bootcamps, and a 3-week research program on AI strategy). Before this, I ran the Stanford Existential Risks Initiative and effective altruism student group and have supported many other university student groups.
Why AI Safety Field Building over EA Community Building
From my experiences over the past few months, it seems that AI safety field building is generally more impactful than EA movement building for people able to do either well, especially at the university level (under the assumption that reducing AI x-risk is probably the most effective way to do good, which I assume in this article). Here are some reasons for this:
AI-alignment-branded outreach is empirically attracting many more students with relevant skill sets and expertise than EA-branded outreach at universities.
Anecdotal evidence: At MIT, we received ~5x the number of applications for AI safety programming compared to EA programming, despite similar levels of outreach last year. This ratio was even higher when just considering applicants with relevant backgrounds and accomplishments. Around two dozen winners and top performers of international competitions (math/CS/science olympiads, research competitions) and students with significant research experience engaged with AI alignment programming, but very few engaged with EA programming.
This phenomenon at MIT has also roughly been matched at Harvard, Stanford, Cambridge, and I’d guess several other universities (though I think the relevant ratios are slightly lower than at MIT).
It makes sense that things marketed with a specific cause area (e.g. AI rather than EA) are more likely to attract individuals highly skilled, experienced, and interested in topics relevant to the cause area.
Effective cause-area specific direct work and movement building still involves the learning, understanding, and application of many important principles and concepts in EA:
Prioritization/Optimization are relevant, to maximally reduce existential risk.
Relatedly, consequentialism/effectiveness/focusing on producing the best outcomes and what actually works, as well as willingness to pivot, seem important to emphasize as part of strong AI safety programming and discussions.
Intervention neutrality—Even within AI alignment, there are many ways to contribute: conceptual alignment research, applied technical research, lab governance, policy/government, strategy research, field-building/communications/advocacy, etc. Wisely determining which of these to focus on requires engagement with many principles core to EA.
(Low confidence) So far, I’ve gotten the impression that the students who have gotten most involved with AIS student groups are orienting to the problem with a “How can I maximally reduce x-risk?” frame, not “Which aspect of the problem seems most intellectually stimulating?”.
The existential vs. non-existential risks distinction remains relevant, to prioritize mitigating the former
This distinction also naturally leads to discussion about population ethics, moral philosophy, altruism (towards future generations), and other related ideas.
Truth-seeking and strong epistemics remain relevant.
Caveat: Empirically, maintaining strong epistemics and a culture of truth-seeking have not been emphasized as much in AIS groups from my experience, and it feels slightly unnatural to do so (though I think the case for its importance can be made pretty straightforwardly given how confusing AI and alignment is, the paucity of feedback loops, and the importance of prioritization given limited time and resources).
When much of the cause-area specific field-building work is done by EAs, and much of the research/content engaged with is from EAs, people will naturally interact with EAs, and some will be sympathetic to the ideas.
Cause-area specific movement building incentivizes a strong understanding of cause area object-level content, which both acts as a selection filter (which standard EA community building lacks), and helps make movement-builders better suited to pivot to object-level work. This makes organizing especially appealing for students who might not want to commit to movement building work long-term.
I think it is useful for people running cause-area specific movement building projects (including student groups) to be pretty motivated to have their group maximally mitigate existential risk/improve the long-term future, since doing the aforementioned prioritization well and creating/maintaining strong culture (with e.g. high levels of truth-seeking, and a results-focused framework) is difficult and unlikely without these high-level goals.
A stronger object-level focus also makes engagement more appealing to individuals with subject matter expertise, like graduate students and professors. Empirically, grad student and professor engagement has been much stronger and more successful with AI safety groups than EA/existential risk focused groups so far.
The words “effective altruism” do not really elicit what I believe is most important and exciting about EA principles and the community, and what many of us currently think is most important to work on (e.g. global/universal impartial focus, prioritization/optimization, navigating and improving technological development and addressing its risks, etc).
AI risk, existential risk, and longtermism get at some items listed above, but maybe don’t get at prioritization/optimization well. Still, perhaps STEM-heavy cause area programming naturally attracts people interested in applying optimization to real life.
The reputation of the EA community and name has (justifiably) taken a big hit in light of the several recent scandals, making EA CB look worse. On the other hand, AI alignment has been getting a ton of positive attention and concern from the general public and relevant stakeholders.
That being said, the effects of the scandals on top university students’ perception of EA seem much smaller than I initially expected (e.g. most people think of the FTX crash as an example of crypto being crazy/fake). According to a Rethink Priorities survey only 20% of people who have heard about EA have heard about FTX.
Not needing to externally justify expenditures on common-sense altruistic grounds: Many of the community building interventions that seem most exciting involve spending money in ways that seem unusual in a university or common-sense altruistic context (e.g. group organizing salaries and costs, organizing workshops at large venues, renting office spaces). I think that some of these are more socially acceptable when not done in the name of ‘altruism’ or charity even if the group has similar motivations to EA groups in its culture (or at the very least this helps to insulate EA from some negative reputational effects).
Anecdotally, impartial/future-focused altruism is not the primary motivation for a large portion of individuals working full-time on AI existential risk reduction (and maybe the majority). Impartial altruism does not seem like the most compelling way one would get people to seriously consider working on existential risk reduction, as is discussed here, here, and here.
Counterarguments and Hesitations
I have not been working on AI safety/cause-area specific movement building for long enough (and AIS groups in general have not been very active for long enough) to feel confident that exciting leading indicators will translate into long-term impact. EA community building has a longer track record. The small sample sizes also reduce my confidence in the above takeaways.
Perhaps strong philosophical/ethical commitments (as opposed to say visceral urgency/concern and amazement at the capabilities of AI, or its rate of improvement) end up being more important than I currently estimate for long-term changes to career plans and behavior more generally.
Maybe the non-altruistic case for existential risk mitigation isn’t sound, e.g. because someone’s likelihood of being able to contribute is too low to justify working on x-risk reduction, instead of achieving their goals another way. If so, maybe insufficiently altruistically motivated people will realize this and pivot to something else.
Figuring out what is true and helpful in the context of AI safety might be sufficiently difficult that the downsides of movement building and outreach (e.g. lower epistemic standards and lower-quality content on e.g. LessWrong/the alignment forum) might outweigh the upsides (e.g. more motivated/talented people working on AI alignment).
AI safety is getting more mainstream than EA. Many of the people I expect to be most impactful would not have initially gotten involved with an AI safety group, but got into EA first and eventually switched to AI (though others like Open Philanthropy would have a better sense of this). The huge increase in discourse and attention on advanced AI might make the usefulness of proactive outreach and education about AI safety much lower moving forward than it was half a year ago.
Historically, AI-alignment-driven writing and field-building seems to have significantly contributed to (speeding up) AI capabilities—potentially more than it has contributed alignment/making the future better. AI alignment field-building might continue (or start to) have this effect.
My current intuition is: AGI hype has gotten high enough that the ratio of median capabilities researchers to safety researchers that would be beneficial from CB is pretty high (maybe >10:1, not sure), and definitely higher than what leading indicators suggest is produced by field-building at the moment.
On the margin, I’d direct more resources towards AI safety movement building, though I still think EA movement-building can be very valuable and should continue to some extent. I’d be interested in hearing others’ experiences and thoughts on AI safety and other cause area field building compared to EA CB in the comments.