AI Safety Field Building vs. EA CB
Summary
As part of the EA Strategy fortnight, I am sharing a reflection on my experience doing AI safety movement building over the last year, and why I am more excited about more efforts in the space compared to EA movement-building. This is mostly due to the relative success of AI safety groups compared to EA groups at universities with both (e.g. read about Harvard and MIT updates from this past year here). I expect many of the takeaways to extend beyond the university context. The main reasons AI safety field building seems more impactful are:
Experimental data from universities with substantial effort put into EA and AI safety groups: Higher engagement overall, and from individuals with relevant expertise, interests, and skills
Stronger object-level focus encourages skill and knowledge accumulation, offers better career capital, and lends itself to engagement from more knowledgeable and senior individuals (including graduate students and professors).
Impartial/future-focused altruism not being a crux for many for working on AI safety
Recent developments increasing the salience of potential risks from transformative AI, and decreasing the appeal of the EA community/ideas.
I also discuss some hesitations and counterarguments, of which the large decrease in neglectedness of existential risk from AI is most salient (and which I have not reflected too much on the implications of yet, though I still agree with the high-level takes this post argues for).
Context/Why I am writing about this
I helped set up and run the Cambridge Boston Alignment Initiative (CBAI) and the MIT AI Alignment group this past year. I also helped out with Harvard’s AI Safety team programming, along with some broader university AI safety programming (e.g. a retreat, two MLAB-inspired bootcamps, and a 3-week research program on AI strategy). Before this, I ran the Stanford Existential Risks Initiative and effective altruism student group and have supported many other university student groups.
Why AI Safety Field Building over EA Community Building
From my experiences over the past few months, it seems that AI safety field building is generally more impactful than EA movement building for people able to do either well, especially at the university level (under the assumption that reducing AI x-risk is probably the most effective way to do good, which I assume in this article). Here are some reasons for this:
AI-alignment-branded outreach is empirically attracting many more students with relevant skill sets and expertise than EA-branded outreach at universities.
Anecdotal evidence: At MIT, we received ~5x the number of applications for AI safety programming compared to EA programming, despite similar levels of outreach last year. This ratio was even higher when just considering applicants with relevant backgrounds and accomplishments. Around two dozen winners and top performers of international competitions (math/CS/science olympiads, research competitions) and students with significant research experience engaged with AI alignment programming, but very few engaged with EA programming.
This phenomenon at MIT has also roughly been matched at Harvard, Stanford, Cambridge, and I’d guess several other universities (though I think the relevant ratios are slightly lower than at MIT).
It makes sense that things marketed with a specific cause area (e.g. AI rather than EA) are more likely to attract individuals highly skilled, experienced, and interested in topics relevant to the cause area.
Effective cause-area specific direct work and movement building still involves the learning, understanding, and application of many important principles and concepts in EA:
Prioritization/Optimization are relevant, to maximally reduce existential risk.
Relatedly, consequentialism/effectiveness/focusing on producing the best outcomes and what actually works, as well as willingness to pivot, seem important to emphasize as part of strong AI safety programming and discussions.
Intervention neutrality—Even within AI alignment, there are many ways to contribute: conceptual alignment research, applied technical research, lab governance, policy/government, strategy research, field-building/communications/advocacy, etc. Wisely determining which of these to focus on requires engagement with many principles core to EA.
(Low confidence) So far, I’ve gotten the impression that the students who have gotten most involved with AIS student groups are orienting to the problem with a “How can I maximally reduce x-risk?” frame, not “Which aspect of the problem seems most intellectually stimulating?”.
The existential vs. non-existential risks distinction remains relevant, to prioritize mitigating the former
This distinction also naturally leads to discussion about population ethics, moral philosophy, altruism (towards future generations), and other related ideas.
Truth-seeking and strong epistemics remain relevant.
Caveat: Empirically, maintaining strong epistemics and a culture of truth-seeking have not been emphasized as much in AIS groups from my experience, and it feels slightly unnatural to do so (though I think the case for its importance can be made pretty straightforwardly given how confusing AI and alignment is, the paucity of feedback loops, and the importance of prioritization given limited time and resources).
When much of the cause-area specific field-building work is done by EAs, and much of the research/content engaged with is from EAs, people will naturally interact with EAs, and some will be sympathetic to the ideas.
Cause-area specific movement building incentivizes a strong understanding of cause area object-level content, which both acts as a selection filter (which standard EA community building lacks), and helps make movement-builders better suited to pivot to object-level work. This makes organizing especially appealing for students who might not want to commit to movement building work long-term.
I think it is useful for people running cause-area specific movement building projects (including student groups) to be pretty motivated to have their group maximally mitigate existential risk/improve the long-term future, since doing the aforementioned prioritization well and creating/maintaining strong culture (with e.g. high levels of truth-seeking, and a results-focused framework) is difficult and unlikely without these high-level goals.
A stronger object-level focus also makes engagement more appealing to individuals with subject matter expertise, like graduate students and professors. Empirically, grad student and professor engagement has been much stronger and more successful with AI safety groups than EA/existential risk focused groups so far.
The words “effective altruism” do not really elicit what I believe is most important and exciting about EA principles and the community, and what many of us currently think is most important to work on (e.g. global/universal impartial focus, prioritization/optimization, navigating and improving technological development and addressing its risks, etc).
AI risk, existential risk, and longtermism get at some items listed above, but maybe don’t get at prioritization/optimization well. Still, perhaps STEM-heavy cause area programming naturally attracts people interested in applying optimization to real life.
The reputation of the EA community and name has (justifiably) taken a big hit in light of the several recent scandals, making EA CB look worse. On the other hand, AI alignment has been getting a ton of positive attention and concern from the general public and relevant stakeholders.
That being said, the effects of the scandals on top university students’ perception of EA seem much smaller than I initially expected (e.g. most people think of the FTX crash as an example of crypto being crazy/fake). According to a Rethink Priorities survey only 20% of people who have heard about EA have heard about FTX.
Not needing to externally justify expenditures on common-sense altruistic grounds: Many of the community building interventions that seem most exciting involve spending money in ways that seem unusual in a university or common-sense altruistic context (e.g. group organizing salaries and costs, organizing workshops at large venues, renting office spaces). I think that some of these are more socially acceptable when not done in the name of ‘altruism’ or charity even if the group has similar motivations to EA groups in its culture (or at the very least this helps to insulate EA from some negative reputational effects).
Anecdotally, impartial/future-focused altruism is not the primary motivation for a large portion of individuals working full-time on AI existential risk reduction (and maybe the majority). Impartial altruism does not seem like the most compelling way one would get people to seriously consider working on existential risk reduction, as is discussed here, here, and here.
Counterarguments and Hesitations
I have not been working on AI safety/cause-area specific movement building for long enough (and AIS groups in general have not been very active for long enough) to feel confident that exciting leading indicators will translate into long-term impact. EA community building has a longer track record. The small sample sizes also reduce my confidence in the above takeaways.
Perhaps strong philosophical/ethical commitments (as opposed to say visceral urgency/concern and amazement at the capabilities of AI, or its rate of improvement) end up being more important than I currently estimate for long-term changes to career plans and behavior more generally.
Maybe the non-altruistic case for existential risk mitigation isn’t sound, e.g. because someone’s likelihood of being able to contribute is too low to justify working on x-risk reduction, instead of achieving their goals another way. If so, maybe insufficiently altruistically motivated people will realize this and pivot to something else.
Figuring out what is true and helpful in the context of AI safety might be sufficiently difficult that the downsides of movement building and outreach (e.g. lower epistemic standards and lower-quality content on e.g. LessWrong/the alignment forum) might outweigh the upsides (e.g. more motivated/talented people working on AI alignment).
AI safety is getting more mainstream than EA. Many of the people I expect to be most impactful would not have initially gotten involved with an AI safety group, but got into EA first and eventually switched to AI (though others like Open Philanthropy would have a better sense of this). The huge increase in discourse and attention on advanced AI might make the usefulness of proactive outreach and education about AI safety much lower moving forward than it was half a year ago.
Historically, AI-alignment-driven writing and field-building seems to have significantly contributed to (speeding up) AI capabilities—potentially more than it has contributed alignment/making the future better. AI alignment field-building might continue (or start to) have this effect.
My current intuition is: AGI hype has gotten high enough that the ratio of median capabilities researchers to safety researchers that would be beneficial from CB is pretty high (maybe >10:1, not sure), and definitely higher than what leading indicators suggest is produced by field-building at the moment.
Conclusion
On the margin, I’d direct more resources towards AI safety movement building, though I still think EA movement-building can be very valuable and should continue to some extent. I’d be interested in hearing others’ experiences and thoughts on AI safety and other cause area field building compared to EA CB in the comments.
- Where on the continuum of pure EA to pure AIS should you be? (Uni Group Organizers Focus) by 26 Jun 2023 23:46 UTC; 43 points) (
- Announcing Arcadia Impact by 4 Jan 2024 16:50 UTC; 40 points) (
- Community building in effective altruism (panel discussion) by 3 Jul 2024 13:30 UTC; 17 points) (
- 19 Jul 2023 16:54 UTC; 3 points) 's comment on CEA: still doing CEA things by (
Explicitly switching to AI only seems like a case of putting all our eggs in one highly speculative basket. We don’t know how the case for AI safety will stack up in 10 years: if we commit too hard and it turns out to be overblown, will EA as a movement be over?
I think the premise that EA will be over because of AI safety community building is confused given this is on the margins and EA movement building literally still exists? There’s literally a companion piece to this by Jessica McCurdy about EA community building on AI Safety specific community building. I also don’t think this piece makes the case for every resource to go to AI Safety community building.
In case anyone is interested, here is that piece.
Dunno what the exact ratio would look like (since the different groups run somewhat different kinds of events), but we’ve definitely seen a lot of interest in AIS at Carnegie Mellon as well. There’s also not very much overlap between the people who come to AIS things and those who come to EA things.
Thanks, you make a compelling argument for AI safety movement building. I especially like that you have a lot of experience with community building already to draw these conclusions from. However I think you might be (perhaps unintentionally) setting up the impression of a false dichotomy here between general EA community building and AI safety community building.
I might be wrong, but perhaps you are saying that EA should intentionally more heavily support the budding AI alignment community than they are now, and in some cases this community should be prioritised more than funding over other EA groups? That would seem reasonable to me at least. Your conclusion of “On the margin, I’d direct more resources towards AI safety movement building, though I still think EA movement-building can be very valuable and should continue to some extent.” seems to back up my take?
It makes sense to me that EA funds could experiment in investing a decent amount in communities built specifically around AI safety, then gather data for a couple of years and see if it produces both a consistent community and fruitful counterfactual AI safety efforts. Its seems likely these communities could be intertwined and connected with current EA communities to different extents in different places), but it could also be very separate. This might already be an explicit plan which is happening and I’ve missed it.
Also, initial recruitment numbers only tell part of the effectiveness story. One of the strengths of EA is that people, once joining the community often...
1. Devote a decent part of their life/time/resources to the community and the work
2. Have a decent likelihood of being in it for the long term (This must be quantified somewhere too)
Whether these features would also be present in an AI safety community remain to be seen.
Like titotal said, I don’t think a drastic pivot pulling a huge amount of money away from EA community building and towards AI safety groups would be a great strategic move. Putting all our eggs in one basket and leaving established communities high and dry seems like a bad move - mind you I don’t think that will happen anyway.
Final Question “Anecdotally, impartial/future-focused altruism is not the primary motivation for a large portion of individuals working full-time on AI existential risk reduction (and maybe the majority).” If not this, then what is their motivation outside of perhaps selfish fear for themselves or their family? I’m genuinely intrigued here.
Nice one!
What’s your current best-guess for what the leading indicators would suggest?
I would guess the ratio is pretty skewed in the safety direction (since uni AIS CB is generally not counterfactually getting people interested in AI when they previously weren’t, if anything EA might have more of that effect), so maybe something in the 1:10 − 1:50 range (1:20ish point estimate for median capabilities research: median safety research contribution ratio from AIS CB)?
I don’t really trust my numbers though. This ratio is also more favorable now than I would have estimated a few months/years ago, when contribution to AGI hype from AIS CB would have seemed much more counterfactual (but also AIS CB seems less counterfactual now that AI x-risk is getting a lot of mainstream coverage).
I would be surprised if the accurate number is as low as 1:20 or even 1:10. I wish there was more data on this, though it seems a bit difficult to collect since at least for university groups most of the impact (to both capabilities and safety) will occur a few+ years after the students start engaging with the group.
I also think it depends a lot on what the best opportunities available to them are. It would depend heavily on what opportunities to work on AI safety exist in the near future versus on AI capabilities for people with their aptitudes.
I agree with this, eg I think I know specific people who went through AIS CB (tho not the recent uni groups because they are younger and there’s more lag) and either couldn’t or wouldn’t find AIS jobs so ended up working in AI capabilities.
Yeah, same. I know of recent university graduates interested in AI safety who are applying for jobs in AI capabilities alongside AI safety jobs.
It makes me think that what matters more is changing the broader environment to care more about AI existential risk (via better arguments, more safety orgs focused on useful research/policy directions, better resources for existing ML engineers who want to learn about it etc.) rather than specifically convincing individual students to shift to caring about it.
I’ve also heard people doing SERI MATS for example explicitly talk/joke about this, about how they’d have to work in AI capabilities now if they don’t get AI safety jobs
I’m impressed the ratio is that favourable! One note to be careful of is that just because people start of hyped about AI safety doesn’t mean they stay there—there’s a decent chance they will swing to the dark side of capabilities, as we sore with Open AI and probably others as well. Just making the point that the starting ratio might look more favourable than after a few years.
Thanks, this is helpful!
Not worsening the current ratio would be a reasonable first guess, and although it depends a lot on how you define safety researchers, I’d say it’s effectively somewhere around 20:1.
sorry are you saying that the current ratio of capabilities researchers to safety researchers produced by AIS field-building is 20:1, or that the current ratio of the researchers overall is 20:1?
(If the latter, then I think my original question was insufficiently clear and I should probably edit it).
The second one—I’m addressing what ratio would be beneficial, but maybe you wanted to understand what actually is?