Reflections on Anthropic and EA
This is crossposted from my blog.
These are personal reflections on feelings that I’ve been sitting with recently. I’m posting them, because the last time I felt this way I regretted not doing so. I don’t really know how calibrated they are, but I’ve been noticing them more and more.
***
Around 6 months before FTX collapsed, I wrote a draft EA Forum post called “Concerns about the Carrick Flynn campaign.” At the time, I was on the verge of leaving EA. During the frothiest FTX days it seemed like many folks were energized by the money and attention, but I felt kind of gross. The amount of money pouring into EA, and the resulting “lowering of the bar” that happened felt like a degradation of community norms that was too severe, almost not worth the benefits. Of course, some people voiced these concerns at the time, but they were drowned out by the excitement of the moment.
I never published the post, but I ended it by saying:
This feels incredibly poorly thought out, like it is one negative news article away from ruining EA’s reputation in the public’s eye, and that it also represents some degree of moral/value slippage that probably shouldn’t be tolerated by a community really focused on doing the most good. I admit that all of this has made me feel more disconnected from EA than I’ve ever felt before, despite also feeling very value-aligned with the people making these decisions.
I don’t really know what to do about all of this, but I guess I’m just scared?
I’ve noticed over the last two months that some of these same feelings are coming back for the first time since FTX. This time, I’m paying a lot more attention to them, because last time I think these feelings turned out to be very attuned to a risk the community was mostly ignoring.
For me, an essential feature of EA has always been its demandingness. I don’t mind that my commitment to utilitarianism pushes me to hold views that might be considered weird in mainstream circles. I don’t mind that EA can be all consuming. I don’t mind that EA can push me to make hard trade-offs for myself. We’re trying to make the best possible world. I would have been foolish to sign up while thinking it would be easy or pleasant. The EAs I like best are people who take the mandate of morality seriously. I disagree with many of these people’s views (I think we should pay more than the absolute minimum salary, I don’t think that our most fundamental goal should be to make the largest good future, etc.) but their beliefs are demanding, and I deeply respect people who follow that demandingness to the very end.
Before FTX, the EA I was in was more universally demanding. During the FTX era, it lost that for a while. Getting your project funded basically required that you had a pulse and sounded like an EA. Today, in response to a growing sense that major Anthropic money is coming, I am starting to have a sense of that same slippage happening again. As a result, when I reflect on how I respond to choices I’m given, I’m worried that much of what feels essential in me about being an EA is being corrupted again.
I hear daily about the coming Anthropic funds. Of course, Anthropic is very different from FTX. It is a legitimate company and has a reasonable chance at being the most important company in the world at a moment that may be one of the most important in history. Many of the people I’ve met who work there are thoughtful, generous, charitable people. I don’t think anyone is committing an enormous fraud, and I think the work they do is fundamentally much more important than FTX’s.
But at the same time, I’ve had a growing number of experiences that make me feel like I’m back in the FTX era. Immense, undemocratic political spending is ramping up (though spending by the enemies of AI safety is also growing) — I have no idea if this is good or bad, but it is, for my sense of morality, unsettling. I have been offered jobs that are billed as impactful and that pay vastly more money than I need while not providing any clear benefit for the world. I feel daily pressure to move to San Francisco, where I’m told that I can “increase [my] influence over the future by somewhere between 10x and 1000x” (an extraordinary claim without extraordinary evidence). It is hard to make the world better, and we should expect it is nearly impossible to do it effectively. It shouldn’t feel like an easy choice to try. To move to San Francisco, and take a cushy job where I am paid an enormous amount of money to write navel-gazing think pieces, and spend all day in a lovely office with the smartest people in the world while eating free lunches, feels like an easy choice.
Right now, I’m leaving my job to go start a new charity, working on a very narrow intervention. Despite every comparison in potential impact I and others make on paper suggesting this is where I will have the most impact, I instead get constant, subtle and unconscious signals from the community that I’m making a major mistake: I’m turning down economic security by turning down jobs in “EA” that pay significantly more; I’m not moving to the place “where I’ll have the most influence,” etc. But when I look at some of the other options I feel fortunate to have, and try to imagine taking them if they either didn’t bring me closer to power or paid only what I make now, they lose all appeal. That they appeal at all to me right now feels fundamentally like a corruption of my spirit. I want to be motivated by the impact of my actions, and I’m constantly receiving a subtle signal that I should put aside what is more obviously impactful, go make a lot of money, and get closer to a lot of power.
This is, of course, very tempting, and sounds very exciting. It would be a glorious coincidence if, of all possible things I could be doing, the thing that is most intellectually interesting and fun and increases my status the most also happens to be the most impactful thing I could do with my life. Similarly, why is it that taking a high-paying, high-power job where the promise of impact is a vague sense of “influence” at the cost of progress on a difficult, high-risk, and concrete intervention is so appealing?
Outside of EA, this pressure isn’t surprising — but that this pressure now feels like it is coming from many in EA, or affiliated with the communities I care about, is what confuses me. I think I used to find EA to be a bulwark that supported me through hard moral decisions — now, it feels more like a counterpressure to making the right choice. The most tempting opportunities to not do the most good I can are coming from the community itself.
In a sense, I feel a bit of my spirit being corrupted. I, and my community, are being given choices that I think will take moral courage to navigate, and I don’t know if we are always going to make the right choices. The community has money, and so it is growing. But I think a core lesson of FTX was that the growth that comes with being seen as “the people who will give money to everything that’s vaguely aligned” is growth that could be not worth having.
The most obvious reason that money and power would align with expected value is that, unfortunately, money does make it possible to do a lot of things in the world. I could be wrong to be worried. I generally do think this new, vast source of funding will probably be good, and I don’t have the same sense I had during the FTX era of there being genuinely massive corruption occurring. No one has tried to pay me $10k to fly to the Bahamas just to hang out (yet). Maybe the community becoming less pleasant for me is not really a sign that it’s becoming less rigorous, and just an indicator that I fit in less with the new trends. Many people in the Bay doing work with or against AI labs are doing incredibly difficult things that take a huge amount of moral courage — especially those directly challenging labs. But at the same time, I think there are also major costs to gaining power and influence — by and large, people seem to make worse decisions when they have them. I don’t think our community has figured out how to navigate these trade-offs as it steps closer to significant power.
As a result I worried I’m going to lose the EA I loved. When EA had no power and low impact on the world outside the people helped by bednets and the chickens living better lives, I had absolute faith in the community — the people I met were all my allies in this strange corner of ethics, who I trusted as strangers more than I trusted anyone else. Now, I regularly meet new people in the community who feel only like strangers, or worse, strangers who want to step on my shoulders to meet my richer and more powerful friends and connections. Now, many people I know are spending more time trying to influence Anthropic staff than figuring out how to do good and acting on it. Maybe it’ll pay off. I hope it does. But there is a part of me that is skeptical.
I think EA’s failure to grapple with the corrupting influence of power is among its greatest failures. As EA, or at least, a company that is strongly influenced by EA values and approaches, is once again at the top of the world, my faith that we’ll keep making the right choices is slipping away. As I said the last time I typed a post like this, I don’t really know what to do about all of this, but I guess I’m just scared?
“I think EA’s failure to grapple with the corrupting influence of power is among its greatest failures. “
This has been the feature of forum discussions that has disturbed me possibly the most since joining. People don’t like to put any weight on conflicts of interest even when the person arguing a point has a huge amount to gain. “Just argue the object point” people say, don’t bring up the conflict of interest…
People seem surprised and bewildered when AI folks defect away from AI safety towards capabilities. People trust that as AI companies grow, those gaining power and money from shares will not be adversely influenced by that power and money.
Even as I have gained a teeny weeny bit of power just in a teeny weeny power of the global health world I have felt a little of the corrupting influence. Living far away from this in Uganda, I’m not part of this at all and like you it’s very unclear what can be done to help, but talking about it a bit could be better than nothing. I loved this post thank you!
I suppose it’s a reaction to the tendency on the political left to not listen to a person at all because of some association they have with some group.
But I agree with you. We should be wary of these dynamics, without falling into black and white ways of thinking.
Also to second Nick, I really ‘felt’ and resonated with this post!
fwiw I don’t actually know many examples of this, and the ones I hear cited often seem uncompelling to me. E.g.:
Greg Brockman doesn’t seem like a true believer in OpenAI’s nonprofit mission who got corrupted but rather someone who went into it wanting to make a profit
Mechanize’s founders don’t seem like EAs who got corrupted by AI money but rather EAs with unusual moral and empirical views which result in them thinking that the best course of action is the exact opposite of what most EAs think
(Counterexamples appreciated, though!)
I think he would include a lot of people who work at Anthropic, for example, on pre-training, some of whom went through MATS or something.
Thanks! I only know a handful of people in this category, but for what it’s worth, it again feels like people who were predisposed to thinking that working on pretraining would be okay rather than them being “corrupted.”
E.g., I recently talked to someone who told me that their main takeaway from a safety fellowship was realizing that they didn’t fit in because they actually weren’t worried about existential risk in the same way that the other attendees were.
Hmm, I think if smart EA/Rat types get “corrupted” in general, they’ll present as thoughtful people with reasons that are hard to dismiss quickly when questioned by EAs. I get the vague sense that your evidence bar for “corruption” is going to be too high to be useful in most worlds where there’s a lot of corruption.
(that’s not to say that EAs/Rats/etc. who join labs/start wildly profitable companies speeding up AI progress have been “corrupted”—I just think if they were, it would present pretty similarly to how it has done and it’s hard to get lots of easy to share evidence)
Arguing the object point is useful, and I love to see it done when possible.
Sometimes it is also useful to call out who is making the argument.
I see the argument that AI folks go from safety to capabilities made constantly (ie, every discussion of OpenAI’s origin). It seems correct but neither novel nor controversial in EA/rat spaces. EX: Habyka’s last point on: https://www.lesswrong.com/posts/MqgwHJ93pJpaeHXs6/posts-i-don-t-have-time-to-write
Maybe we are reading different folks though. Do you have specific examples of you making conflict-of-interest arguments and folks on the forum pushing back on you to instead argue the object-level-point?
This is a beautiful piece Abraham, thanks for writing it. I feel very similarly to you. I thought this EA Forum post from the FTX-era hit a few of these concerns well, as well as this comment from Benjamin Todd:
Basically, it feels harder to know who is genuine and who to trust vs who is involved for the various status-based and financial incentives. Whilst not new, I feel like I’ve seen an increasing number of organisations/individuals who are functionally cosplaying being interested in EA, to increase their chances of getting funding. This makes me sad—I would love not to have to question people’s motives like that, but it feels necessary sometimes.
Also, the demanding part of EA is something I really value too (in fact, I wrote a relatively controversial post on some issues with paying high salaries in EA orgs shortly before the FTX crash). On the frugality aspects of demandingness: I feel torn on how to navigate this. As I say in the post above, I worry about losing some ideological commitment (and related impact-focused decision-making) by paying generous salaries and attracting new people. But at the same time, I am very happy that we can pay more as a movement, if it means attracting great people. Similarly, even though people can often fairly justify spending significant chunks of money to increase their productivity, this kind of thinking still makes me uneasy sometimes (the most obvious example being a $2k coffee table).
Thanks James! I liked the old piece. I have no idea how to handle the pay questions: I think my default answer is something like “pay people reasonably well such that they can save for retirement, have families, etc” but that view just collapses when you’re competing with the market in many ways. And I think the AI space feels it especially hard — they have to compete directly with labs for talent.
But yeah, I think I don’t really know how to sit with all of this. I think maybe it’s just a set of feelings I don’t want to be unsaid. But I also worry that things that have pushed the community to find really interesting, unusual opportunities have come from the community being narrow, high-trust, and high-truth seeking, which might change with the growth.
Do you mean just in AI safety/meta? Because the FTXFF was only funding long-term projects, so I know there were still huge funding gaps in global poverty and animal welfare. And even in bio and nuclear security, there were still very large funding gaps.
That’s great that you are so successful. But there are many EAs/rationalists who are well-qualified who still have not gotten an EA job (probably thousands). I think part of the disconnect between the people saying there is lots of money and the people who are saying that they can’t get a job is that EAs are so well-qualified that they are used to only having to apply for a few jobs per offer. The number of applicants per job is something like 30-300, so it makes sense that the average person needs to apply to that many to get a job. But when EAs apply to 10 to 30 EA jobs and don’t get one, they tend to get frustrated. Also there are just not that many EA jobs that would be relevant to an individual EA’s qualifications.
Furthermore, there are enormous potential effective uses of money that don’t require EA labor, such as Give Directly, stockpiling PPE, etc.
So overall, I do think we should be careful to limit grift, but there is enormous room for effective use of funding in EA, one estimate was ~$1 trillion.
I definitely agree to some extent about FTX, though money did flow into some other spaces as well. But agree that I’m painting with too broad a brush there.
And definitely — I think I meant this post partially as a lamentation of what feels inevitable with funding. Our ability to make the world better might massively grow, but at the same time, it feels like something essential to EA’s past success (or at least, the culture and community that pushed it to do weird, fringe-y things that I think hold much of EA’s greatest promise) might be lost.
You present the parenthetical as a meliorating factor, but I expect that these enemies exist due to previous undemocratic power-seeking actions by the AI safety community.
(This isn’t based on any private information, I just think there must be some reason these enemies single out EAs in particular. Bad faith actors don’t just randomly pick targets to attack. My best guess at the reason is the intense focus on gaining influence and power.)
While I admire your instinct to extend grace to your enemies, I think you’re bending too far backwards and attributing to them too much good faith. As one point of evidence that they’ll say anything they think will help them win, consider their use of Lonsdale money to accuse Bores of being too cozy with Palantir.
… I explicitly called them bad faith actors in my comment; I don’t think I’m extending much grace to them.
In any case, whether or not they are acting in bad faith doesn’t have much bearing (for me anyway) on whether EA actions have been good.
Fair! My point is that while I don’t think they randomly pick targets to attack, I don’t think their target-selection rubric is at all calibrated to who is actually bad or good. I think they attack EA because they think that’ll help them win, and they would do that even if EA were not seeking power and influence.
Many people will compromise their morals for money. That’s life. I try not to hold it against them.
For what it’s worth, EA’s donor core of pledgers/EtGers probably aren’t going anywhere or compromising anything much, being a group of people who constantly could just decide to have more money and don’t. So maybe they’d be a nicer group for you to hang out with if this kind of moral compromise really bugs you?
Personally I’d quite like to hear about your new charity.
Worth mentioning because the policy is so new: your disclosure was interesting but isn’t required. Disclosure is only required when your post contains significant LLM-generated text, so you’re all good, and can cut it unless you included it by choice.
“LLM disclosure: I wrote this post myself, then asked an LLM to copy-edit it before posting. I manually made any edits I liked and copy-pasted no text from the LLM (my current practice for using LLMs in writing that I care about).”
Thanks! I removed it!
What is it? I’m curious!
I’d like to know too.
Abraham has mentioned the new charity briefly on the Hive Slack.
Found it! So cool! 🦋💃
Beautiful and moving post.
It reminded me of very good post by Ben Kuhn about the risk of grifters—https://forum.effectivealtruism.org/posts/nqt3kasPieQEKihCp/the-biggest-risk-of-free-spending-ea-is-not-optics-or
https://forum.effectivealtruism.org/posts/48mypEepqBqWibKtJ/?commentId=WQfiqqSpsjYBGqtKC
I wrote a related parable in response to a similar-ish post a while ago if people like fiction.