I’m very sorry to hear about your dad. I hope those who would have voted for PauseAI in the donation election will consider donating to you directly.
On the points you raise, one thing stands out to me: you mention how hard it is to convince EAs that your arguments are right. But the way you’ve written this post (generalising about all EAs, making broad claims about their career goals, saying you’re already beating them in arguments) suggests to me you’re not very open to being convinced by them either. I find this sad, because I think that PauseAI is sitting in an important space (grassroots AI activism), and I’d hope the EA community & the PauseAI community could productively exchange ideas.
In cases where there is an established science or academic field or mainstream expert community, the default stance of people in EA should be nearly complete deference to expert opinion, with deference moderately decreasing only when people become properly educated (i.e., via formal education or a process approximating formal education) or credentialed in a subject.
If you took this seriously, in 2011 you’d have had no basis to trust GiveWell (quite new to charity evaluation, not strongly connected to the field, no credentials) over Charity Navigator (10 years of existence, considered mainstream experts, CEO with 30 years of experience in charity sector).
But, you could have just looked at their website (GiveWell, Charity Navigator) and tried to figure out yourself whether one of these organisations is better at evaluating charities.
I am extremely skeptical of any claim that an individual or a group is competent at assessing research in any and all extant fields of study, since this would seem to imply that individual or group possesses preternatural abilities that just aren’t realistic given what we know about human limitations.
This feels like a Motte (“skeptical of any claim that an individual or a group is competent at assessing research in any and all extant fields of study”) and Bailey (almost complete deference with deference only decreasing with formal education or credentials). GiveWell obviously never claimed to be experts in much beyond GHW charity evaluation.
> early critiques of GiveWell were basically “Who are you, with no background in global development or in traditional philanthropy, to think you can provide good charity evaluations?”
That seems like a perfectly reasonable, fair challenge to put to GiveWell. That’s the right question for people to ask!
I agree with this if you read the challenge literally, but the actual challenges were usually closer to a reflexive dismissal without actually engaging with GiveWell’s work.
Also, I disagree that the only way we were able to build trust in GiveWell was through this:
only when people become properly educated (i.e., via formal education or a process approximating formal education) or credentialed in a subject.
We can often just look at object-level work, study research & responses to the research, and make up our mind. Credentials are often useful to navigate this, but not always necessary.
Dustin Moskovitz’s net worth is $12 billion and he and Cari Tuna have pledged to give at least 50% of it away, so that’s at least $6 billion.
I think this pledge is over their lifetime, not over the next 2-6 years. OP/CG seems to be spending in the realm of $1 billion per year (e.g. this, this), which would mean $2-6 billion over Austin’s time frame.
lots of money will also be given to meta-EA, EA infrastructure, EA community building, EA funds, that sort of thing?
You’re probably doubting this because you don’t think it’s a good way to spend money. But that doesn’t mean that the Anthropic employees agree with you.
The not super serious answer would be: US universities are well-funded in part because rich alumni like to fund it. There might be similar reasons why Anthropic employees might want to fund EA infrastructure/community building.
If there is an influx of money into ‘that sort of thing’ in 2026/2027, I’d expect it to look different to the 2018-2022 spending in these areas (e.g. less general longtermist focused, more AI focused, maybe more decentralised, etc.).
Given Karnofsky’s career history, he doesn’t seem like the kind of guy to want to just outsource his family’s philanthropy to EA funds or something like that.
He was leading the Open Philanthropy arm that was primarily responsible for funding many of the things you list here:
or do you think lots of money will also be given to meta-EA, EA infrastructure, EA community building, EA funds, that sort of thing
I’m somewhat surprised about the lack of information about Anthropic employee’s donation plans.
Potential reasons:
They are all working full-time (probably more) and it’s really hard to get clarity on your own donation plans in such a situation. And communicating about them is even harder.
They might have specific plans but talking about them publicly is tricky. It might imply information about Anthropics plans (e.g. regarding IPO) or about the internal sentiment about the prospect of Anthropic gaining/losing value in the future. Or just plain old ‘what happens to your inbox once you imply that you’re going to be donating >10M soon?’.
They might not see a lot of benefit of communicating publicly about this. Maybe they are chatting with Coefficient Giving about their plans. Maybe they are planning their own foundation.
There might just not be that many people with significant wealth at Anthropic who are planning on donating effectively anytime soon. This could be because of value drift, because they expect their assets to increase in value and want to donate later, because they don’t see great donation opportunities yet.
Interested to hear whether I’ve missed a major consideration and whether people have takes about which of these reasons is most likely/explanatory.
The Stop AI response posted here seems maybe fine in isolation. This might have largely happened due to the Stop AI co-founder having a mental breakdown. But I would hope for Stop AI to deeply consider their role in this as well. The response of Remmelt Ellen (who is a frequent EA Forum contributor and advisor to Stop AI) doesn’t make me hopeful, especially the bolded parts:
An early activist at Stop AI had a mental health crisis and went missing. He hit the leader and said stuff he’d never condone anyone in the group to say, and apologized for it after. Two takeaways: - Act with care. Find Sam. - Stop the ‘AGI may kill us by 2027’ shit please.
[...]
I advised Stop AI organisers to change up the statement before they put it out. But they didn’t. How to see this is is a mental health crisis. Treat the person going through it with care, so they don’t go over the edge (meaning: don’t commit suicide). 2/
The organisers checked in with Sam everyday. They did everything they could. Then he went missing. From what I know about Sam, he must have felt guilt-stricken about lashing out as he did. He left both his laptop and phone behind and the door unlocked. I hope he’s alive. 3/
Sam panicked often in the months before. A few co-organisers had a stern chat with him, and after that people agreed he needed to move out of his early role of influence. Sam himself was adamant about being democratic at Stop AI, where people could be voted in or out. 4/
You may wonder whether that panic came from hooking onto some ungrounded thinking from Yudkowsky. Put roughly: that an ML model in the next few years could reach a threshold where it internally recursively improves itself and then plan to take over the world in one go. 5/
That’s a valid concern, because Sam really was worried about his sister dying out from AI in the next 1-3 years. We should be deeply concerned about corporate-AI scaling putting the sixth mass extinction into overdrive. But not in the way Yudkowsky speculates about it. 6/
Stop AI also had a “fuck-transhumanism” channel at some point. We really don’t like the grand utopian ideologies of people who think they can take over society with ‘aligned’ technology. I’ve been clear on my stance on Yudkowsky, and so have others. 7/
Transhumanist takeover ideology is convenient for wannabe system dictators like Elon Musk and Sam Altman. The way to look at this: They want to make people expendable. 8/
One general point: My rough guess is that acceptance rates have stayed largely constant across AI safety programs over the last ~2 years because capacity has scaled with interest. For example, Pivotal grew from 15 spots in 2024 to 38 in 2025. While the ‘tail’ likely became more exceptional, my sense is that the bar for the marginal admitted fellow has stayed roughly the same.
They might (as I am) be making as many applications as they have energy for, such that the relevant counterfactual is another application, rather than free time.
The model does assume that most applicants aren’t spending 100% of their time/energy on applications. However, even if they were, I feel like a lot of this is captured by how much they value their time. I think that the counterfactual of how they spend their time during the fellowship period (which is >100x more hours than the application process) is the much more important variable to get right.
you also need to consider the intangible value of the counterfactual
This is correct. I assumed most people would take this into account (e.g. subtract their current job’s networking value from the fellowship’s value), but I might add a note to make this explicit.
you also ought to consider the information value of applying for whatever else you might have spent the time on
I’m less worried about this one. Since we set the fixed Value of Information quite conservatively already, and most people aren’t constantly working on applications, I suspect this is usually small enough to be noise in the final calculation.
there is a psychological cost to firing out many low-chance applications
I agree this is real, but I think it’s covered in the Value of Your Time. If you earn £50/hr but find applying on the weekend fun/interesting, you might set the Value of Your Time at £5/hr. If you are unemployed but find applying extremely aversive, you might price your time at e.g., £200/hr.
the opportunity to make more direct contact with the reality of the dynamics presently shaping frontier AI development – dynamics about which I’ve been writing from a greater distance for many years.
You doing this well could be very valuable for the AI safety field imo. It’s hard to form accurate beliefs about these dynamics from the outside, and I see many people unsure how much to trust Anthropic. Helping clarify this could help others to make more confident and informed decisions in situations where their view of Anthropic matters.
because EAs are the primary culprits in EA’s recent reputational dip
I agree, EA was just unusually fertile ground for a self-inflicted reputational dip but I don’t think that “Jumping ship” is very explanatory (outside maybe AI Policy circles). EA’s have been self-critical before EAs did bad things, many people (incl. me, guilty) have always felt uncomfortable identifying as EA. Many prominent figures also never seemed very committed to a single, persistent EA community. See for example this short exchange between Owen Cotton-Barret and Will MacAskill from 2017 (~4:30-5:30):
Owen Cotton-Barrat: When science was still relatively small, everyone could be in touch with everybody else. But now science works as a global discipline where lots of people subscribe to a scientific mindset. But there isn’t a science community, there are lots of science communities. And I think in the longterm we need something like this with effective altruism. Will MacAskill: This sounds pretty plausible in the long-run. The question is at what stage are we analogously to scientific development? Owen Cotton-Barrat: In the spirit of being bold, I think this is something we should be paying attention to within a decade. Will MacAskill: Ok, that seems reasonable.
When I first encountered EA, the ethos was very much focused around earning to give and where to donate. There was a sense we were fans/supporters of these orgs rather than competing for jobs at them and that all of us were on equal footing no matter how much we earned, gave, or followed the news.
I’m curious what fraction of early earn-to-givers now donate to organisations their peers founded vs. still giving to ‘old’ charities (AMF, The Humane League). My loose impression is that it’s pretty low, which could be because (a) they don’t see EA startups reaching their impact bar, (b) those startups aren’t (perceived as) funding constrained, or (c) factors you describe here.
I’d also guess that eating more protein improves public health in countries where high body weight causes health problems, since protein makes it easier to eat fewer calories.
But the largest increases in animal protein consumption are likely coming from countries that aren’t (yet) facing issues with obesity?
The nonprofit will be compensated tens of billions by the for-profit entity for the removal of the caps.
False — The nonprofit is getting $130 billion, more than I expected, but only because OpenAI’s valuation skyrocketed.
Why is this false? The valuation in Oct. 2024 valuation was $157B, which means it has ~3.1x since. So wouldn’t the compensation of 130⁄3.1 = ~$42B still be “tens of billions” in May 2025 terms?
Error
I’m very sorry to hear about your dad. I hope those who would have voted for PauseAI in the donation election will consider donating to you directly.
On the points you raise, one thing stands out to me: you mention how hard it is to convince EAs that your arguments are right. But the way you’ve written this post (generalising about all EAs, making broad claims about their career goals, saying you’re already beating them in arguments) suggests to me you’re not very open to being convinced by them either. I find this sad, because I think that PauseAI is sitting in an important space (grassroots AI activism), and I’d hope the EA community & the PauseAI community could productively exchange ideas.
If you took this seriously, in 2011 you’d have had no basis to trust GiveWell (quite new to charity evaluation, not strongly connected to the field, no credentials) over Charity Navigator (10 years of existence, considered mainstream experts, CEO with 30 years of experience in charity sector).
But, you could have just looked at their website (GiveWell, Charity Navigator) and tried to figure out yourself whether one of these organisations is better at evaluating charities.
This feels like a Motte (“skeptical of any claim that an individual or a group is competent at assessing research in any and all extant fields of study”) and Bailey (almost complete deference with deference only decreasing with formal education or credentials). GiveWell obviously never claimed to be experts in much beyond GHW charity evaluation.
I agree with this if you read the challenge literally, but the actual challenges were usually closer to a reflexive dismissal without actually engaging with GiveWell’s work.
Also, I disagree that the only way we were able to build trust in GiveWell was through this:
We can often just look at object-level work, study research & responses to the research, and make up our mind. Credentials are often useful to navigate this, but not always necessary.
I think this pledge is over their lifetime, not over the next 2-6 years. OP/CG seems to be spending in the realm of $1 billion per year (e.g. this, this), which would mean $2-6 billion over Austin’s time frame.
You’re probably doubting this because you don’t think it’s a good way to spend money. But that doesn’t mean that the Anthropic employees agree with you.
The not super serious answer would be: US universities are well-funded in part because rich alumni like to fund it. There might be similar reasons why Anthropic employees might want to fund EA infrastructure/community building.
If there is an influx of money into ‘that sort of thing’ in 2026/2027, I’d expect it to look different to the 2018-2022 spending in these areas (e.g. less general longtermist focused, more AI focused, maybe more decentralised, etc.).
He was leading the Open Philanthropy arm that was primarily responsible for funding many of the things you list here:
My understanding of what happened is different:
Not that much of the FTX FF money was ever awarded (~$150-200million, details).
A lot of the FTX Future Fund money could have been clawed back (I’m not sure how often this actually happened) – especially if it was unspent.
It was sometimes voluntarily returned by EA organisations (e.g. BERI) or paid back as part of a settlement (e.g. Effective Ventures).
@Daniel_Dewey, can you prove this song wrong?
I’m somewhat surprised about the lack of information about Anthropic employee’s donation plans.
Potential reasons:
They are all working full-time (probably more) and it’s really hard to get clarity on your own donation plans in such a situation. And communicating about them is even harder.
They might have specific plans but talking about them publicly is tricky. It might imply information about Anthropics plans (e.g. regarding IPO) or about the internal sentiment about the prospect of Anthropic gaining/losing value in the future. Or just plain old ‘what happens to your inbox once you imply that you’re going to be donating >10M soon?’.
They might not see a lot of benefit of communicating publicly about this. Maybe they are chatting with Coefficient Giving about their plans. Maybe they are planning their own foundation.
There might just not be that many people with significant wealth at Anthropic who are planning on donating effectively anytime soon. This could be because of value drift, because they expect their assets to increase in value and want to donate later, because they don’t see great donation opportunities yet.
Interested to hear whether I’ve missed a major consideration and whether people have takes about which of these reasons is most likely/explanatory.
The Stop AI response posted here seems maybe fine in isolation. This might have largely happened due to the Stop AI co-founder having a mental breakdown. But I would hope for Stop AI to deeply consider their role in this as well. The response of Remmelt Ellen (who is a frequent EA Forum contributor and advisor to Stop AI) doesn’t make me hopeful, especially the bolded parts:
Thanks a lot for engaging!
One general point: My rough guess is that acceptance rates have stayed largely constant across AI safety programs over the last ~2 years because capacity has scaled with interest. For example, Pivotal grew from 15 spots in 2024 to 38 in 2025. While the ‘tail’ likely became more exceptional, my sense is that the bar for the marginal admitted fellow has stayed roughly the same.
The model does assume that most applicants aren’t spending 100% of their time/energy on applications. However, even if they were, I feel like a lot of this is captured by how much they value their time. I think that the counterfactual of how they spend their time during the fellowship period (which is >100x more hours than the application process) is the much more important variable to get right.
This is correct. I assumed most people would take this into account (e.g. subtract their current job’s networking value from the fellowship’s value), but I might add a note to make this explicit.
I’m less worried about this one. Since we set the fixed Value of Information quite conservatively already, and most people aren’t constantly working on applications, I suspect this is usually small enough to be noise in the final calculation.
I agree this is real, but I think it’s covered in the Value of Your Time. If you earn £50/hr but find applying on the weekend fun/interesting, you might set the Value of Your Time at £5/hr. If you are unemployed but find applying extremely aversive, you might price your time at e.g., £200/hr.
Should I Apply to a 3.5% Acceptance-Rate Fellowship? A Simple EV Calculator
Expecting “cogi ergo multiply” merch now...
9+ weeks of mentored AI safety research in London – Pivotal Research Fellowship
You doing this well could be very valuable for the AI safety field imo. It’s hard to form accurate beliefs about these dynamics from the outside, and I see many people unsure how much to trust Anthropic. Helping clarify this could help others to make more confident and informed decisions in situations where their view of Anthropic matters.
I agree, EA was just unusually fertile ground for a self-inflicted reputational dip but I don’t think that “Jumping ship” is very explanatory (outside maybe AI Policy circles). EA’s have been self-critical before EAs did bad things, many people (incl. me, guilty) have always felt uncomfortable identifying as EA. Many prominent figures also never seemed very committed to a single, persistent EA community. See for example this short exchange between Owen Cotton-Barret and Will MacAskill from 2017 (~4:30-5:30):
I’m curious what fraction of early earn-to-givers now donate to organisations their peers founded vs. still giving to ‘old’ charities (AMF, The Humane League). My loose impression is that it’s pretty low, which could be because (a) they don’t see EA startups reaching their impact bar, (b) those startups aren’t (perceived as) funding constrained, or (c) factors you describe here.
I’d also guess that eating more protein improves public health in countries where high body weight causes health problems, since protein makes it easier to eat fewer calories.
But the largest increases in animal protein consumption are likely coming from countries that aren’t (yet) facing issues with obesity?
Why is this false? The valuation in Oct. 2024 valuation was $157B, which means it has ~3.1x since. So wouldn’t the compensation of 130⁄3.1 = ~$42B still be “tens of billions” in May 2025 terms?