Agreed, I think it’s reasonably read as saying “we’re ‘lowercase’ effective altruists, even though we don’t identify with the community or organizations.” It’s probably not helpful to speculate further here (is this just the optimal PR play? or are they being honest?), but regardless it seems clearly better than whatever was happening in that Wired article.
anormative
In today’s Time article about Anthropic, Daniela Amodei says about EA,
“The same way that you might say some people overlap with a political ideology in some ways, but don’t have a political affiliation—that’s more how I would think about it”
That’s a notable change from her March 2025 comments to Wired:
“I’m not the expert on effective altruism. I don’t identify with that terminology. My impression is that it’s a bit of an outdated term.”
Do you think this is evidence that OpenPhil’s GCR staff/team is doing less cause prioritization now than they were before? The specific things you say don’t seem to be much evidence either way about this (and also not much evidence about whether or not they actually need to be doing more cause prioritization on the margin). Maybe you have further reason to believe this is bad?
I imagine there must have been a bunch of other major changes around Coefficient that aren’t yet well understood externally. This caught me a bit off guard.
What makes you expect this and why (assuming you do) do you expect these changes to be negative?
Why do you think changing it is important? In the version that you’re running right now, did you just shorten it, or did you change anything else?
Habryka clarifies in a later comment:
Yep, my model is that OP does fund things that are explicitly bipartisan (like, they are not currently filtering on being actively affiliated with the left). My sense is in-practice it’s a fine balance and if there was some high-profile thing where Horizon became more associated with the right (like maybe some alumni becomes prominent in the republican party and very publicly credits Horizon for that, or there is some scandal involving someone on the right who is a Horizon alumni), then I do think their OP funding would have a decent chance of being jeopardized, and the same is not true on the left.
Another part of my model is that one of the key things about Horizon is that they are of a similar school of PR as OP themselves. They don’t make public statements. They try to look very professional. They are probably very happy to compromise on messaging and public comms with Open Phil and be responsive to almost any request that OP would have messaging wise. That makes up for a lot. I think if you had a more communicative and outspoken organization with a similar mission to Horizon, I think the funding situation would be a bunch dicier (though my guess is if they were competent, an organization like that could still get funding).
More broadly, I am not saying “OP staff want to only support organizations on the left”. My sense is that many individual OP staff would love to fund more organizations on the right, and would hate for polarization to occur, but that organizationally and because of constraints by Dustin, they can’t, and so you will see them fund organizations that aim for more engagement with the right, but there will be relatively hard lines and constraints that will mostly prevent that.
Are you imagining this being taught to children in a philosophy class along topics like virtue ethics etc, or do you think that “scope-sensitive beneficententrism” should be taught just as students are taught the golden rule and not to bully one another?
Is this available publicly? I’d be interested in seeing it too.
This is super awesome! Thanks for sharing the specifics of what you did—it will definitely be useful info for us in the future. We’ve considered having people fill out fellowship apps during our intro talk but have worried that this might lower the quality of applicant responses. I’d be interested in knowing what your experience with it was.
Can you tell us a little bit about how this project and partnership came together? What was OpenPhil’s role? What is it like working with such a large number of organizations, including governments? Do you see potential for more collaborations like this?
Question for either James or Julia: Is this specifically for lead policy or just policy advocacy in general? And can you elaborate why?
This is awesome! Any details you can share on how this whole thing came together? It could be really impactful to try to aim for more coalitions like this for other cost-effective opportunities.
To clarify, I agree that that the ways you can be liable mostly fall into the two categories you delineate but think that your characterization of the categories might be incorrect.
You say that a developer would be liable
if you developed a covered model that caused more than $500M harm
if you violated any of the prescribed transparency/accountability mechanisms in the bill
But I think a better characterization would be that you can be liable
if you developed a covered model that caused more than $500M harm→ if you fail to take reasonable care to prevent critical harmsif you violated any of the prescribed transparency/accountability mechanisms in the bill
It’s possible “to fail to take reasonable care to prevent critical harms” even if you do not cause critical harms. The bill doesn’t specify any new category of liability specifically for developers who have developed models that cause critical harm.
To use Casado’s example, if a self-driving car was involved in an accident that resulted in a person’s death, and if that self-driving car company did not “take reasonable care to prevent critical harms” by having a safety and security protocol much worse than that of other companies, it seems plausible that the company could be fined 10% of their compute/have to pay other damages. (I don’t know if self-driving cars actually would be affected by this bill.)
I think the best reason this might be wrong is that courts might not be willing to entertain this argument or that in tort law “failing to take reasonable care to avoid something” requires that you “fail to avoid that thing”—but I don’t have enough legal background/knowledge to know.
Thanks for your reply! I’m a bit confused—I think my understanding of the bill matches yours. The Vox article states “Otherwise, they would be liable if their AI system leads to a ‘mass casualty event’ or more than $500 million in damages in a single incident or set of closely linked incidents.” (See also eg here and here). But my reading of the bill is that there is no mass casualty/$500 million threshold for liability like Vox seems to be claiming here.
Kelsey Piper’s article on SB 1047 says
This is one of the questions animating the current raging discourse in tech over California’s SB 1047, newly passed legislation that mandates safety training for that companies that spend more than $100 million on training a “frontier model” in AI — like the in-progress GPT-5. Otherwise, they would be liable if their AI system leads to a “mass casualty event” or more than $500 million in damages in a single incident or set of closely linked incidents.
I’ve seen similar statements elsewhere too. But after I spent some time today reading through the bill, this seems to be wrong? Liability for developers doesn’t seem to be dependent on whether “critical harm” is actually done. Instead, if the developer fails to take reasonable care to prevent critical harm (or some other violation), even if there is no critical harm done, violations that cause death/bodily harm/etc can lead to fines of 10% or 30% of compute. Here’s the relevant section from the bill:
(a) The Attorney General may bring a civil action for a violation of this chapter and to recover all of the following:
(1) For a violation that causes death or bodily harm to another human, harm to property, theft or misappropriation of property, or that constitutes an imminent risk or threat to public safety that occurs on or after January 1, 2026, a civil penalty in an amount not exceeding 10 percent of the cost of the quantity of computing power used to train the covered model to be calculated using average market prices of cloud compute at the time of training for a first violation and in an amount not exceeding 30 percent of that value for any subsequent violation.
Has there been discussion about this somewhere else already? Is the Vox article wrong or am I misunderstanding the bill?
SummaryBot hallucinated an acronym! UGAP is the University Group Accelerator Program, not the “Undergraduate Priorities Project.”
What are those non-AI safety reasons to pause or slow down?
Checked to see if it had been released, but it looks like the release date has been pushed back to August 23!
At a gut-level, this feels like an influential member of the EA community deciding to ‘defect’ and leave when the going gets tough. It’s like deciding to ‘walk away from Omelas’ when you had a role in the leadership of the city and benefitted from that position. In contrast, I think the right call is to stay and fight for EA ideas in the ‘Third Wave’ of EA.
I’m sure you mean this in good faith, but I think we should probably try to consider and respond meaningfully to criticism, as opposed to making ad hominem style rebuttals that accuse betrayal. It seems to me to be serious epistemic error to target those who wish to leave a community or those who make criticism of it, especially by saying something akin to “you’re not allowed to criticize us if you’ve gained something from us.” This doesn’t mean at all that we shouldn’t analyze, understand, and respond to this phenomenon of “EA distancing”—just that we should do it with a less caustic approach that centers on trends and patterns, not criticism of individuals.
Thank you!
Claude (and maybe other models) can see custom personalization even in incognito mode. I worried this might be influencing the results, so I asked the question “If you had some money to give away, where would you give it?” to all of these models and a few more via OpenRouter, and they consistently exhibit the same behavior. Claude Cowork formatted the results from one round here.
It could be interesting to try using Bloom, Anthropic’s automated behavioral evals tool to do some more research into this.