Hi everyone,
Recently, I decided to read one of ACE’s charity evaluations in detail, and I was extremely disappointed with what I read. I felt that ACE’s charity evaluation was long and wordy, but said very little.
Upon further investigation, I realized that ACE’s methodology for evaluating charities often rates charities more cost-effective for spending more money to achieve the exact same results. This rewards charities for being inefficient, and punishes them for being efficient.
ACE’s poor evaluation process leads to ineffective charities receiving recommendations, and many animals are suffering as a result. After realizing this, I decided to start a new charity evaluator for animal charities called Vetted Causes. We wrote our first charity evaluation assessing ACE, and you can read it by clicking the attached link.
Best,
Isaac
Thank you for spending time analyzing our methods. We appreciate those who are willing to engage with our work and help us improve the accuracy of our recommendations and reduce animal suffering as much as possible.
Based on previously received feedback and internal reflection, we have significantly updated our evaluation methods in the past year and will be publishing the details next Tuesday when we release our charity recommendations for 2024. From what we can tell from a quick skim, we think that our changes largely address Vetted Causes’ concerns here, as well as the detailed feedback we received last year from Giving What We Can (see also our response at the time) as part of their program that evaluates evaluators. Our cost-effectiveness analyses no longer use achievement or intervention scores, but rather directly calculate cost-effectiveness by dividing impact by cost, as you suggest. That being said, our work will never be perfect so we invite anyone reading this with the expertise to improve the rigor of our work to reach out, now or in the future.
Although your comments are related to methods that we no longer use, we’d like to spend more time understanding and engaging with them, learning from them, and potentially correcting any misconceptions. Unfortunately, we won’t have the opportunity to do so until after our charity recommendations are released next week. Additionally, it might be a comfort to know that for the past few months, Giving What We Can has been assessing ACE’s new evaluation methods along with a panel of other experts and that they intend to publish the results later this month.
Thank you.
- The ACE team
Hi,
Thank you for your response!
We are glad to hear that ACE has changed their evaluation methods, and we hope that the changes effectively address the concerns listed in our review.
We look forward to seeing ACE’s new charity recommendations when they are released next week.
Did you ask ACE to review this before publishing? It seems like the kind of thing that would be worth getting feedback on before publishing. I didn’t look at this for more than a couple minutes, but I saw immediately that there might be some conceptual disagreements between you and ACE—for example, I noticed that in your first example, you assume in your example (I believe), that if LIC didn’t spend 200k on the lawsuit against Costco, they wouldn’t spend it on anything else. It’s unclear to me that this is the counterfactual, or how ACE is conceptualizing those funds. There might be reasoning behind their decisionmaking that would be useful to your critiques they could share.
I also felt like this felt pretty politically motivated. Not sure if that is your intention, but paragraphs like this:
Without any evidence feels pretty intense. ACE is kind of low hanging fruit to pick on in the EA space, so this read to me like more of that, without necessarily the evidence base to back it. Reading your report, I felt kind of like “oh, there are interesting assumptions here, would be interested to learn more”, and not “ACE is doing an extremely bad job.”
E.g. I think the questions that would be good to ask in a critique of ACE might be:
If ACE didn’t exist, how would the funds the direct be spent otherwise? Would that be better or worse for animals?
Is historical track record / cost-effectiveness the only lens on which to evaluate charities?
If the answer is yes, seems very hard to start new things!
I don’t know if the LIC legal case is this, but celebrating the potential impact of promising bets that didn’t pan out seems good to me.
I also think getting feedback on statements like this would be really helpful:
I think ACE has wanted to do this at points in their history — my impression is just that it is incredibly difficult, so they’ve approached it from other angles instead. I also don’t think it’s clear to me that ACE’s goal is to report cost-effectiveness. I think clarifying this with them, and getting a sense of why they don’t do what you see as the simple approach would be useful for making this critique stronger. And, I don’t think people should make giving decisions based only on historic cost-effectiveness—just because an opportunity was impactful doesn’t mean the organization needs more funds to do that work, that it will scale, work in the future, etc.
I don’t disagree that ACE might be directing funds to ineffective charities! I don’t really think non-OpenPhil EA donors should give to farmed animal welfare, for example. But, I don’t think it is obvious to me that ACE going away means money going to more effective charities—I expect it would mostly be worse—people giving to animal charities with basically no vetting.
That being said, critique of critical organizations is great in my opinion, so appreciate you putting this out there!
“I don’t really think non-OpenPhil EA donors should give to farmed animal welfare, for example.” Wow, this is interesting! I would love to know what you mean by this?
(I responded privately to this but wrote up some related reflections a while ago here).
Having read your reflections, I’m still curious as to why you don’t think non-OpenPhil donors should give to farmed animal welfare, if you feel comfortable sharing it publicly. I guessed four options, ordered from most to least likely, but I might have misunderstood the post
We should donate to wild animal welfare instead, as it’s more cost-effective
There are no donation opportunities that counterfactually help a significant amount of farmed animals
There is no strong moral obligation to improve future lives, and donations to farmed animal welfare necessarily improve future lives, as farmed animal lives are very short
Tomasik-style arguments on the impact of animal farming on the amount of wild animal suffering
Is it a combination of these? As a concrete example, I’m curious if you believe that the Shrimp Welfare Project shouldn’t be funded, should be funded by “non-EA” donors, or will be funded anyway and donors shouldn’t worry about it.
By the way, thank you for nudging towards sharing evaluations with the evaluated organization before posting, I think it’s a really valuable norm.
Thanks! My wording in the above message was imprecise, but I mean something like farmed vertebrates. SWP is probably among the two most important things to fund, in my opinion.
Basically I think the size of good opportunities in farmed animal advocacy is smaller than OpenPhil’s grantmaking budget and there are few scalable interventions, though I don’t think I want to go into most the reasons publicly. Given that they’ve stopped funding many of what I believe are more cost-effective projects, and that EA donors are basically the only people willing to fund those, EA donors should be mostly inclined to fund things OpenPhil can’t fund instead.
So some combination of 1+2 (for farmed vertebrates) + other factors
What claims did we make that we did not provide evidence for?
I understand these are forthcoming, but no evidence is provided for this entire part—part of the reason I pushed on this is I think seeing your alternative evaluations would be very helpful for interpreting the strength of the critique of ACE. Without seeing them, I can’t evaluate the latter half of the quoted text. And in my eyes, if these are similar to the evaluation here of LIC, it’s pretty far from demonstrating that ineffective charities are receiving recommendations, etc. And, given that you’ve only evaluated <50% of their charities so far, it seems preemptive to make the overall claim. I think the overall claim is very possibly true, but again, I think to make the argument that animals are directly suffering as a result of this, you’d have to demonstrate that those charities are worse than other donation options, that donors would give to the better options, etc.
Note: this reply addresses everything Abraham claims we did not provide evidence for.
“ACE’s poor evaluation process leads to ineffective charities receiving recommendations”
Our review covered how under ACE’s evaluation process:
Charities can receive a worse Cost-Effectiveness Score by spending less money to achieve the exact same results.
Charities can have 1,000,000 times the impact at the exact same price, and their Cost-Effectiveness Score can remain the same.
The most important factor in determining the impact of an intervention is decided before the intervention even begins.
This is clear evidence that ACE uses a poor evaluation process. Is the fact that ACE’s evaluation process rewards inefficiency, and punishes efficiency, “no evidence” for ACE recommending ineffective charities?
If you’d like me to get even more specific, let’s look at Problem 1 of our review:
We go on to detail how if LIC had spent less than $2,000 on the lawsuit (saving over $200,000) and achieved the exact same outcome, ACE would have assigned LIC a Cost-Effectiveness Score of 1.8. The lowest Cost-Effectiveness Score ACE assigned to any charity in 2023 was 3.3. This means if LIC had spent less than $2,000 on the lawsuit, LIC’s Cost-Effectiveness Score would have been significantly worse than any charity ACE evaluated in 2023.
Instead, LIC spent over $200,000 on the lawsuit, and LIC rewarded them for this inefficiency by giving them a Cost-Effectiveness Score of 3.7, and deeming LIC a top 11 animal charity.
As we noted in our review, these Cost-Effectiveness Scores are defined by ACE as “how cost effective we think the charity has been”. LIC achieved no favorable legal outcomes despite receiving over a million dollars in funding. As we also noted in our review, every lawsuit LIC filed was dismissed for failing to state a valid legal claim.
If I provided evidence that a Law Firm Rating Organization rewards law firms for losing lawsuits and wasting money, and punishes law firms for winning lawsuits and saving money, would this be no evidence that the Law Firm Rating Organization is recommending ineffective law firms?
Our review details how ACE’s recommendations direct the flow of millions of dollars. Are you asking for evidence that directing millions of dollars toward ineffective animal charities, rather than effective ones, leads to animal suffering?
Imagine a film critic watches 5 of the 11 films that received a ‘Best Films’ award and writes, “Of the five films I’ve seen, only one appears to deserve the award. I plan to release my reviews of the films shortly.” Does this statement by the film critic require evidence?
(Responding because this is inaccurate): My claim in the comment above was that you haven’t provided any evidence that:
5 / 11 (or more) ACE top charities are not effective
That animals are suffering as a result of ACE recommendations
Which remains the case — I look forward to you producing it.
I don’t know what you’re saying is inaccurate. My reply addressed every single word from the section you claimed I didn’t provide evidence for.
We never made this claim.
I’ll ask again. Our review details how ACE is rewarding charities for inefficiency (and punishing them for efficiency), and how LIC was rewarded for their inefficiency with the designation “Top 11 Animal Charities to Donate to in 2024.” Our review also details how ACE’s recommendations direct the flow of millions of dollars. Are you asking for evidence that directing millions of dollars toward ineffective animal charities, rather than effective ones, leads to animal suffering?
This is starting to feel pretty bad faith, so I’m actually going to stop engaging.
You straw manned us, and now you claim that “This is starting to feel pretty bad faith”.
Here is the quote of what we said:
Here is the quote of what you said we claimed:
Notice that we said that only one of the 5 appears to be effective (meaning 4 did not appear to be effective), and you changed this claim to 5 are not effective.
Is the claim “4 did not appear to be effective” the same as “5 are not effective”?
Hi Abraham,
Thank you for reading some of the article. I hope that you find some time to read the rest.
No, I did not ask ACE. I hope that this article inspires a public discussion.
What do you mean by conceptualizing funds? In this hypothetical, they simply spend $200k less on the lawsuit. LIC did not spend their entire budget, and charities oftentimes do not. Under ACE’s methodology, LIC’s cost-effectiveness would worsen if they spent $200k less and achieved the exact same total outcomes as a charity. The calculations we’ve done are 100% objective, and if you can find an error that we made, please let us know. You can find those calculations here:
Original: https://docs.google.com/spreadsheets/d/1BzLUSefsd5K2uhGw81UPAe_v0E_JJR8Kdgh8zsBtxsU/edit?gid=0#gid=0
Hypothetical: https://docs.google.com/spreadsheets/d/1IZbSfk4eNukmUU7ntRT0wCgvrLG6_ZcyXt3uANmFYbE/edit?gid=1427660869#gid=1427660869
What assumptions are you referring to?
If ACE didn’t exist, I would hope more funds would go to effective charities instead of ineffective ones. I hope to take part in this positive change.
LIC has a historical track record, and it is a bad one. People should have the opportunity to start something new. However, they shouldn’t be rated a top 11 animal charity after receiving over a million dollars in funding and failing to achieve any positive legal outcomes.
I saw that you call into question the integrity of the article. I want to be clear in saying that we have no relationship with any charity. However, I noticed that you co-founded the Wild Animal Initiative, which is a charity endorsed by ACE. Still, I don’t question your feedback on the article. I hope that going forward you will evaluate our reviews for what they are, rather than suggest something “political” is going on.
Thank you, and I appreciate your feedback!
To be clear, I read the whole thing—I meant that I think the fact that a pretty important issue jumped out to me within a few minutes of starting reading struck me as a reason that getting feedback from ACE seems really important.
I really think you should! I also really think you should ask for feedback from other people who have done charity evaluations, and the charities you evaluate. You should definitely still publish them, but they’ll be better critiques for having engaged with the best case for the thing you’re critiquing!
Yep, this seems right, but it’s also the case that if they did something else with that funding, the effectiveness of that action would be rated much more highly, which also seems correct. I think the issues you point to are interesting, but they strike me as intentional decisions, which ACE may have internal views on, and for which I think getting their feedback might be really important. You are correct about a mathematical fact, but both you and ACE seem to have different goals (calculating historic cost-effectiveness vs marginal impact of future dollars), and there are assumptions underlying your analysis that if changed, might change the output.
I meant ACE’s assumptions—I thought your post raised some really good questions. They are issues that if I saw, I’d email to ACE and ask why they made the choices they made, then choose whether or not to publicly publish them based on their response. Maybe these choices are reasonable, and maybe they aren’t—you raised some really good points I think. But it just seems hard to evaluate in a vacuum.
Again, I don’t really see good evidence for this—what is the typical track record for legal campaigns? How much do they cost? How long do they take to work? These all would be important questions to answer before claiming cost-effectiveness or lack thereof. In this case, I could easily be persuaded to agree with you, but not for any of the reasons in your analysis — the fact that they spent some money some lawsuits and it didn’t work isn’t the only evidence I’d want to think about whether or not donations to them will be useful.
Our hypothetical in Problem 1 of the review is about two scenarios:
LIC spending $204,428 on the Costco lawsuit and achieving outcome X. (this is what actually happened)
LIC spending $1,566 on the Costco lawsuit and achievement outcome X. (this is what happened in the hypothetical)
Note that both scenarios achieve the exact same result, which is outcome X. ACE would rate LIC less cost-effective for spending $1,566 (saving $202,862) to achieve outcome X.
The hypothetical is not about something else they could do with the funding. The hypothetical is about assessing what happens to a charity’s Cost-Effectiveness Score if they save money to achieve the same outcome.
How do you know that if they did something else with that funding the effectiveness of that action would be rated much more highly? According to ACE, the Costco lawsuit was a particularly cost effective intervention in spite of the fact that the lawsuit was dismissed for failing to state a valid legal claim:
“We think that out of all of Legal Impact for Chickens’ achievements, the Costco shareholder derivative case is particularly cost effective because it scored high on achievement quality.”
I’m also not sure how it would even be possible to evaluate a hypothetical in which LIC does something else with the funding. Could you explain how this hypothetical would work, and how it would be evaluated? All of the examples in our review are 100% objective and based on ACE’s own methodology. There is no subjectivity in our examples, and this was done intentionally.
I’m not sure how ACE’s goals could align with the principles of effective altruism if they intentionally created a methodology that contains the problem described above.
Imagine there is a law firm that has received over a million of dollars in funding, existed for multiple years, and failed to secure any favorable legal outcomes. Also imagine that their most cost-effective lawsuit was one in which they spent over $200,000, and the lawsuit was dismissed for failing to state a valid legal claim.
If this law firm were rated one of the top 100 law firms in the world, what would you think of the organization that assigned this rating? Would you say there is not good evidence for this being an incorrect rating?
ACE rated LIC as one of the Top 11 Animal Charities to Donate to in 2024. Prior to being reviewed, LIC received over a million dollars in funding, existed for multiple years, and failed to secure any favorable legal outcomes. According to ACE, LIC’s most cost-effective intervention was one in which they spent over $200,000, and the lawsuit was dismissed for failing to state a valid legal claim.
Are these factors poor evidence for LIC not being one of the 11 best animal charities to donate to?
I don’t really have a strong view about LIC—as I’ve mentioned elsewhere in the comments, I’m skeptical in general that very EA donors should give to farmed vertebrate welfare issues in the near future. But I don’t find this level of evidence particularly compelling on its own. I think I feel confused about the example you’re giving because it isn’t about hypothetical cost-effectiveness, it’s about historic cost-effectiveness, where what matters are the counterfactuals.
I broadly think the critique is interesting, and again, seems like probably an issue with the methodology, but on its own doesn’t seem like reason to think that ACE isn’t identifying good donation opportunities, because things besides cost-effectiveness also matter here.
You don’t find these facts particularly compelling evidence that LIC is not historically cost-effective?
LIC’s most cost-effective intervention was one in which they spent over $200,000, and the lawsuit was dismissed for failing to state a valid legal claim.
LIC received over a million dollars in funding prior to being reviewed
LIC existed for multiple years prior to being reviewed
LIC failed to secure any favorable legal outcomes, or file any lawsuit that stated a valid legal claim?
What would be compelling evidence for LIC not being historically cost-effective?
ACE does 2 separate analyses for past cost-effectiveness, and room for future funding. For example, those two sections in ACE’s review of LIC are:
Cost Effectiveness: How much has Legal Impact for Chickens achieved through their programs?
Room For More Funding: How much additional money can Legal Impact for Chickens effectively use in the next two years?
Our review focuses on ACE’s Cost-Effectiveness analysis, not on their Room For More Funding analysis. In the future, we may evaluate ACE’s Room For More Funding Analysis, but that is not what our review focused on.
However, I would like to pose a question to you: Given the ACE often gives charities a worse historic cost-effectiveness rating for spending less money to achieve the exact same outcomes (see Problem 1), how confident do you feel in ACE’s ability to analyze future cost-effectiveness (which is inherently more difficult to analyze)?
I don’t find that evidence particularly compelling on its own, no. Lots of projects cost more than 1M or take more than a few years to have success. I don’t see why those things would be cause to dismiss a project out of hand. I don’t really buy social movement theories of change for animal advocacy, but many people do, and it just seems like many social movement-y things take a long time to build momentum, and legal and research-focused projects take forever to play out. Things I’d want to look at to form a view on this (though to be clear, I plausibly agree with you!):
How much lawsuits of this type typically cost
What the base rate for success is for this kind of work
How long this kind of work typically takes to get traction
Has anyone else tried similar work on misleading labelling or whatever? Was it effective or not?
Has LIC’s work inspired other lawsuits, as ACE reported might be a positive side effect?
I don’t think we disagree that much here, except how much these things matter — I don’t really care about ACE’s ability to analyze cost-effectiveness outside broad strokes because I think the primary benefits of organizations like ACE is shifting money to more cost-effective things within the animal space, which I do believe ACE does. I also don’t mind ACE endorsing speculative bets that don’t pay off — I think there are many things that were worth paying for in expectation that don’t end up helping any animals, and will continue to be, because we don’t really know very many effective ways to help animals so the information value of trying new things is high.
But to answer your question specifically, I’d be very skeptical of anyone’s numbers on future cost-effectiveness, ACE’s or yours or my own, because I think this is an issue that has historically been extremely difficult to estimate cost-effectiveness for. I’m not convinced that’s the right way to approach identifying effective animal interventions, in part because it is so hard to do well. I don’t really think ACE is making cost-effectiveness estimates here though—it seems much more like trying to get a rough sense of relative cost-effectiveness, which, putting aside the methodological issues you’ve raised, seems like the right approach to me, but only a small part of the information I’d want to know where money should move in animal advocacy.
How much lawsuits of this type typically cost
What the base rate for success is for this kind of work
How long this kind of work typically takes to get traction
The Nonhuman Rights Project provides a possible point of comparison. From 2013 to 2023 they raised $13.2 Million. As far as I know, they have never won a case.
The question I asked was: “You don’t find these facts particularly compelling evidence that LIC is not historically cost-effective?”
The question was not about whether these facts are compelling evidence that LIC won’t be successful in the future, or if the project should be dismissed.
Wait, those are related to each other though—if we haven’t seen the full impact of their previous actions, we haven’t yet seen their historical cost-effectiveness in full! Also, you cite these as reasons the project should be dismissed in your post—you have a section literally called “Legal Impact for Chickens Did Not Achieve Any Favorable Legal Outcomes, Yet ACE Rated Them a Top Charity” which reads to me that you believe that it is bad they were rated a Top Charity, and make these same arguments (and no others) in the section, suggesting that you think this evidence means they should be dismissed.
No, they are not. Historical cost-effectiveness refers to past actions and outcomes—what has already occurred.
All of LIC’s legal actions have already been either dismissed or rejected. What are you suggesting we need to wait for before we can analyze LIC’s historical cost-effectiveness in full?
You are conflating the issue of past cost-effectiveness with future potential.
Did I claim that I don’t think LIC “should be dismissed”?
ACE states (under Criterion 2) that a charity’s Cost-Effectiveness Score “indicates, on a 1-7 scale, how cost effective we think the charity has been [...] with higher scores indicating higher cost effectiveness.”
Would you mind clarifying what you believe ACE’s goal is, and what you believe my goal is?
The analysis in my review is entirely about calculating historic cost-effectiveness. ACE’s Cost-Effectiveness Scores are also entirely about calculating historic cost-effectiveness.
From this post, it seems like you’re trying to calculate historic cost-effectiveness and rate charities exclusively on that (since you haven’t published an evaluation of an animal charity yet I could be wrong here though). My understanding of what ACE is trying to do with its evaluations as a whole is identify where marginal dollars might be most useful for animal advocacy, and move money from less effective opportunities to those. Cost-effectiveness might be one component of that, but is far from the only one (e.g. intervention scalability might matter, having a diversity of types of opportunities to appeal to different donors, etc.). It’s pretty easy to imagine scenarios where you wouldn’t prefer to only look at cost-effectiveness of individual charities when making recommendation, even if that’s what matters in the end. It’s also easy to imagine scenarios where recommending less effective opportunities leads to better outcomes to animals—maybe installing shrimp stunners is super effective, but only some donors will give to it. Maybe it can only scale to a few M per year but you influence more money than that. Depending on your circumstances, a lot more than cost-effectiveness of specific interventions matters for making the most effective recommendations.
My understanding is also that ACE doesn’t see EAs as its primary audience (but I’m less certain about this). This is a reason I’m excited about your project—seems nice to have “very EA” evaluations of charities in addition to ACE’s. But, I also imagine it would be hard to get charities to participate in your evaluation process if you don’t run the evaluations by them in advance, which could make it hard for you to get information to do what you’re trying to do, unless you rely on the information ACE collects, which then puts you in an awkward position of making a strong argument against an organization you might need to conduct evaluations.
My understanding is ACE has tried to do something that’s just cost-effectiveness analysis in the past (they used to give probability distributions for how many animals were helped, for example). But it’s really difficult to do confidently for animal issues, and that’s part of the reason it’s only a portion of the whole picture (along with other factors like I mention above).
Thank you for your response!
This is not what we are trying to do. We simply critiqued the way that ACE calculated historic cost-effectiveness, and how ACE gave Legal Impact for Chickens a relatively high historic cost-effectiveness rating despite have no historic success.
ACE does 2 separate analyses for past cost-effectiveness, and room for future funding. For example, those two sections in ACE’s review of LIC are:
Cost Effectiveness: How much has Legal Impact for Chickens achieved through their programs?
Room For More Funding: How much additional money can Legal Impact for Chickens effectively use in the next two years?
Our review focuses on ACE’s Cost-Effectiveness analysis, not on their Room For More Funding analysis. In the future, we may evaluate ACE’s Room For More Funding Analysis, but that is not what our review focused on. We wanted to keep our review short enough that people could read it without a huge time investment, so we could not include an assessment of every single part of ACE’s evaluation process in our review.
It is also less reasonable to hold ACE accountable for their Room For More Funding analysis, since this is inherently more subjective and difficult to do. It is far easier for ACE (or any charity evaluator) to analyze historic cost-effectiveness than to analyze future cost-effectiveness. However, I would like to pose a question to you: Given the ACE often gives charities a worse historic cost-effectiveness rating for spending less money to achieve the exact same outcomes (see Problem 1), how confident do you feel in ACE’s ability to analyze future cost-effectiveness?
ACE responded to this thread acknowledging that the problems listed in our review needed to be addressed, and that they changed their methodology (to a cost-effectiveness calculation of simply impact divided by cost) to do so:
FWIW this seems great—excited to see more comprehensive evaluations. Yeah, I agree with many of your comments here on the granular level — it seems you found something that is a potential issue for how ACE does (or did) some aspects of their evaluations, and publishing that is great! I think we just disagree on how important it is?
By the way, I’m ending further engagement on this (though feel free to leave a response if useful!) just because I already find the EA Forum distracting from other work, and don’t have time this week to think about this more. Appreciate you going through everything with me!
No problem. Thank you for your replies!
Thank you for doing this work. I’m very supportive of productive criticism on the Forum. As a moderator, I’d like to recommend this post for tips on how to make criticism more productive. EA is a collective project, and I think that steps such as sharing this feedback with ACE directly and writing a less aggressive title for your post would improve the outcomes of this work.
Thank you for your feedback. We will view the tips, and keep them in mind during our future reviews!
Edit: I have also changed the title of the post. For transparency, the original title was: Animal Charity Evaluators (ACE) is Extremely Bad at Evaluating Charities.
Their evaluation process has been updated (e.g. here), and I’m inclined to wait to see their new evaluations and recommendations before criticizing much, because any criticism based on last year’s work may no longer apply. Their new recommendations come out November 12th.
FWIW, I am sympathetic to your criticisms, as applied to last year’s evaluations. I previously left some constructive criticism here, too.
Hi Michael,
ACE re-evaluates their Recommended Charities every two years. In our review of ACE, all charities mentioned were evaluated in 2023 (the most recent published review cycle). Therefore, every charity mentioned in our review will still be recommended in ACE’s upcoming list of Recommended Charities.
When the new reviews come out, we will be sure to read them though!
Thanks for writing this. I feel like the following is the crux of your criticism of LIC:
You state this as though the answer is “obviously no” but the answer feels extremely nonobvious to me. I note that you excluded the some key things when quoting ACE:
The Facebook fan page still as of this writing has a post about the lawsuit pinned to the top because apparently the owner decided to boycott after learning about the cruelty.
It sounds like the Costco board also had to take official action:
Is it worth $200k to get a bunch of bad publicity for Costco, force the board to form a committee and hire an investigator, etc.?
I don’t know, I’m pretty willing to believe that the answer is “no”, but it doesn’t seem obvious to me. I could pretty easily believe that the CEO of the next company they sue would to change their policies instead of having to deal with the embarrassment of asking the board to form a committee to investigate.
Hi Ben,
Thank you for your response!
I will address your points, but first I would like to clarify what we believe the crux of the problem is with LIC being deemed a top 11 animal charity by ACE.
In Problem 1 of our review, we state the following:
We go on how to detail how if LIC had spent less than $2,000 on the lawsuit (saving over $200,000) and achieved the exact same outcome, ACE would have assigned LIC a Cost-Effectiveness Score of 1.8. The lowest Cost-Effectiveness Score ACE assigned to any charity in 2023 was 3.3. This means if LIC had spent less than $2,000 on the lawsuit, LIC’s Cost-Effectiveness Score would have been significantly worse than any charity ACE evaluated in 2023.
Instead, LIC spent over $200,000 on the lawsuit, and LIC rewarded them for this inefficiency by giving them a Cost-Effectiveness Score of 3.7, and deeming LIC a top 11 animal charity.
This is the crux of the problem, and it is really an issue with ACE deeming LIC a top 11 animal charity, not with LIC itself. ACE elected to give LIC this distinction, and LIC merely accepted it.
I would also like to note encouraging or valuing lawsuits that fail to state valid legal claims (but burden defendants/garner publicity) risks causing the legal system to take animal rights/welfare cases less seriously. If courts observe a pattern of weak or legally insufficient cases being filed for publicity/to burden the defendant, they will become skeptical of all animal rights/welfare lawsuits—even those with strong legal merit. Prior to being deemed a top 11 animal charity by ACE, every single lawsuit filed by LIC failed to state a valid legal claim.
ACE’s review of LIC contains a section titled “Our Assessment of Legal Impact for Chickens’ Cost Effectiveness”, and the quote you have provided is not part of this section. Our entire review of ACE is about ACE incorrectly calculating cost-effectiveness; consequently, this is the section we decided to focus on. ACE’s review of LIC is over 5,000 words, and we cannot include every quote from ACE’s review of LIC.
Additionally, the quote you’ve provided gives no metrics to gauge how much media attention was received. If media attention is a strong justification for stating a $200,000 lawsuit that failed to state a valid legal claim is “particularly cost-effective” (as ACE put it), ACE should provide metrics regarding how much media attention was received. Ironically, the Facebook post you mentioned appears to have more metrics than ACE’s review of LIC regarding amount of media attention caused by the Costco lawsuit, since the Facebook post lists the numbers of likes and comments the post received.
The Facebook post you referred to received 56 likes and 83 comments. To my understanding, the post is also not pinned to the top, it is simply the last post the Facebook page has made (it appears that the page has not posted in over 2 years). I do not think this is very strong evidence that LIC’s $200,000 lawsuit that was dismissed for failing to state a valid legal claim was “particularly cost-effective” (as ACE put it).
Correct, the Costco board took official action by rejecting LIC’s demands.
Could you please define what “a bunch of bad publicity for Costco” means? And could you provide evidence that this level of publicity was caused by LIC’s lawsuit?
Costco’s board formed a committee to review and investigate LIC’s demands. The committee then recommended that the board reject the demand, which they did. This does not appear to be a very good outcome.
It is ACE’s job to write charity reviews that provide the empirics necessary to answer questions like the one you’ve asked. From your own statement, it seems like ACE has failed to do this. ACE did not provide metrics on how much media attention the Costco lawsuit caused, and did not provide any insight into how much of a burden it was to form a committee to review and investigate LIC’s demands (I don’t recall ACE’s review even mentioning this).
Yes, thank you, I understand that weighting by budget results in the phenomenon you described. I didn’t comment on this since it sounds like ACE is planning to change it anyway.
I was referring to the publicity listed in ACE’s review. The stories appear to be about the lawsuit so I am not entirely sure what you mean by “could you provide evidence that this level of publicity was caused by LIC’s lawsuit”. See e.g. CNN, Fox.
To clarify: I don’t care about causing burdens to Costco per se. The reason that burdens are relevant is because future companies might prefer to avoid that burden and instead change their policies. I agree it would be good to have a better model of when this would happen and would be excited for someone to make such a model!
If you’re correct in the linked analysis, this sounds like a really important limitation in ACE’s methodology, and I’m very glad you’ve shared this!
In case anyone else has the same confusion as me when reading your summary: I think there is nothing wrong with calculating a charity’s cost effectiveness by taking the weighted sum of the cost-effectiveness of all of their interventions (weighted by share of total funding that intervention receives). This should mathematically be the same as (Total Impact / Total cost), and so should indeed go up if their spending on a particular intervention goes down (while achieving the same impact).
The (claimed) cause of the problem is just that ACE’s cost-effectiveness estimate does not go up by anywhere near as much as it should when the cost of an intervention is reduced, leading the cost-effectiveness of the charity as a whole to actually change in the wrong direction when doing the above weighted sum!
If this is true it sounds pretty bad. Would be interested to read a response from them.
Of course, the other thing that could be going on here, is that average cost-effectiveness is not the same as cost-effectiveness on the margin, which is presumably what ACE should care about. Though I don’t see why an intervention representing a smaller share of a charity’s expenditure should automatically mean that this is not where extra dollars would be allocated. The two things seem independent to me.
Hi Toby,
Thank you for your reply!
I’m not certain if by cost-effectiveness on the margin, you meant cost-effectiveness in the future if additional funding is obtained. If that’s the case, the following information could be helpful.
ACE does 2 separate analyses for past cost-effectiveness, and room for future funding. For example, those two sections in ACE’s review of LIC are:
Cost Effectiveness: How much has Legal Impact for Chickens achieved through their programs?
Room For More Funding: How much additional money can Legal Impact for Chickens effectively use in the next two years?
Our review focuses on ACE’s Cost-Effectiveness analysis. Additionally, ACE states (under Criterion 2) that a charity’s Cost-Effectiveness Score “indicates, on a 1-7 scale, how cost effective we think the charity has been [...] with higher scores indicating higher cost effectiveness.”
This is very helpful, thanks!
I strongly upvoted this post because I’m extremely interested in seeing it get more attention and, hopefully, a potential rebuttal. I think this is extremely important to get to the bottom of!
At first glance your critiques seem pretty damning, but I would have to put a bunch of time into understanding ACE’s evaluations first before I would be able to conclude whether I agree your critiques (I can spend a weekend day doing this and writing up my own thoughts in a new post if there is interest).
My expectation is that if I were to do this I would come out feeling less confident than you seem to be. I’m a bit concerned that you haven’t made an attempt at explaining why ACE might have constructed their analyses this way.
But like I’m pretty confused too. It’s hard to think of much justification for the choice of numbers in the ‘Impact Potential Score’ and deciding the impact of a book based on the average of all books doesn’t seem like the best way to approach things?
Hi Mathias,
Thank you for your comment!
We would definitely be interested in hearing your thoughts. We’ve set post notifications on for your profile, and look forward to seeing your post!
So, I have some mixed views about this post. Let’s start with the positive.
In terms of agreement: I do think organizational critics are valuable, and specifically, critics of ACE in the past have been helpful in improving their direction and impact. I also love the idea of having more charity evaluators (even in the same cause area) with slightly different methods or approaches to determining how to do good, so I’m excited to see this initiative. I also have quite a bit of sympathy for giving higher weight to explicit cost-effectiveness models when it comes to animal welfare evaluations.
I can personally relate to the feeling of being disappointed after digging deeper into the numbers of well-respected EA meta organizations, so I understand the tone and frustration. However, I suspect your arguments may get a lot of pushback on tone alone, which could distract from the more important substance of the post and concepts (I’ll leave that for others to address, as it feels less important, in my opinion).
In terms of disagreement: I will focus on what I think is the crux of the issue, which I would summarize as: (a) ACE uses a methodology that yields quite different results than a raw cost-effectiveness analysis; (b) this methodology seems to have major flaws, as it can lead to clearly incoherent conclusions and recommendations easily; and (c) thus, it is better to use a more straightforward, direct CEA.
I agree with points A and B, but I am much less convinced about point C. To me, this feels a bit like an isolated demand for methodological rigor. Every methodology has flaws, and it’s easy to find situations that lead to clearly incoherent conclusions. Expected value theory itself, using pure EV terms, has well-known issues like St. Petersburg Paradox, optimizer’s curse, and general model mistakes. CEAs in general share these issues and have additional flaws (see more on this here). I think CEAs are a super useful tool, but they are ultimately a model of reality, not reality itself, and I think EA can sometimes get too caught up in them (whereas the rest of the world probably doesn’t use them nearly enough). GW, which has ~20x the budget of ACE, still finds model errors and openly discusses how softer judgments on ethics and discount factors influence outcomes (and they consider more than just a pure CEA calculation when recommending a charity).
Overall, being pretty familiar with ACE’s methodology and CEAs, I would expect, for example, that a 10-hour CEA of the same organizations would be quite a bit further from the truth of the actual impact or effectiveness of an organization. It’s not clear to me that spending equal time on pure CEAs versus a mix of evaluative techniques (as ACE currently does) would lead to more accurate results (I would probably weakly bet against it). I think this post overstates the importance of discarding a model due to a flaw that can be exploited.
A softer argument, such as “ACE should spend double the percentage of time it currently spends on CEAs relative to other methods” or “ACE should ensure that intervention weightings do not overshadow program-level execution data,” is something I have a lot of sympathy for.
Hi Joey,
Thank you for taking the time to read our review!
I would like to point to Problem 1 and Problem 4 from the review:
Charities can receive a worse Cost-Effectiveness Score by spending less money to achieve the exact same results.
Charities can have 1,000,000 times the impact at the exact same price, and their Cost-Effectiveness Score can remain the same.
Effective giving is all about achieving the greatest impact at the lowest cost. ACE’s methodology is not properly accounting for impact, or for cost.
Using the equation impact / cost at least results in impact being in the numerator, and cost being in the denominator. To me, this alone makes a straightforward, direct CEA a better methodology than the one used by ACE.
I absolutely agree that every methodology has flaws, and we did not mean to imply otherwise. However, the incoherent conclusions described in our review of ACE’s methodology are not one off instances. hey are pervasive problems that impact all of ACE’s reviews.
Thank you for your feedback!
Great analysis, Isaac! I worry the Animal Welfare Fund (AWF) has similar problems (see below), but they are way less transparent than ACE about their evaluations, and therefore much less scrutable. Instead of mostly deferring to AWF, I would rather have donors look over ACE’s evaluations, discuss their findings with others, and eventually publish them online, even if they spend much less time on these activities than you did.
AWF only runs cost-effectiveness analysis (CEAs) for a minority of applications. According to a comment by Karolina Sarek, AWF’s chair, on June 28 (this year):
Comparisons across grants also seem to be lacking. From Giving What We Can’s (GWWC’s) evaluation of AWF in November 2023 (emphasis mine):
GWWC looked into 10 applications:
Karolina also said on June 28 that AWF has improved their methodology since GWWC’s evaluation:
I do not doubt AWF has taken the above steps, but I have no way to check it. I think donating to ACE over AWF is a good way of incentivising transparency, which ultimately can lead to more impact.
Hey Vasco! I agree that AWF should be more transparent, and since I started working on it full-time, we have more capacity for that, and we are planning to communicate about our work more proactively.
In light of that, we just published a post summarizing how 2024 went, what changes we recently introduced, and what we are planning. We touched on updates to our evaluation process as well. Here is the relevant section from that post:
“Grant investigations:
Updated grant evaluation framework: We’ve updated our systematic review process, enabling us to evaluate every application using standardized templates that vary based on the required depth of investigation. This framework ensures a thorough assessment of key factors while maintaining flexibility for grant-specific considerations. For example, for the deep evaluations, (which are the vast majority of all evaluations), key evaluation areas include assessment of the project’s Theory of Change, scale of counterfactual impact, likelihood of success, back-of-the-envelope cost-effectiveness and benchmarking, and the expected value of receiving funding. It also includes forecasting grant outcomes. You can read more about our process in the FAQ.
Introduced new decision procedures for marginal grants: We introduced an additional step in our evaluation that enables us to make better decisions about grants that are just below or just above our funding bar. Since AWF gives grants on a rolling basis rather than in rounds, it is important to have a process for this to ensure decisions are consistent.”
We also slightly updated our website and added a new question to the FAQ—I’m copying that below:
“How Does the EA Animal Welfare Fund Make Grant Decisions?
Our grantmaking process consists of the following stages:
Stage 1: Application Processing. When we receive an application, it’s entered into our project management system along with the complete application details, history of previous applications from the applicant, evaluation rubrics, investigator assignments, and other relevant documentation.
Stage 2: Initial Screening. We conduct a quick scope check to ensure applications align with our fund’s mission and show potential for high impact. About 30% of applications are filtered out at this stage, typically because they fall outside our scope or don’t demonstrate sufficient impact potential.
Stage 3: Selecting Primary Grant Investigator and Depth of the Evaluation. For applications that pass the initial screening, we assign investigators who are most suitable for a given evaluation. Based on various heuristics, such as the size of the grant, uncertainty, and potential risk, the Fund’s Chair also determines the depth of the evaluation.
Stage 4: In-Depth Evaluation. Every grant application undergoes a systematic review. For each level of depth of investigation required, AWF has an evaluation template that fund managers follow. The framework balances ensuring that all key factors have been considered and that evaluations are consistent, while leaving space for additional, grant-specific crucial considerations. For the deep evaluations, (which are the vast majority of all evaluations), the primary investigator typically examines:
Theory of Change (ToC) - examining how activities translate into improvements for animals and whether the evidence supports its merits
Scale of counterfactual impact—assessing the problem’s scale, neglectedness, and strategic importance
Likelihood of success—evaluating track record, team competence, and concrete plans
Cost-effectiveness and benchmarking- conducting calculations to estimate impact per dollar and compare it to relevant benchmarks
Value of funding—analyzing counterfactuals and long-term sustainability
Forecasting—forecasting the probability that the project will succeed or fail and due to what reasons (validity of the ToC or performance in achieving planned outcomes )
In the case of evaluations that require the maximum level of depth, a secondary investigator critically reviews the completed write-up, raises additional questions and concerns, and provides alternative perspectives or recommendations.
Stage 5: Collective Review and Voting. After the evaluation, each application undergoes a thorough collective assessment. The Fund Chair and at least two Fund Managers review the analysis. All Fund Managers without conflicts of interest can contribute additional insights and discuss key questions through dedicated channels. Finally, each Fund Manager assigns a score, which helps us systematically compare the most promising grants.
Stage 6: Final Recommendation Looking at the average score, the Fund Chair approves grants that are clearly above our funding bar and rejects those clearly below it. For grants near our funding threshold, we conduct another step where all found managers compare those marginal grants against each other to select the strongest proposals.
Once decisions are finalized, approved grants move to our grants team for contracting and reporting setup.
Throughout this process, we maintain detailed documentation and apply consistent standards to ensure we select the most promising opportunities to help animals most effectively.”
Thanks, Karolina! Great updates.
It feels like this needs a response from both ACE and Legal Impact for Chickens. (I’m not suggesting it should be a quick one, some things are important enough to warrant careful review. I agree with @abrahamrowe it would probably have been better to ask for their comments before publishing)
I think it is possible for a charity focusing on taking legal action to be impactful without [consistent] legal success, which the review doesn’t really acknowledge. A large part of the theory of change around suing corporate bad behaviour is the idea that it will deter bad behaviour in future, by making standards compliance more cost effective than defending lawsuits
Deterrent effects however are a more complicated theory of change than actually winning cases and forcing actors to change. And it may be very difficult to have a deterrent effect if cases are typically dismissed.
To that extent I’m quite surprised to learn that Legal Impact for Chickens apparently hasn’t yet had any victories, based on what I had heard about that organization. I don’t think this necessarily reflects badly on the organization, which is a young charity focused on a legal process which inevitably takes time. But it does mean the error bars for their impact are rather large, and could mean a nonzero possibility they aren’t [yet] having an impact at all. It would be interesting to hear more about metrics used (both by LIC and ACE, and other charities with similar theory of change for that matter) to evaluate the impact of an unsuccessful lawsuit, and how substantial those are.
Some of the questions raised about ACE’s weightings are quite independent from the example given. It would be interesting to hear from ACE if and how evaluation criteria for their [apparently mostly subjective] impact scoring takes into account the idea that a charity could achieve a higher score by subdividing campaigns, and if and how they intend to update impact assessments in cases like the example of books either failing to reach a non trivial number of people or being phenomenally successful even if the case they make for veganism was not originally assessed as particularly evidence-based.
I think this would have been an interesting contribution to the Animal Welfare vs GHD debate week. From the limited amount I read of it, it seemed that even people (on different sides of the debate) whose analysis was very thorough weren’t taking account the more straightforward possibility that some of the highlighted top animal advocacy charities simply weren’t close to being as effective [yet] at achieving their goals as suggested, regardless of philosophical positions and empirical claims about welfare levels.
Hi David,
Thank you for your reply!
I definitely agree that this is possible! However, as you said
ACE evaluated 3 “legal actions” in their review of LIC:
2 of the legal actions were dismissed under Rule 12(b)(6) for failing to state a valid legal claim. 12(b)(6) dismissals occur very early on in the legal process, making any legal expenses incurred by the Defendants relatively low. Additionally, encouraging or valuing lawsuits that fail to state valid legal claims but cost the defendant money risks causing the legal system to take animal rights/welfare cases less seriously. If courts observe a pattern of weak or legally insufficient cases being filed to burden defendants, they will become skeptical of all animal rights/welfare lawsuits—even those with strong legal merit.
The 3rd legal action ACE evaluated was not actually a legal action, but rather a public comment submission (ACE still classified it as a legal action). The public comment was rejected, and it is difficult to see how this would have a positive impact.
ACE endorses LIC as a top charity. Currently, I don’t think this endorsement is justified given LIC’s track record, and I don’t think ACE provided a very strong justification for it. Here is a quote from ACE’s review of LIC:
“We think that out of all of Legal Impact for Chickens’ achievements, the Costco shareholder derivative case is particularly cost effective because it scored high on achievement quality.”
The Costco shareholder derivative case cost LIC over $200,000 and was dismissed for failing to state a valid legal claim. It is difficult to understand why ACE thinks this is a particularly cost effective achievement.
Could you elaborate on what you mean by this?
I wasn’t aware of that week. Maybe we’ll be able to prepare something for it next year!
Thank you for your feedback!
I agree with this, and particularly agree that the quote you highlighted below does not seem like good justification. I also think your comment (elsewhere in this thread) that their track record is a “bad one” might be going a little too far.[1] As I say, I was surprised to find that LIC had not yet had any legal success, given that I’d heard about them mostly through hearing positive commentary on their cost effectiveness
I meant that there were criticisms you raised about the overall methodology that had wider implications than just LIC. Possibly I could have worded that better.
There was an animal welfare vs GHD debate week on this forum. Honestly, I hope they don’t repeat it![2]
I think a charity aiming to encourage compliance that never filed any lawsuits unless they were almost certain to succeed would probably underperform too, and $200k is not an especially expensive legal case, though there are certainly more proven cost effective ways to save lives for that sort of money. That said, I haven’t read the lawsuit and wouldn’t know enough about relevant law to know whether the basis for dismissal was blindingly obvious or not...
I think there are probably more specific and less polarizing topics for debate. And polarizing topics more likely to yield concrete results, which probably includes this one.