First of all, thank you for the extensive comments!
I can give more context during our AMA next week if helpful (I won’t have much time to engage in the coming few days unfortunately), but wanted to just quickly react to avoid a misunderstanding about our views here. I’ve copy-pasted from the relevant section from the report below:
To be clear, there are strong limitations to this recommendation:
We didn’t ourselves evaluate THL’s work directly, nor did we compare it to other charities (e.g., ACE’s other recommendations).
The availability of evidence here may be high relative to other interventions in animal welfare, but is still low compared to interventions we recommend in global health and wellbeing. We haven’t directly evaluated Open Philanthropy, Rethink Priorities, or Founders Pledge as evaluators.
We have questions about the external validity of the evidence for corporate campaigns, i.e. whether they are as cost-effective when applied in new contexts (e.g. low- and middle-income countries in Africa) as they seem to have been where the initial evidence was collected (mainly in the US and Europe).
We also have questions about the extent to which the evidence for corporate campaigns is out of date, as the Founders Pledge and Rethink Priorities reports are from more than four years ago and we would expect there to be diminishing returns to corporate campaigns over time, as the “low-hanging fruits” in terms of cost-effectiveness are picked first.
Taken together, all of this means we expect funding THL’s current global corporate campaigns to be (much) less cost-effective than the corporate campaigns in 2016-2017, which were evaluated in those reports.^1
^1 It is worth noting that Open Philanthropy confirmed to us that it thinks so as well: its referral is not a claim that funding THL’s corporate campaigns will be exactly as cost-effective as it probably was a couple of years ago, when THL achieved big wins on a small budget, but a claim that funding them is likely still among the most cost-effective options in the space, and that THL can productively use a lot of extra funding without strongly diminishing marginal returns to funding currently provided.
So in short, we share your impression that THL’s work is (much) less cost-effective than it was a few years ago. We are aware of Open Phil’s views on this, and their referral of THL’s work to us took these diminished expected returns into account. The FP and RP reports weigh (much) less heavily in our recommendation of THL’s current work than ACE’s and OP’s recommendations, but we think those reports still provide a useful (and publicly accessible) reference on corporate campaigns as an intervention more generally.
Hi Simon,
I’m back to work and able to reply with a bit more detail now (though also time-constrained as we have a lot of other important work to do this new year :)).
I still do not think any (immediate) action on our part is required. Let me lay out the reasons why:
(1) Our full process and criteria are explained here. As you seem to agree with from your comment above we need clear and simple rules for what is and what isn’t included (incl. because we have a very small team and need to prioritize). Currently a very brief summary of these rules/the process would be: first determine which evaluators to rely on (also note our plans for this year) and then rely on their recommendations. We do not generally have the capacity to review individual charity evaluations, and would only do so and potentially diverge from a trusted evaluator’s recommendation under exceptional circumstances. (I don’t believe we have had such a circumstance this giving season, but may misremember)
(2) There were no strong reasons to diverge with respect to FP’s recommendation of StrongMinds at the time they recommended them—or to do an in-depth review of FP’s evaluation ourselves—and I think there still aren’t. As I said before, you make a few useful points in your post but I think Matt’s reaction and the subsequent discussion satisfactorily explain why Founders Pledge chose to recommend StrongMinds and why your comments don’t (immediately) change their view on this: StrongMinds doesn’t need to meet GiveWell-tier levels of confidence and easily clears FP’s bar in expectation—even with the issues you mention having been taken into account—and nearly all the decision-relevant reasoning is already available publicly in the 2019 report and HLI’s recent review. I would of course be very interested and we could reconsider our view if any ongoing discussion brings to light new arguments or if FP is unable to back up any claims they made, but so far I haven’t seen any red or even orange flags.
(3) The above should be enough for GWWC to not prioritize taking any action related to StrongMinds at the moment, but I happen to have a bit more context here than usual as I was a co-author on the 2019 FP report on StrongMinds, and none of the five issues you raise are a surprise/new to me or change my view of StrongMinds very much. Very briefly on each (note: I don’t have much time / will mostly leave this to Matt / some of my knowledge may be outdated or my memory may be off):
I agree the overall quality of evidence is far short from e.g. GiveWell’s standards (cf. Matt’s comments—and would have agreed on this back in 2019. At this point, I certainly wouldn’t take FP’s 2019 cost-effectiveness analysis literally: I would deflate the results by quite a bit to account for quality of evidence, and I know FP have done so internally for the past ~2 years at least. However, AFAIK such accounting—done reasonably—isn’t enough to change the overall conclusion of StrongMinds meeting the cost-effectiveness bar in wellbeing terms. I should also note that HLI’s cost-effectiveness analysis seems to take into account more pieces of evidence, though I haven’t reviewed it; just skimmed it.
As you say yourself, The 2019 FP report already accounted for social desirability bias to some extent, and it further highlights this bias as one of its key uncertainties (section 3.8, p.31).
I disagree with depression being overweighted here for various reasons, including that DALYs plausibly underweight mental health (see section 1, p.8-9 of the FP mental health report. Also note that HLI’s recent analysis—AFAIK—doesn’t rely on DALY’s in any way.
I don’t think the reasons StrongMinds mention for not collecting more evidence (than they already are) are as unreasonable as you seem to think. I’d need to delve more into the specifics to form a view here, but just want to reiterate StrongMinds’s first reason that running high-quality studies is generally very expensive, and may often not be the best decision for a charity from a cost-effectiveness standpoint. Even though I think the sector as a whole could probably still do with more (of the right type of) evidence generation, from my experience I would guess it’s also relatively common charities collect more evidence (of the wrong kind) than would be optimal.
I don’t like what I see in at least some of the examples of communication you give—and if I were evaluating StrongMinds currently I would certainly want to give them this feedback (in fact I believe I did back in 2018, which I think prompted them to make some changes). However, though I’d agree that these provide some update on how thoroughly one should check claims StrongMinds makes more generally, I don’t think they should meaningfully change one’s view on the cost-effectiveness of StrongMinds’s core work.
(4) Jeff suggested (and some others seem to like) the idea of GWWC changing its inclusion criteria and only recommending/top-rating organisations for which an up-to-date public evaluation is available. This is something we discussed internally in the lead-up to this giving season, but we decided against it and I still feel that was and is the right decision (though I am open to further discussion/arguments):
There are only very few charities for which full public and up-to-date evaluations are available, and coverage for some worldviews/promising cause areas is structurally missing. In particular, there are currently hardly any full public and up-to-date evaluations in the mental health/subjective well-being, longtermist and “meta” spaces. And note that - by this standard—we wouldn’t be able to recommend any funds except for those just regranting to already-established recommendations.
If the main reason for this was that we don’t know of any cost-effective places to donate in these areas/according to these worldviews, I would have agreed that we should just go with what we know or at least highlight that standards are much lower in these areas.
However, I don’t think this is the case: we do have various evaluators/grantmakers looking into these areas (though too few yet IMO!) and arguably identifying very cost-effective donation opportunities (in expectation), but they often don’t prioritise sharing these findings publicly or updating public evaluations regularly. Having worked at one of those myself (FP), my impression is this is generally for very good reasons, mainly related to resource constraints/prioritisation as Jeff notes himself.
In an ideal world—where these resource constraints wouldn’t exist—GWWC would only recommend charities for which public, up-to-date evaluations are available. However, we do not live in that ideal world, and as our goal is primarily to provide guidance on what are the best places to give to according to a variety of worldviews, rather than what are the best explainable/publicly documented places to give, I think the current policy is the way to go.
Obviously it is very important that we are transparent about this, which we aim to do by clearly documenting our inclusion criteria, explaining why we rely on our trusted evaluators, and highlighting the evidence that is publicly available for each individual charity. Providing this transparency has been a major focus for us this giving season, and though I think we’ve made major steps in the right direction there’s probably still room for improvement: any feedback is very welcome!
Note that one reason why more public evaluations would seem to be good/necessary is accountability: donors can check and give feedback on the quality of evaluations, providing the right incentives and useful information to evaluators. This sounds great in theory, but in my experience public evaluation reports are almost never read by donors (this post is an exception, which is why I’m so happy with it, even though I don’t agree with the author’s conclusions), and they are a very high resource cost to create and maintain—in my experience writing a public report can take up about half of the total time spent on an evaluation (!). This leaves us with an accountability and transparency problem that I think is real, and which is one of the main reasons for our planned research direction this year at GWWC.
Lastly, FWIW I agree that we actively recommend StrongMinds (and this is our intention), even though we generally recommend donors to give to funds over individual charities.
I believe this covers (nearly) all of the GWWC-related comments I’ve seen here, but please let me know if I’ve missed anything!