Why I’m concerned about Giving Green
I am a forecaster, and occasional independent researcher. I also work in a volunteer capacity for SoGive, which has included some analysis of the climate space in order to provide advice to some individual donors interested in this area. This work has involved ~20 hours of conference calls over the last year with donors and organisations, one of which was the Clean Air Task Force, although for the last few months my primary focus has been internal work on moral weights. I began the research for this piece in a personal capacity, and the opinions below are my own, not those of SoGive.
I received input on some early drafts, for which I am extremely grateful, from Sanjay Joshi (SoGive’s founder), as well as Aaron Gertler and Linch Zhang, however I again want to emphasise that the opinions expressed, and especially any mistakes in the below, are mine alone. I’m also very grateful to Giving Green for taking the time to have a call with me about my thinking here. I provided a copy of the post to them in advance, and they have indicated that they’ll be providing a reponse to the below.
I think that Giving Green has the potential to be incredibly impactful, not just on the climate but also on the EA/Effective Giving communities. Many people, especially young people, are extremely concerned about climate change, and very excited to act to prevent it. Meta-analysis of climate charities has the chance to therefore have large first-order effects, by redirecting donations to the most effective organisations within the climate space. It also, if done well, has the potential to have large second-order effects, by introducing people to the huge multiplier on their impact that cost-effectiveness research can have, and through that to the wider EA movement. I note that at least one current CEA staff member took this exact path into EA. With this said, I am concerned about some aspects of Giving Green in its current form, and having discussed these concerns with them, felt it was worth publishing the below.
Concerns about research quality
Giving Green’s evaluation process involves substantial evidence collection and qualitative evaluation, but eschews quantitative modelling, in favour of a combination of metrics which do not have a simple relationship to cost-effectiveness. In three cases, detailed below, I have reservations about the strength of Giving Green’s recommendations. Giving Green also currently recommends the Clean Air Task Force, which I enthusiastically endorse, but who Founders Pledge had identified as promising before Giving Green’s founding, and Tradewater, who I have not evaluated. What this boils down to is that in every case where I investigated an original recommendation made by Giving Green, I was concerned by the analysis to the point where I could not agree with the recommendation.
Despite the unusual approach, especially compared to standard EA practice, the research and methodology are presented by Giving Green in a way which implies a level of concreteness comparable to major existing charity evaluators such as Givewell. As well as the quantitative aspect mentioned above, major evaluators are notable for the high degree of rigour in their modelling, with arguments being carefully connected to concrete outcomes, and explicit consideration of downside risks and ways that they could be wrong. One important part of the more usual approach is that it makes research much easier to critique, as causal reasoning is laid out explicitly, and key assumptions are identified and quantified. When research lacks this style, not only does the potential for error increase, but it becomes much more difficult and time-intensive to critique, meaning errors in the analysis are less likely to be identified and publicised.
There being considerable room for improvement in Giving Green’s analysis is not, of itself, a major issue. Giving Green is a new organisation, which is primarily the work of two people in their spare time, and mistakes as a new organisation should be expected, let alone a new organisation with considerable time constraints. For example, GiveWell made many mistakes in its early years; in addition, concerns were raised productively about poor research quality at ACE a few years ago, and the impression I have from people who were involved at the time is that things have since substantially improved.
Media coverage risks overselling
Giving Green has, however, had an extremely successful launch, and has enthusiastically been promoted as a highlight of the effective giving/EA community. In some cases, I think that this promotion has implied a level of certainty behind Giving Green’s recommendations beyond that which is warranted, and indeed beyond that which they themselves have.
For example, this Atlantic article describes Giving Green as follows, before directly comparing them to Givewell:
Giving Green is part of the effective-altruism movement, which tries to answer questions such as “How can someone do the most good?” with scientific rigor. Or at least with econometric rigor…
This Vox article lists recommendations by Giving Green alongside Founder’s Pledge recommendations, describing both as:
the most high-impact, cost-effective, evidence-based charities to donate to if you want to improve US climate policy
Giving Green has also been highlighted in various more internal EA discussion, including an extremely positive post on the EA Forum, as well as a more neutral writeup in the Effective Altruism and Giving What We Can newsletters. The (even newer) organisation High Impact Athletes mentions Giving Green alongside Founders Pledge as a source of recommendations in their FAQ, though they do not feature prominently elsewhere. Edit: the reference to Giving Green has now been removed. While this piece was being drafted, another new organisation launched, with the aim of promoting and facilitating effective giving in Sweden. They list GiveWell, Giving Green, and ACE with equal prominence, again strongly implying an equivalence between the three. Having started this post with the case for why I am excited to see the emergence of an organisation like Giving Green, especially given this media success and their extremely slick website (seriously, go look at it), I am concerned that, especially among EAs, the lack of quantitative and rigorous analysis, even though it has been implied, poses a nontrivial reputational risk, especially if the headline recommendations are promoted without adequate qualification.
Finally, it feels like both Giving Green, and the EA community, could have seen this coming. Cool Earth was, for a long time, something like the default EA answer to the question “I really want to give to a climate charity, which is the best one?”, despite what turned out to be significant flaws in the analysis. It is still, as far as I know, the only climate charity mentioned in “Doing Good Better”. Thanks to the excellent work of Founders Pledge, among others, it is no longer the case that any recommendation is better than none.
My goal with this piece
I hope that Giving Green will, at least while they are still growing and learning, consider changing the priority with which the organisations they have researched are displayed, so that potential donors are strongly encouraged to donate to organisations where Giving Green concurs with the research of the more established effective climate researcher, Founders Pledge. My current personal recommendation for individual donors interested in climate change, or for EAs talking to friends who are looking for donation advice, is the one organisation recommended by both Giving Green and Founders Pledge, Clean Air Task Force.
Details of what I’m concerned about, including questions I put to Giving Green and details of their responses, are below.
Giving Green’s overall approach
Giving Green’s analysis of offsets consists of assessing five factors, none of which consider cost. Their approach is described in this document, from which the image below outlining their framework is taken.
When I asked why cost was not explicitly modelled, Giving Green responded that the cost estimates produced by the organisations themselves are often unreliable, and that the true cost per ton of carbon is very uncertain and difficult to estimate. I agree with both claims.
However, something being difficult to model is not, in principle, a reason not to try. It could well be the case that optimistic estimates of the cost-effectiveness of one organisation clearly underperform pessimistic estimates of the cost effectiveness of another, and in such cases there seems to be little reason to continue to recommend the first.
Giving Green agrees with the consensus EA view that the framing of “offsetting personal emissions” is unhelpful, stating, for example, in their launch post that:
Carbon offsets are a mechanism to contribute to certified projects in an attempt to “undo” climate damage done by individuals or businesses. We find this framing unhelpful, and instead argue that individuals and organizations should view offsets simply as a philanthropic contribution to a pro-climate project with an evidence-based approach to reducing emissions, rather than a way to eliminate their contribution to climate change.
Despite this, Giving Green has decided to make recommendations in the offset space, noting in the same article that:
In 2019 the voluntary carbon offset market transacted $330 million, and it looks poised for massive growth.
I would be very excited to see research by Giving Green into whether their approach of recommending charities which are, by their own analysis, much less cost effective than the best options is indeed justified. This analysis has not yet been performed by Giving Green, although they stated that they were very confident it would turn out to be the case.
Specifically, I’m interested in estimates/analysis of:
What fraction of Giving Green’s donors will be people who come to Giving Green via EA or EA adjacent routes (where the most likely donation they would have made otherwise is to one of the FP top charities), and end up donating less effectively?
What fraction of people who would otherwise not have come across EA climate change analysis, but who come across Giving Green, would have donated more effectively if Giving Green had presented them with only those recommendations where there is consensus on their effectiveness?
Giving Green has not yet quantitatively modelled either aspect, although their impression of the climate donation space, which they view as comprised of several distinct groups, each with markedly different worldviews, gives them confidence in the portfolio approach.
Activism vs. Insider policy influence
From Giving Green’s “Recommendations” page:
Similarly to the above, there has not yet been any quantitative analysis by Giving Green of the tradeoff between making better, narrower recommendations but risking putting some people off, and making donations that appeal to more people but include some options which are much less effective than others. This is concerning, as if organisations differ considerably in cost-effectiveness then ranking them equally is potentially a significant mistake. This would be a concern even if Giving Green currently recommended only one option, or made multiple recommendations with very similar EV. As the discussion below lays out, however, I think that Giving Green’s current recommendations do in fact differ dramatically in cost-effectiveness.
In Giving Green’s approach to policy recommendations, they lay out three reasons behind their choice not to rank organisations with different approaches: quantitative cost-effectiveness analysis is uncertain, not everyone agrees on the correct theory of change, and that donating to multiple organisations offers the opportunity to Hedge against political uncertainty.
As discussed above, quantitative estimates potentially having wide error bars is not in itself a reason not to perform the analysis. It is possible that the difference between organisations is clearly larger than the size of the error. Secondly, the process of producing quantitative models forces an organisation to be explicit about their reasoning, in a way which makes it much easier to analyse and respond to.
The existence of disagreement is similarly not, on its own, reason enough to eschew quantitative analysis, even if that disagreement is about political futures and/or moral weights. As a stand-out example of how the second can be handled, see Givewell’s work. As for political uncertainty, considering the way different political futures might affect the effectiveness of potential recommendations is exactly the sort of analysis it would be great to see from Giving Green.
Hedging has been the subject of much previous discussion, so I won’t go into much detail here, other than to say that in general hedging arguments are not sufficient alone to justify spreading donations across options with significant differences in EV.
The Sunrise Movement Education Fund (TSM)
The description of TSM as a “High-Potential” organisation (as opposed to Clean Air Task Force (CATF), which is described as a “Good Bet”, and is one of the organisations recommended by Founders Pledge), implied to me that Giving Green’s position was that the choice with the highest EV currently is CATF but that it is worth presenting both options.
When I asked Giving Green about this, they made it clear that this was not the case. They believe that a direct comparison of the cost effectiveness of CATF and TSM would be too uncertain to be meaningful, and hence they fully endorse donations to either. In the Vox piece linked above, Dan Stein explicitly states:
I would push for a two-part strategy, because I think the way policy gets made is through these insider-outsider coalitions
There are strong reasons, laid out below, to believe that TSM does indeed significantly underperform CATF. They can be thought of as having two key themes: low confidence in marginal impact, and uncertainty about the sign of the impact.
Low confidence in marginal impact:
The case for donations to TSM being impactful on the margin feels thin; The Sunrise Movement has thousands of volunteers and is not obviously funding constrained. Similarly, within the field of climate change, progressive climate activism hardly seems neglected. If anything, grassroots climate activism has been the single most visible feature of the Western fight against climate change in the few years.
Giving Green’s ITN analysis of US policy change ranked climate activism as the most neglected area, but the justification they provide is is based on data from six years ago:
Between 2011-2015, the largest donors in environmental philanthropy allocated about 6.9% of all their funding to grassroots activism and mobilization efforts, which are generally not as well funded as “insider” methods such as campaigns and lobbying
Giving Green do not discuss the impact on neglectedness of the large and rapidly growing number of volunteers for TSM. More importantly, there is no mention of how different the activism landscape looks now compared to 2015. As one concrete illustration of how different progressive activism looks in a wider sense, Greta Thunberg is a household name. Her first protest was in 2018.
Uncertainty about the sign of impact:
Even in a world where potential CATF and TSM donors are different enough that recommending TSM does not meaningfully impact donations to CATF, and where marginal donations to TSM help them achieve their goals, this does not mean the recommendation of TSM is necessarily good in expectation.
More worrying, TSM’s explicit strategy of attempting to polarise the debate rather than looking for consensus, seems like it could backfire extremely easily.
Making climate change a partisan issue might look promising in the short-term given the current Democratic trifecta, though the wafer-thin majority and existence of the filibuster somewhat dampens the case even there, but in even the medium term there is an obvious and potentially very large downside to such an approach. This would be concerning anyway, but given the proven track record of groups like CATF at achieving exactly the sort of bipartisan consensus that political polarisation could permanently damage, it seems very unwise to recommend both.
The potentially negative effects highlight again the advantage of a quantitative model over merely picking “the best of each different approach”. It is no use that an intervention is the most promising for its particular strategy, if there is a significant chance of that strategy being actively harmful. It is notable that CATF, Giving Green’s other policy recommendation, who not only were first recommended by Founders Pledge but have since in addition been recommended by SoGive and Legacies Now, does not pose a significant downside risk, strongly indicating that it does better in expectation.
While not directly related to the concerns discussed above, a choice of language in the TSM writeup is also concerning. Giving Green refers to TSM’s “non-partisan get-out-the-vote activities”. While, on a technicality, one could argue that these activities were non-partisan, writing in this way risks making Giving Green appear either naive or disingenuous. Though The Sunrise Movement Education Fund is registered as a non-partisan arm of the broader but obviously partisan Sunrise Movement, this line doesn’t feel integral to the analysis and consequently feels like it would be a good idea to cut.
Here, again, I have split the discussion into two sections. The first describes why I think the strength of the RCT evidence has been overstated, the second details why extremely strong evidence would be needed to recommend BURN given the specifics of how offsets work in this case:
Concerns about the presentation of RCT evidence:
From Giving Green’s recommendation:
[We recommend] BURN stoves on the weight of strong Randomized Controlled Trial (RCT) evidence in support of the causality of emissions reductions, as well as demonstrated co-benefits.
The single RCT being referred to studied the effect of different pricing systems on willingness to pay. While evidence was also collected on fuel use, and a 154-person subsample was studied 18 months later, calling this “strong RCT evidence [of emissions reductions]” risks appearing sloppy or even deliberately disingenuous. When people hear “Strong RCT evidence in support of X”, it is reasonable for them to assume that the primary aim of the RCT was to investigate X, and this is not the case.
Basing the analysis on a single RCT and then referring to this as “strong RCT evidence” does not appear consistent with the norms of other evaluators in the EA space (I believe this would be atypical for GiveWell, and can confirm that it would be atypical for SoGive). All the more so when the rest of the evidence base is highly heterogeneous.
Giving Green does not recommend cookstoves in general due to a wider review of the evidence:
We do not feel comfortable recommending cookstove offsets in general, as the RCT literature shows that the required assumptions are frequently not satisfied.
When I asked Giving Green about this recommendation, they replied that the wider literature on cookstoves is heterogeneous, rather than outright negative. As the two principal uncertainties about the usefulness of cookstoves are with respect to fuel use reduction and long-term adoption, and the RCT above provides evidence about both of these, they feel it makes a strong case for BURN. I am less skeptical about BURN than I was following this exchange, however I still think that there is an important difference between
evidence about X extracted from an RCT studying Y
an RCT about X.
As well as the concern about transparency above, I note that picking the best looking intervention from a heterogeneous set makes you particularly susceptible to the optimiser’s curse, and that multiple hypothesis testing is a nontrivial risk even when dealing with RCTs directly investigating variables of interest. Most importantly, for the reasons detailed below, the evidence would need to be extremely strong in order for BURN to be worth recommending overall.
Specifics of donating to BURN:
Even ignoring the concerns above, there is no concrete mechanism by which donations to BURN will lead to more cookstoves being sold. BURN themselves state donations will be
used to invest in research and development, as well as to fund certain aspects of customer engagement, branding, and warranty services.
There’s a plausible causal mechanism here which may lead to more stoves being sold, but isn’t Giving Green’s model of carbon offsets that people prefer them due to increased certainty? Again, if this was modelled quantitatively it would be useful to see, but without such modelling the recommendation is hard to understand.
Furthermore, BURN is a company, not a charity, so there is less recourse to ensure that the marginal effect of a donation is to do good (as opposed to increasing profits).
BURN plausibly saves its customers money in the long run. It was unclear to me whether this was the primary reason behind the recommendation, as the approach to recommending offsets claims the following:
Giving Green only uses GHG reductions to determine which offsets to recommend, and therefore it is not necessary for an offset to have co-benefits to gain our recommendation. However, as many offset purchasers would like to buy offsets with co-benefits, we highlight them in the analysis of our recommended offsets.
However, BURN’s endorsement on the Carbon Offsets page has the following label:
Decrease emissions and save families money
Giving Green confirmed when asked that this was not a factor in BURN’s recommendation.
Not mentioning the substantial funding pipeline Climeworks has from Stripe seems like a significant oversight, especially if (as discussed below) the principal reason to fund Climeworks is that they might be better in the future. I expect almost all of the value of offsetting with Climeworks right now to be in marginally increasing the chance they end up as a successful company, given the tiny amount of carbon sequestered for the price. Stripe’s involvement seems to notably reduce the risk they fail due to lack of funds, which is the main lever that buying offsets has to pull.
Giving Green confirmed that the long-term effects of donations to Climeworks were the principal reason for the recommendation, however they did not agree with me that these effects being positive were conditional on Climeworks’s survival and eventual cost competitiveness. They argued that donations were able to send a price signal about permanent carbon capture which was important even if Climeworks ultimately failed.
Giving Green state:
The main drawback of Climeworks is that it is currently very expensive (around $1000/ton) to remove carbon relative to other options. This may be justified by the fact that supporting Climeworks will hopefully go toward reducing the cost of their frontier carbon-removal technology. (emphasis mine)
Giving Green confirmed that they have not attempted to model the expected value of the various long-term effects discussed above, or even the probability of Climeworks ultimately being successful. Some public forecasts on related topics do exist. Metaculus currently puts the chance of Climeworks’ pricing dropping below $50/T, by 2030, at 3%, conditional on its survival. Climeworks itself only has a long term price target of $100-200/T, though it is not clear whether this is adjusted for inflation. Metaculus currently puts the chance of it still existing in 2030 at 1 in 3.
It is finally noteworthy that directly purchasing overpriced (in immediate terms) offsets is not the only way to try to positively affect the future of the offset market. Carbon180, another Founder’s Pledge recommendation, focuses on policy advocacy, business engagement, and innovation support for carbon removal/negative emissions technology.