Impact Evaluation in EA
Summary
Given EA’s history and values, I’d have expected for impact evaluation to be a distinguishing feature of the movement. In fact, impact evaluation seems fairly rare in the EA space.
There are some things specific actors could do for EA to get more of the benefits of impact evaluation. For example, organisations that don’t already could carry out evaluations of their impact, and a well-suited individual could start an organisation to carry out impact evaluations and analysis of the EA movement.
Overall I’m unsure to what extent more focus on impact evaluation would be an improvement. On the one hand, establishing impact is challenging for many EA activities and impact evaluation can be burdensome. On the other hand, an organisation’s historic impact seems very action-relevant to its future activities and current levels of impact evaluation seem low.
What Is Impact Evaluation?
Over the last year I’ve been speaking to EA orgs about their impact evaluation. This includes setting up a theory of change (ToC), choosing metrics and methods of evaluation, and carrying out evaluations. Impact evaluation can be done internally, by a funder, or by another external evaluator.
Why is Impact Evaluation Important?
Whereas companies have a clear metric to understand their success (profit), the social impact of a nonprofit is much harder to see. As a result, if a nonprofit wants to have a good sense of its success then it’s likely to need to make an explicit assessment of its impact. For this reason impact evaluation is often viewed as a core part of strategy and operations for nonprofits, particularly in the Global Health and Development space.
Concretely, the main benefits of impact evaluation to an organisation are:
Confidence that targeted impacts are achieved
Decision-making being more closely tied to desired social impact (including organisational alignment)
Identifying changes to activities to increase impact and mitigate harms
Sharing progress with stakeholders, and highlighting successes publicly
I’d expect Impact Evaluation to be quite common in EA
Given the history of EA, I’d have expected impact evaluation to be quite common in the EA movement. One early EA message was something like “even programs that sound good and are implemented by people with the best intentions can be ineffectual or even harmful, and you might not realise this until you rigorously look into the outcomes”.
EA also has values and norms that align with impact evaluation. The movement has a focus on rigour, quantification, and impact. It has a reputation for making use of cost-effectiveness analysis and placing weight on ‘getting to the truth of the matter’, and there is a culture of actually caring about good outcomes.
Impact evaluation is fairly rare in EA
My impression is that impact evaluation is actually quite rare in the EA space, in the following ways:
As far as I can tell many organisations in EA, perhaps somewhere between 30 and 70% (outside of GHD work), including the larger ones:[1]
Don’t have explicit theories of change
Don’t carry out or publish assessments of their impact
Don’t have explicit internal functions focussing on impact evaluation
There aren’t established and sophisticated ways in the movement to evaluate the impact of many common EA activities, such as movement-building and policy change
To be clear, there definitely are evaluation efforts in EA, such as:
Many orgs do publish impact reports with meaningful information and detailed metrics
Funders carry out work that might be considered impact evaluation to support grants
Rethink Priorities carries out the EA survey annually, and there have been other surveys carried out such as OP’s 2020 survey, the CEA events team 2023 survey
Individuals have occasionally explored early methodological approaches to impact assessment for various EA areas (e.g. see examples here and here)
There are multiple charity evaluators (GiveWell, GWWC, Founders Pledge, ACE, EA Funds, Giving Green, potentially others) focussed on impact evaluation for funders
Culturally, organisations seem motivated by impact, and it is common to carry out cost-effectiveness calculations and reason in social impact terms when making decisions
But overall I’m still surprised by the level of impact evaluation in EA.
Potentially justified reasons for this
This might be the optimal level of impact evaluation. Some potentially well justified reasons for the current level are:
It’s challenging for many EA activities
The impact of many EA activities is indirect, hard to see and/or a long way in the future, making evaluation challenging. This increases the burden of impact evaluation work, and reduces the pay-off. Some organisations I spoke to have attempted this work in the past but found it too challenging without enough reward, so reasonably shelved it for the timebeing. (This is also the most obvious explanation for why impact evaluation is much more common for GHD organisations than e.g. policy, AI safety and movement-building organisations)
Org’s are busy
Many organisations I spoke to expressed a desire to have better impact evaluation and/or a theory of change, but it was on a long list of things to do
Org’s might have strong priors about activities
Organisations might have strong views that work is worth carrying out and expect that the evidence from impact evaluation would be weak, so that the results of an evaluation would be unlikely to change their focus
Some low- to medium-cost opportunities
Here are some actions that different actors could take if they wanted to do or encourage more impact evaluation:
Strengthen internal evaluations – organisations above a certain size (e.g. >10 employees) could carry out a minimal level of impact evaluation, including:
Creating an explicit theory of change, with a description of what they do, what they hope it will lead to, and how they evaluate their success
Running internal impact evaluations to assess their social impact (and potentially publishing the results)
Having staff dedicated to impact evaluation or with it included in their duties
Leadership teams being invested in the results of evaluations
Carry out external evaluations – individuals who want to work on this problem and are well-suited to this type of work could:
Carry out impact evaluations for EA organisations (or provide advice on evaluations carried out internally)
They could focus on a specific cause area or sub-section of EA, since evaluation methods differ between them
Carry out analysis of the EA movement as a whole, including historic impact and potential risks and opportunities
Will Macaskill points out here that EA should potentially have an org focussed on identifying potential risks of harm.[2] To me the more natural idea would be that there is an evaluation org focussed on more generally assessing the progress of the EA movement, since if this is done properly it would include potential harms.
This analysis could be distributed to EA org leaders, or published online. For analysis made public, it would be important to get the buy-in of central EA org’s and be considerate of PR risks.
Collect and centralise resources and improve methodology
An individual could:
Collect public impact evaluations and methodological resources in e.g. a wiki
Develop evaluation methods for common EA cause areas or interventions (this is perhaps best done while simultaneously carrying out actual evaluations)
CEA could refresh and deepen the public impact page
Conclusion
Overall, I’m fairly uncertain to what extent more focus on impact evaluation in EA would be an improvement.
On the one hand, impact evaluation is challenging for many EA activities and can carry a high resource burden, without a guarantee that it will affect decisions in significant ways.
On the other hand, the historic impact of activities seems very action-relevant to future activities, and current levels of evaluation seem low.
If someone was particularly excited and well-suited to work on impact evaluation in EA then that seems more unambiguously positive and I could imagine them being really useful to many organisations.
Thanks to Stephen Clare, Ben Clifford and Devon Fritz for providing comments on a draft of this post.
- ^
I’m basing this on conversations I’ve had with ~20 org’s, and on quickly checking the websites of ~10 other prominent EA org’s for a public ToC or impact report. For the second set of organisations they may carry out this work but not make it public.
- ^
“Someone could set up an organisation or a team that’s explicitly taking on the task of assessing, monitoring and mitigating ways in which EA faces major risks, and could thereby fail to provide value to the world, or even cause harm.”
Thanks for writing this!
For me, the value of independent impact evaluation seems particularly clear- though I would agree that orgs doing it in-house is still usually better than nothing.
You mention difficulty, orgs being busy, and orgs having strong priors as possible reasons for the lack of impact evaluation. I’d speculate that financial cost is perhaps the largest factor. Orgs that want to have an impact evaluation, but can’t afford the time cost, could readily commission external evaluations (were finance no issue).[1] RP’s Surveys and Data Analysis Team has provided impact assessment for a couple of core meta orgs, and provided pro bono consultation on impact assessment to a larger number of orgs, and I know our other departments have done cost-effectiveness models in other areas, and I see several other people mentioning they do this in the comments. My impression is that people often dramatically overestimate the extent to which large EA orgs have sufficiently unlimited funding that they can just pay for anything they’d find valuable, without cost being an issue. But, for small-medium size EA projects it seems particularly clear that they often could not afford to pay, even though many of them (in my experience) value impact assessment.
Related to both the points above, it seems to me that interest from funders is one of the biggest potential drivers of whether orgs do impact assessment. If funders desired orgs to have external impact assessments, this would serve as a strong incentive for orgs. Funders could even consider a heuristic, that projects receiving >$X a year should dedicate $Y to external impact assessment. Of course, for that to work, funders would need to provide commensurate additional funding.[2]
Granted, commissioning an external impact evaluation still entails non-trivial time cost, since they need to engage with and provide information to the external assessor for the evaluation to be useful.
Anecdotally, I encounter lots of examples where orgs, of various sizes, are interested in receiving surveys or other private analyses which would help assess their impact, but can’t afford to pay for them.
Just chiming in to say that for EA Netherlands, financial cost is definitely a big factor. Another factor is that for a long time, we didn’t have a sufficiently established programme to evaluate. A third is that, until recently, we didn’t know anything about M&E other than ‘we should do an impact evaluation!’.
Fortunately, the second and third factors are beginning to change, so hopefully we’ll actually be able to commission something soon.
However, realistically, we’d only have a few thousand to spend, and I don’t know how much expertise that would get us. So, if there’s anyone in this thread who thinks they can help us given our low budget, please do reach out!
Thanks David!
I agree independence has advantages. OTOH I think there are also important advantages to internal impact evaluation: the results are more likely to be bought into internally, and important context or nuance is less likely to be missed. For making a theory of change specifically, I think it’s quite important this is done internally, usually. Overall I think the ideal setup would quite often be for organisations to have their own internal impact evaluation function.
And that’s interesting on funder interest. In a few cases, organisations I’ve spoken to have been able to get specific grants for impact evaluation. But also org’s might choose to reallocate their existing budget, without needing additional funding, if they consider impact evaluation an essential function. E.g. for a fixed budget, they might decide that they should be allocating at least e.g. 5 / 10% to impact evaluation. (But I guess this might be harder if it required pulling back on existing activities)
I kind’ve see this as more at the org-level than funder-level tbh, similarly to any other spending decision facing an organisation. Perhaps because I’m thinking most about the benefits to org’s themselves. But I definitely still agree that funder interest is a big driver.
Thanks for the reply!
I agree there are some advantages to internal evaluation. But I think “the results are more likely to be bought into internally” is, in many cases, only an advantage insofar that orgs are erroneously more likely to trust their own work than external independent work.
That said, I agree that the importance of orgs bringing important context and nuance (and just basic information) to the evaluation can hardly be over-stated. My general take here is that the ideal arrangement is for the org and external evaluators to work very closely on an evaluation, so they can combine the benefits of insider knowledge and external expertise.
I would even say that in those kinds of cases, it’s not extremely important whether the evaluation is primarily lead by the org or primarily lead by the external evaluator (so long as there’s still scope for the external evaluator to offer an independent, and maybe even dissenting, take on the org’s conclusions). I think people can reasonably disagree about how important it is that, in addition, the external evaluator is truly independent (i.e. ideally funded by an external funder, not selected and contracted by the org in question, which obviously potentially risks biasing the evaluator).
I actually think it could be quite reasonable for an org to trust or place more weight on an internal evaluation more than an external one, but apart from that fully agree with all you say!
FWIW, I am amenable to being commissioned to do impact evaluation. Interested parties should contact me here.
Nice post!
I do agree there is a potential gap for more impact evaluation in EA space and that it is common place for many non-EA NGOs/organisations to be required to have a certain percentage of their programme set aside for monitoring & evaluation purposes… so it feels something similar for EA organisations could be easily achieved.
A potential option—though would need far more exploration—is a central EA organisation that is funded by 5% of all OP/GW/EA Funds grants. So if a Open Phil gives a $1m grant, $50k is allocated to the central EA impact evaluation organisation who now need to add that recipient org to their list of orgs to work with and do an independent evaluation of at some agreed point (depending on grant objectives etc.).
One thing I would stress in particular links to your point around the difficulty of doing M&E on several of the largest EA cause areas (esp. in the GCR space) that have very long (or potentially non-existent) feedback loops and unclear metrics to track. Rather than just accepting its too difficult to do impact evaluation, the focus should be on the process of decision making and reasoning in those organisations, which can act as a ‘best alternative’ proxy. This can then be evaluated, through independent assessment of items such as theory of changes and use of decision methods like Bayesian belief networks of the route to impact/change.
I too am working on impact evaluation! Feel free to message me.
Thanks for creating this post! +1 to the general notion incl. the uncertainties around if it is always the most impactful use of time. On a similar note, after working with 10+ EA organizations on theories of change, strategies and impact measurement, I was surprised that there is even more room for more prioritization of highest leverage activities across the organization (e.g., based on results of decision-relevant impact analysis). For example, at cFactual, I don’t think we have nailed how we allocate our time. We should probably deprioritize even more activities, double down even more aggressively on the most impactful ones and spend more time exploring new impact growth areas which could outperform existing ones.
Nice post! Relatedly, readers may want to check The value of cost-effectiveness analysis by David Thorstad.
Likewise. That being said, I also like doing impact evals! People are welcome to get in touch.
I have many years of experience with the Energy Efficiency policy analysis community where there is a very strong emphasis on impact evaluation. I also have a little experience seeing impact evaluations implemented in the international development community.
I see four reasons why impact evaluation is not as prevalent in EA as other professional communities:
(1) Impact evaluation can get very bureaucratized and lead to high administrative overhead expenses and organizational constraints which can make programs more costly to implement and less efficient in their operations;
(2) Impact evaluation if it is to be quantitative needs clear metrics to measure performance, and this requires consensus on goals and theories of change. While EAs have an open mind, they also have a fairly large diversity of opinions on the details of which specific near-term goals are highest priority. This diversity of opinion, makes it difficult to make a singular theory of change and makes it difficult to select clear performance metrics for many projects (because of the lack of theory of change).
(3) Much of EA is focused on providing a low-overhead service to charitable donors so that their donations can have the most cost-effective impact. Since donors also have a diversity of opinions with respect to specific objectives and goals, the focus is on cost-effectiveness calculations and transparency using a variety of metrics. Thus impact is viewed from the perspective of a diversity of donors who may have different impact goals. Thus in practice impact is often evaluated in terms of “funds directed” which is the response that donors have in terms of the EA movement meeting their image of good cost-effectiveness.
(4) Maximizing marginal cost-effectiveness may be better and more efficient than doing impact evaluations: If there is a supply of “world improving opportunities” and a demand of “wanting to improve the world”, then maximizing the marginal cost-effectiveness of “wanting to improve the world” purchases (i.e. EA donations) should maximize the net surplus value produced from “world improving activities” in general given constrained resourced available to those wanting to pay for a better world. This just the law of supply and demand applied to people donating to make a better world. Thus the EA approach of focusing on marginal cost-effectiveness may be better and more economically efficient than performing lots of impact evaluations of specific EA activities.
In short, the “impact evaluation” of the EA movement is in terms of maximizing marginal cost-effectiveness, maximizing the number of people involved and maximizing the funds directed to the causes that meet maximum marginal cost-effectiveness criteria. This may not require a special impact evaluation effort with some specific, consensus “theory of change,” but it may simply require a bit of movement statistics collection and compilation to track the cost effectiveness, the amount of money directed, and the number of people participating in the EA movement.
Thanks a lot Robert!
On 1) great point, I agree.
On 2) I think this shouldn’t be a barrier, since the first step in impact evaluation is establishing a theory of change, and any individual organisation of a given maturity should have a clear picture of this, and consensus on their goals. (I’m not advocating for a singular ToC across the whole EA movement, just for individual organisations)
On 3) I agree this is a factor for evaluations for funders. But I think organisations should carry out internal impact evaluation according to their theory of change, so I think should still be able to have clear and established goals. They can say to funders “Our Theory of Change is X, and along those lines our impact has been Y”. Funders might also carry out their own evaluations in terms of the goals they most care about.
On 4), I agree there’s a trade-off between spending on direct activities and impact evaluation. Of course the value of impact evaluation is that it may improve the effectiveness of the direct activities. So I don’t think the existence of the trade-off is in itself a reason that impact evaluation shouldn’t be prevalent.