Just a quick comment that I don’t think the above is a good characterisation of how 80k assesses its impact. Describing our whole impact evaluation would take a while, but some key elements are:
We think impact is heavy tailed, so we try to identify the most high-impact ‘top plan changes’. We do case studies of what impact they had and how we helped. This often involves interviewing the person, and also people who can assess their work. (Last year these interviews were done by a third party to reduce desirability bias). We then do a rough fermi estimate of the impact.
We also track the number of a wider class of ‘criteria-based plan changes’, but then take a random sample and make fermi estimates of impact so we can compare their value to the top plan changes.
If we had to choose a single metric, it would be something closer to impact-adjusted years of extra labour added to top causes, rather than the sheer number of plan changes.
We also look at other indicators like:
There have been other surveys of the highest-impact people who entered EA in recent years, evaluating which fraction came from 80k, which let’s us make an estimate of the percentage of the EA workforce from 80k.
We look at the EA survey results, which let’s us track things like how many people are working at EA orgs and entered via 80k.
We use number of calls as a lead metric, not an impact metric. Technically it’s the number of calls with people who made an application above a quality bar, rather than the raw number. We’ve checked and it seems to be a proxy for the number of impact-adjusted plan changes that result from advising.
This is not to deny that assessing our impact is extremely difficult, and ultimately involves a lot of judgement calls—we were explicit about that in the last review—but we’ve put a lot more work into it than the above implies – probably around 5-10% of team time in recent years.
I think similar comments could be made by several of the other examples e.g. GWWC also tracks dollars donated each year to effective charities (now via the EA Funds) and total dollars pledged. They track the number of pledges as well since that’s a better proxy for the community building benefits.
As @Benjamin_Todd mentioned, GWWC also does report on pledged donations and donations made.
However, Giving What We Can’s core metric on a day-to-day basis is the number of active members who are keeping their pledge. This is in part because the organisation aim is “to create a culture where people are inspired to give more, and give more effectively” (a community building project) and we see pledges as a more correlated and stable reflection of that than the noisy donation data (which a single billionaire can massively skew in any given year). This aim is in service of the organisations’ mission to “Inspire donations to the world’s most effective organisations”. We believe this mission is important in making progress on the world’s most pressing problems (whatever they might be throughout the lifetime of the members).
Because GWWC is cause-diverse and not the authority on impact evaluations of the charities its members donate to is hard to translate this into exact impact numbers across our membership. We do however regularly look at where our members are donating and in our impact analysis try to benchmark this to equivalent money donated to top charities. We do plan to improve our reporting of impact where it is possible (e.g. donations to GiveWell’s top charities) – however, this will never be a complete picture.
Note:I did not choose the mission of GWWC and do not speak on behalf of the board nor the founders. However, this is my best understanding of the mission and the core metrics as required for my day-to-day operations. It is also a mission I believe to be impactful not just through direct donation moved but through indirect factors (such as moral circle expansion, movement building, and changing the incentives for charities to be more impact focused because they see more donors seeking impact).
Indeed. I can speak to Founders Pledge which is another of the orgs listed here:
Founders Pledge focusing on the amount of money pledged and the amount of money donated, rather than on the impact those donations have had out in the world.
While these are the metrics we are reporting most prominently, we do of course evaluate the impact these grants are having.
Impact = money moved * average charity effectiveness. FP tracks money to their recommended charities, and this is their published research on the effectiveness of those charities, and why they recommended them.
Forward-looking estimation of a charity’s effectiveness is different from retrospective analysis of that charity’s track record / use of FP money moved.
I agree—but my impression is that they consider track record when making the forward-looking estimates, and they also update their recommendations over time, in part drawing on track record. I think “doesn’t consider track record” is a straw man, though there could be an interesting argument about whether more weight should be put on track record as opposed to other factors (e.g. intervention selection, cause selection, team quality).
I asked someone from our impact analytics team to reply here re FP, as he will be better calibrated to share what is public and what is not.
But in principle what Ben describes is correct, we have assessments of charities from our published reports (incl. judgments of partners, such as GiveWell) and we relate that to money moved. We also regularly update our assessments of charities, charities get comprehensively re-evaluated every 2 years or so, with many adjustments in between when things (funding gaps, political circumstances) .
So, this critique seems to incorrectly equate headline figure reporting with all metrics we and others are optimizing for.
Just a quick comment that I don’t think the above is a good characterisation of how 80k assesses its impact. Describing our whole impact evaluation would take a while, but some key elements are:
We think impact is heavy tailed, so we try to identify the most high-impact ‘top plan changes’. We do case studies of what impact they had and how we helped. This often involves interviewing the person, and also people who can assess their work. (Last year these interviews were done by a third party to reduce desirability bias). We then do a rough fermi estimate of the impact.
We also track the number of a wider class of ‘criteria-based plan changes’, but then take a random sample and make fermi estimates of impact so we can compare their value to the top plan changes.
If we had to choose a single metric, it would be something closer to impact-adjusted years of extra labour added to top causes, rather than the sheer number of plan changes.
We also look at other indicators like:
There have been other surveys of the highest-impact people who entered EA in recent years, evaluating which fraction came from 80k, which let’s us make an estimate of the percentage of the EA workforce from 80k.
We look at the EA survey results, which let’s us track things like how many people are working at EA orgs and entered via 80k.
We use number of calls as a lead metric, not an impact metric. Technically it’s the number of calls with people who made an application above a quality bar, rather than the raw number. We’ve checked and it seems to be a proxy for the number of impact-adjusted plan changes that result from advising.
This is not to deny that assessing our impact is extremely difficult, and ultimately involves a lot of judgement calls—we were explicit about that in the last review—but we’ve put a lot more work into it than the above implies – probably around 5-10% of team time in recent years.
I think similar comments could be made by several of the other examples e.g. GWWC also tracks dollars donated each year to effective charities (now via the EA Funds) and total dollars pledged. They track the number of pledges as well since that’s a better proxy for the community building benefits.
As @Benjamin_Todd mentioned, GWWC also does report on pledged donations and donations made.
However, Giving What We Can’s core metric on a day-to-day basis is the number of active members who are keeping their pledge. This is in part because the organisation aim is “to create a culture where people are inspired to give more, and give more effectively” (a community building project) and we see pledges as a more correlated and stable reflection of that than the noisy donation data (which a single billionaire can massively skew in any given year). This aim is in service of the organisations’ mission to “Inspire donations to the world’s most effective organisations”. We believe this mission is important in making progress on the world’s most pressing problems (whatever they might be throughout the lifetime of the members).
Because GWWC is cause-diverse and not the authority on impact evaluations of the charities its members donate to is hard to translate this into exact impact numbers across our membership. We do however regularly look at where our members are donating and in our impact analysis try to benchmark this to equivalent money donated to top charities. We do plan to improve our reporting of impact where it is possible (e.g. donations to GiveWell’s top charities) – however, this will never be a complete picture.
Note: I did not choose the mission of GWWC and do not speak on behalf of the board nor the founders. However, this is my best understanding of the mission and the core metrics as required for my day-to-day operations. It is also a mission I believe to be impactful not just through direct donation moved but through indirect factors (such as moral circle expansion, movement building, and changing the incentives for charities to be more impact focused because they see more donors seeking impact).
Indeed. I can speak to Founders Pledge which is another of the orgs listed here:
Founders Pledge focusing on the amount of money pledged and the amount of money donated, rather than on the impact those donations have had out in the world.
While these are the metrics we are reporting most prominently, we do of course evaluate the impact these grants are having.
Thanks – does Founders Pledge publish these impact evaluations? Could you point me to an index of them, if so?
https://founderspledge.com/stories/2020-research-review-our-latest-findings-and-future-plans
Thanks… I don’t see impact evaluations of past FP money moved discussed on that page.
Are you pointing to the link out to Lewis’ animal welfare newsletter? That seems like the closest thing to an evaluation of past impact.
Impact = money moved * average charity effectiveness. FP tracks money to their recommended charities, and this is their published research on the effectiveness of those charities, and why they recommended them.
Forward-looking estimation of a charity’s effectiveness is different from retrospective analysis of that charity’s track record / use of FP money moved.
I agree—but my impression is that they consider track record when making the forward-looking estimates, and they also update their recommendations over time, in part drawing on track record. I think “doesn’t consider track record” is a straw man, though there could be an interesting argument about whether more weight should be put on track record as opposed to other factors (e.g. intervention selection, cause selection, team quality).
I feel like I’m asking about something pretty simple. Here’s a sketch:
FP recommends Charity Z
In the first year after recommending Charity Z, FP attributes $5m in donations to Charity Z because of their recommendation
The next time FP follows up with Charity Z, they ask “What did you guys use that $5m for?”
Charity Z tells them what they used the $5m for
FP thinks about this use of funds, forms an opinion about its effectiveness, and writes about this opinion in their next update of Charity Z
GiveWell basically does this for its top charities.
I asked someone from our impact analytics team to reply here re FP, as he will be better calibrated to share what is public and what is not.
But in principle what Ben describes is correct, we have assessments of charities from our published reports (incl. judgments of partners, such as GiveWell) and we relate that to money moved. We also regularly update our assessments of charities, charities get comprehensively re-evaluated every 2 years or so, with many adjustments in between when things (funding gaps, political circumstances) .
So, this critique seems to incorrectly equate headline figure reporting with all metrics we and others are optimizing for.