It is hard for me to phrase this question exactly so here’s the rough idea, hopefully you can get the picture about my conundrum.
Example
An EA starts a charity by themselves, sets up everything themselves using their time and zero financial setup costs. They hire a full-time contractor who trains volunteers to become peer support group leaders within their communities.
Say it takes X number of hours of the contractor’s time to train one volunteer, costing the charity $500 in expenditure. The volunteer then becomes self-sufficient and continues to volunteer for 20 years. In that time frame, they interacted with 2000 people, directly helped them increase their life satisfaction temporarily (and that increase was sustained for a finite period in those peoples’ lives), with the total gain in WELLBYs being 4000.
Say that the contractor trains 100 such volunteers (costing $50k) who end up delivering the same identical output, so a total of 400,000 WELLBYs has been gained. The charity then wraps up and dissolves.
Questions
How do you calculate the amount of good done by the EA in setting up this charity? My motivation for asking this is, it seems useful to be able to estimate the net contribution by setting up this charity in such a way that it can be compared to the opportunity cost of doing other things instead. And so far I can’t wrap my head around a framework for doing this in a way that doesn’t have severe risks of double-counting contributions made.
Who did what good and how much? If we naively assume that the contractor and volunteer were doing net zero good before they took on their roles, would you say the EA did 400k WELLBYs of good? Or from the contractor’s perspective, the EA only gave a little push but essentially did nothing, the actual good was all done by the contractor? Or the volunteers could think the same about the contractor. Or the participants could think the same volunteers.
What is considered the cost of this intervention program? Is it really just $50k as some analyses might use? Is it really okay to ignore the volunteers time? My confusion on this is that if we claim that the cost-effectiveness is 8 WELLBYs per dollar, this hides externalized costs that can’t arbitrarily scale (for a limited pool of volunteers), while also claiming the credit for the volunteers contributions “for free” (this seems wrong if the charity merely convinced the volunteer to switch from another charity to this one).
What if one particular recipient of the intervention made the statement “Your intervention has directly inspired me to set up my own charity with the exact same structure as yours but in a different region”, and they end up contributing another 400,000 WELLBYs? Did the EA contribute 0% to that output, 100%, or somewhere in between? Any answer that’s not 0% seems intuitively wrong to me, and yet that seems to imply that the EA probably caused far less than 400k WELLBYs of gain by starting this charity, even if the EA would otherwise have slept in a cryo chamber instead for the duration of this project.
What if we adjust some assumptions such as the contractor and the volunteers having non-zero counterfactual output? How would a calculation framework factor that in?
I feel like this type of scenario comes up a lot if our chosen way to do good is to enable other people, e.g. provide resources for better productivity, life choices, or decision-making.
My prefered answer is: calculate (or approximate) the Shapley value: <https://forum.effectivealtruism.org/posts/XHZJ9i7QBtAJZ6byW/shapley-values-better-than-counterfactuals>. But there is some controversy; see the comment section.
I’m usually skeptical of using Shapley values, but I think this is exactly the right application for them: credit assignment.
That being said, it’s also not obvious that Shapley values are the best way to assign credit, but I’m not aware of any better formal alternatives. I can think of things I might want to do differently, but it’s hard to formalize and justify any specific approach. I haven’t thought a lot about this, though.
Like even if two specific people were necessary to get something done, neither was replaceable, and neither would have done anything valuable otherwise, I might want to give more credit based on individual outputs, inputs/effort and initiative. But Shapley tells you to split evenly, right? Or maybe can you apply Shapley across all of the (subtask, individual) pairs, and then, for each individual, sum the Shapley values across subtasks?
Also, I guess Shapley values can assign credit to people who didn’t actually do anything, but would have otherwise, and that seems kind of wrong to me. But you can avoid that by just not including them in the model as coalition members.
I wonder if replacing the third property of Shapley values would lead to basically a generalized version of Shapley values, with equal splits being a special case. Maybe this could address those two implications about unequal agents and members who did nothing.
The example with unsuccessful applicants getting credit seems weird but also necessary to me, because they do contribute to the outcome in terms of market competition, but maybe they need a lower weighting than Shapley gives, as you suggest, since they are individually not “critical”.
The use case I’m trying to get my head around is comparing the case of starting a new organization vs donating to AMF.
It seems easy to use the normal counterfactual calculation to severely inflate the ROI when making a pitch to investors. Is that solved by only arguing for the ROI of the intervention the charity is using? I’m not sure.
If I start a charity with an intervention with identical outcomes and cost effectiveness to AMF, that seems pretty impressive and good, but have I actually made zero counterfactual benefit because AMF haven’t saturated their ability to make use of donations? It might be net negative if I require more externalized costs than AMF, or conversely it might be net positive if I win funding that other equally good charities would never have won.
If you’re just considering what you should personally do, then you don’t need to worry about how to assign credit to yourself and other agents; just make sure you properly capture what other agents would be doing in your counterfactuals. I think GiveWell does this when assessing the impact of charities that depend on funding from sources other than those going through the charity, like governments partnering with the charity, because they’d plausibly be doing something else useful with that funding, too.
Starting an AMF clone with similar (or worse) direct cost-effectiveness seems counterfactually low or even negative impact to me, if it means your funding would have otherwise (mostly) gone to AMF. And even if you aren’t competing much for funding, it might be more efficient to just fundraise for AMF instead of duplicating overhead and risking doing far worse than AMF because of relative inexperience with the intervention. Or, just start a different charity with the potential to be more cost-effective than AMF. Shapley values, if you include AMF as an agent (or multiple agents) in the coalitional game require you to consider the counterfactual where neither AMF nor the AMF clone “cooperate”, i.e. neither distributes more bednets, and the AMF clone gets counterfactual impact relative to that counterfactual. But that’s not a reasonable counterfactual, because AMF isn’t going to just not distribute more bednets.
This would also be a case where I’d be disinclined to give credit to an AMF clone, if they’d be mostly competing for AMF’s funding (but not if they found another source AMF wouldn’t have been able to attract, even if they tried raising those funds for AMF). AMF was there first, and I don’t want to encourage people to start redundant charities. If a charity has low or negative counterfactual impact from their own perspective (if they modelled things accurately), then the amount of credit they should get is low or negative.
One way I’d want to think about credit assignments is: How should I assign credit in a way that leads to the best outcomes (in expectation, or whatever)? And I don’t think giving (much) credit to competing charity clones does this, if and because they could be doing much more counterfactually valuable things instead.
That being said, there could be some value in competition, e.g. pressure to be better. Also, starting a charity can be useful for gaining experience, as well as training staff for other future roles, but again, someone could aim higher than cloning AMF.
Under the stated assumptions about the AMF clone, I agree with your assessments about low or negative impact. Would you also say, on paper, that Malaria Consortium and AMF are similar enough to being clones that whichever came second probably has/had low impact?
This is a great question, and it’s worth pointing out that some of the same issues could apply to charities working on pretty different things, too, like Helen Keller International or New Incentives, or even totally different cause areas like mental health or animal welfare. When you start a new charity, you want to allow more funding to be spent at a higher marginal cost-effectiveness than otherwise in expectation, keeping in mind that you can adjust your programs, pivot or shut down when you don’t meet this bar. This holds regardless of how their programs or beneficiaries might differ. However, in practice, donors are often not cause-neutral or intervention-neutral and often don’t make allocations between causes based on cost-effectiveness, so there would be less competition over funding between causes than within causes, as well as less between charities working on the same intervention than across interventions (with similar cost-effectiveness). Also, there are other reasons to funding multiple things, e.g. see Open Phil’s post.
On AMF vs MC:
AMF and MC were founded in 2004 and 2003 respectively. This could have been too close together to tell if either would end up very cost-effective, so starting AMF after MC could have still made sense ex ante, because if MC didn’t go well, then AMF could pick up more, and if AMF didn’t go well, but MC did, then AMF could shut down (or pivot or get much less funding). AMF focused on bednets and has been a GiveWell Top Charity since 2011, while MC worked on multiple programs, including bednets and chemoprevention, and has only been top since 2016, for its chemoprevention program. I think it’s reasonable to assume that AMF has a more cost-effective net program, maybe far better, and, based on what GiveWell believed over 2011-2015, AMF probably had a lot of expected counterfactual impact. (Mostly based on AMF’s Wikipedia page and MC’s Wikipedia page.)
In retrospect, MC could have shut down during the years AMF was a top charity but MC wasn’t or while both were top charities but AMF beat MC. But that probably would have been bad in the longer run, because MC has become more cost-effective than AMF, according to GiveWell. Also, having both might allow more funding to be spent at a higher bar for cost-effectiveness. AMF’s and MC’s programs are still different enough that if you are ambiguity averse or consider the possibility of a larger difference in cost-effectiveness between them in the future (so value of information), keeping both programs running now seems good (even if disproportionately funding whichever is more cost-effective). If the programs were the same, then you’d have much less uncertainty about how the two might differ in cost-effectiveness now and in the future, so there’d be less value to diversification.
AMF and MC both vary significantly in marginal cost-effectiveness based on where they can work next (AMF sheet, MC sheet) and GiveWell allocates funding dynamically between them (and other top charities) to maximize impact, through its Top Charities Fund. If it only had one to allocate to, then it would plausibly do far less good, because it would have to go further down in marginal cost-effectiveness.
One thing I’m now wondering about is whether GiveWell accounts for recipients of both a bednet and preventive antimalarial medicine, if any, in their cost-effectiveness estimates. It looks like AMF and MC work in some of the same countries, like Togo, Nigeria and Chad (AMF sheet, MC sheet).[1] If someone has already received one, then they are at much lower risk of death, so the extra impact of giving them the other should be lower than if they didn’t already receive the first (assuming they would actually use the bednet). So, there’s a risk of double-counting some lives saved. Similarly, I wouldn’t be surprised if vitamin A deficiency made children more vulnerable to death from malaria, so there could be some double-counting with Helen Keller International, too. MC apparently takes bednet coverage into account in deciding where to work, and GiveWell makes adjustments for MC based on bednet coverage and for AMF based on chemopreventive medicine coverage, so maybe there’s little overlap. On the other hand, it’s not just past coverage, but you need to make sure you don’t end up working in the same places in any given year (unless it’s still worth it without any double-counting), which probably requires some coordination. The highest priority regions for both AMF and MC, before coordinating, would probably be regions with low net use and no recent preventive antimalarial medicine distribution, and so overlap substantially.
And even if they didn’t, one charity might have taken on the other charity’s countries if the other charity didn’t exist.
On the other hand, they might not work in the same regions, cities or villages or whatever.
This is great! Thank you!
I see you’re the author of that post and have probably been thinking about Shapley value for a while. How practical do you think Shapley is for comparing three choices like 1) earning to give 2) starting a charity 3) taking an EA job? Would this be calculated as one game, three separate games, or three games and a combined game?
Hey, would prefer if you thought about this on your own. One clue is: when does Shapley value give a different answer for games amalgamated together, vs considered separately, in general?