On the whole: Interesting write-up, certainly works as an intro to how very in-depth use of EA forecasting/impact estimate techniques can be used. I’d read any of these that came out in the future, and can already think of organizations I’d be curious to see evaluated in this way.
This is similar to Jonas’s second comment, but it seems like concerns about the indirect harms of economic growth or “the reduction of agency” are both constants in any evaluation of any program in global poverty/development, or which encourages donations to said programs.
Perhaps this indicates that your models could be filled in over time with “default scores” that apply to all projects within a certain area? For example, any program aiming to reduce poverty could get the same “indirect harm” scores as this project’s anti-poverty side.
----
Something also feels off about noting the potential harms of effects which are generally very good. I’m having trouble coming up with a formal explanation for some reason, so I’ll write this out informally:
If I’m considering funds for a project to reduce poverty and grow the economy, and someone tells me that doing so could increase the number of animals that get eaten… these two effects scale together. The more that poverty is reduced, and the faster the economy grows, the more animals are likely to be eaten. This is an “indirect bad outcome” that I’m actually happy to see in some sense, because the existence of the bad outcome indicates that I succeeded in my primary goal.
It’s as though someone were to warn me about donating to an X-risk-reduction organization by pointing out that more humans living their lives implies that more humans will get cancer. Cancer is definitely a harm related to being alive, but it’s one that I’m implicitly prepared to accept in the course of helping more humans exist. If you came back from some point in the future to tell me that the cancer rate had remained constant over time, but that ten billion humans had cancer, I’d probably be very happy to hear the news, because it would imply the existence of hundreds of billions of cancer-free humans spread out across planets or artificial interstellar habitats.
Meanwhile, if someone were to tell me that they think cancer is so bad that it makes additional years of life net-negative, I’d tell them to support promising cancer treatments rather than even considering projects that create more years of human life.
If a socialist tells me they’re concerned about poor people losing autonomy as a result of charitable giving, my response would be something like: ”...okay. That’s going to be par for the course in this entire category of projects. By the project’s very nature, it should be clear that it’s not something you’ll want to support if you generally oppose charity.” And then I’d produce a report intended for people who do believe charity generally does more good than harm, because they’re the only ones who might actually satisfy their values by donating.
----
This is oversimplified and unsophisticated, and it’s easy to think of counterarguments. But I still feel as though “losing autonomy” is a very different kind of concern than, say, “Donational falls apart, and its initial corporate partners become much less likely to run effective giving programs in the future”. The latter is inherent to specific features of Donational, rather than specific features of charitable giving, so it helps me decide between Donational and other charitable projects. The former doesn’t help me make that choice.
On another note, I second some of Oli’s concerns; I wish the section on Donational’s basic strategy had been a lot longer. Things I don’t think were addressed:
Who are Donational’s competitors in the corporate-giving space?
How large is the total market for a product like Donational’s?
What is Donational’s pitch to COOs who have predictable objections?
For example, a platform that only features a tiny set of charities will naturally be less appealing to executives than a platform with a wider range, because it will annoy employees and may not appeal to executives’ views on the best causes.
I was startled to see Donational decide to limit its selection so drastically—it feels like an enormous product change that will seem like regression to almost all customers. I’m surprised that the CEO found it worth doing just to appeal to our small donor community. What will actual VCs think? (Perhaps the company never intends to raise a private funding round, but VCs aside, there are still customers to consider.)
On the whole, while I understand that this is an atypical organization to evaluate in this way, I felt like I was seeing many indirect/correlational measures of success (“CEOs with these traits tend to run good organizations”) and little explicit discussion of the program’s strategy (“this is how they plan to find companies who might be good customers for their service”). I generally prioritize the latter ahead of the former.
Many thanks for the comments, and sorry for the slow reply—I’ve been travelling. Currently very jet-lagged and a bit ill so let me know if the following isn’t clear/sensible. Also, as in all my responses, these views are my own and not necessarily shared by the whole Rethink team.
>Perhaps this indicates that your models could be filled in over time with “default scores” that apply to all projects within a certain area? For example, any program aiming to reduce poverty could get the same “indirect harm” scores as this project’s anti-poverty side.
I agree. If RG continues, we may well standardise indirect effect scores to some extent, perhaps publishing fairly in-depth analyses of the various issues in separate posts. We discussed this early on in the process but it didn’t make sense to do it for an initial ‘experimental’ evaluation.
>Something also feels off about noting the potential harms of effects which are generally very good … The more that poverty is reduced, and the faster the economy grows, the more animals are likely to be eaten. This is an “indirect bad outcome” that I’m actually happy to see in some sense, because the existence of the bad outcome indicates that I succeeded in my primary goal.
I think I see where you’re coming from, but this seems to constitute a general case against considering indirect effects at all. I suppose there are some that don’t scale linearly with the intended effects, but I’m not sure this example does either (e.g. meat consumption may plateaux above a certain income), and it strikes me as a pretty arbitrary criterion to use. (I’m not sure whether you were actually suggesting we use that.)
Maybe this comes back to the point I made to Jonas, i.e. the effect of the charities, both direct and indirect, seems to be captured by the ‘counterfactual dollars moved to top charities’ metric so we should perhaps only consider indirect effects of CAP itself. But it sounds like if we were just evaluating, say, AMF, you’d be opposed to considering its effects on animal consumption, which doesn’t seem right to me. But maybe I’m misunderstanding you.
>It’s as though someone were to warn me about donating to an X-risk-reduction organization by pointing out that more humans living their lives implies that more humans will get cancer.
This example feels different. Assuming your main goal of reducing x-risk is that humans live longer, and the main bad consequence of cancer is shortening of human lives, your primary metric—something like (wellbeing-adjusted) life-years gained—will already account for the additional cancer.
More broadly: it seems like what counts as ‘indirect’ depends on what you’ve decided apriori to be ‘direct’ or intended. So we haven’t included, say, non-creation of net positive factory farmed animal lives as an indirect harm of charities that reduce meat consumption, because we think on average intensively-farmed animals’ lives would be net negative; but I would consider something like harm to the livelihoods of farmers an indirect effect, as that is not the goal, and may not be considered at all by potential donors if it weren’t highlighted. Likewise, if the goal of a charity is to mitigate global warming by reducing meat consumption, the effect on animal welfare would be an indirect effect; but because the main aim of the ACE-recommended animal charities seems to be reducing animal suffering, we have considered any consequences for climate change (and antibiotic resistance) to be indirect.
>If a socialist tells me they’re concerned about poor people losing autonomy as a result of charitable giving, my response would be something like: ”...okay. That’s going to be par for the course in this entire category of projects… By the project’s very nature, it should be clear that it’s not something you’ll want to support if you generally oppose charity.”
That was in the moral uncertainty section, not indirect effects, and the point was to essentially highlight that for people with this worldview (and a range of others) this project—or at least some of the recipient charities—may look bad. So it seems broadly consistent with your suggested approach, if I’ve understood you correctly.
>I wish the section on Donational’s basic strategy had been a lot longer.
Okay, I can see how that would have been useful. To briefly respond to some of your specific questions:
• I’m not aware of any ‘competitors’ as such. AFAIK there is no comparable organization operating in workplaces.
• ‘Market size’, or something close to it, was factored into the ‘growth rate’ and ‘max scale’ parameters of the model. We didn’t provide justifications for every parameter for time and space reasons but I can dig out my notes on specific ones if you want.
• The selection of charities is going to be a trade-off between mass appeal and effectiveness. Currently Donational recommends a very broad range (based on The Life You Can Save’s, I think) but we felt there were too many that were unproven or seemed likely to be orders of magnitude worse than the top charities, which undermined the impact focus. After some discussion, Ian agree to limit them to ACE and GW top charities, plus a US criminal justice one. (If it were my program, I’d probably exclude Give Directly, some of the ACE charities, and the criminal justice one, and would perhaps include some longtermist options—but ultimately it’s not my choice.)
I’ll leave your points about strategy for Tee or Luisa to answer.
Thanks for this reply! I don’t have time to engage in much more detail, but I’m now a little more uncertain that my specific qualms with indirect impact are important to the project.
I don’t want to make you dig through your notes just to answer my question; I more intended to make the general point that I’d have liked to have a few more concrete facts that I could use to help me weigh Rethink’s judgment. (For example, if you shared some current numbers on corporate giving, I could assign my own ‘max scale’ parameter and check my intuition against yours.)
Knowing that Donational started out with all or almost all TLYCS charities reduces my concern a lot. The impression I had was that they’d been working with a very broad range of charities and were radically cutting back on their selection.
>I more intended to make the general point that I’d have liked to have a few more concrete facts that I could use to help me weigh Rethink’s judgment.
That’s fair. Initially I was going to write a summary of our evidence and reasoning for all 42 parameters, or at least the 5-10 that the results were most sensitive to. In the end we decided against it for various reasons, e.g.: - Some were based fairly heavily on information that had to remain confidential, so a lot would have to be redacted. - Often the 6 team members had different rationales and drew on different information/experiences, so it would be hard in some cases to give a coherent summary. - Sometimes team members noted their rationales in the elicitation document, but with so many parameters, there wasn’t always time to do this properly. Any summary would therefore also be incomplete. - The report was already too long and was taking too much time, so this seemed like an easy way of limiting both length and delays.
But maybe it was the wrong call.
>Knowing that Donational started out with all or almost all TLYCS charities reduces my concern a lot. The impression I had was that they’d been working with a very broad range of charities and were radically cutting back on their selection.
I would consider TLYCS’s range very broad, but you may disagree. Anyway, you can see Donational’s current list at https://donational.org/charities
I would consider TLYCS’s range very broad, but you may disagree.
TLYCS only endorses 22 charities, all of which work in the developing world on causes that are plausibly cost-effective on the level of some GiveWell interventions (even though evidence is fairly weak on some of them—I recall GiveWell being more down on Zusha after their last review). This selection only looks broad if your point of comparison is another EA-aligned evaluator like GiveWell, ACE, or Founder’s Pledge.
Meanwhile, many charitable giving platforms/evaluators support/endorse a much wider range of nonprofits, most of them based in rich countries. Even looking only at Charity Navigator’s perfect scores, you see 60 charities (only 1⁄4 of which are “international”) -- and Charity Navigator’s website includes hundreds of other favorable charity profiles. Another example: When I worked at Epic, employees could support more than 100 different charities with the company’s money during the annual winter giving drive.
I also imagine that many corporate giving platforms would try to emphasize their vast selection/”the huge number of charities that have partnered with us”—I’m impressed that Donational was selective from the beginning.
>TLYCS only endorses 22 charities, all of which work in the developing world on causes that are plausibly cost-effective on the level of some GiveWell interventions (even though evidence is fairly weak on some of them...)
It’s plausible that some of these are as cost-effective as the GW top charities, but perhaps not that they are as cost-effective on average, or in expectation.
>This selection only looks narrow if your point of comparison is another EA-aligned evaluator like GiveWell, ACE, or Founder’s Pledge.
You mean only looks broad?
Anyway, I would agree TLYCS’s selection is narrow relative to some others; just not the EA evaluators that seem like the most natural comparators.
It’s plausible that some of these are as cost-effective as the GW top charities, but perhaps not that they are as cost-effective on average, or in expectation.
I agree, for most values of “plausible”. Otherwise, it would imply TLYCS is catching many GiveWell-tier charities GiveWell either missed or turned down, which is unlikely given their much smaller research capacity. But all TLYCS charities are in the category “things I could imagine turning out to be worthy of support from donors in EA with particular values, if more evidence arose” (which wouldn’t be the case for, say, an art museum).
On the whole: Interesting write-up, certainly works as an intro to how very in-depth use of EA forecasting/impact estimate techniques can be used. I’d read any of these that came out in the future, and can already think of organizations I’d be curious to see evaluated in this way.
This is similar to Jonas’s second comment, but it seems like concerns about the indirect harms of economic growth or “the reduction of agency” are both constants in any evaluation of any program in global poverty/development, or which encourages donations to said programs.
Perhaps this indicates that your models could be filled in over time with “default scores” that apply to all projects within a certain area? For example, any program aiming to reduce poverty could get the same “indirect harm” scores as this project’s anti-poverty side.
----
Something also feels off about noting the potential harms of effects which are generally very good. I’m having trouble coming up with a formal explanation for some reason, so I’ll write this out informally:
If I’m considering funds for a project to reduce poverty and grow the economy, and someone tells me that doing so could increase the number of animals that get eaten… these two effects scale together. The more that poverty is reduced, and the faster the economy grows, the more animals are likely to be eaten. This is an “indirect bad outcome” that I’m actually happy to see in some sense, because the existence of the bad outcome indicates that I succeeded in my primary goal.
It’s as though someone were to warn me about donating to an X-risk-reduction organization by pointing out that more humans living their lives implies that more humans will get cancer. Cancer is definitely a harm related to being alive, but it’s one that I’m implicitly prepared to accept in the course of helping more humans exist. If you came back from some point in the future to tell me that the cancer rate had remained constant over time, but that ten billion humans had cancer, I’d probably be very happy to hear the news, because it would imply the existence of hundreds of billions of cancer-free humans spread out across planets or artificial interstellar habitats.
Meanwhile, if someone were to tell me that they think cancer is so bad that it makes additional years of life net-negative, I’d tell them to support promising cancer treatments rather than even considering projects that create more years of human life.
If a socialist tells me they’re concerned about poor people losing autonomy as a result of charitable giving, my response would be something like: ”...okay. That’s going to be par for the course in this entire category of projects. By the project’s very nature, it should be clear that it’s not something you’ll want to support if you generally oppose charity.” And then I’d produce a report intended for people who do believe charity generally does more good than harm, because they’re the only ones who might actually satisfy their values by donating.
----
This is oversimplified and unsophisticated, and it’s easy to think of counterarguments. But I still feel as though “losing autonomy” is a very different kind of concern than, say, “Donational falls apart, and its initial corporate partners become much less likely to run effective giving programs in the future”. The latter is inherent to specific features of Donational, rather than specific features of charitable giving, so it helps me decide between Donational and other charitable projects. The former doesn’t help me make that choice.
On another note, I second some of Oli’s concerns; I wish the section on Donational’s basic strategy had been a lot longer. Things I don’t think were addressed:
Who are Donational’s competitors in the corporate-giving space?
How large is the total market for a product like Donational’s?
What is Donational’s pitch to COOs who have predictable objections?
For example, a platform that only features a tiny set of charities will naturally be less appealing to executives than a platform with a wider range, because it will annoy employees and may not appeal to executives’ views on the best causes.
I was startled to see Donational decide to limit its selection so drastically—it feels like an enormous product change that will seem like regression to almost all customers. I’m surprised that the CEO found it worth doing just to appeal to our small donor community. What will actual VCs think? (Perhaps the company never intends to raise a private funding round, but VCs aside, there are still customers to consider.)
On the whole, while I understand that this is an atypical organization to evaluate in this way, I felt like I was seeing many indirect/correlational measures of success (“CEOs with these traits tend to run good organizations”) and little explicit discussion of the program’s strategy (“this is how they plan to find companies who might be good customers for their service”). I generally prioritize the latter ahead of the former.
Hi Aaron,
Many thanks for the comments, and sorry for the slow reply—I’ve been travelling. Currently very jet-lagged and a bit ill so let me know if the following isn’t clear/sensible. Also, as in all my responses, these views are my own and not necessarily shared by the whole Rethink team.
>Perhaps this indicates that your models could be filled in over time with “default scores” that apply to all projects within a certain area? For example, any program aiming to reduce poverty could get the same “indirect harm” scores as this project’s anti-poverty side.
I agree. If RG continues, we may well standardise indirect effect scores to some extent, perhaps publishing fairly in-depth analyses of the various issues in separate posts. We discussed this early on in the process but it didn’t make sense to do it for an initial ‘experimental’ evaluation.
>Something also feels off about noting the potential harms of effects which are generally very good … The more that poverty is reduced, and the faster the economy grows, the more animals are likely to be eaten. This is an “indirect bad outcome” that I’m actually happy to see in some sense, because the existence of the bad outcome indicates that I succeeded in my primary goal.
I think I see where you’re coming from, but this seems to constitute a general case against considering indirect effects at all. I suppose there are some that don’t scale linearly with the intended effects, but I’m not sure this example does either (e.g. meat consumption may plateaux above a certain income), and it strikes me as a pretty arbitrary criterion to use. (I’m not sure whether you were actually suggesting we use that.)
Maybe this comes back to the point I made to Jonas, i.e. the effect of the charities, both direct and indirect, seems to be captured by the ‘counterfactual dollars moved to top charities’ metric so we should perhaps only consider indirect effects of CAP itself. But it sounds like if we were just evaluating, say, AMF, you’d be opposed to considering its effects on animal consumption, which doesn’t seem right to me. But maybe I’m misunderstanding you.
>It’s as though someone were to warn me about donating to an X-risk-reduction organization by pointing out that more humans living their lives implies that more humans will get cancer.
This example feels different. Assuming your main goal of reducing x-risk is that humans live longer, and the main bad consequence of cancer is shortening of human lives, your primary metric—something like (wellbeing-adjusted) life-years gained—will already account for the additional cancer.
More broadly: it seems like what counts as ‘indirect’ depends on what you’ve decided apriori to be ‘direct’ or intended. So we haven’t included, say, non-creation of net positive factory farmed animal lives as an indirect harm of charities that reduce meat consumption, because we think on average intensively-farmed animals’ lives would be net negative; but I would consider something like harm to the livelihoods of farmers an indirect effect, as that is not the goal, and may not be considered at all by potential donors if it weren’t highlighted. Likewise, if the goal of a charity is to mitigate global warming by reducing meat consumption, the effect on animal welfare would be an indirect effect; but because the main aim of the ACE-recommended animal charities seems to be reducing animal suffering, we have considered any consequences for climate change (and antibiotic resistance) to be indirect.
>If a socialist tells me they’re concerned about poor people losing autonomy as a result of charitable giving, my response would be something like: ”...okay. That’s going to be par for the course in this entire category of projects… By the project’s very nature, it should be clear that it’s not something you’ll want to support if you generally oppose charity.”
That was in the moral uncertainty section, not indirect effects, and the point was to essentially highlight that for people with this worldview (and a range of others) this project—or at least some of the recipient charities—may look bad. So it seems broadly consistent with your suggested approach, if I’ve understood you correctly.
>I wish the section on Donational’s basic strategy had been a lot longer.
Okay, I can see how that would have been useful. To briefly respond to some of your specific questions:
• I’m not aware of any ‘competitors’ as such. AFAIK there is no comparable organization operating in workplaces.
• ‘Market size’, or something close to it, was factored into the ‘growth rate’ and ‘max scale’ parameters of the model. We didn’t provide justifications for every parameter for time and space reasons but I can dig out my notes on specific ones if you want.
• The selection of charities is going to be a trade-off between mass appeal and effectiveness. Currently Donational recommends a very broad range (based on The Life You Can Save’s, I think) but we felt there were too many that were unproven or seemed likely to be orders of magnitude worse than the top charities, which undermined the impact focus. After some discussion, Ian agree to limit them to ACE and GW top charities, plus a US criminal justice one. (If it were my program, I’d probably exclude Give Directly, some of the ACE charities, and the criminal justice one, and would perhaps include some longtermist options—but ultimately it’s not my choice.)
I’ll leave your points about strategy for Tee or Luisa to answer.
Thanks for this reply! I don’t have time to engage in much more detail, but I’m now a little more uncertain that my specific qualms with indirect impact are important to the project.
I don’t want to make you dig through your notes just to answer my question; I more intended to make the general point that I’d have liked to have a few more concrete facts that I could use to help me weigh Rethink’s judgment. (For example, if you shared some current numbers on corporate giving, I could assign my own ‘max scale’ parameter and check my intuition against yours.)
Knowing that Donational started out with all or almost all TLYCS charities reduces my concern a lot. The impression I had was that they’d been working with a very broad range of charities and were radically cutting back on their selection.
>I more intended to make the general point that I’d have liked to have a few more concrete facts that I could use to help me weigh Rethink’s judgment.
That’s fair. Initially I was going to write a summary of our evidence and reasoning for all 42 parameters, or at least the 5-10 that the results were most sensitive to. In the end we decided against it for various reasons, e.g.:
- Some were based fairly heavily on information that had to remain confidential, so a lot would have to be redacted.
- Often the 6 team members had different rationales and drew on different information/experiences, so it would be hard in some cases to give a coherent summary.
- Sometimes team members noted their rationales in the elicitation document, but with so many parameters, there wasn’t always time to do this properly. Any summary would therefore also be incomplete.
- The report was already too long and was taking too much time, so this seemed like an easy way of limiting both length and delays.
But maybe it was the wrong call.
>Knowing that Donational started out with all or almost all TLYCS charities reduces my concern a lot. The impression I had was that they’d been working with a very broad range of charities and were radically cutting back on their selection.
I would consider TLYCS’s range very broad, but you may disagree. Anyway, you can see Donational’s current list at https://donational.org/charities
TLYCS only endorses 22 charities, all of which work in the developing world on causes that are plausibly cost-effective on the level of some GiveWell interventions (even though evidence is fairly weak on some of them—I recall GiveWell being more down on Zusha after their last review). This selection only looks broad if your point of comparison is another EA-aligned evaluator like GiveWell, ACE, or Founder’s Pledge.
Meanwhile, many charitable giving platforms/evaluators support/endorse a much wider range of nonprofits, most of them based in rich countries. Even looking only at Charity Navigator’s perfect scores, you see 60 charities (only 1⁄4 of which are “international”) -- and Charity Navigator’s website includes hundreds of other favorable charity profiles. Another example: When I worked at Epic, employees could support more than 100 different charities with the company’s money during the annual winter giving drive.
I also imagine that many corporate giving platforms would try to emphasize their vast selection/”the huge number of charities that have partnered with us”—I’m impressed that Donational was selective from the beginning.
>TLYCS only endorses 22 charities, all of which work in the developing world on causes that are plausibly cost-effective on the level of some GiveWell interventions (even though evidence is fairly weak on some of them...)
It’s plausible that some of these are as cost-effective as the GW top charities, but perhaps not that they are as cost-effective on average, or in expectation.
>This selection only looks narrow if your point of comparison is another EA-aligned evaluator like GiveWell, ACE, or Founder’s Pledge.
You mean only looks broad?
Anyway, I would agree TLYCS’s selection is narrow relative to some others; just not the EA evaluators that seem like the most natural comparators.
I agree, for most values of “plausible”. Otherwise, it would imply TLYCS is catching many GiveWell-tier charities GiveWell either missed or turned down, which is unlikely given their much smaller research capacity. But all TLYCS charities are in the category “things I could imagine turning out to be worthy of support from donors in EA with particular values, if more evidence arose” (which wouldn’t be the case for, say, an art museum).