The endline goal of any piece of evaluation criteria is to be able to be used to best predict “good done”. I broadly agree that one criteria factor is unlikely to rule in or out an intervention fully (including limiting factor—it was one of four in our system). If we know a criteria that was that powerful there would be no need for complex evaluation.
Although limiting factor is not a pure hard limit I do not think this changes its usefulness much; an intervention might be low evidence, and in theory multiple RCTs could be done to improve this, but in practice if there is say a limiting factor on funding (such that multiple RCTs could not be funded) the intervention might be indefinitely low evidenced even if in theory evidence is not an independent of movement factor. It seems fairly clear that all things being equal running an intervention will be easier than running an equivalent intervention that also requires you to build a field of talent or otherwise work on a limiting factor.
In principle I think this could be put into a more numerical form (e.g. included in CEA), but I think in practice this has not been done. Historically maybe the closest is different levels of funding gaps that Givewell has put for there top charities, but that is mostly considering a single possible limiting factor (funding). I would love to see more models on limiting factor and think it would be a natural next step in the current EA talent vs funding conversations.
A different way to think about this question is do we think problem scale or limiting factor are better predictors of areas where the most good can be done? I pretty strongly disagree that problem scale is more important than the limiting factor that will hit an intervention. Theoretically scale of the problem is a harder limit but that doesn’t really matter if in practice an intervention is never capped by it. We ended up looking at quite a number of charities to consider what was stopping them (including GiveWell and ACE recommendations) and none of them seemed to be capped by problem scale, they had all been stopped by other limiting factors far before that became an issue (for example, with AMF it was funding and logistical bottlenecks not the number of people with malaria). I think this is even true for the specific case of wild suffering interventions. The absolute number of bugs does not matter much when considering ethical pest control so much as the density per hectare of field or the available funding for a humane insecticides charity. You could imagine a world where the bug populations of colder locations (such as Canada and Russia) where close to 0 and it would do very little to affect the estimated good done- broadly due to having a ton of work to do in warmer locations before one would expand to Canada and likely hitting many limiting factors before expanding that far. How soon these problems hit would be more predictive of impact than if there were twice or half as many bugs in the world as there are now.
I think historical evidence like “if this was not done X would not never have happened” is not a very strong argument unless some research is done systematically and compares both the hits and misses that occured (e.g. there where a lot of issues that were attempted to be founded but never got traction at that same point in time). To take a more clear example you could look a friend who won the lottery, and although clearly he benefited from his ticket it still would have been the wrong call from an expected value perspective to buy it, and certainly would not suggest you should buy a lottery ticket we have to be careful of survivorship bias. Mainly we are looking at factors that are predictive of something having the most impact and singular examples do not describe much about field building vs making quicker progress on a more established field. Although I would be really interested in more systematic research in this area.
That makes a lot of sense. Maybe one way of framing scale + cost-effectiveness could be “how long will a particular cost-effectiveness be applicable in the real world?”, and then two ways of describing that cost-effectiveness are either incorporating costs to raise these limits or not.
In either case, I definitely agree that these should be considered. One other thought—it seems like in certain ways, a donation to a charity will account for their efforts to raise limits, to some extent. I don’t know enough about how ACE does cost-effectiveness analysis (and obviously the degree to which this information is incorporated would definitely depend on that), but I could imagine that if you make a statement like “a donation of $100 to The Humane League will help reduce the suffering of X animals”, in a complete assessment of that donation, some of that funding would be going to their development department (raising the amount of funding available), some might be going to volunteer cultivation (maybe volunteer capacity is another limiting factor).
So the issue is more that while the outcome per dollar we are looking at is based on historical performance, over time that outcome per dollar is actually worse because some of that funding was going towards raising limits, and actually would need to be applied to animals not yet helped, if that makes sense.
Either way, I’m really interested in this—since reading it, I’ve been thinking of how I can incorporate this kind of thinking about cost-effectiveness into my organization—it seems tricky, but definitely worth doing a lot more of. Thanks for posting it!
Hey Abraham,
The endline goal of any piece of evaluation criteria is to be able to be used to best predict “good done”. I broadly agree that one criteria factor is unlikely to rule in or out an intervention fully (including limiting factor—it was one of four in our system). If we know a criteria that was that powerful there would be no need for complex evaluation.
Although limiting factor is not a pure hard limit I do not think this changes its usefulness much; an intervention might be low evidence, and in theory multiple RCTs could be done to improve this, but in practice if there is say a limiting factor on funding (such that multiple RCTs could not be funded) the intervention might be indefinitely low evidenced even if in theory evidence is not an independent of movement factor. It seems fairly clear that all things being equal running an intervention will be easier than running an equivalent intervention that also requires you to build a field of talent or otherwise work on a limiting factor.
In principle I think this could be put into a more numerical form (e.g. included in CEA), but I think in practice this has not been done. Historically maybe the closest is different levels of funding gaps that Givewell has put for there top charities, but that is mostly considering a single possible limiting factor (funding). I would love to see more models on limiting factor and think it would be a natural next step in the current EA talent vs funding conversations.
A different way to think about this question is do we think problem scale or limiting factor are better predictors of areas where the most good can be done? I pretty strongly disagree that problem scale is more important than the limiting factor that will hit an intervention. Theoretically scale of the problem is a harder limit but that doesn’t really matter if in practice an intervention is never capped by it. We ended up looking at quite a number of charities to consider what was stopping them (including GiveWell and ACE recommendations) and none of them seemed to be capped by problem scale, they had all been stopped by other limiting factors far before that became an issue (for example, with AMF it was funding and logistical bottlenecks not the number of people with malaria). I think this is even true for the specific case of wild suffering interventions. The absolute number of bugs does not matter much when considering ethical pest control so much as the density per hectare of field or the available funding for a humane insecticides charity. You could imagine a world where the bug populations of colder locations (such as Canada and Russia) where close to 0 and it would do very little to affect the estimated good done- broadly due to having a ton of work to do in warmer locations before one would expand to Canada and likely hitting many limiting factors before expanding that far. How soon these problems hit would be more predictive of impact than if there were twice or half as many bugs in the world as there are now.
I think historical evidence like “if this was not done X would not never have happened” is not a very strong argument unless some research is done systematically and compares both the hits and misses that occured (e.g. there where a lot of issues that were attempted to be founded but never got traction at that same point in time). To take a more clear example you could look a friend who won the lottery, and although clearly he benefited from his ticket it still would have been the wrong call from an expected value perspective to buy it, and certainly would not suggest you should buy a lottery ticket we have to be careful of survivorship bias. Mainly we are looking at factors that are predictive of something having the most impact and singular examples do not describe much about field building vs making quicker progress on a more established field. Although I would be really interested in more systematic research in this area.
That makes a lot of sense. Maybe one way of framing scale + cost-effectiveness could be “how long will a particular cost-effectiveness be applicable in the real world?”, and then two ways of describing that cost-effectiveness are either incorporating costs to raise these limits or not.
In either case, I definitely agree that these should be considered. One other thought—it seems like in certain ways, a donation to a charity will account for their efforts to raise limits, to some extent. I don’t know enough about how ACE does cost-effectiveness analysis (and obviously the degree to which this information is incorporated would definitely depend on that), but I could imagine that if you make a statement like “a donation of $100 to The Humane League will help reduce the suffering of X animals”, in a complete assessment of that donation, some of that funding would be going to their development department (raising the amount of funding available), some might be going to volunteer cultivation (maybe volunteer capacity is another limiting factor).
So the issue is more that while the outcome per dollar we are looking at is based on historical performance, over time that outcome per dollar is actually worse because some of that funding was going towards raising limits, and actually would need to be applied to animals not yet helped, if that makes sense.
Either way, I’m really interested in this—since reading it, I’ve been thinking of how I can incorporate this kind of thinking about cost-effectiveness into my organization—it seems tricky, but definitely worth doing a lot more of. Thanks for posting it!