Thanks for the input and helpful reply, definitely clarifies some things and I really hope this work goes well and better than I expect :)
One comment on the above that I don’t think the specified learning approach will yield the most important learnings
Especially in these kind of cases where counterfactuals are difficult to assess, I don’t think self-reports provide very useful data. Incentives are all wrong, and there’s a lot of confirmation bias, implementers see the results they want to see—often unintentionally. BINGOs, with the exception of GiveDirectly basically report nearly all (at least publicly) of their programs as successful. As far as I’ve seen, CHAI and PATH are no exception. Just by nature of doing tricky work, many projects will be complete failures through no fault of anyone, but it won’t be reported usually not even internally.
As a datapoint to suppor/refute my claim, it might be useful to look at the reports for the last 20 times CHAI and PATH have supported TSUs. Due to the nature of TSU work, at very least a handful of these (I would guess 25%-75%) would not achieve anything useful, which is obviously fine and to be expected in this high variance situation. Despite this I would be fairly confident if you looked at the reports of last 20 times CHAI has implemented TSUs, all of them will report overall success (with caveats of course, that’s standard thing to do), and none of them will say that they were overall a failure, or didn’t achieve anything. If you did find that 5 out of 20 reports did say their work was likely useless then I would be happy to be corrected/proved wrong here!
Because of this I think the higher risk/more uncertain the work is, the more it is important to rely on external assessment. Where metrics are clearer and crisper then I think its more reasonable to rely somewhat on implementing orgs own assessments - there’s less room for subjectivity.
Although its very difficult to measure success well, I would suggest this as an alternative to your current suggestion of “Ask PATH and CHAI to report and track...” which I don’t think will yield much useful information.
1. Have the learnings/assessment done not by implementers, perhaps even by GiveWell staff yourselves who are really motivated to find the truth 2. After discussion with CHAI and PATH (in advance) as to what you all think success metrics might look like, then specify some clear numerical/qualitative success metrics in advance as well. Then don’t show your final metrics to CHAI/PATH so this doesn’t become a gameable target can’t be Goodharted. 3. Validate with the PATH/CHAI failure as being likely in many TSU cases—state that half of these TSUs (or whatever) are likely to achieve not much and that’s OK. You might then get more self reflective/honest feedback from implementers—at least you’d be in with a chance.
To clarify I’m not trying to say that these orgs are bad/dishonest in any way, I’m just riding of my experience in programs of this nature. Incentives incentives incentives. With a combination of BINGO implementers and unclear/nebulous success measures, both personal emotional/motivational incentives and organisational incentives for things to “work” mean that with remarkable consistency successes are exaggerated and failures suppressed.
GW’s email newsletter just alerted me to their grant writeup on this, in case you still want to look into it / aren’t subbed to them :) especially the main reservations section
Thanks yes. I don’t think there’s much new in the write-up that hasn’t been discussed—
The BOTEC is clear and well written as usual. I think (predictably) that they are very optimistic about what TSUs might be able do. They think that TSUs will...
1) Have a 70% chance of increasing cost effectiveness of 20 −40 million dollars of health spending in each country (over 100million in total) by around 20%. This through shifting allocation. I struggle to fathom how that could be possible here in Uganda, with my albeit limited knowledge of how the ministry of health works here. If they could move 20 million of funding to be 30% more cost-effective in even one country I’d be impressed.
2) Bring in 40 million dollars of counterfactually new health funding in the countries that they work in (between 2 and 20 million in each country). This would be a truly brilliant lobbying effort that again I’m very skeptical about. Bringing in new counterfactual funding is very difficult.
if a technical support unit can achieve anything like that in only 18 months I would be blown away. The tricky thing thing is that it will be very difficult to tell whether this happened or not—again it would be great to have an external org (or GiveWell themselves) assessing the likely counterfactual here.
On the other hand (to their credit as it hurts their BOTEC) they are very conservative about the chance of the grant being funded by some other funder. They put this chance at 50%, but I think its far far lower. Very few orgs in the world fund this kind of work to the tune of 4.5 million dollars, I would have put it at more like 20%-30%
Actually now that you mention the shifting health budget allocation I’m also skeptical, although I’m mostly thinking of CEAP’s experience finding the development budget ~fixed and the remaining sliver fought over by hundreds of NGOs, I take you to be saying it’s the same story for health budget.
I agree re: impact attribution determination and they don’t seem to be planning to do that in their plans for follow-up section.
Thanks for the input and helpful reply, definitely clarifies some things and I really hope this work goes well and better than I expect :)
One comment on the above that I don’t think the specified learning approach will yield the most important learnings
Especially in these kind of cases where counterfactuals are difficult to assess, I don’t think self-reports provide very useful data. Incentives are all wrong, and there’s a lot of confirmation bias, implementers see the results they want to see—often unintentionally. BINGOs, with the exception of GiveDirectly basically report nearly all (at least publicly) of their programs as successful. As far as I’ve seen, CHAI and PATH are no exception. Just by nature of doing tricky work, many projects will be complete failures through no fault of anyone, but it won’t be reported usually not even internally.
As a datapoint to suppor/refute my claim, it might be useful to look at the reports for the last 20 times CHAI and PATH have supported TSUs. Due to the nature of TSU work, at very least a handful of these (I would guess 25%-75%) would not achieve anything useful, which is obviously fine and to be expected in this high variance situation. Despite this I would be fairly confident if you looked at the reports of last 20 times CHAI has implemented TSUs, all of them will report overall success (with caveats of course, that’s standard thing to do), and none of them will say that they were overall a failure, or didn’t achieve anything. If you did find that 5 out of 20 reports did say their work was likely useless then I would be happy to be corrected/proved wrong here!
Because of this I think the higher risk/more uncertain the work is, the more it is important to rely on external assessment. Where metrics are clearer and crisper then I think its more reasonable to rely somewhat on implementing orgs own assessments - there’s less room for subjectivity.
Although its very difficult to measure success well, I would suggest this as an alternative to your current suggestion of “Ask PATH and CHAI to report and track...” which I don’t think will yield much useful information.
1. Have the learnings/assessment done not by implementers, perhaps even by GiveWell staff yourselves who are really motivated to find the truth
2. After discussion with CHAI and PATH (in advance) as to what you all think success metrics might look like, then specify some clear numerical/qualitative success metrics in advance as well. Then don’t show your final metrics to CHAI/PATH so this doesn’t become a gameable target can’t be Goodharted.
3. Validate with the PATH/CHAI failure as being likely in many TSU cases—state that half of these TSUs (or whatever) are likely to achieve not much and that’s OK. You might then get more self reflective/honest feedback from implementers—at least you’d be in with a chance.
To clarify I’m not trying to say that these orgs are bad/dishonest in any way, I’m just riding of my experience in programs of this nature. Incentives incentives incentives. With a combination of BINGO implementers and unclear/nebulous success measures, both personal emotional/motivational incentives and organisational incentives for things to “work” mean that with remarkable consistency successes are exaggerated and failures suppressed.
GW’s email newsletter just alerted me to their grant writeup on this, in case you still want to look into it / aren’t subbed to them :) especially the main reservations section
Thanks yes. I don’t think there’s much new in the write-up that hasn’t been discussed—
The BOTEC is clear and well written as usual. I think (predictably) that they are very optimistic about what TSUs might be able do. They think that TSUs will...
1) Have a 70% chance of increasing cost effectiveness of 20 −40 million dollars of health spending in each country (over 100million in total) by around 20%. This through shifting allocation. I struggle to fathom how that could be possible here in Uganda, with my albeit limited knowledge of how the ministry of health works here. If they could move 20 million of funding to be 30% more cost-effective in even one country I’d be impressed.
2) Bring in 40 million dollars of counterfactually new health funding in the countries that they work in (between 2 and 20 million in each country). This would be a truly brilliant lobbying effort that again I’m very skeptical about. Bringing in new counterfactual funding is very difficult.
if a technical support unit can achieve anything like that in only 18 months I would be blown away. The tricky thing thing is that it will be very difficult to tell whether this happened or not—again it would be great to have an external org (or GiveWell themselves) assessing the likely counterfactual here.
On the other hand (to their credit as it hurts their BOTEC) they are very conservative about the chance of the grant being funded by some other funder. They put this chance at 50%, but I think its far far lower. Very few orgs in the world fund this kind of work to the tune of 4.5 million dollars, I would have put it at more like 20%-30%
Actually now that you mention the shifting health budget allocation I’m also skeptical, although I’m mostly thinking of CEAP’s experience finding the development budget ~fixed and the remaining sliver fought over by hundreds of NGOs, I take you to be saying it’s the same story for health budget.
I agree re: impact attribution determination and they don’t seem to be planning to do that in their plans for follow-up section.