This looks pretty similar to a model I wrote with Nick Dunkley way back in the 2012 (part 1, part 2). I still stand by that as a reasonable stab at the problem, so I also think your model is pretty reasonable :)
Charity population:
You’re assuming a fixed pool of charities, which makes sense given the evidence gathering strategy you’ve used (see below). But I think it’s better to model charities as an unbounded population following the given distribution, from which we can sample.
That’s because we do expect new opportunities to arise. And if we believe that the distribution is heavy-tailed, a large amount of our expected value may come from the possibility of eventually finding something way out in the tails. In your model we only ever get N opportunities to get a really exceptional charity—after that we are just reducing our uncertainty. I think we want to model the fact that we can keep looking for things out in the tails, even if they maybe don’t exist yet.
I do think that a lognormal is a sensible distribution for charity effectiveness. The real distribution may be broader, but that just makes your estimate more conservative, which is probably fine. I just did the boring thing and used the empirical distribution of the DCP intervention cost-effectivenss (note: interventions, not charities).
Evidence gathering strategy:
You’re assuming that the evaluator does a lot of evaluating: they evaluate every charity in the pool in every round. In some sense I suppose this is true, in that charities which are not explicitly “investigated” by an evaluator can be considered to have failed the first test by not being notable enough to even be considered. However, I still think this is somewhat unrealistic and is going to drive diminishing returns very quickly, since we’re really just waiting for the errors for the various charities settle down so that the best charity becomes apparent.
I modelled this as the process as the evaluator sequentially evaluating a single charity, chosen at random (with replacement). This is also unrealistic, because in fact an evaluator won’t waste their time with things that are obviously bad, but even with this fairly conservative strategy things turned out pretty well.
I think it’s interesting to think what happens when model the pool more explicitly, and consider strategies like investigating the top recommendation further to reduce error.
Increasing scale with money moved:
Charity evaluators have the wonderful feature that their effectiveness scales more or less linearly with the amount of money they move (assuming that the money all goes to their top pick). This is a pretty great property, so worth mentioning.
The big caveat there is room for more funding, or saturation of opportunities. I’m not sure how best to model this. We could model charities as rather “deposits” of effectiveness that are of a fixed size when discovered, and can be exhausted. I don’t know how that would change things, but I’d be interested to see! In particular, I suspect it may be important how funding capacity co-varies with effectiveness. If we find a charity with a cost-effectiveness that’s 1000x higher than our best, but it can only take a single dollar, then that’s not so great.
This looks pretty similar to a model I wrote with Nick Dunkley way back in the 2012 (part 1, part 2). I still stand by that as a reasonable stab at the problem, so I also think your model is pretty reasonable :)
Charity population:
You’re assuming a fixed pool of charities, which makes sense given the evidence gathering strategy you’ve used (see below). But I think it’s better to model charities as an unbounded population following the given distribution, from which we can sample.
That’s because we do expect new opportunities to arise. And if we believe that the distribution is heavy-tailed, a large amount of our expected value may come from the possibility of eventually finding something way out in the tails. In your model we only ever get N opportunities to get a really exceptional charity—after that we are just reducing our uncertainty. I think we want to model the fact that we can keep looking for things out in the tails, even if they maybe don’t exist yet.
I do think that a lognormal is a sensible distribution for charity effectiveness. The real distribution may be broader, but that just makes your estimate more conservative, which is probably fine. I just did the boring thing and used the empirical distribution of the DCP intervention cost-effectivenss (note: interventions, not charities).
Evidence gathering strategy:
You’re assuming that the evaluator does a lot of evaluating: they evaluate every charity in the pool in every round. In some sense I suppose this is true, in that charities which are not explicitly “investigated” by an evaluator can be considered to have failed the first test by not being notable enough to even be considered. However, I still think this is somewhat unrealistic and is going to drive diminishing returns very quickly, since we’re really just waiting for the errors for the various charities settle down so that the best charity becomes apparent.
I modelled this as the process as the evaluator sequentially evaluating a single charity, chosen at random (with replacement). This is also unrealistic, because in fact an evaluator won’t waste their time with things that are obviously bad, but even with this fairly conservative strategy things turned out pretty well.
I think it’s interesting to think what happens when model the pool more explicitly, and consider strategies like investigating the top recommendation further to reduce error.
Increasing scale with money moved:
Charity evaluators have the wonderful feature that their effectiveness scales more or less linearly with the amount of money they move (assuming that the money all goes to their top pick). This is a pretty great property, so worth mentioning.
The big caveat there is room for more funding, or saturation of opportunities. I’m not sure how best to model this. We could model charities as rather “deposits” of effectiveness that are of a fixed size when discovered, and can be exhausted. I don’t know how that would change things, but I’d be interested to see! In particular, I suspect it may be important how funding capacity co-varies with effectiveness. If we find a charity with a cost-effectiveness that’s 1000x higher than our best, but it can only take a single dollar, then that’s not so great.