I think tautological measurement is a real concern for basically every meta charity, although I’m not sure I agree with your solution. I think the better solution is external evaluation, someone like GiveWell or Founders Pledge who does not have any reason to value CE charities. Typically, these organizations do their own independent research and compare it across their current portfolio of projects. If CE can, for example, fairly consistently incubate charities that GW/FP/etc. rank as best in the world, I think that is at least not organizationally tautological (it is assuming that these charity evaluators are in fact identifying the best areas/charities, and replicating any flaws they have though).
In terms of success rate, I agree 40% is high but I would expect many NGO incubators to be considerably higher than in the for-profit space, for a few reasons (a couple listed below):
General competition: There are just not that many charities aiming for pure impact (in an EA way), unlike the for-profit market. The general efficiency of the charity market is pretty low, and thus there are lots of fairly easy wins.
Scale sensitivity: Generally, successful for-profits are seen as really large-scale ventures (e.g., unicorns) and the market is consistently hunting for that. Debatably, the only charity currently seen as highly impactful that can get to that sort of scale is GiveDirectly. Thus the bar for success in the charity sector is significantly lower in terms of a money spent scale. For example, if we founded a charity running with a 1m a year budget, that was x2 as effective as top GW ones, we could count that as a success. But an organization of the same size would be considered a rounding error by YC. If we take size expectations into account, it might be like 1⁄25 charities that we incubated that have any significant chance of getting to unicorn-level size.
Thanks, Joey. Really appreciate you taking the time to engage on these questions.
To be clear, I’m not seriously suggesting ignoring all research from before the decision. I’m just saying that mathematically, an independent test needs its backrest data to exclude all calibration data.
It strikes me that there are broadly 3 buckets of risk / potential failure:
Execution risk—this is significant and you can only find out by trying, but you only really know if you’re being successful with the left hand side of the theory of change
Logic risk—having an external organisation take a completely fresh view should solve most of this
Evidence risk—even with an external organisation marking your homework, they are still probably drawing on the same pool of research and that might suffer from survivorship bias
I think tautological measurement is a real concern for basically every meta charity, although I’m not sure I agree with your solution. I think the better solution is external evaluation, someone like GiveWell or Founders Pledge who does not have any reason to value CE charities. Typically, these organizations do their own independent research and compare it across their current portfolio of projects. If CE can, for example, fairly consistently incubate charities that GW/FP/etc. rank as best in the world, I think that is at least not organizationally tautological (it is assuming that these charity evaluators are in fact identifying the best areas/charities, and replicating any flaws they have though).
In terms of success rate, I agree 40% is high but I would expect many NGO incubators to be considerably higher than in the for-profit space, for a few reasons (a couple listed below):
General competition: There are just not that many charities aiming for pure impact (in an EA way), unlike the for-profit market. The general efficiency of the charity market is pretty low, and thus there are lots of fairly easy wins.
Scale sensitivity: Generally, successful for-profits are seen as really large-scale ventures (e.g., unicorns) and the market is consistently hunting for that. Debatably, the only charity currently seen as highly impactful that can get to that sort of scale is GiveDirectly. Thus the bar for success in the charity sector is significantly lower in terms of a money spent scale. For example, if we founded a charity running with a 1m a year budget, that was x2 as effective as top GW ones, we could count that as a success. But an organization of the same size would be considered a rounding error by YC. If we take size expectations into account, it might be like 1⁄25 charities that we incubated that have any significant chance of getting to unicorn-level size.
Thanks, Joey. Really appreciate you taking the time to engage on these questions.
To be clear, I’m not seriously suggesting ignoring all research from before the decision. I’m just saying that mathematically, an independent test needs its backrest data to exclude all calibration data.
It strikes me that there are broadly 3 buckets of risk / potential failure:
Execution risk—this is significant and you can only find out by trying, but you only really know if you’re being successful with the left hand side of the theory of change
Logic risk—having an external organisation take a completely fresh view should solve most of this
Evidence risk—even with an external organisation marking your homework, they are still probably drawing on the same pool of research and that might suffer from survivorship bias