Ah sorry, I think I might have confused the issue a bit with my footnote. I think I’ve managed to conflate two issues in your mind.
The first is exactly as you say; any intervention worth doing has some effects which are easy to model and some which are difficult (maybe impossible) to model. What GiveWell has done is completely reasonable here; modelling what it can and then making assumptions about how important the other things, like track record, are in comparison to the main cost-effectiveness results.
The second issue is the more subtle one that I was driving at. Imagine you are going to buy a new car, and your friend (who knows about cars) says that modern cars are 10x more fuel efficient than the car you currently drive. Speaking very roughly, there are two strategies you could pick from to choose your next car:
Completely ignore your friend, and pick the car that has the best MPG regardless of any other feature. This would be a good strategy if literally all you care about is fuel efficiency, but a bad strategy otherwise (because it is unlikely the most fuel efficient car is also the most comfortable to drive—especially if fuel efficiency and comfort are sort-of tradeoffs)
Treat your friend as having offered a useful rule of thumb, and so have an idea in your head about what ‘good’ fuel efficiency looks like. This is a good strategy if cars aren’t really directly comparable along a straightforward scale—a Ford F-150 isn’t ‘better’ or ‘worse’ than a Prius, it is just a different kind of thing.
Both GiveWell (implicitly) and me in my fertility days (explicitly) argue that QALYs are like cars—you can end up in a situation where you can generate different kinds of QALYs and your best bet is to compare them with a rule of thumb like GiveWell’s 10x multiplier. However I don’t think GiveWell is correct in making this assumption about charities—there is in fact a single measure like MPG which we want to ruthlessly optimise, and therefore we do actually want to it the F-150 and Prius directly against each other.
However my point in the essay is that GiveWell don’t actually have to choose—they can build their model as if they are in the first world and directly compare charities together, and then make their final decision as though they are in the second world and different charities will offer different profiles of benefit on top of their cost-effectiveness. This is pretty much the commonsense way of choosing a car too—you would look at MPG and directly compare cars in this way, but you might then consider other factors. It would be weird to lump all cars together in your head as ‘better than 10x my previous efficiency’ or ‘worse than 10x my previous efficiency’.
Ah sorry, I think I might have confused the issue a bit with my footnote. I think I’ve managed to conflate two issues in your mind.
The first is exactly as you say; any intervention worth doing has some effects which are easy to model and some which are difficult (maybe impossible) to model. What GiveWell has done is completely reasonable here; modelling what it can and then making assumptions about how important the other things, like track record, are in comparison to the main cost-effectiveness results.
The second issue is the more subtle one that I was driving at. Imagine you are going to buy a new car, and your friend (who knows about cars) says that modern cars are 10x more fuel efficient than the car you currently drive. Speaking very roughly, there are two strategies you could pick from to choose your next car:
Completely ignore your friend, and pick the car that has the best MPG regardless of any other feature. This would be a good strategy if literally all you care about is fuel efficiency, but a bad strategy otherwise (because it is unlikely the most fuel efficient car is also the most comfortable to drive—especially if fuel efficiency and comfort are sort-of tradeoffs)
Treat your friend as having offered a useful rule of thumb, and so have an idea in your head about what ‘good’ fuel efficiency looks like. This is a good strategy if cars aren’t really directly comparable along a straightforward scale—a Ford F-150 isn’t ‘better’ or ‘worse’ than a Prius, it is just a different kind of thing.
Both GiveWell (implicitly) and me in my fertility days (explicitly) argue that QALYs are like cars—you can end up in a situation where you can generate different kinds of QALYs and your best bet is to compare them with a rule of thumb like GiveWell’s 10x multiplier. However I don’t think GiveWell is correct in making this assumption about charities—there is in fact a single measure like MPG which we want to ruthlessly optimise, and therefore we do actually want to it the F-150 and Prius directly against each other.
However my point in the essay is that GiveWell don’t actually have to choose—they can build their model as if they are in the first world and directly compare charities together, and then make their final decision as though they are in the second world and different charities will offer different profiles of benefit on top of their cost-effectiveness. This is pretty much the commonsense way of choosing a car too—you would look at MPG and directly compare cars in this way, but you might then consider other factors. It would be weird to lump all cars together in your head as ‘better than 10x my previous efficiency’ or ‘worse than 10x my previous efficiency’.