This is an interesting analysis that I haven’t properly digested, so what I’m about to say might be missing something important, but something feels a bit strange about this type of approach to this type of question.
For example, couldn’t I write a post titled “Can AI cause human extinction? not on priors” where I look at historical data on “humans killed by machines” (e.g. traffic accidents, factory accidents) as a fraction of the global population, show that it is tiny, and argue it’s extremely unlikely that AI (another type of machine) will wipe us all out?
I think the mistake I’d be making here is lumping in AGI with cars, construction machinery, etc, into one single category. But then I imagine the people who worry about extinction from war are also imagining a kind of war which should belong in a different category to previous wars.
What’s your take on this? Would the AI post be actually valid as well? Or is there an important difference I’m missing?
Edited after more careful reading of the post
As you say in the post, I think all these things can be true:
1) The expected counterfactual value is all that matters (i.e. we can ignore Shapley values).
2) The 3 vaccine programs had zero counterfactual value in hindsight.
3) It was still the correct decision to work on each of them at the time, with the information that was available then.
At the time, none of the 3 programs knew that any of the others would succeed, so the expected value of each programme was very high. It’s not clear to me why the ’12.5%′ figure in your probabilistic analysis is getting anything wrong.
If one vaccine program actually had known with certainty that the other two would already succeed, and if that really rendered their own counterfactual value 0 (probably not literally true in practice but granted for the sake of argument) then it seems very plausible to me that they probably should have focused on other things (despite what a Shapley value analysis might have told them).
It gets more complicated if you imagine that each of the three knew with certainty that the other two could succeed if they tried, but might not necessarily actually do it, because they will be reasoning similarly. Then it becomes a sort of game of chicken between the three of them as to which will actually do the work, and I think this is the kind of anti-cooperative nature of counterfactual value that you’re alluding to. This is a potential problem with focusing only on counterfactuals, but focusing only on Shapley values has problems too, because it gives the wrong answer in cases where the decisions of others are already set in stone.
Toby Ord left a really good comment on the linked post on Shapley values that I think it’s worth people reading, and I would echo his recommendation to read Parfit’s Five Mistakes in Moral Mathematics for a really good discussion of these problems.