There are a lot of things l like about this post. From small (e.g. the summary on top of it; and the table at the end) to large (e.g. it’s a good thing to do given a desire to understand how to quantify/estimate impact better).
Here are some things I am perplexed about or disagree with:
EAF hiring round estimate misses the enormous realized value of information. As far as I can see, EAF decided to move to London (partly) because of that.
> We moved to London (Primrose Hill) to better attract and retain staff and collaborate with other researchers in London and Oxford.
> Budget 2020: $994,000 (7.4 expected full-time equivalent employees). Our per-staff expenses have increased compared with 2019 because we do not have access to free office space anymore, and the cost of living in London is significantly higher than in Berlin.
The donor lottery evaluation seems to miss that $100K would have been donated otherwise.
Further, I would suggest another decomposition.
Impact = impact of running donor lottery as a tool (as opposed to donating without ~aggregation) + the counterfactuals impact of particular grants (as opposed to ~expected grants) + misc. side-effects (like a grantmaker joining LTFF).
I can understand why you added the first two terms. But it seems to me that
we can get a principled estimate about the first one based on arguments for donor lotteries (e.g. epistemic advantage coming from spending more time per dollar donated; and freed time of donors);
One can get more empirical and have a quick survey here.
estimating the second term is trickier because you need to make a guess about the impact of an average epistemically advantaged donation (as opposed to an average donation of 100K I which I think is missing from your estimate)
Both of these are doable because we saw how other donor lottery winners gave their money and how wealthy/invested donors give their money.
A good proxy for an impact of average donation might come from (a) EA survey donation data, (b) a quick survey of lottery participants. The latter seems superior because participating in an early donor lottery suggests a higher engagement with EA ideas &c.
After thinking a bit longer the choice of decomposition depends on what you want to understand better. It seems like your choice is better if you want to empirically understand whether the donor lottery is valuable.
Another weird thing is to see the 2017 Donor Lottery Grant having x5..10 higher impact than 2018 AI Alignment Literature Review and Charity Comparison.
I think it might come down to you not subtracting the counterfactual impact of donating 100K w/o lottery from donors’ lottery impact estimate.
The basic source of impact of the donor lottery and charity review comes from an epistemic advantage (someone dedicating more time to think/evaluate donations; people being better informed about the charities they are likely to donate to). Given how well received the literature review is it seems to be (quite likely) helpful to individual donors and given that it (according to your guess) impacted $100K..1M it should be kinda as impactful or more impactful than an abstract donor lottery.
And it’s hard to see this particular donor lottery as overwhelmingly more impactful than an average one.
Another weird thing is to see the 2017 Donor Lottery Grant having x5..10 higher impact than 2018 AI Alignment Literature Review and Charity Comparison.
I see now, that is weird. Note that if I calculate the total impact of the 100k to $1M I think Larks moved, the impact of that would be 100mQ to 2Q (change the Shapley value fraction in the Guessstimate to 1), which is closer to the 500mQ to 4Q I estimated from the 2017 Donor Lottery. And the difference can be attributed to a) Investing in organizations which are starting up, b) the high cost of producing AI safety papers, coupled with cause neutrality, and c) further error.
Re: “The donor lottery evaluation seems to miss that $100K would have been donated otherwise”: I don’t think it does. In the “total project impact” section, I clarify that “Note that in order to not double count impact, the impact has to be divided between the funding providers and the grantee (and possibly with the new hires as well).”
Am I understand correctly that the Shapley value multiplier (0.3 to 0.5) is responsible for preventing double counting?
If so why don’t you apply it to Positive status effects? The effect was also partially enabled by the funding providers (maybe less so).
Huh! I am surprised that your Shapley value calculation is not explicit but is reasonable.
Let’s limit ourselves to two players (= funding providers who are only capable of shallow evaluations and grantmakers who are capable of in-depth evaluation but don’t have their own funds). You get Shapley mult.=V(lottery, funding in-depth evaluated projects)−V(default, funding shallowly evaluated projects)2V(lottery). Your estimate of “0.3 to 0.5” implies that shallowly evaluated giving is as impactful as “0 to 0.4″ of in-depth evaluated giving.
This x2.5..∞ multiplier is reasonable but doesn’t feel quite right to put 10% on above ∞ :)
This makes me further confused about the gap between the donor lottery and the alignment review.
You are understanding correctly that the Shapley value multiplier is responsible for preventing double-counting, but you’re making a mistake when you say that it “implies that shallowly evaluated giving is as impactful as “0 to 0.4″ of in-depth evaluated giving”; the latter doesn’t follow.
In the two player game, you have Value({}), Value({1}), Value({2}), Value({1,2}), and the Shapley value for player 1 (the funders) is ([Value({1})- Value({})] + [Value({1,2})- Value({2})] )/2, and the value of player 2 (the donor lottery winner) is ([Value({2})- Value({})] + [Value({1,2})- Value({1})] )/2
In this case, I’m taking ([Value({2})- Value({})] to be ~0 for simplicity, so the value of player 2 is [Value({1,2})- Value({1})] )/2. Note that this is just the counterfactual value divided by a fraction.
If there were more players, it would be a little bit more complicated, but you’d end up with something similar to [Value({1,2,3})- Value({1,3})] )/3. Note again this is just the counterfactual value divided by a fraction.
But now, I don’t know how many players there are, so I just consider [Value({The World})- Value({The world without player 2})] )/(some estimates of how many players there are).
And the Shapley value multiplier would be 1/(some estimates of how many players there are).
At no point am I assuming that “shallowly evaluated giving is as impactful as 0 to 0.4 of in-depth evaluated giving”; the thing that I’m doing is just allocating value so that the sum of the value of each player is equal to the total value.
First, “note that this [misha: Shapley value of evaluator] is just the counterfactual value divided by a fraction [misha: by two].” Right, this is exactly the same in my comment. I further divide by total impact to calculate the Shapley multiplier.
Do you think we disagree?
Why isn’t my conclusion follows?
Second, you conclude “And the Shapley value multiplier would be 1/(some estimates of how many players there are)”, while your estimate is”0.3 to 0.5″. There have been like 30 participants over two lotteries that year, so you should have ended up with something an order of magnitude less like “3% to 10%”.
Am I missing something?
Third, for the model with more than two players, it’s unclear to me who the players are. If these are funders + N evaluators. You indeed will end up with 1N(1−V(funders)V(lottery)) because
Shapley multipliers should add up to 1, and
Shapley value of the funders is easy to calculate (any coalition without them lacks any impact).
Please note that V(funders) is V(default, …) from the comment above.
(Note that this model ignores that the beneficiary might win the lottery and no donations will be made.)
In the end,
I think that it is necessary to estimate X in “shallowly evaluated giving is as impactful as X times of in-depth evaluated giving”. Because if X≈1 impact of the evaluator is close to nil.
I might not understand how you model impact here, please, be more specific about the modeling setup and assumptions.
I don’t think that you should split evaluators. Well, basically because you want to disentangle the impact of evaluation and funding provision and not to calculate Adam’s personal impact.
Like, take it to the extreme: it would be pretty absurd to say that the overwhelmingly successful (e.g. seeding a new ACE Top Charity in yet unknown but highly tractable area of animal welfare and e.g. discovering AI alignment prodigy) donor lottery had an impact less than an average comment because there have been too many people (100K) contributing a dollar to participate in it.
No, we don’t agree. I think that Adam did better than other potential donor lottery winners, and so his counterfactual value is higher, and thus his Shapley value is also higher. If all the other donors had been clones of Adam, I agree that you’d just divide by n. Thus, the “In every example here, this will be equivalent to calculating counterfactual value, and dividing by the number of necessary stakeholders” is in fact wrong, and I was implicitly doing both of the following in one step: a. Calculating Shapley values with “evaluators” as one agent and b. thinking of Adam’s impact as a high proportion of the SV of the evaluator round,
The rest of our disagreements hinge on 2., and I agree that judging the evaluator step alone would make more sense.
There are a lot of things l like about this post. From small (e.g. the summary on top of it; and the table at the end) to large (e.g. it’s a good thing to do given a desire to understand how to quantify/estimate impact better).
Here are some things I am perplexed about or disagree with:
EAF hiring round estimate misses the enormous realized value of information. As far as I can see, EAF decided to move to London (partly) because of that.
> We moved to London (Primrose Hill) to better attract and retain staff and collaborate with other researchers in London and Oxford.
> Budget 2020: $994,000 (7.4 expected full-time equivalent employees). Our per-staff expenses have increased compared with 2019 because we do not have access to free office space anymore, and the cost of living in London is significantly higher than in Berlin.
The donor lottery evaluation seems to miss that $100K would have been donated otherwise.
Further, I would suggest another decomposition.
Impact = impact of running donor lottery as a tool (as opposed to donating without ~aggregation) + the counterfactuals impact of particular grants (as opposed to ~expected grants) + misc. side-effects (like a grantmaker joining LTFF).
I can understand why you added the first two terms. But it seems to me that
we can get a principled estimate about the first one based on arguments for donor lotteries (e.g. epistemic advantage coming from spending more time per dollar donated; and freed time of donors);
One can get more empirical and have a quick survey here.
estimating the second term is trickier because you need to make a guess about the impact of an average epistemically advantaged donation (as opposed to an average donation of 100K I which I think is missing from your estimate)
Both of these are doable because we saw how other donor lottery winners gave their money and how wealthy/invested donors give their money.
A good proxy for an impact of average donation might come from (a) EA survey donation data, (b) a quick survey of lottery participants. The latter seems superior because participating in an early donor lottery suggests a higher engagement with EA ideas &c.
After thinking a bit longer the choice of decomposition depends on what you want to understand better. It seems like your choice is better if you want to empirically understand whether the donor lottery is valuable.
Another weird thing is to see the 2017 Donor Lottery Grant having x5..10 higher impact than 2018 AI Alignment Literature Review and Charity Comparison.
I think it might come down to you not subtracting the counterfactual impact of donating 100K w/o lottery from donors’ lottery impact estimate.
The basic source of impact of the donor lottery and charity review comes from an epistemic advantage (someone dedicating more time to think/evaluate donations; people being better informed about the charities they are likely to donate to). Given how well received the literature review is it seems to be (quite likely) helpful to individual donors and given that it (according to your guess) impacted $100K..1M it should be kinda as impactful or more impactful than an abstract donor lottery.
And it’s hard to see this particular donor lottery as overwhelmingly more impactful than an average one.
I see now, that is weird. Note that if I calculate the total impact of the 100k to $1M I think Larks moved, the impact of that would be 100mQ to 2Q (change the Shapley value fraction in the Guessstimate to 1), which is closer to the 500mQ to 4Q I estimated from the 2017 Donor Lottery. And the difference can be attributed to a) Investing in organizations which are starting up, b) the high cost of producing AI safety papers, coupled with cause neutrality, and c) further error.
Good point re: value of information
Re: “The donor lottery evaluation seems to miss that $100K would have been donated otherwise”: I don’t think it does. In the “total project impact” section, I clarify that “Note that in order to not double count impact, the impact has to be divided between the funding providers and the grantee (and possibly with the new hires as well).”
Thank you, Nuno!
Am I understand correctly that the Shapley value multiplier (0.3 to 0.5) is responsible for preventing double counting?
If so why don’t you apply it to Positive status effects? The effect was also partially enabled by the funding providers (maybe less so).
Huh! I am surprised that your Shapley value calculation is not explicit but is reasonable.
Let’s limit ourselves to two players (= funding providers who are only capable of shallow evaluations and grantmakers who are capable of in-depth evaluation but don’t have their own funds). You get Shapley mult.=V(lottery, funding in-depth evaluated projects)−V(default, funding shallowly evaluated projects)2V(lottery). Your estimate of “0.3 to 0.5” implies that shallowly evaluated giving is as impactful as “0 to 0.4″ of in-depth evaluated giving.
This x2.5..∞ multiplier is reasonable but doesn’t feel quite right to put 10% on above ∞ :)
This makes me further confused about the gap between the donor lottery and the alignment review.
You are understanding correctly that the Shapley value multiplier is responsible for preventing double-counting, but you’re making a mistake when you say that it “implies that shallowly evaluated giving is as impactful as “0 to 0.4″ of in-depth evaluated giving”; the latter doesn’t follow.
In the two player game, you have Value({}), Value({1}), Value({2}), Value({1,2}), and the Shapley value for player 1 (the funders) is ([Value({1})- Value({})] + [Value({1,2})- Value({2})] )/2, and the value of player 2 (the donor lottery winner) is ([Value({2})- Value({})] + [Value({1,2})- Value({1})] )/2
In this case, I’m taking ([Value({2})- Value({})] to be ~0 for simplicity, so the value of player 2 is [Value({1,2})- Value({1})] )/2. Note that this is just the counterfactual value divided by a fraction.
If there were more players, it would be a little bit more complicated, but you’d end up with something similar to [Value({1,2,3})- Value({1,3})] )/3. Note again this is just the counterfactual value divided by a fraction.
But now, I don’t know how many players there are, so I just consider [Value({The World})- Value({The world without player 2})] )/(some estimates of how many players there are).
And the Shapley value multiplier would be 1/(some estimates of how many players there are).
At no point am I assuming that “shallowly evaluated giving is as impactful as 0 to 0.4 of in-depth evaluated giving”; the thing that I’m doing is just allocating value so that the sum of the value of each player is equal to the total value.
Thank you for engaging!
First, “note that this [misha: Shapley value of evaluator] is just the counterfactual value divided by a fraction [misha: by two].” Right, this is exactly the same in my comment. I further divide by total impact to calculate the Shapley multiplier.
Do you think we disagree?
Why isn’t my conclusion follows?
Second, you conclude “And the Shapley value multiplier would be 1/(some estimates of how many players there are)”, while your estimate is”0.3 to 0.5″. There have been like 30 participants over two lotteries that year, so you should have ended up with something an order of magnitude less like “3% to 10%”.
Am I missing something?
Third, for the model with more than two players, it’s unclear to me who the players are. If these are funders + N evaluators. You indeed will end up with 1N(1−V(funders)V(lottery)) because
Shapley multipliers should add up to 1, and
Shapley value of the funders is easy to calculate (any coalition without them lacks any impact).
Please note that V(funders) is V(default, …) from the comment above.
(Note that this model ignores that the beneficiary might win the lottery and no donations will be made.)
In the end,
I think that it is necessary to estimate X in “shallowly evaluated giving is as impactful as X times of in-depth evaluated giving”. Because if X≈1 impact of the evaluator is close to nil.
I might not understand how you model impact here, please, be more specific about the modeling setup and assumptions.
I don’t think that you should split evaluators. Well, basically because you want to disentangle the impact of evaluation and funding provision and not to calculate Adam’s personal impact.
Like, take it to the extreme: it would be pretty absurd to say that the overwhelmingly successful (e.g. seeding a new ACE Top Charity in yet unknown but highly tractable area of animal welfare and e.g. discovering AI alignment prodigy) donor lottery had an impact less than an average comment because there have been too many people (100K) contributing a dollar to participate in it.
Yes, we agree
No, we don’t agree. I think that Adam did better than other potential donor lottery winners, and so his counterfactual value is higher, and thus his Shapley value is also higher. If all the other donors had been clones of Adam, I agree that you’d just divide by n. Thus, the “In every example here, this will be equivalent to calculating counterfactual value, and dividing by the number of necessary stakeholders” is in fact wrong, and I was implicitly doing both of the following in one step: a. Calculating Shapley values with “evaluators” as one agent and b. thinking of Adam’s impact as a high proportion of the SV of the evaluator round,
The rest of our disagreements hinge on 2., and I agree that judging the evaluator step alone would make more sense.