When do you think it’s best to use WFM vs expected value calculation?
Asking as I tend to use EV for charity choice. As I see it: pro of EV is it captures variables like scale really nicely, pro of WFM is it’s more robust to imperfect inputs.
I think EV is one valuable (but incomplete) metric for evaluating charities. WFMs can capture EV as well as other variables that are harder to incorporate quantitatively. However, creating BOTECs to estimate EV is a lot faster than making a full WFM. Which one to use is, in my view, a question of whether the importance of your decision justifies that extra effort or whether your time would be better spent on other decisions/work.
Regardless which one you choose, you should be careful not to rely on just the one tool. EV reasoning is vulnerable to Pascal’s Mugging and the Optimizer’s Curse. WFM is vulnerable to the issues I talked about in my post and more. The underlying point is that we need to supplement our tools with critical thinking to ensure we’re not falling victim to their weaknesses.
EV reasoning is vulnerable to Pascal’s Mugging and the Optimizer’s Curse.
One can account for priors (information besides the new evidence) to mitigate these issues, as suggested in Holden Karnofsky’s post about not taking expected value estimates literally (when they do not incorporate priors; I think GiveWell does take expected value estimates close to literally when they incorporate the vast majority of available evidence).
You are correct that there are ways to mitigate these issues. However, that does not mean that the issues completely disappear or that the method is without weakness.
The fundamental problem remains. Like I mentioned in my original post, any system for decision making is going to be trading away truth for practicality.
A more refined method means that some weaknesses will be less pronounced, though they frequently introduce new types of errors (like the WFM example in my post). We still need to account for methodological bias into our final decision.
You cite GiveWell as an example of an organization that takes EV estimates “close to literally”. I assume by this you mean the EV estimates they make with respect to cost-effectiveness. However, GiveWell outlines 5 things they keep in mind when considering cost-effectiveness here, including the following:
Because of the many limitations of cost-effectiveness estimates, we consider other factors when recommending programs or grants. For example, confidence in an organization’s track record and the strength of the evidence for an intervention generally also carry significant weight in our investigations.
In other words, GiveWell seems to believe that cost-effectiveness is a useful tool, but it is not perfect. There are methodological biases with that method, so they acknowledge those limitations and incorporate other factors before making a final decision.
The fundamental problem remains. Like I mentioned in my original post, any system for decision making is going to be trading away truth for practicality.
Agreed. At the same time, I struggle to see practical cases where it makes sense to spend significant time on WFMs. I would rather improve cost-effectiveness analyses (CEA). For example, accounting better for priors, modelling more effects, and gathering more evidence to decrease uncertainty in key inputs. GiveWell uses CEAs all the time, but has never included a WFM in their public analyses as far as I know. Gemini did not find any examples either.
You cite GiveWell as an example of an organization that takes EV estimates “close to literally”. I assume by this you mean the EV estimates they make with respect to cost-effectiveness.
Yes. Elie Hassenfeld, GiveWell’s CEO, mentioned the following on the Clearer Thinking podcast.
GiveWell cost- effectiveness estimates are not the only input into our decisions to fund malaria programs and deworming programs, there are some other factors, but they’re certainly 80% plus of the case.
Isabel Arjmand from GiveWell elaborated on the above.
The numerical cost-effectiveness estimate in the spreadsheet is nearly always the most important factor in our recommendations, but not the only factor. That is, we don’t solely rely on our spreadsheet-based analysis of cost-effectiveness when making grants.
We don’t have an institutional position on exactly how much of the decision comes down to the spreadsheet analysis (though Elie’s take of “80% plus” definitely seems reasonable!) and it varies by grant, but many of the factors we consider outside our models (e.g. qualitative factors about an organization) are in the service of making impact-oriented decisions. See this post for more discussion.
For a small number of grants, the case for the grant relies heavily on factors other than expected impact of that grant per se. For example, we sometimes make exit grants in order to be a responsible funder and treat partner organizations considerately even if we think funding could be used more cost-effectively elsewhere.
I struggle to see practical cases where it makes sense to spend significant time on WFMs. I would rather improve cost-effectiveness analyses (CEA).
I think that is a reasonable decision. I think WFMs are very useful for certain types of decisions, but not always. I use CEAs much more often. My claim is *not* that more people should be using WFMs. If anything, my post should be seen as a warning to those who do.
My claim is that people should take time to understand their tools and account for their weaknesses. Accounting for weaknesses should happen not just within the tool, but outside of it when making the final decision.
I think GiveWell is a good example of this. If CEAs made up 100% of their decision making process, their decisions would be heavily influenced by the weaknesses of CEAs as a method. However, GiveWell acknowledges these weaknesses and uses CEAs as a primary deciding factor, while also incorporating other factors as well.
My claim is that people should take time to understand their tools and account for their weaknesses. Accounting for weaknesses should happen not just within the tool, but outside of it when making the final decision.
I think GiveWell is a good example of this. If CEAs made up 100% of their decision making process, their decisions would be heavily influenced by the weaknesses of CEAs as a method. However, GiveWell acknowledges these weaknesses and uses CEAs as a primary deciding factor, while also incorporating other factors as well.
Thanks for writing this up :)
When do you think it’s best to use WFM vs expected value calculation?
Asking as I tend to use EV for charity choice. As I see it: pro of EV is it captures variables like scale really nicely, pro of WFM is it’s more robust to imperfect inputs.
I think EV is one valuable (but incomplete) metric for evaluating charities. WFMs can capture EV as well as other variables that are harder to incorporate quantitatively. However, creating BOTECs to estimate EV is a lot faster than making a full WFM. Which one to use is, in my view, a question of whether the importance of your decision justifies that extra effort or whether your time would be better spent on other decisions/work.
Regardless which one you choose, you should be careful not to rely on just the one tool. EV reasoning is vulnerable to Pascal’s Mugging and the Optimizer’s Curse. WFM is vulnerable to the issues I talked about in my post and more. The underlying point is that we need to supplement our tools with critical thinking to ensure we’re not falling victim to their weaknesses.
Hi Evan.
One can account for priors (information besides the new evidence) to mitigate these issues, as suggested in Holden Karnofsky’s post about not taking expected value estimates literally (when they do not incorporate priors; I think GiveWell does take expected value estimates close to literally when they incorporate the vast majority of available evidence).
You are correct that there are ways to mitigate these issues. However, that does not mean that the issues completely disappear or that the method is without weakness.
The fundamental problem remains. Like I mentioned in my original post, any system for decision making is going to be trading away truth for practicality.
A more refined method means that some weaknesses will be less pronounced, though they frequently introduce new types of errors (like the WFM example in my post). We still need to account for methodological bias into our final decision.
You cite GiveWell as an example of an organization that takes EV estimates “close to literally”. I assume by this you mean the EV estimates they make with respect to cost-effectiveness. However, GiveWell outlines 5 things they keep in mind when considering cost-effectiveness here, including the following:
In other words, GiveWell seems to believe that cost-effectiveness is a useful tool, but it is not perfect. There are methodological biases with that method, so they acknowledge those limitations and incorporate other factors before making a final decision.
Agreed. At the same time, I struggle to see practical cases where it makes sense to spend significant time on WFMs. I would rather improve cost-effectiveness analyses (CEA). For example, accounting better for priors, modelling more effects, and gathering more evidence to decrease uncertainty in key inputs. GiveWell uses CEAs all the time, but has never included a WFM in their public analyses as far as I know. Gemini did not find any examples either.
Yes. Elie Hassenfeld, GiveWell’s CEO, mentioned the following on the Clearer Thinking podcast.
Isabel Arjmand from GiveWell elaborated on the above.
I think that is a reasonable decision. I think WFMs are very useful for certain types of decisions, but not always. I use CEAs much more often. My claim is *not* that more people should be using WFMs. If anything, my post should be seen as a warning to those who do.
My claim is that people should take time to understand their tools and account for their weaknesses. Accounting for weaknesses should happen not just within the tool, but outside of it when making the final decision.
I think GiveWell is a good example of this. If CEAs made up 100% of their decision making process, their decisions would be heavily influenced by the weaknesses of CEAs as a method. However, GiveWell acknowledges these weaknesses and uses CEAs as a primary deciding factor, while also incorporating other factors as well.
Agreed.