Again, none of this is to say that Bayesianism is fundamentally broken or that high-level Bayesian-ish things like “I have a very skeptical prior so I should not take this estimate of impact at face value” are crazy.
As a real world example:
Venture capitalists frequently fund things that they’re extremely uncertain about. It’s my impression that Bayesian calculations rarely play into these situations. Instead, smart VCs think hard and critically and come to conclusions based on processes that they probably don’t full understand themselves.
It could be that VCs have just failed to realize the amazingness of Bayesianism. However, given that they’re smart & there’s a ton of money on the table, I think the much more plausible explanation is that hardcore Bayesianism wouldn’t lead to better results than whatever it is that successful VCs actually do.
It’s always worth entertaining multiple models if you can do that at no cost. However, doing that often comes at some cost (money, time, etc). In situations with lots of uncertainty (where the optimizer’s curse is liable to cause significant problems), it’s worth paying much higher costs to entertain multiple models (or do other things I suggested) than it is in cases where the optimizer’s curse is unlikely to cause serious problems.
Hey Kyle, I’d stopped responding since I felt like we were well beyond the point where we were likely to convince one another or say things that those reading the comments would find insightful.
I understand why you think “good prior” needs to be defined better.
As I try to communicate (but may not quite say explicitly) in my post, I think that in situations where uncertainty is poorly understood, it’s hard to come up with priors that are good enough that choosing actions based explicit Bayesian calculations will lead to better outcomes than choosing actions based on a combination of careful skepticism, information gathering, hunches, and critical thinking.
I’d also be excited to see more people in the EA movement doing the sort of work that I think would put society in a good position for handling future problems when they arrive. E.g., I think a lot of people who associate with EA might be awfully good and pushing for progress in metascience/open science or promoting a free & open internet.
Thanks for raising this.
To be clear, I’m still a huge fan of GiveWell. GiveWell only shows up in so many examples in my post because I’m so familiar with the organization.
I mostly agree with the points Holden makes in his cluster thinking post (and his other related posts). Despite that, I still have serious reservations about some of the decision-making strategies used both at GW and in the EA community at large. It could be that Holden and I mostly agree, but other people take different positions. It could be that Holden and I agree about a lot of things at a high-level but then have significantly different perspectives about how those things we agree on at a high-level should actually manifest themselves in concrete decision making.
For what it’s worth, I do feel like the page you linked to from GiveWell’s website may downplay the role cost-effectiveness plays in its final recommendations (though GiveWell may have a good rebuttal).
In a response to Taymon’s comment, I left a specific example of something I’d like to see change. In general, I’d like people to be more reluctant to brute-force push their way through uncertainty by putting numbers on things. I don’t think people need to stop doing that entirely, but I think it should be done while keeping in mind something like: “I’m using lots of probabilities in a domain where I have no idea if I’m well-calibrated...I need to be extra skeptical of whatever conclusions I reach.”
Just to be clear, much of the deworming work supported by people in the EA community happens in areas where worm infections are more intense or are caused by worm species other than Trichuris & Ascaris. However, I believe a non-trivial amount of deworming done by charities supported by the EA community occurs in areas w/ primarily light infections from those worms.
Sure. To be clear, I think most of what I’m concerned about applies to prioritization decisions made in highly-uncertain scenarios. So far, I think the EA community has had very few opportunities to look back and conclusively assess whether highly-uncertain things it prioritized turned out to be worthwhile. (Ben makes a similar point at https://www.lesswrong.com/posts/Kb9HeG2jHy2GehHDY/effective-altruism-is-self-recommending.)
That said, there are cases where I believe mistakes are being made. For example, I think mass deworming in areas where almost all worm infections are light cases of trichuriasis or ascariasis is almost certainly not among the most cost-effective global health interventions.
Neither trichuriasis nor ascariasis appear to have common/significant/easily-measured symptoms when infections are light (i.e., when there are not many worms in an infected person’s body). To reach the conclusion that treating these infections has a high expected value, extrapolations are made from the results of a study that had some weird features and occurred in a very different environment (an environment with far heavier infections and additional types of worm infections). When GiveWell makes its extrapolations, lots of discounts, assumptions, probabilities, etc. are used. I don’t think people can make this kind of extrapolation reliably (even if they’re skeptical, smart, and thinking carefully). When unreliable estimates are combined with an optimization procedure, I worry about the optimizer’s curse.
Someone who is generally skeptical of people’s ability to productively use models in highly-uncertain situations might instead survey experts about the value of treating light trichuriasis & asariasis infections. Faced with the decision of funding either this kind of deworming or a different health program that looked highly-effective, I think the example person who ran surveys would choose the latter.
I think it’s super exciting—a really useful application of probability!
I don’t know as much as I’d like to about Tetlock’s work. My understanding is that the work has focused mostly on geopolitical events where forecasters have been awfully successful. Geopolitical events are a kind of thing I think people are in an OK position for predicting—i.e. we’ve seen a lot of geopolitical events in the past that are similar to the events we expect to see in the future. We have decent theories that can explain why certain events came to pass while others didn’t.
I doubt that Tetlock-style forecasting would be as fruitful in unfamiliar domains that involve Knightian-ish uncertainty. Forecasting may not be particularly reliable for questions like:
-Will we have a detailed, broadly accepted theory of consciousness this century?
-Will quantum computers take off in the next 50 years?
-Will any humans leave the solar system by 2100?
(That said, following Tetlock’s guidelines may still be worthwhile if you’re trying to predict hard-to-predict things.)
I’m struggling to understand how your proposed new group avoids the optimizer’s curse, and I’m worried we’re already talking past each other. To be clear, I’m don’t believe there’s something wrong with Bayesian methods in the abstract. Those methods are correct in a technical sense. They clearly work in situations where everything that matters can be completely quantified.
The position I’m taking is that the scope of real-world problems that those methods are useful for is limited because our ability to precisely quantify things is severely limited in many real-world scenarios. In my post, I try to build the case for why attempting Bayesian approaches in scenarios where things are really hard to quantify might be misguided.
Thanks Max! That paper looks interesting—I’ll have to give it a closer read at some point.
I agree with you that how the reliability of assessments varies between options is crucial.
Can you expand on how you would directly estimate the reliability of charity evaluations? I feel like there are a lot of realistic situations where this would be extremely difficult to do well.
Thanks for the detailed comment!
I expect we’ll remain in disagreement, but I’ll clarify where I stand on a couple of points you raised:
“Optimizer’s curse only matters when comparing better-understood projects to worse-understood projects, but you are talking about “prioritizing among funding opportunities that involve substantial, poorly understood uncertainty.”
Certainly, the optimizer’s curse may be a big deal when well-understood projects are compared with poorly-understood projects. However, I don’t think it’s the case that all projects involving “substantial, poorly understood uncertainty” are on the same footing. Rather, each project is on its own footing, and we’re somewhat ignorant about how firm that footing is.
“We can use prior distributions.”
Yes, absolutely. What I worry about is how reliable those priors will be. I maintain that, in many situations, it’s very hard to defend any particular prior.
“And there is no reason to assume that probabilistic decision makers will overestimate as opposed to underestimate.”
This gets at what I’m really worried about! Let’s assume decisionmakers coming up with probabilistic estimates to assess potential activities don’t have a tendency to overestimate or underestimate. However, once a decisionmaker has made many estimates, there is reason to believe the activities that look most promising likely involve overestimates (because of the optimizer’s curse).
“Here’s a question: how are you going to adjust for the optimizer’s curse if you don’t use probability (implicitly or explicitly)?”
This is a great question!
Rather than saying, “This is a hard problem, and I have an awesome solution no one else has proposed,” I’m trying to say something more like, “This is a problem we should acknowledge! Let’s also acknowledge that it’s a damn hard problem and may not have an easy solution!”
That said, I think there are approaches that have promise (but are not complete solutions):-Favoring opportunities that look promising under multiple models.
-Being skeptical of opportunities that look promising under only a single model.
-Learning more (if that can cause probability estimates to become less uncertain & hazy).
-Doing more things to put society in a good position to handle problems when they arise (or become apparent) instead of trying to predict problems before they arise (or become apparent).
“Here’s what it means, formally: given that I have an equal desire to be right about the existence of God and the nonexistence of God, and given some basic assumptions about my money and my desire for money, I would make a bet with at most 50:1 odds that all-powerful-God exists.”
This is how a lot of people think about statements of probability, and I think that’s usually reasonable. I’m concerned that people are sometimes accidentally equivocating between: “I would bet on this with at most 50:1 odds” and “this is as likely to occur as a perfectly fair 50-sided die being rolled and coming up ‘17’”
“But in Bayesian decision theory, they aren’t on the same footing. They have very different levels of robustness. They are not well-grounded and this matters for how readily we update away from them. Is the notion of robustness inadequate for solving some problem here?”
The notion of robustness points in the right direction, but I think it’s difficult (perhaps impossible) to reliably and explicitly quantify robustness in the situations we’re concerned about.
It’s definitely an interesting phenomenon & worth thinking about seriously.
Any procedures for optimizing for expected impact could go wrong if the value of long-term alliances and relationships isn’t accounted for.
Thanks Milan—I probably should have been a bit more detailed in my summary.
Here are the main issues I see:
-The optimizer’s curse is an underappreciated threat to those who prioritize among causes and programs that involve substantial, poorly understood uncertainty.
-I think EAs are unusually prone to wrong-way reductions: a fallacy where people try to solve messy, hard problems with tidy, formulaic approaches that actually create more issues than they resolve.
--I argue that trying to turn all uncertainty into something like numeric probability estimates is a wrong-way reduction that can have serious consequences.
--I argue that trying to use Bayesian methods in situations where well-ground priors are unavailable is often a wrong-way reduction. (For what it’s worth, I rarely see EAs actually deploy these Bayesian methods, but I often see people suggest that the proper approaches in hard situations involve “making a Bayesian adjustments.” In many of these situations, I’d argue that something closer to run-of-the-mill critical thinking beats Bayesianism.)
-I think EAs sometimes have an unwarranted bias towards numerical, formulaic approaches over less-quantitative approaches.