Kudos for this write-up, and for your many other posts (both here and on LessWrong, it seems) on uncertainty.
Overall, I’m very much in the “Probabilities are pretty great and should eventually be used for most things” camp. That said, I think the “Scout vs. Soldier” mindset is useful, so to speak; investigating both sides is pretty useful. I’d definitely assign some probability to being wrong here.
My impression is that we’re probably in broad agreement here.
Some quick points that come to mind:
The debate on “are explicit probabilities useful” is very similar to those of “are metrics useful”, “are cost-benefit analyses useful”, and “is consequentialist reasoning useful.” I expect that there’s broad correlation between those who agree/disagree with these.
In cases where probabilities are expected to be harmfully, hopefully probabilities could be used to tell us as such. Like, we could predict that explicit and public use would be harmful.
I’d definitely agree that it’s very possible to use probabilities poorly. I think a lot of Holden’s criticisms here would fall into this camp. Neural Nets for a while were honestly quite poor, but thankfully that didn’t lead to scientists abandoning those. I think probabilities are a lot better now, but we could learn to get much better than them later. I’m not sure how we can get much better without them.
The optimizer’s curse can be adjusted for with reasonable use of Bayes. Bayesian hierarchical models should deal with it quite well. There’s been some discussion of this around “Goodhart” on LessWrong.
I think I agree with pretty much all of that. And I’d say my position is close to yours, though slightly different; I might phrase mine like: “My understanding is that probabilities should always be used by ideal, rational agents with unlimited computational abilities etc. (Though that’s still slightly ‘received wisdom’ for me.) And I also think that most people, and perhaps even most EAs and rationalists, should use probabilities more often. But I doubt they should actually be used for most tiny decisions, by actual humans. And I think they’ve sometimes been used with far too little attention to their uncertainty—but I also think that this really isn’t an intrinsic issues with probabilities, and that intuitions are obviously also very often used overconfidently.”
(Though this post wasn’t trying to argue for that view, but rather to explore the potential downsides relatively neutrally and just see what that revealed.)
I’m not sure I know what you mean by the following two statements: “Probabilities [...] should eventually be used for most things” and “I think probabilities are a lot better now, but we could learn to get much better than them later.” Could you expand on those points? (E.g., would you say we should eventually use probabilities even the 100th time we make the same decision as before about what to put in our sandwiches?)
Other points:
1. Yes, I share that view. But I think it’s also interesting to note it’s not a perfect correlation. E.g. Roser writes:
while I believe that we always have probabilities, this paper refrains from taking a stance on how we ought to decide on the basis of these probabilities. The question whether we have probabilities is completely separate from the question how we ought to make use of them. Here, I only ask the former question. The two issues are often not kept separate: the camp that is in favour of relying on probabilities is often associated with processing them in line with expected utility theory. I myself am in favour of relying on probabilities but I reject expected utility theory (and related stances such as cost-benefit analysis), at least if it comes as a formal way of spelling out a maximizing consequentialist moral stance which does not properly incorporate rights.
2. Yes, I agree. Possibly I should’ve emphasised that more. I allude to a similar point with “It seems the expected value of me bothering to do this EPM is lower than the expected value of me just reading a few reviews and then “going with my gut” (and thus saving time for other things)”, and the accompanying footnote about utilitarianism.
4. I think I’ve seen what you’e referring to, e.g. in lukeprog’s post on the optimizer’s curse. And I think the basic idea makes sense to me (though not to the extent I could actually act on it right away if you handed me some data). But Chris Smith quotes the proposed solution, and then writes:
For entities with lots of past data on both the (a) expected values of activities and (b) precisely measured, realized values of the same activities, this may be an excellent solution.
In most scenarios where effective altruists encounter the optimizer’s curse, this solution is unworkable. The necessary data doesn’t exist.[7] The impact of most philanthropic programs has not been rigorously measured. Most funding decisions are not made on the basis of explicit expected value estimates. Many causes effective altruists are interested in are novel: there have never been opportunities to collect the necessary data.
The alternatives I’ve heard effective altruists propose involve attempts to approximate data-driven Bayesian adjustments as well as possible given the lack of data. I believe these alternatives either don’t generally work in practice or aren’t worth calling Bayesian.
That seems to me like at least a reason to expect the proposed solution to not work very well. My guess would be that we can still use our best guesses to make adjustments (e.g., just try to quantify our vague sense that a randomly chosen charity wouldn’t be very cost-effective), but I don’t think I understand the topic well enough to speak on that, really.
(And in any case, I’m not sure it’s directly relevant to the question of whether we should use EPs anyway, because, as covered in this post, it seems like the curse could affect alternative approaches too, and like the curse doesn’t mean we should abandon our best guess, just that we should be more uncertain about it.)
Hm… Some of this would take a lot more writing than would make sense in a blog post.
On overconfidence in probabilities vs. intuitions:
I think I mostly agree with you. One cool thing about probabilities is that they can be much more straightforwardly verified/falsified and measured using metrics for calibration. If we had much larger systems, I believe we could do a great deal of work to better ensure calibration with defined probabilities.
“should eventually be used for most things”
I’m not saying that humans should come up with unique probabilities for most things on most days. One example I’d consider “used for most things” is a case where an AI uses probabilities to tell humans which actions seem the best, and humans go with what the AI states. Similar could be said for “a trusted committee” that uses probabilities as an in-between.
“we could learn to get much better than them later”
I think there are strong claims that topics like Bayes, Causality, Rationality even, are still relatively poorly understood, and may be advanced a lot in the next 30-100 years. As we get better with them, I predict we would get better at formal modeling.
I reject expected utility theory (and related stances such as cost-benefit analysis), at least if it comes as a formal way of spelling out a maximizing consequentialist moral stance which does not properly incorporate rights.
This is a complicated topic. It think a lot of Utilitarians/Consequentialists wouldn’t deem many interpretations of rights as metaphysical or terminally-valuable things. Another way to look at it would be to attempt to map the rights to a utility function. Utility functions require very, very few conditions. I’m personally a bit cynical of values that can’t be mapped to utility functions, if even in a highly-uncertain way.
But Chris Smith quotes the proposed solution, and then writes…
It’s clear Chris Smith has thought about some of this topic a fair bit, but my impression is that I disagree with him. It’s quite possible that much of the disagreement is semantic; where he says ‘this solution is unworkable’ I may say, ‘the solution results in a very wide amount of uncertainty’. I think it’s clear to everyone (the main researchers anyway) that there’s little data about many of these topics, and that Bayesian or any kind of statistical manipulations can’t fundamentally convert “very little data” into “a great deal of confidence”.
Kudos for identifing that post. The main solution I was referring to was the one described in the second comment:
In statistics the solution you describe is called Hierarchical or Multilevel Modeling. You assume that you data is drawn from a set of distributions which have their parameters drawn from another distribution. This automatically shrinks your estimates of the distributions towards the mean. I think it’s a pretty useful trick to know and I think it would be good to do a writeup but I think you might need to have a decent grasp of bayesian statistics first.
I’m not saying that these are easy to solve, but rather, there is a mathematical strategy to generally fix them in ways that would make sense intuitively. There’s no better approach than to try to approximate the mathematical approach, or go with an approach that in-expectation does a decent job at approximating the mathematical approach.
Just found this post, coming in to comment a year late—Thanks Michael for the thoughtful post and Ozzie for the thoughtful comments!
I’m not saying that these are easy to solve, but rather, there is a mathematical strategy to generally fix them in ways that would make sense intuitively. There’s no better approach than to try to approximate the mathematical approach, or go with an approach that in-expectation does a decent job at approximating the mathematical approach.
I might agree with you about what’s (in some sense) mathematically possible (in principle). In practice, I don’t think people trying to approximate the ideal mathematical approach are going to have a ton of success (for reasons discussed in my post and quoted in Michael’s previous comment).
I don’t think searching for “an approach that in-expectation does a decent job at approximating the mathematical approach” is pragmatic.
In most important scenarios, we’re uncertain what approaches work well in-expectation. Our uncertainty about what works well in-expectation is the kind of uncertainty that’s hard to hash out in probabilities. A strict Bayesian might say, “That’s not a problem—with even more math, the uncertainty can be handled....”
While you can keep adding more math and technical patches to try and ground decision making in Bayesianism, pragmatism eventually pushes me in other directions. I think David Chapman explains this idea a hell of a lot better than I can in Rationalism’s Responses To Trouble.
Getting more concrete: Trusting my gut or listening to domain experts might turn out to be approaches that work well in some situation. If one of these approaches works, I’m sure someone could argue in hindsight that an approach works because it approximates an idealized mathematical approach. But I’m skeptical of the merits of work done in the reverse (i.e., trying to discover non-math approaches by looking for things that will approximate idealized mathematical approaches).
Hmm, I feel like you may be framing things quite differently to how I would, or something. My initial reaction to your comment is something like:
It seems usefully to conceptually separate data collection from data processing, where by the latter I mean using that data to arrive at probability estimates and decisions.
I think Bayesianism (in the sense of using Bayes’ theorem and a Bayesian interpretation of probability) and “math and technical patches” might tend to be part of the data processing, not the data collection. (Though they could also guide what data to look for. And this is just a rough conceptual divide.)
When Ozzie wrote about going with “an approach that in-expectation does a decent job at approximating the mathematical approach”, he was specifically referring to dealing with the optimizer’s curse. I’d consider this part of data processing.
Meanwhile, my intuitions (i.e., gut reactions) and what experts say are data. Attending to them is data collection, and then we have to decide how to integrate that with other things to arrive at probability estimates and decisions.
I don’t think we should see ourselves as deciding between either Bayesianism and “math and technical patches” or paying attention to my intuitions and domain experts. You can feed all sorts of evidence into Bayes theorem. I doubt any EA would argue we should form conclusions from “Bayesianism and math alone”, without using any data from the world (including even their intuitive sense of what numbers to plug in, or whether people they share their findings with seem skeptical). I’m not even sure what that’d look like.
And I think my intuitions or what domain experts says can very easily be made sense of as valid data within a Bayesian framework. Generally, my intuitions and experts are more likely to indicate X is true in worlds where X is true than where it’s not. This effect is stronger when the conditions for intuitive expertise are met, when experts’ incentives seem to be well aligned with seeking and sharing truth, etc. This effect is weaker when it seems that there are strong biases or misaligned incentives at play, or when it seems there might be.
(Perhaps this is talking past you? I’m not sure I understood your argument.)
I largely agree with what you said in this comment, though I’d say the line between data collection and data processing is often blurred in real-world scenarios.
I think we are talking past each other (not in a bad faith way though!), so I want to stop myself from digging us deeper into an unproductive rabbit hole.
Kudos for this write-up, and for your many other posts (both here and on LessWrong, it seems) on uncertainty.
Overall, I’m very much in the “Probabilities are pretty great and should eventually be used for most things” camp. That said, I think the “Scout vs. Soldier” mindset is useful, so to speak; investigating both sides is pretty useful. I’d definitely assign some probability to being wrong here.
My impression is that we’re probably in broad agreement here.
Some quick points that come to mind:
The debate on “are explicit probabilities useful” is very similar to those of “are metrics useful”, “are cost-benefit analyses useful”, and “is consequentialist reasoning useful.” I expect that there’s broad correlation between those who agree/disagree with these.
In cases where probabilities are expected to be harmfully, hopefully probabilities could be used to tell us as such. Like, we could predict that explicit and public use would be harmful.
I’d definitely agree that it’s very possible to use probabilities poorly. I think a lot of Holden’s criticisms here would fall into this camp. Neural Nets for a while were honestly quite poor, but thankfully that didn’t lead to scientists abandoning those. I think probabilities are a lot better now, but we could learn to get much better than them later. I’m not sure how we can get much better without them.
The optimizer’s curse can be adjusted for with reasonable use of Bayes. Bayesian hierarchical models should deal with it quite well. There’s been some discussion of this around “Goodhart” on LessWrong.
I think I agree with pretty much all of that. And I’d say my position is close to yours, though slightly different; I might phrase mine like: “My understanding is that probabilities should always be used by ideal, rational agents with unlimited computational abilities etc. (Though that’s still slightly ‘received wisdom’ for me.) And I also think that most people, and perhaps even most EAs and rationalists, should use probabilities more often. But I doubt they should actually be used for most tiny decisions, by actual humans. And I think they’ve sometimes been used with far too little attention to their uncertainty—but I also think that this really isn’t an intrinsic issues with probabilities, and that intuitions are obviously also very often used overconfidently.”
(Though this post wasn’t trying to argue for that view, but rather to explore the potential downsides relatively neutrally and just see what that revealed.)
I’m not sure I know what you mean by the following two statements: “Probabilities [...] should eventually be used for most things” and “I think probabilities are a lot better now, but we could learn to get much better than them later.” Could you expand on those points? (E.g., would you say we should eventually use probabilities even the 100th time we make the same decision as before about what to put in our sandwiches?)
Other points:
1. Yes, I share that view. But I think it’s also interesting to note it’s not a perfect correlation. E.g. Roser writes:
2. Yes, I agree. Possibly I should’ve emphasised that more. I allude to a similar point with “It seems the expected value of me bothering to do this EPM is lower than the expected value of me just reading a few reviews and then “going with my gut” (and thus saving time for other things)”, and the accompanying footnote about utilitarianism.
4. I think I’ve seen what you’e referring to, e.g. in lukeprog’s post on the optimizer’s curse. And I think the basic idea makes sense to me (though not to the extent I could actually act on it right away if you handed me some data). But Chris Smith quotes the proposed solution, and then writes:
That seems to me like at least a reason to expect the proposed solution to not work very well. My guess would be that we can still use our best guesses to make adjustments (e.g., just try to quantify our vague sense that a randomly chosen charity wouldn’t be very cost-effective), but I don’t think I understand the topic well enough to speak on that, really.
(And in any case, I’m not sure it’s directly relevant to the question of whether we should use EPs anyway, because, as covered in this post, it seems like the curse could affect alternative approaches too, and like the curse doesn’t mean we should abandon our best guess, just that we should be more uncertain about it.)
Hm… Some of this would take a lot more writing than would make sense in a blog post.
On overconfidence in probabilities vs. intuitions: I think I mostly agree with you. One cool thing about probabilities is that they can be much more straightforwardly verified/falsified and measured using metrics for calibration. If we had much larger systems, I believe we could do a great deal of work to better ensure calibration with defined probabilities.
I’m not saying that humans should come up with unique probabilities for most things on most days. One example I’d consider “used for most things” is a case where an AI uses probabilities to tell humans which actions seem the best, and humans go with what the AI states. Similar could be said for “a trusted committee” that uses probabilities as an in-between.
I think there are strong claims that topics like Bayes, Causality, Rationality even, are still relatively poorly understood, and may be advanced a lot in the next 30-100 years. As we get better with them, I predict we would get better at formal modeling.
This is a complicated topic. It think a lot of Utilitarians/Consequentialists wouldn’t deem many interpretations of rights as metaphysical or terminally-valuable things. Another way to look at it would be to attempt to map the rights to a utility function. Utility functions require very, very few conditions. I’m personally a bit cynical of values that can’t be mapped to utility functions, if even in a highly-uncertain way.
Kudos for identifing that post. The main solution I was referring to was the one described in the second comment:
The optimizer’s curse arguably is basically within the class of Goodhart-like problems https://www.lesswrong.com/posts/5bd75cc58225bf06703754b2/the-three-levels-of-goodhart-s-curse
I’m not saying that these are easy to solve, but rather, there is a mathematical strategy to generally fix them in ways that would make sense intuitively. There’s no better approach than to try to approximate the mathematical approach, or go with an approach that in-expectation does a decent job at approximating the mathematical approach.
That all seems to make sense to me. Thanks for the interesting reply!
Just found this post, coming in to comment a year late—Thanks Michael for the thoughtful post and Ozzie for the thoughtful comments!
I might agree with you about what’s (in some sense) mathematically possible (in principle). In practice, I don’t think people trying to approximate the ideal mathematical approach are going to have a ton of success (for reasons discussed in my post and quoted in Michael’s previous comment).
I don’t think searching for “an approach that in-expectation does a decent job at approximating the mathematical approach” is pragmatic.
In most important scenarios, we’re uncertain what approaches work well in-expectation. Our uncertainty about what works well in-expectation is the kind of uncertainty that’s hard to hash out in probabilities. A strict Bayesian might say, “That’s not a problem—with even more math, the uncertainty can be handled....”
While you can keep adding more math and technical patches to try and ground decision making in Bayesianism, pragmatism eventually pushes me in other directions. I think David Chapman explains this idea a hell of a lot better than I can in Rationalism’s Responses To Trouble.
Getting more concrete:
Trusting my gut or listening to domain experts might turn out to be approaches that work well in some situation. If one of these approaches works, I’m sure someone could argue in hindsight that an approach works because it approximates an idealized mathematical approach. But I’m skeptical of the merits of work done in the reverse (i.e., trying to discover non-math approaches by looking for things that will approximate idealized mathematical approaches).
Hmm, I feel like you may be framing things quite differently to how I would, or something. My initial reaction to your comment is something like:
It seems usefully to conceptually separate data collection from data processing, where by the latter I mean using that data to arrive at probability estimates and decisions.
I think Bayesianism (in the sense of using Bayes’ theorem and a Bayesian interpretation of probability) and “math and technical patches” might tend to be part of the data processing, not the data collection. (Though they could also guide what data to look for. And this is just a rough conceptual divide.)
When Ozzie wrote about going with “an approach that in-expectation does a decent job at approximating the mathematical approach”, he was specifically referring to dealing with the optimizer’s curse. I’d consider this part of data processing.
Meanwhile, my intuitions (i.e., gut reactions) and what experts say are data. Attending to them is data collection, and then we have to decide how to integrate that with other things to arrive at probability estimates and decisions.
I don’t think we should see ourselves as deciding between either Bayesianism and “math and technical patches” or paying attention to my intuitions and domain experts. You can feed all sorts of evidence into Bayes theorem. I doubt any EA would argue we should form conclusions from “Bayesianism and math alone”, without using any data from the world (including even their intuitive sense of what numbers to plug in, or whether people they share their findings with seem skeptical). I’m not even sure what that’d look like.
And I think my intuitions or what domain experts says can very easily be made sense of as valid data within a Bayesian framework. Generally, my intuitions and experts are more likely to indicate X is true in worlds where X is true than where it’s not. This effect is stronger when the conditions for intuitive expertise are met, when experts’ incentives seem to be well aligned with seeking and sharing truth, etc. This effect is weaker when it seems that there are strong biases or misaligned incentives at play, or when it seems there might be.
(Perhaps this is talking past you? I’m not sure I understood your argument.)
I largely agree with what you said in this comment, though I’d say the line between data collection and data processing is often blurred in real-world scenarios.
I think we are talking past each other (not in a bad faith way though!), so I want to stop myself from digging us deeper into an unproductive rabbit hole.