One way to think about this phenomenon of reversion to the mean is in the bandit problem setting, where you are choosing between a bunch of levers, to ascertain their payouts. This is not my area, so take this with a grain of salt, but here is my understanding. There are a bunch of levers. You can think that each lever gives out payoffs with a normal distribution. The mean and variance of each lever is itself randomly initialised with some known distribution. There are a many ways to choose a lever, that have pretty different behaviour. You could (1) be a myopic bayesian, and choose the lever with highest expected immediate payoff, taking into account that levers with high variance probably aren’t as good as they appear, (2) take a simplistic historical approach, and choose the lever with the highest payoff when you pulled it in the past, or (3) use the “upper confidence bound” algorithm, which chooses the lever for which the upper bound of your confidence interval of the payoff is the highest. It turns out that option (3), which is pretty over-optimistic about, and not very cautious about your impact—converges to optimality, and does so more quickly than (noisy) variations of (1) and (2). If you’re feollowing a strategy like (3), then you’ll switch a lot, and the levers that you pull will often not appear optimal in hindsight, but that’s just the consequence of proper exploration.
NB. If any experts want to clarify/fix anything I’ve said then please do.
An idealised bayesian would know that these “high upper confidence bound” levers probably pay off less than they appear. So although we spend more time thinking about or focusing on higher-variance risks, we should not fear or worry about them any extra.
The bandit problem is definitely related, although I’m not sure it’s the best way to formulate the situation here. The main issue is that the bandit formulation, here, treats learning about the magnitude of a risk and working to address the risk as the same action—when, in practice, they often come apart.
Here’s a toy model/analogy that feels a bit more like it fits the case, in my mind.
Let’s say there are two types of slot machines: one that has a 0% chance of paying and one that has a 100% chance of paying. Your prior gives you a 90% credence that each machine is non-paying.[1]
Unfortunately: When you pull the lever on either machine, you don’t actually get to see what the payout is. However, there’s some research you can do to try to get a clearer sense of what each machine’s “type” is.
And this research is more tractable in the case of the first machine. For example: Maybe the first machine has identifying information on it, like a model number, which might allow you to (e.g.) call up the manufacturer and ask them. The second machine is just totally nondescript.
The most likely outcome, then, is that you quickly find out that the first slot machine is almost certainly non-paying—but continue to have around a 10% credence that the second machine pays.
In this scenario, you should keep pulling the lever on the second machine. You should also, even as a rational Bayesian, actually be more optimistic about the second machine.
(By analogy, I think we actually should tend to fear speculative existential risks more.]
A more sophisticated version of this scenario would have a continuum of slot machine types and a skewed prior over the likelihood of different types arising.
Interesting, that makes perfect sense. However, if there’s no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and stop worrying about the unknowable one. So I’m not sure your story explains why we end up fixating on the uncertain interventions (AIS research).
Another way to explain why the uncertain risks look big would be that we are unable to stop society pulling the AI progress lever until we have proven it to be dangerous. Definitely risky activities just get stopped! Maybe that’s implicitly how your model gets the desired result.
However, if there’s no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and stop worrying about the unknowable one. So I’m not sure your story explains why we end up fixating on the uncertain interventions (AIS research).
The story does require there to be only a very limited number of arms that we initially think have a non-negligible chance of paying. If there are unlimited arms, then one of them should be both paying and easily identifiable.
So the story (in the case of existential risks) is that there are only a very small number of risks that, on the basis of limited argument/evidence, initially seem like they might lead to extinction or irrecoverable collapse by default. Maybe this set looks like: nuclear war, misaligned AI, pandemics, nanotechnology, climate change, overpopulation / resource depletion.
If we’re only talking about a very limited set, like this, then it’s not too surprising that we’d end up most worried about an ambiguous risk.
Interesting, that makes perfect sense. However, if there’s no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and forget about the unknowable one. So I’m not sure your story explains why we end up fixating on the uncertain interventions (AIS research). It seems you need an additional element where society is unable to stop itself pulling the AI progress lever...
I think we could probably invest a lot more time and resources in interventions that are plausibly good, in order to get more evidence about them. We should probably do more research, although I realise this point is somewhat self-serving. For larger donors, this probably means diversifying their giving more if the value of information diminishes steeply enough, which I think might be the case.
Psychologically, I think we should be a bit more resilient to failure and change. When people consider the idea that they might be giving to cause areas that could turnout to be completely fruitless, I think they find it psychologically difficult. In some ways, just thinking “Look, I’m just exploring this to get the information about how good it its, and if it’s bad, I’ll just change. Or, if it doesn’t do as well as I thought, I’ll just change.” can be quite comforting if you worry about these things.
The extreme view that you could have is “We should just start investing time and money in interventions with high expected value, but little or no evidential support.” A more modest proposal, that I tentatively endorse, is “We should probably start explicitly including the value of information, and assessment of causes and interventions, rather than treating it as an afterthought to concrete value.” In my experience, information value can swamp concrete value; and if that is the case, it really shouldn’t be an afterthought. Instead it should be one of the primary drivers of values, not an afterthought in your calculation summary.
Amanda is talking about the philosophical principle, whereas I’m talking about the algorithm that roughly satisfies it. The principle is that a non-myopic Bayesian will take into account not just the immediate payoff, but also the information value of an action. The algorithm—upper confidence bound—efficiently approximates this behaviour. The fact that UCB is optimistic (about its impact) suggests that we might want to behave similarly, in order capture the information value. (“Information value of an action” and “exploration value” are synonymous here.)
One way to think about this phenomenon of reversion to the mean is in the bandit problem setting, where you are choosing between a bunch of levers, to ascertain their payouts. This is not my area, so take this with a grain of salt, but here is my understanding. There are a bunch of levers. You can think that each lever gives out payoffs with a normal distribution. The mean and variance of each lever is itself randomly initialised with some known distribution. There are a many ways to choose a lever, that have pretty different behaviour. You could (1) be a myopic bayesian, and choose the lever with highest expected immediate payoff, taking into account that levers with high variance probably aren’t as good as they appear, (2) take a simplistic historical approach, and choose the lever with the highest payoff when you pulled it in the past, or (3) use the “upper confidence bound” algorithm, which chooses the lever for which the upper bound of your confidence interval of the payoff is the highest. It turns out that option (3), which is pretty over-optimistic about, and not very cautious about your impact—converges to optimality, and does so more quickly than (noisy) variations of (1) and (2). If you’re feollowing a strategy like (3), then you’ll switch a lot, and the levers that you pull will often not appear optimal in hindsight, but that’s just the consequence of proper exploration.
NB. If any experts want to clarify/fix anything I’ve said then please do.
An idealised bayesian would know that these “high upper confidence bound” levers probably pay off less than they appear. So although we spend more time thinking about or focusing on higher-variance risks, we should not fear or worry about them any extra.
The bandit problem is definitely related, although I’m not sure it’s the best way to formulate the situation here. The main issue is that the bandit formulation, here, treats learning about the magnitude of a risk and working to address the risk as the same action—when, in practice, they often come apart.
Here’s a toy model/analogy that feels a bit more like it fits the case, in my mind.
Let’s say there are two types of slot machines: one that has a 0% chance of paying and one that has a 100% chance of paying. Your prior gives you a 90% credence that each machine is non-paying.[1]
Unfortunately: When you pull the lever on either machine, you don’t actually get to see what the payout is. However, there’s some research you can do to try to get a clearer sense of what each machine’s “type” is.
And this research is more tractable in the case of the first machine. For example: Maybe the first machine has identifying information on it, like a model number, which might allow you to (e.g.) call up the manufacturer and ask them. The second machine is just totally nondescript.
The most likely outcome, then, is that you quickly find out that the first slot machine is almost certainly non-paying—but continue to have around a 10% credence that the second machine pays.
In this scenario, you should keep pulling the lever on the second machine. You should also, even as a rational Bayesian, actually be more optimistic about the second machine.
(By analogy, I think we actually should tend to fear speculative existential risks more.]
A more sophisticated version of this scenario would have a continuum of slot machine types and a skewed prior over the likelihood of different types arising.
Interesting, that makes perfect sense. However, if there’s no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and stop worrying about the unknowable one. So I’m not sure your story explains why we end up fixating on the uncertain interventions (AIS research).
Another way to explain why the uncertain risks look big would be that we are unable to stop society pulling the AI progress lever until we have proven it to be dangerous. Definitely risky activities just get stopped! Maybe that’s implicitly how your model gets the desired result.
The story does require there to be only a very limited number of arms that we initially think have a non-negligible chance of paying. If there are unlimited arms, then one of them should be both paying and easily identifiable.
So the story (in the case of existential risks) is that there are only a very small number of risks that, on the basis of limited argument/evidence, initially seem like they might lead to extinction or irrecoverable collapse by default. Maybe this set looks like: nuclear war, misaligned AI, pandemics, nanotechnology, climate change, overpopulation / resource depletion.
If we’re only talking about a very limited set, like this, then it’s not too surprising that we’d end up most worried about an ambiguous risk.
Interesting, that makes perfect sense. However, if there’s no correlation between the payoff of an arm and our ability to know it, then we should eventually find an arm that pays off 100% of the time with high probability, pull that arm, and forget about the unknowable one. So I’m not sure your story explains why we end up fixating on the uncertain interventions (AIS research). It seems you need an additional element where society is unable to stop itself pulling the AI progress lever...
Do you have a sense how this argument relates to Amanda Askell’s argument for the importance of value of information?
Amanda is talking about the philosophical principle, whereas I’m talking about the algorithm that roughly satisfies it. The principle is that a non-myopic Bayesian will take into account not just the immediate payoff, but also the information value of an action. The algorithm—upper confidence bound—efficiently approximates this behaviour. The fact that UCB is optimistic (about its impact) suggests that we might want to behave similarly, in order capture the information value. (“Information value of an action” and “exploration value” are synonymous here.)
Thanks!