Iâd argue that you need to use a point estimate to decide what bets to make, and that you should make that point estimate by (1) geomean-pooling raw estimates of parameters, (2) reasoning over distributions of all parameters, then (3) taking arithmean of the resulting distribution-over-probabilities and (4) acting according to that mean probability.
I think âact according to that mean probabilityâ is wrong for many important decisions you might want to takeâanalogous to buying a lot of trousers with 1.97 legs in my example in the essay. No additional comment if that is what you meant though and were just using shorthand for that position.
Clarifying, I do agree that there are some situations where you need something other than a subjective p(risk) to compare EV(value|action A) with EV(value|action B). I donât actually know how to construct a clear analogy from the 1.97-legged trousers example if the variable weâre meaning is probabilities (though I agree that there are non-analogous examples; VOI for example).
Iâll go further, though, and claim that what really matters is what worlds the risk is distributed over, and that expanding the point-estimate probability to a distribution of probabilities, by itself, doesnât add any real value. If it is to be a valuable exercise, you have to be careful what youâre expanding and what youâre refusing to expand.
More concretely, you want to be expanding over things your intervention wonât control, and then asking about your interventionâs effect at each point in things-you-wonât-control-space, then integrating back together. If you expand over any axis of uncertainty, then not only is there a multiplicity of valid expansions, but the natural interpretation will be misleading.
For example, say we have a 10% chance of drawing a dangerous ball from a series of urns, and 90% chance of drawing a safe one. If we describe it as (1) â50% chance of 9.9% risk, 50% chance of 10.1% riskâ or (2) â50% chance of 19% risk, 50% chance of 1% riskâ or (3) â10% chance of 99.1% risk, 90% chance of 0.1% riskâ, what does it change our opinion of <intervention A>? (You can, of course, construct a two-step ball-drawing procedure that produces any of these distributions-over-probabilities.)
I think the natural intuition is that interventions are best in (2), because most probabilities of risk are middle-ish, and worst in (3), because probability of risk is near-determined. And this, I think, is analogous to the argument of the post that anti-AI-risk interventions are less valuable than the point-estimate probability would indicate.
But that argument assumes (and requires) that our interventions can only chance the second ball-drawing step, and not the first. So using that argument requires that, in the first place, we sliced the distribution up over things we couldnât control. (If that is the thing we can control with our intervention, then interventions are best in the world of (3).)
Back to the argument of the original post: Youâre deriving a distribution over several p(X|Y) parameters from expert surveys, and so the bottom-line distribution over total probabilities reflects the uncertainty in expertsâ opinions on those conditional probabilities. Is it right to model our potential interventions as influencing the resolution of particular p(X|Y) rolls, or as influencing the distribution of p(X|Y) at a particular stage?
I claim itâs possible to argue either side.
Maybe a question like âp(much harder to build aligned than misaligned AGI | strong incentives to build AGI systems)â (the second survey question) is split between a quarter of the experts saying ~0% and three-quarters of the experts saying ~100%. (This extremizes the example, to sharpen the hypothetical analysis.) We interpret this as saying thereâs a one-quarter chance weâre ~perfectly safe and a three-quarters chance that itâs hopeless to develop and aligned AGI instead of a misaligned one.
If we interpret that as if God will roll a die and put us in the âmuch harderâ world with three-quarters probability and the ânot much harderâ world with one-quarters probability, then maybe our work to increase the we get an aligned AGI is low-value, because itâs unlikely to move either the ~0% or ~100% much lower (and we canât change the die). If this was the only stage, then maybe all of working on AGI risk is worthless.
But âthree-quarter chance itâs hopelessâ is also consistent with a scenario where thereâs a three-quarters chance that AGI development will be available to anyone, and many low-resourced actors will not have alignment teams and find it ~impossible to develop with alignment, but a one-quarter chance that AGI development will be available only to well-resourced actors, who will find it trivial to add on an alignment team and develop alignment. But then working on AGI risk might not be worthless, since we can work on increasing the chance that AGI development is only available to actors with alignment teams.
I claim that it isnât clear, from the survey results, whether the distribution of expertsâ probabilities for each step reflect something more like the God-rolls-a-die model, or different opinions about the default path of a thing we can intervene on. And if thatâs not clear, then itâs not clear what to do with the distribution-over-probabilities from the main results. Probably theyâre a step forward in our collective understanding, but I donât think you can conclude from the high chances of low risk that thereâs a low value to working on risk mitigation.
Clarifying, I do agree that there are some situations where you need something other than a subjective p(risk) to compare EV(value|action A) with EV(value|action B). I donât actually know how to construct a clear analogy from the 1.97-legged trousers example if the variable weâre meaning is probabilities (though I agree that there are non-analogous examples; VOI for example).
Iâll go further, though, and claim that what really matters is what worlds the risk is distributed over, and that expanding the point-estimate probability to a distribution of probabilities, by itself, doesnât add any real value. If it is to be a valuable exercise, you have to be careful what youâre expanding and what youâre refusing to expand.
More concretely, you want to be expanding over things your intervention wonât control, and then asking about your interventionâs effect at each point in things-you-wonât-control-space, then integrating back together. If you expand over any axis of uncertainty, then not only is there a multiplicity of valid expansions, but the natural interpretation will be misleading.
For example, say we have a 10% chance of drawing a dangerous ball from a series of urns, and 90% chance of drawing a safe one. If we describe it as (1) â50% chance of 9.9% risk, 50% chance of 10.1% riskâ or (2) â50% chance of 19% risk, 50% chance of 1% riskâ or (3) â10% chance of 99.1% risk, 90% chance of 0.1% riskâ, what does it change our opinion of <intervention A>? (You can, of course, construct a two-step ball-drawing procedure that produces any of these distributions-over-probabilities.)
I think the natural intuition is that interventions are best in (2), because most probabilities of risk are middle-ish, and worst in (3), because probability of risk is near-determined. And this, I think, is analogous to the argument of the post that anti-AI-risk interventions are less valuable than the point-estimate probability would indicate.
But that argument assumes (and requires) that our interventions can only chance the second ball-drawing step, and not the first. So using that argument requires that, in the first place, we sliced the distribution up over things we couldnât control. (If that is the thing we can control with our intervention, then interventions are best in the world of (3).)
Back to the argument of the original post: Youâre deriving a distribution over several p(X|Y) parameters from expert surveys, and so the bottom-line distribution over total probabilities reflects the uncertainty in expertsâ opinions on those conditional probabilities. Is it right to model our potential interventions as influencing the resolution of particular p(X|Y) rolls, or as influencing the distribution of p(X|Y) at a particular stage?
I claim itâs possible to argue either side.
Maybe a question like âp(much harder to build aligned than misaligned AGI | strong incentives to build AGI systems)â (the second survey question) is split between a quarter of the experts saying ~0% and three-quarters of the experts saying ~100%. (This extremizes the example, to sharpen the hypothetical analysis.) We interpret this as saying thereâs a one-quarter chance weâre ~perfectly safe and a three-quarters chance that itâs hopeless to develop and aligned AGI instead of a misaligned one.
If we interpret that as if God will roll a die and put us in the âmuch harderâ world with three-quarters probability and the ânot much harderâ world with one-quarters probability, then maybe our work to increase the we get an aligned AGI is low-value, because itâs unlikely to move either the ~0% or ~100% much lower (and we canât change the die). If this was the only stage, then maybe all of working on AGI risk is worthless.
But âthree-quarter chance itâs hopelessâ is also consistent with a scenario where thereâs a three-quarters chance that AGI development will be available to anyone, and many low-resourced actors will not have alignment teams and find it ~impossible to develop with alignment, but a one-quarter chance that AGI development will be available only to well-resourced actors, who will find it trivial to add on an alignment team and develop alignment. But then working on AGI risk might not be worthless, since we can work on increasing the chance that AGI development is only available to actors with alignment teams.
I claim that it isnât clear, from the survey results, whether the distribution of expertsâ probabilities for each step reflect something more like the God-rolls-a-die model, or different opinions about the default path of a thing we can intervene on. And if thatâs not clear, then itâs not clear what to do with the distribution-over-probabilities from the main results. Probably theyâre a step forward in our collective understanding, but I donât think you can conclude from the high chances of low risk that thereâs a low value to working on risk mitigation.