Hi Andreas, Excited you are doing this. As you can maybe tell I really liked your paper on Heuristics for Clueless Agents (although not sure my post above has sold it particularly well). Excited to see what you produce on RDM.
Firstly, measures of robustness may seem to smuggle probabilities
This seems true to me (although not sure I would consider it to be “by the backdoor”). Insofar as any option selected through a decisions process will in a sense be the one with the highest expected value, any decision tool will have probabilities inherent either implicit or explicitly. For example you could see a basic Scenario Planning exercise as implicitly stating that all the scenarios are of reasonable (maybe equal) likelihood.
I don’t think the idea of RDM is to avoid probabilities, it is to avoid the traps of expected value calculation decisions. For example by avoiding explicit predictions it prevents users making important shifts to plans based on highly speculative estimates. I’d be interested to see if you think it works well in this regard.
Secondly, we wonder why a concern for robustness in the face of deep uncertainty should lead to adoption of a satisficing criterion of choice
Honestly I don’t know (or fully understand this), so good luck finding out. Some thoughts:
In engineering you design your lift or bridge to hold many times the capacity you think it needs, even after calculating all the things you can think off that go wrong – this helps prevent the things you didn’t think of going wrong. I could imagine a similar principle applying to DMDU decision making – that aiming for the option that is satisfyingly robust to everything you can think of might give a better outcome than aiming elsewhere – as it may be the option that is most robust to the things you cannot think of.
But not sure. Not sure how much empirical evidence there is on this. It also occurs to me that if some of the anti-optimizing sentiment could driven by rhetoric and a desire to be different.
Very late here, but a brainstormy thought: maybe one way one could start to make a rigorous case for RDM is to suppose that there is a “true” model and prior that you would write down if you had as much time as you needed to integrate all of the relevant considerations you have access to. You would like to make decisions in a fully Bayesian way with respect to this model, but you’re computationally limited so you can’t. You can only write down a much simpler model and use that to make a decision.
We want to pick a policy which, in some sense, has low regret with respect to the Bayes-optimal policy under the true model. If we regard our simpler model as a random draw from a space of possible simplified models that we could’ve written down, then we can ask about the frequentist properties of the regret incurred by different decision rules applied to the simple models. And it may be that non-optimizing decision rules like RDM have a favorable bias-variance tradeoff, because they don’t overfit to the oversimplified model. Basically they help mitigate a certain kind of optimizer’s curse.
This makes sense to me, although I think we may not be able to assume a unique “true” model and prior even after all the time we want to think and use information that’s already accessible. I think we could still have deep uncertainty after this; there might still be multiple distributions that are “equally” plausible, but no good way to choose a prior over them (with finitely many, we could use a uniform prior, but this still might seem wrong), so any choice would be arbitrary and what we do might depend on such an arbitrary choice.
For example, how intense are the valenced experiences of insects and how much do they matter? I think no amount of time with access to all currently available information and thoughts would get me to a unique distribution. Some or most of this is moral uncertainty, too, and there might not even be any empirical fact of the matter about how much more intense one experience is than another (I suspect there isn’t).
Or, for the US election, I think there was little precedent for some of the considerations this election (how coronavirus would affect voting and polling), so thinking much more about them could have only narrowed the set of plausible distributions so much.
I think I’d still not be willing to commit to a unique AI risk distribution with as much time as I wanted and perfect rationality but only the information that’s currently accessible.
Hi Andreas, Excited you are doing this. As you can maybe tell I really liked your paper on Heuristics for Clueless Agents (although not sure my post above has sold it particularly well). Excited to see what you produce on RDM.
This seems true to me (although not sure I would consider it to be “by the backdoor”).
Insofar as any option selected through a decisions process will in a sense be the one with the highest expected value, any decision tool will have probabilities inherent either implicit or explicitly. For example you could see a basic Scenario Planning exercise as implicitly stating that all the scenarios are of reasonable (maybe equal) likelihood.
I don’t think the idea of RDM is to avoid probabilities, it is to avoid the traps of expected value calculation decisions. For example by avoiding explicit predictions it prevents users making important shifts to plans based on highly speculative estimates. I’d be interested to see if you think it works well in this regard.
Honestly I don’t know (or fully understand this), so good luck finding out. Some thoughts:
In engineering you design your lift or bridge to hold many times the capacity you think it needs, even after calculating all the things you can think off that go wrong – this helps prevent the things you didn’t think of going wrong.
I could imagine a similar principle applying to DMDU decision making – that aiming for the option that is satisfyingly robust to everything you can think of might give a better outcome than aiming elsewhere – as it may be the option that is most robust to the things you cannot think of.
But not sure. Not sure how much empirical evidence there is on this. It also occurs to me that if some of the anti-optimizing sentiment could driven by rhetoric and a desire to be different.
Very late here, but a brainstormy thought: maybe one way one could start to make a rigorous case for RDM is to suppose that there is a “true” model and prior that you would write down if you had as much time as you needed to integrate all of the relevant considerations you have access to. You would like to make decisions in a fully Bayesian way with respect to this model, but you’re computationally limited so you can’t. You can only write down a much simpler model and use that to make a decision.
We want to pick a policy which, in some sense, has low regret with respect to the Bayes-optimal policy under the true model. If we regard our simpler model as a random draw from a space of possible simplified models that we could’ve written down, then we can ask about the frequentist properties of the regret incurred by different decision rules applied to the simple models. And it may be that non-optimizing decision rules like RDM have a favorable bias-variance tradeoff, because they don’t overfit to the oversimplified model. Basically they help mitigate a certain kind of optimizer’s curse.
This makes sense to me, although I think we may not be able to assume a unique “true” model and prior even after all the time we want to think and use information that’s already accessible. I think we could still have deep uncertainty after this; there might still be multiple distributions that are “equally” plausible, but no good way to choose a prior over them (with finitely many, we could use a uniform prior, but this still might seem wrong), so any choice would be arbitrary and what we do might depend on such an arbitrary choice.
For example, how intense are the valenced experiences of insects and how much do they matter? I think no amount of time with access to all currently available information and thoughts would get me to a unique distribution. Some or most of this is moral uncertainty, too, and there might not even be any empirical fact of the matter about how much more intense one experience is than another (I suspect there isn’t).
Or, for the US election, I think there was little precedent for some of the considerations this election (how coronavirus would affect voting and polling), so thinking much more about them could have only narrowed the set of plausible distributions so much.
I think I’d still not be willing to commit to a unique AI risk distribution with as much time as I wanted and perfect rationality but only the information that’s currently accessible.
See also this thread.