Very late here, but a brainstormy thought: maybe one way one could start to make a rigorous case for RDM is to suppose that there is a “true” model and prior that you would write down if you had as much time as you needed to integrate all of the relevant considerations you have access to. You would like to make decisions in a fully Bayesian way with respect to this model, but you’re computationally limited so you can’t. You can only write down a much simpler model and use that to make a decision.
We want to pick a policy which, in some sense, has low regret with respect to the Bayes-optimal policy under the true model. If we regard our simpler model as a random draw from a space of possible simplified models that we could’ve written down, then we can ask about the frequentist properties of the regret incurred by different decision rules applied to the simple models. And it may be that non-optimizing decision rules like RDM have a favorable bias-variance tradeoff, because they don’t overfit to the oversimplified model. Basically they help mitigate a certain kind of optimizer’s curse.
This makes sense to me, although I think we may not be able to assume a unique “true” model and prior even after all the time we want to think and use information that’s already accessible. I think we could still have deep uncertainty after this; there might still be multiple distributions that are “equally” plausible, but no good way to choose a prior over them (with finitely many, we could use a uniform prior, but this still might seem wrong), so any choice would be arbitrary and what we do might depend on such an arbitrary choice.
For example, how intense are the valenced experiences of insects and how much do they matter? I think no amount of time with access to all currently available information and thoughts would get me to a unique distribution. Some or most of this is moral uncertainty, too, and there might not even be any empirical fact of the matter about how much more intense one experience is than another (I suspect there isn’t).
Or, for the US election, I think there was little precedent for some of the considerations this election (how coronavirus would affect voting and polling), so thinking much more about them could have only narrowed the set of plausible distributions so much.
I think I’d still not be willing to commit to a unique AI risk distribution with as much time as I wanted and perfect rationality but only the information that’s currently accessible.
Very late here, but a brainstormy thought: maybe one way one could start to make a rigorous case for RDM is to suppose that there is a “true” model and prior that you would write down if you had as much time as you needed to integrate all of the relevant considerations you have access to. You would like to make decisions in a fully Bayesian way with respect to this model, but you’re computationally limited so you can’t. You can only write down a much simpler model and use that to make a decision.
We want to pick a policy which, in some sense, has low regret with respect to the Bayes-optimal policy under the true model. If we regard our simpler model as a random draw from a space of possible simplified models that we could’ve written down, then we can ask about the frequentist properties of the regret incurred by different decision rules applied to the simple models. And it may be that non-optimizing decision rules like RDM have a favorable bias-variance tradeoff, because they don’t overfit to the oversimplified model. Basically they help mitigate a certain kind of optimizer’s curse.
This makes sense to me, although I think we may not be able to assume a unique “true” model and prior even after all the time we want to think and use information that’s already accessible. I think we could still have deep uncertainty after this; there might still be multiple distributions that are “equally” plausible, but no good way to choose a prior over them (with finitely many, we could use a uniform prior, but this still might seem wrong), so any choice would be arbitrary and what we do might depend on such an arbitrary choice.
For example, how intense are the valenced experiences of insects and how much do they matter? I think no amount of time with access to all currently available information and thoughts would get me to a unique distribution. Some or most of this is moral uncertainty, too, and there might not even be any empirical fact of the matter about how much more intense one experience is than another (I suspect there isn’t).
Or, for the US election, I think there was little precedent for some of the considerations this election (how coronavirus would affect voting and polling), so thinking much more about them could have only narrowed the set of plausible distributions so much.
I think I’d still not be willing to commit to a unique AI risk distribution with as much time as I wanted and perfect rationality but only the information that’s currently accessible.
See also this thread.