Notably, the agent following P_CDT two-boxes because $1,001,000 > $1,000,000 and $1000 > $0, even though this “dominance” argument appeals to two outcomes that are known to be impossible just from the problem statement. I certainly don’t think agents “should” try to achieve outcomes that are impossible from the problem specification itself.
Suppose that we accept the principle that agents never “should” try to achieve outcomes that are impossible from the problem specification—with one implication being that it’s false that (as R_CDT suggests) agents that see a million dollars in the first box “should” two-box.
This seems to imply that it’s also false that (as R_UDT suggests) an agent that sees that the first box is empty “should” one box. By the problem specification, of course, one boxing when there is no money in the first box is also an impossible outcome. Since decisions to two box only occur when the first box is empty, this would then imply that decisions to two box are never irrational in the context of this problem. But I imagine you don’t want to say that.
I think I probably still don’t understand your objection here—so I’m not sure this point is actually responsive to it—but I initially have trouble seeing what potential violations of naturalism/determinism R_CDT could be committing that R_UDT would not also be committing.
(Of course, just to be clear, both R_UDT and R_CDT imply that the decision to commit yourself to a one-boxing policy at the start of the game would be rational. They only diverge in their judgments of what actual in-room boxing decision would be rational. R_UDT says that the decision to two-box is irrational and R_CDT says that the decision to one-box is irrational.)
Suppose that we accept the principle that agents never “should” try to achieve outcomes that are impossible from the problem specification—with one implication being that it’s false that (as R_CDT suggests) agents that see a million dollars in the first box “should” two-box.
This seems to imply that it’s also false that (as R_UDT suggests) an agent that sees that the first box is empty “should” one box. By the problem specification, of course, one boxing when there is no money in the first box is also an impossible outcome. Since decisions to two box only occur when the first box is empty, this would then imply that decisions to two box are never irrational in the context of this problem. But I imagine you don’t want to say that.
I think I probably still don’t understand your objection here—so I’m not sure this point is actually responsive to it—but I initially have trouble seeing what potential violations of naturalism/determinism R_CDT could be committing that R_UDT would not also be committing.
(Of course, just to be clear, both R_UDT and R_CDT imply that the decision to commit yourself to a one-boxing policy at the start of the game would be rational. They only diverge in their judgments of what actual in-room boxing decision would be rational. R_UDT says that the decision to two-box is irrational and R_CDT says that the decision to one-box is irrational.)
That should be “a one-boxing policy”, right?
Yep, thanks for the catch! Edited to fix.