As a non-decision theorist, here’s some thoughts, well, objections really.
I think maybe my thoughts are useful to look at because they represent what a “layman” or non-specialist might think in response to your post.
But I am a scrub. If I am wrong, please feel free to just stomp all over what I write. That would be useful and illustrative. Stomp stomp stomp!
To start, I’ll quote your central example for context:
My main example is a prisoner’s dilemma between perfect deterministic software twins, exposed to the exact same inputs. This example that shows, I think, that you can write on whiteboards light-years away, with no delays; you can move the arm of another person, in another room, just by moving your own. This, I claim, is extremely weird...Nevertheless, I think that CDT is wrong. Here’s the case that convinces me most.
Perfect deterministic twin prisoner’s dilemma: You’re a deterministic AI system, who only wants money for yourself (you don’t care about copies of yourself). The authorities make a perfect copy of you, separate you and your copy by a large distance, and then expose you both, in simulation, to exactly identical inputs (let’s say, a room, a whiteboard, some markers, etc). You both face the following choice: either (a) send a million dollars to the other (“cooperate”), or (b) take a thousand dollars for yourself (“defect”).
But defecting in this case, I claim, is totally crazy. Why? Because absent some kind of computer malfunction, both of you will make the same choice, as a matter of logical necessity. If you press the defect button, so will he; if you cooperate, so will he. The two of you, after all, are exact mirror images. You move in unison; you speak, and think, and reach for buttons, in perfect synchrony. Watching the two of you is like watching the same movie on two screens.
To me, it’s an extremely easy choice. Just press the “give myself a million dollars” button! Indeed, at this point, if someone tells me “I defect on a perfect, deterministic copy of myself, exposed to identical inputs,” I feel like: really?
Objection 1:
So let’s imagine an agent who shares exactly your thoughts up to exactly the moment above.
So this agent, at this moment, has just thought about this “extremely easy choice” to cooperate and also exactly all of the logic leading up to this moment, just before they are about to press the “cooperate” button.
But then at this moment, they break from your story.
The agent (deterministic AI system, who only wants money) thinks, “Well I can get 1M + 1K by defecting, so I’m going to do that”.
Being a clone, having every atom, particle, quantum effect and random draw being identical, does not stop this logic. There’s no causal control of any kind, right?
So RIP cooperation. We get the same prisoner dilemma effect again.
(Notice that linked thinking between agents doesn’t solve this. They just can think “Well my clone must be having the same devious thoughts. Also, purple elephant spaghetti.”)
Objection 2:
Let’s say that the agents share exactly your strong views toward cooperation, and even in the strongest way, believe in the acausal control or mirroring that allows them to cooperate (or scratch their nose, perform weird mental gymnastics, etc.).
They cooperate. Great!
Ok, but here the agency/design/control was not exercised by the two agents, but instead by whatever device/process that linked/created them in the first place.
That process established this linkage with such certainty that it allows people to sync or cooperate over many light years.
It’s that process that created/pulled their strings, like a programmer writing a program.
This is a pretty boring form of control and not what you envision. There’s no control, causal or otherwise exercised at all by your agents.
Objection 3:
Again, let’s have the agents do exactly as you suggest and cooperate.
Notice that you assert some sort of abhorrence against the astronomical waste of defecting.
Frankly, you lay it on pretty thick:
To me, it’s an extremely easy choice. Just press the “give myself a million dollars” button! Indeed, at this point, if someone tells me “I defect on a perfect, deterministic copy of myself, exposed to identical inputs,” I feel like: really?
Note that this doesn’t seem like a case where any idiosyncratic predictors are going around rewarding irrationality. Nor, indeed, does feel to me like “cooperating is an irrational choice, but it would be better for me to be the type of person who makes such a choice” or “You should pre-commit to cooperating ahead of time, however silly it will seem in the moment” (I’ll discuss cases that have more of this flavor later). Rather, it feels like what compels me is a direct, object-level argument, which could be made equally well before the copying or after. This argument recognizes a form of acausal “control” that our everyday notion of agency does not countenance, but which, pretty clearly, needs to be taken into account.
Ok. Your agents follow exactly your thinking, exactly as you describe, and cooperate.
But now, your two agents aren’t really playing “prisoner’s dilemma” at all.
They do have the mechanical, physical payoffs of 1M and 1K in front of them, but as your own emphatic writing lays out, this doesn’t describe their preferences at all.
Instead, it’s better to interpret your agents preferences/“game”/“utility function” as having some term for the cost of defecting or aversion to astronomical waste.
Ok, so I thought about this more and want to double down on my Objection 1:
Consider the following three scenarios for clarity:
Scenario 1: Two identical, self interested agents play prisoner’s dilemma in your respective rooms, light years apart. These two agents are just straight out of our econ 101 lecture. Also, they know they are identical and self-interested. Ok dokie. So we get the “usual” defect/defect single-shot result. Note that we can have these agents identical, down to the last molecule and quantum effect, but it doesn’t matter. I think we all accept that we get the defect/defect result.
Scenario 2: We have your process or Omega create two identical agents, molecularly identical, quantum effect identical, etc. Again, they know they are identical and self-interested. Now, again they play the game in their respective rooms, light years apart. Again, once I point out that nothing has changed from Scenario 1, I think you would agree we get the defect/defect result.
Scenario 3: We have your process or Omega create one primary agent, and then create a puppet or slave of this primary agent that will do exactly what the primary agent does (and we put them in the two rooms with whiteboards, etc.). Now, it’s going to seem counterintuitive how this puppeting works, across the light-years, with no causation or information passing between agents. What’s going on is that, just as in Newcomb’s boxing thingy, that Omega is exercising extraordinary agency or foresight, probably over both agent and copy, e.g.it’s foreseen what the primary agent will do and creates that over the puppet.
Ok. Now, in Scenario 3, indeed your story about getting the cooperate result works, because it’s truly mirroring and the primary agent can trust the puppet will copy as they do.
However, I think your story is merely creating Scenario 2, and the copying doesn’t go through.
There is no puppeting effect or Omega’s effect—this is what is biting for Scenario 3.
To see why the puppeting doesn’t go through, it’s because Scenario 2 is the same as Scenario 1.
Another way of seeing this is that, imagine in your story in your post, imagine your agent doing something horrific, almost unthinkable, like committing genocide or stroking a cat backwards. Despite both the agent and the copy are able to do the horrific act, and despite the fact that they would mirror eachother, is not adequate for this act to actually happen. Both agents need to do/choose this.
You get your result, by rounding this off. You point out how tempting cooperate looks like, which is indeed true and indeed human subjects will actually probably cooperate in this situation. But that’s not causality or control.
As a side note, I think this “Omega effect”, or control/agency is the root of the Newcomb’s box paradox thing. Basically CDT’s refuse the idea that they are in the inner loop of Omega or in Omega’s mind’s eye as they eye the $1000 box, and think they can grab two boxes without consequence. But this rejects the premise of the whole story and doesn’t take Omega’s agency seriously (which is indeed extraordinary and maybe very hard to imagine). This makes Newcomb’s paradox really uninteresting.
Also, I read all this Newcomb stuff over the last 24 hours, so I might be wrong.
As a non-decision theorist, here’s some thoughts, well, objections really.
I think maybe my thoughts are useful to look at because they represent what a “layman” or non-specialist might think in response to your post.
But I am a scrub. If I am wrong, please feel free to just stomp all over what I write. That would be useful and illustrative. Stomp stomp stomp!
To start, I’ll quote your central example for context:
Objection 1:
So let’s imagine an agent who shares exactly your thoughts up to exactly the moment above.
So this agent, at this moment, has just thought about this “extremely easy choice” to cooperate and also exactly all of the logic leading up to this moment, just before they are about to press the “cooperate” button.
But then at this moment, they break from your story.
The agent (deterministic AI system, who only wants money) thinks, “Well I can get 1M + 1K by defecting, so I’m going to do that”.
Being a clone, having every atom, particle, quantum effect and random draw being identical, does not stop this logic. There’s no causal control of any kind, right?
So RIP cooperation. We get the same prisoner dilemma effect again.
(Notice that linked thinking between agents doesn’t solve this. They just can think “Well my clone must be having the same devious thoughts. Also, purple elephant spaghetti.”)
Objection 2:
Let’s say that the agents share exactly your strong views toward cooperation, and even in the strongest way, believe in the acausal control or mirroring that allows them to cooperate (or scratch their nose, perform weird mental gymnastics, etc.).
They cooperate. Great!
Ok, but here the agency/design/control was not exercised by the two agents, but instead by whatever device/process that linked/created them in the first place.
That process established this linkage with such certainty that it allows people to sync or cooperate over many light years.
It’s that process that created/pulled their strings, like a programmer writing a program.
This is a pretty boring form of control and not what you envision. There’s no control, causal or otherwise exercised at all by your agents.
Objection 3:
Again, let’s have the agents do exactly as you suggest and cooperate.
Notice that you assert some sort of abhorrence against the astronomical waste of defecting.
Frankly, you lay it on pretty thick:
Ok. Your agents follow exactly your thinking, exactly as you describe, and cooperate.
But now, your two agents aren’t really playing “prisoner’s dilemma” at all.
They do have the mechanical, physical payoffs of 1M and 1K in front of them, but as your own emphatic writing lays out, this doesn’t describe their preferences at all.
Instead, it’s better to interpret your agents preferences/“game”/“utility function” as having some term for the cost of defecting or aversion to astronomical waste.
Please don’t hurt me.
Ok, so I thought about this more and want to double down on my Objection 1:
Consider the following three scenarios for clarity:
Scenario 1: Two identical, self interested agents play prisoner’s dilemma in your respective rooms, light years apart. These two agents are just straight out of our econ 101 lecture. Also, they know they are identical and self-interested. Ok dokie. So we get the “usual” defect/defect single-shot result. Note that we can have these agents identical, down to the last molecule and quantum effect, but it doesn’t matter. I think we all accept that we get the defect/defect result.
Scenario 2: We have your process or Omega create two identical agents, molecularly identical, quantum effect identical, etc. Again, they know they are identical and self-interested. Now, again they play the game in their respective rooms, light years apart. Again, once I point out that nothing has changed from Scenario 1, I think you would agree we get the defect/defect result.
Scenario 3: We have your process or Omega create one primary agent, and then create a puppet or slave of this primary agent that will do exactly what the primary agent does (and we put them in the two rooms with whiteboards, etc.). Now, it’s going to seem counterintuitive how this puppeting works, across the light-years, with no causation or information passing between agents. What’s going on is that, just as in Newcomb’s boxing thingy, that Omega is exercising extraordinary agency or foresight, probably over both agent and copy, e.g. it’s foreseen what the primary agent will do and creates that over the puppet.
Ok. Now, in Scenario 3, indeed your story about getting the cooperate result works, because it’s truly mirroring and the primary agent can trust the puppet will copy as they do.
However, I think your story is merely creating Scenario 2, and the copying doesn’t go through.
There is no puppeting effect or Omega’s effect—this is what is biting for Scenario 3.
To see why the puppeting doesn’t go through, it’s because Scenario 2 is the same as Scenario 1.
Another way of seeing this is that, imagine in your story in your post, imagine your agent doing something horrific, almost unthinkable, like committing genocide or stroking a cat backwards. Despite both the agent and the copy are able to do the horrific act, and despite the fact that they would mirror eachother, is not adequate for this act to actually happen. Both agents need to do/choose this.
You get your result, by rounding this off. You point out how tempting cooperate looks like, which is indeed true and indeed human subjects will actually probably cooperate in this situation. But that’s not causality or control.
As a side note, I think this “Omega effect”, or control/agency is the root of the Newcomb’s box paradox thing. Basically CDT’s refuse the idea that they are in the inner loop of Omega or in Omega’s mind’s eye as they eye the $1000 box, and think they can grab two boxes without consequence. But this rejects the premise of the whole story and doesn’t take Omega’s agency seriously (which is indeed extraordinary and maybe very hard to imagine). This makes Newcomb’s paradox really uninteresting.
Also, I read all this Newcomb stuff over the last 24 hours, so I might be wrong.