Ha, I think the problem is just that your formalization of Newcomb’s problem is defined so that one-boxing is always the correct strategy, and I’m working with a different formulation. There are four forms of Newcomb’s problem that jibe with my intuition, and they’re all different from the formalization you’re working with.
Your source code is readable. Then the best strategy is whatever the best strategy is when you get to publicly commit e.g. you should tear off the wheel when playing chicken if you have the opportunity to do so before your opponent.
Your source code is readable and so is your opponent’s. Then you get mathy things like mutual simulation and lob’s theorem.
We’re in the real world, so the only information the other player has to guess your strategy is information like your past behavior and reputation. (This is by far the most realistic situation in my opinion.)
You’re playing against someone who’s an expert in reading body language, say. Then it might be impossible to fool them unless you can fool yourself into thinking you’ll one-box. But of course, after the boxes are actually in front of you, it would be great for you if you had a change of heart.
Your version is something like
Your opponent can simulate you with 100% accuracy, including unforeseen events like something unexpected causing you to have a change of mind.
If we’re creating AIs that others can simulate, then I guess we might as well make them immune to retro blackmail. I still don’t see the implications for humans, who cannot be simulated with 100% fidelity and already have ample intuition about their reputations and know lots of ways to solve coordination problems.
I posted a couple months ago that I was working on an effective altruism board game. You can now order a copy online!
To recap:
it’s a cooperative game where you start out as a random human sampled from the real-world distribution of income
try to get lots of human QALYs and animal QALYs and reduce existential risk
all while answering EA-related trivia questions, donating to effective charities, partaking in classic philosophy thought experiments, realizing your own private morality and
try to avoid being turned into a chicken.