Flipping Out: The Cosmic Coinflip Thought Experiment Is Bad Philosophy
TL;DR: People often use the thought experiment of flipping a coin, giving 50% chance of huge gain and 50% chance of losing everything, to say that maximizing utility is bad. But the real problem is that our intuitions on this topic are terrible, and there’s no real paradox if you adopt the premise in full.
Epistemic status: confident, but too lazy to write out the math
There’s a thought experiment that I’ve sometimes heard as a counterargument to strict utilitarianism. A god/alien/whatever offers to flip a coin. Heads, it slightly-more-than-doubles the expected utility in the world. Tails, it obliterates the universe. An expected-utility maximizer, the argument goes, keeps taking this bet until the universe goes poof. Bad deal.
People seem to love citing this thought experiment when talking about Sam Bankman-Fried. We should have known he was wrong in the head, critics sigh, when he said he’d bet the universe on a coinflip. They have a point; SBF apparently talked about this a lot, and it came up in his trial. I’m not fully convinced he understood the implications, and he certainly had a reckless and toxic attitude towards risk.
But today I’m here to argue that, despite his many, many flaws, SBF got this one right.
There is a lot of value in the universe
Suppose I’m a utilitarian. I value things like the easing of suffering and the flourishing of sapient creatures. Some mischievous all-powerful entity offers me the coinflip deal. On one side is the end of the world. On the other is “slightly more than double everything I value.” What does that actually mean?
It turns out the world is pretty big. There is a lot of flourishing in it. There’s also a lot of suffering, but I happen to arrange my preference-ordering such that the net utility of the world continuing to exist is extremely large. To make this coinflip an appealing trade, the Cosmic Flipper has to offer me something whose value is commensurate to that of the whole entire world and everyone in it, plus all the potential future value in humanity’s light cone.
That’s a big freaking deal.
The number of offers that weigh heavily enough on the other side of the scale is pretty darn small. “Double the number of people in the world” doesn’t begin to come close; neither does “make everyone twice as happy.” A more appropriate offer IMO might look more like “everyone becomes unaging, doesn’t need to eat or drink except for fun, grows two standard deviations smarter and wiser, and is basically immune to suffering.”
That’s a bet I’d at least consider taking. Odds are, you might feel that way too.
(If you don’t, that’s okay, but it means the Cosmic Flipper still isn’t offering you enough. What would need to be on the table for you, personally, to actually consider wagering the fate of the universe on a coinflip? What would the Cosmic Flipper have to offer? How much better does the world have to be, in the “heads” case, that you would be tempted?)
Suppose I do take the bet, and get lucky. How do you double that? Now we’re talking something on the order of “all animals everywhere also stop suffering” and I don’t even know what else.
By the time we get to flipping the coin five, ten, or a hundred times, I literally can’t even conceive of what sort of offer it would take to make a 50% chance of imploding utopia sound like a good price to pay. It’s incredibly difficult to wrap our brains around what “doubling the value in the world” actually means. And that’s just the tip of the iceberg.
We already court apocalypse
The thought experiment gets even more complicated when you factor in existing risks.
If you buy the arguments about threats from artificial superintelligence—which I do, for the record—then our world most likely has only a few years or decades left before we’re eaten by an unaligned machine. If you don’t buy those arguments, there’s still the 1 in 10,000 chance per year that we all nuke ourselves to death (or into the Stone Age), which is similar to the odds that you die this year in a car crash (if you’re in the US). Even if humanity never invents another superweapon, there’s still the chance that Earth gets hit by a meteor or Mother Nature slaughters our civilization with the next Black Death before we get our collective shit together.
What does it mean to “double the expected value of the universe” given the threat of possible extinction? I genuinely don’t know. And we can’t just say “well, holding x-risk constant...” because any change to the world that’s big enough to double its expected utility is going to massively affect the odds of human extinction.
When it comes to thought experiments like this, we can’t just rely on what first pops into our head when we hear the phrase “double expected value.” For the bargain to make sense to a true expected-utility maximizer, it has to still sound like a good deal even after all these considerations are factored in.
Everything breaks down at infinity
OK, so maybe it’s a good idea to flip the coin once or twice, or even many times. But if you take this bet an infinite number of times, then you’re guaranteed to destroy the universe. Right?
Firstly, lots of math breaks down at infinity. Infinity is weird like that. I don’t think there exists a value system that can’t be tied in knots by some contrived thought experiment involving infinite regression, and even if there did, I doubt it would be one I wanted to endorse.
Secondly, and more importantly, I question whether it is possible even in theory to produce infinite expected value. At some point you’ve created every possible flourishing mind in every conceivable permutation of eudaimonia, satisfaction, and bliss, and the added value of another instance of any of them is basically nil. In reality I would expect to reach a point where the universe is so damn good that there is literally nothing the Cosmic Flipper could offer me that would be worth risking it all.
And given the nature of exponential growth, it probably wouldn’t even take that many flips to get to “the universe is approximately perfect”. Sounds like a pretty good deal.
Conclusion
The point I’m hoping to make is that this coinflip thought experiment suffers from a gap between the mathematical ideal of “maximizing the expected value in the universe” and our intuitions about it.
On a more specific level, I wish people would stop saying “Of course SBF had a terrible understanding of risk, he took EV seriously!” as though SBF’s primary failing was being a utilitarian, and not being reckless and hopelessly blinkered about the real-world consequences of his actions.
I agree that there’s been a phenomenon of people suddenly all agreeing that all of SBF’s opinions were wrong post-FTX collapse. So I appreciate the effort to make the case for taking the deal, and to portray the choice as not completely obvious.
To the extent that you’re hoping to save “maximizing utility via maximizing expected value,” I think it’s still an uphill battle. I like Beckstead and Thomas’s “A paradox for tiny probabilities and enormous values” on this, which runs essentially the same thought experiment as “flip the coin many times,” except with the coin weighted to 99.9% heads (and only your own life in play, not the universe). They point out that both positions, “timidity” and “recklessness”, have implausible conclusions.
I’m ultimately quite philosophically troubled by this “concentrating all the value into narrow regions of probability space” feature of EV maximization as a result (but I don’t have a better alternative on hand!). This makes me, in particular, not confident enough in EV-maximization to wager the universe on it. So while I’m more sympathetic than most to the position that the coin flip might be justifiable, I’m still pretty far from wanting to bite that bullet.