Honoring Petrov Day on the EA Forum: 2021
Petrov Day
Today we celebrate not destroying the world. We do so today because 38 years ago, Stanislav Petrov made a decision that averted tremendous calamity. It’s possible that an all-out nuclear exchange between the US and USSR would not have actually destroyed the world, but there are few things with an equal chance of doing so.
As a Lieutenant Colonel of the Soviet Army, Petrov manned the system built to detect whether the US government had fired nuclear weapons on Russia. On September 26th, 1983, the system reported five incoming missiles. Petrov’s job was to report this as an attack to his superiors, who would launch a retaliative nuclear response. But instead, contrary to the evidence the systems were giving him, he called it in as a false alarm, for he did not wish to instigate nuclear armageddon.
For more information, see: 1983 Soviet nuclear false alarm incident
Petrov is not alone in having made decisions that averted destruction — presidents, generals, commanders of nuclear submarines, and similar also made brave and fortunate calls — but Petrov’s story is salient, so today we celebrate him and all those who chose equally well.
As the world progresses, it’s likely that many more people will face decisions like Petrov’s. Let’s hope they’ll make good decisions! And if we expect to face decisions ourselves, let us resolve to decide wisely!
Mutually Assured Destruction (??)
The Petrov Day tradition is to celebrate Petrov’s decisions and also to practice not destroying things, even when it’s tempting.
In both 2019 and 2020, LessWrong placed a large red button on the frontpage and distributed “launch codes” to a few hundred “trustworthy” people. A launch would bring down the frontpage for the duration of Petrov Day, denying hundreds to thousands of people access to LessWrong. In 2019, all was fine. In 2020… let’s just say some bad decisions were made.
And yet, having a button on your own page that brings down your own site doesn’t make much sense! Why would you have nukes pointed at yourself? It’s also not very analogous to the cold war nuclear scenario between major world powers.
For those reasons, in 2021, LessWrong is teaming up with the Forum to play a game of mutual destruction. Two buttons, two sets of codes, and two sets of hopefully trustworthy users.
(The button will appear on the homepage on Sunday morning, 8 AM PST.)
If LessWrong chose any launch code recipients they couldn’t trust, the EA Forum will go down, and vice versa. One of the sites going down means that people are blocked from accessing important resources: the destruction of significant real value. What’s more, it will damage trust between the two sites (“I guess your most trusted users couldn’t be trusted to not take down our site”) and also for each site itself (“I guess the admins couldn’t find a hundred people who could be trusted”).
For exact rules of the game, see the final section below.
Last year, it emerged that there was ambiguity about how serious the Petrov Day exercise was. I’ll be clear as I can via text: there is real value on the line here, and this is a real trust-building exercise that was not undertaken lightly by either LessWrong or the Forum. Both sites have chosen recipients who we hope will understand this.
How Do I Celebrate?
If you were one of the two hundred people to receive launch codes for LessWrong or the Forum, celebrate by doing nothing!
Other ways of celebrating:
You can discuss Petrov Day and threats to humanity with your friends.
You can hold a quiet, dignified ceremony with candles and the beautiful booklets created by Jim Babcock.
And you can also play on hard mode: “During said ceremony, unveil a large red button. If anybody presses the button, the ceremony is over. Go home. Do not speak.”
This has been a common practice at Petrov Day celebrations in Oxford,
Boston, Berkeley, New York, and in other rationalist communities. It is often done with pairs of celebrations, each with a button that can end the other.
Rules of the Exercise
The following email was sent last night to 100 users from the EA Forum. 100 LessWrong users received a similar message.
Hello!
I invite you to participate in an exercise to determine whether the EA Forum can find 100 users it can trust with a genuinely high-stakes decision.
This year, we’re joining LessWrong in celebrating Petrov Day — a holiday where we celebrate the non-destruction of the world, and practice not destroying it ourselves.
To prove the goodwill and trust between the two sites, each site is sending “nuclear launch codes” to 100 users we think we can trust. I chose you personally to receive this message.
(For more on why we’ve done this, see this post.)
If you enter your launch codes into the launch console on the Forum’s homepage, they will cause LessWrong’s homepage to go down for the duration of Petrov Day. For the rest of the day, thousands of people will have a hard time using the site; some posts and comments will likely go unwritten. And I’ll have failed in my mission to find 100 people I could trust not to take down our friendly compatriots.
Your code is personalized; if someone enters it, we’ll know whose code took down the site.
This is your code: [CODE]
LessWrong and the Forum both have second-strike capability that will last for one hour after one of the sites is taken down. If the Forum’s homepage disappears, please consider very carefully whether or not you think it is correct to retaliate.
I hope you’ll help us all keep LessWrong safe, and that they’ll do the same for us.
Yours truly,
Aaron Gertler and the EA Forum team
To all, I wish you a safe and stable Petrov Day.
Here is the mirror of this post on LessWrong. You may wish to view it for the discussion there.
- Clarifying the Petrov Day Exercise by 26 Sep 2021 22:39 UTC; 111 points) (
- Petrov Day 2021: Mutually Assured Destruction? by 22 Sep 2021 1:04 UTC; 99 points) (LessWrong;
- Honoring Petrov Day on the EA Forum: 2021 by 25 Sep 2021 23:27 UTC; 88 points) (
- Petrov Day Retrospective: 2021 by 21 Oct 2021 10:12 UTC; 54 points) (
- A list of Petrov buttons by 26 Oct 2022 20:50 UTC; 19 points) (LessWrong;
- How would you run the Petrov Day game? by 26 Sep 2021 23:37 UTC; 17 points) (
First of all I’d like to thank the Forum team for their hard work producing this nuclear deterrent. We have been extremely lucky that LessWrong did not heed Bertrand Russell’s advice during their period of nuclear monopoly. However, I am concerned that we have not yet tested these weapons, and hence we cannot be entirely sure they will function as intended. Perhaps a test strike against a lightly populated military target like the https://www.nytimes.com/ would make an effective demonstration?
I had one of the EA Forum’s launch codes, but I decided to permanently delete it as an arms-reduction measure. I no longer have access to my launch code, though I admit that I cannot convincingly demonstrate this.
Thanks for this interesting exercise. Three things I want to say:
#1: For people unaware, pressing the red button means you cannot un-press it, though nothing bad will happen unless you enter a launch code.
After I read this page carefully, I thought it was going to be fine and harmless/reversible for me to press the red button, since I had not received a launch code. I have no intention of bringing the LessWrong site down, and don’t plan on entering any launch code, whether random or someone else’s, into the page. I also thought pressing the button would be anonymous, but entering launch codes would not be.
I was just curious at what the user interface/experience would be if I press the button, but not enter anything into it. Anyway, apparently pressing the button means you cannot “un-press” the button. So if you’re similarly as curious as me, here’s what it looks like after pressing the button:
It was only after I read this LessWrong postmortem about Petrov Day 2020 that pressing the red button, even without entering anything into it, will likely be known as done by you by the Forum/LW team, but probably not announced publicly. So I’m posting this ahead saying me pressing the button was out of curiosity, not out of some bad intention.
#2: I think the EA Forum/LW teams should not publicly name the people who enter random codes into either webpage.
In the same LW postmortem, I also found out that the name of someone entering random codes could be revealed. I think both the EA Forum and LW team should probably not do this. (I haven’t entered a random code, but maybe someone else would, without knowing they’d possibly be publicly named.)
You probably don’t know whether the person entering random codes had read this post and understood the exercise before doing so. And they might feel some distress about being publicly named as having entered a random code.
#3: The EA Forum/LW teams should not read in too much about the number of people who press the red button without entering a correct launch code.
I assume they probably won’t read into it, but I thought it would still be worth saying.
As seen in another comment here (which I assume is not a joke), someone accidentally pressed the button. And there’s probably others who didn’t understand what the exercise was about, and were just tempted to click a big red button when they first landed on the Forum today. And then there’s probably a few people like me who wanted to see what would happen if the button were pressed but no launch codes were entered.
Anyway, hopefully neither site gets taken down!
LessWrong mod speaking here. Just wanted to confirm that everything written here is correct.
To be clear, only the identities of the account that enters a valid code will be shared.
Great, thanks!
You shouldn’t read too much into the amount of people pressing the button in terms of malice, but you can read into it in terms of negligence, lack of caution or impulsiveness. It’s how many people saw a big red button and pressed it without first checking what it does. It’s how many people took the chance that pressing it may do something bad even without the launch codes.
I was also curious what happens if you press the button and don’t enter the code, but didn’t check, because I view pressing the button as something you just don’t do—I wouldn’t do it even if a site admin specifically told me “you can press the button without any consequence”.
Though, having pressed the button, it was a good idea to publish how it looks, and you satisfied my curiosity.
I’d fairly strongly disagree with that take. I think it’s an extremely reasonable assumption that a somewhat cartoony red button someone put at the top of a website deliberately does not do harm to press. Someone deliberately chose to put it there, and most features on websites are optimised for user interaction. This only looks unreasonable within the strong frame of having cultural context about Petrov Day
Fair, though I still tend to check what things I press on do before I press them. If there’s no explanation I might still press them, but if it says “learn more” right there I will probably learn more before I do.
Yeah I guess you can read into it in terms of negligence, lack of caution or impulsiveness.
I’m coming to the conclusion that a private Petrov day game is good but a public one without community buy-in leads to a lot of tense disagreements as to what the game means. In some ways that’s a nice analogy for the human condition, in other ways it feels like afterwards we should have some kind of group therapy.
I think I’m softly in favour but I’m glad this only happens once a year. Also I’m 1% worried this is going to end in reputational damage to the community.
I’ve included a series of forecasting questions, in case people excited about forecasting on global catastrophic risks want a fast-feedback way to test their data gathering or calibration.
(Note that there is a relation between these questions—the sum of the last three probabilities is twice the first)
People offering forecasting questions like this is really cool, but is there any way to resolve these questions later and give people track records? Or at that point are we just re-inventing Metaculus too much?
Probably a question for Aaron Gertler / the EA Dev team. Semi-relatedly, is there a way to tag Aaron? That might be another good feature.
You can tag me with a quick DM for now — totally fine if you just literally send the URL of a comment and nothing else, if you want to optimize for speed/ease.
Tagging users to ping them is a much-discussed feature internally, with an uncertain future.
edit: Feature already exists, thanks Ruby!
Another feature request: Is it possible to make other people’s predictions invisible by default and then reveal them if you’d like? (Similar to how blacked-out spoilers work, which you can hover over to see the text.)
I wanted to add a prediction but then noticed that I heavily anchored on the previous responses and didn’t end up doing it.
There’s a user setting that lets you do this.
I agree this would be good to see!
I’m also interested in people’s predictions had the codes been anonymous (not been personalized). In this case, individual reputational risk would be low, so it would mostly be a matter of community
reputationalrisk, and we’d learn more about if EAs or LWers would stab each other in the back (well, inconvenience each other) if they could get away with it.I mean, having a website shut down is also annoying.
As of this comment: 40%, 38%, 37%, 5%. I haven’t taken into account time passing since the button appeared.
With 395 total codebearer-days, a launch has occurred once. This means that, with 200 codebearers this year, the Laplace prior for any launch happening is 40% (1−(1−1396)200). The number of participants is about in between 2019 (125 codebearers) and 2020 (270 codebearers), so doing an average like this is probably fine.
I think there’s a 5% chance that there’s a launch but no MAD, because Peter Wildeford has publicly committed to MAD, says 5%, and he knows himself best.
I think the EA forum is a little bit, but not vastly, more likely to initiate a launch, because the EA Forum hasn’t done Petrov day before and qualitatively people seem to be having a bit more fun and irreverance over here, so I’m giving 3% of the no-MAD probability to EA Forum staying up and 2% to Lesswrong staying up.
Also, the reference class of launches doesn’t fully represent the current situation: last launch was more of a self-destruct. This time, it’s harming another website/community, which seems more prohibitive. So I think the prior is lower than 40%.
There is a chance to remove MAD by removing Peter’s launch codes’ validity, per my request.
My forecasts just before the button appeared: 25%, 17%, 19%, 14%.
I just want to say, I’m having a really bad day, and this made me feel a little better about myself.
So I won’t be launching the missiles, but er… I was also offered codes on LessWrong, so launching still kinda seems like blowing up my own side.
So it looks like we survived? (Yay)
I think it would be good for CEA to provide a clear explanation, that it (not LW) stands behind as an organization, of exactly what real value it views as being on the line here, and why it thinks it was worthwhile to risk that value.
I have a code. How much should I charge in counterfactual donations to effective charities to push it? How much do we think it’s worth to “win” this year’s Petrov day game?
I don’t think this is a terrible question. Personally, somewhere like $2k to $20k; $2k if one only considers the object level value (say $1M/year ÷ 365 days) , $20k if one thinks that the value is higher than $1M/year or if one really values the intangibles. And because of the unilateralist curse &c, one should probably tend towards the higher amounts anyways.
Surely the value of the nuke goes down with time? Like having it shut down right now will be kinda annoying for me; much less so for 630 AM on Monday.
Fair point.
Yeah, 20k seemed about right to me also.
Ouch. It was a serious question, if someone were to pay 20k to malaria coalition, that’s 4 lives on expectation. Seems reasonable for the loss to our community.
Could someone explain why asking how much we should charge to press the button is taboo?
It could be seen as uncooperative or pushing things too far, like “for what value of donations should an Effective Altruist let me punch them?”
Hmm, I guess I find it strange. To me, asking this question is part of taking this ritual seriously. IE how valuable is this ritual to maintain?
The appropriate response to someone with the launch codes to a real nuke suggesting we sell them to terrorists is to shoot them, not to wait to see if the terrorists could pay a lot of money; by comparison a downvote seems very apt!
I mean it depends how much the terrorists are going to pay compared to the value of the damage. In the real world that value is unthinkably high, here, not so much.
This was proposed and discussed 2 years ago here.
One thought in a similar vein is that somebody with the codes should do it the other way around.
They should hold EA/LessWrong to ransom and say that if the communities don’t donate $X by the time Y to GiveWell charities, they will destroy the websites.
This has been suggested to me.
How could we be convinced that the donations were counterfactual?
Also, do you mean you’re (considering) taking bribes (to EA charities) to push the button?
I think I’d ask for the community here to agree first, but if someone suggested an amount that got half the upvotes of the total of this page I’d probably push it. That seems like the ethical choice.
I thought this comment was apt: https://www.lesswrong.com/posts/KQnYogkFTKc9wpWjY/postmortem-to-petrov-day-2020?commentId=Ye23vB8amHhxHLTdz
I can’t really tell if this is supposed to be a game or a sacred ritual.
I like Ben Pace’s response from the linked post:
Just because the site admins gave us the ability to shut down the site does not mean that it is harmless or permissible to do so. Even if they were to tell us it’s a game and it’s permissible to do so (which they did not) that still would not make it harmless nor necessarily permissible. The stakes still affect the permissibility regardless of what they were to say.
If it’s not permissable for me to shut down the site, why is it permissible for Aaron to send unsolicited emails to 100 people inviting them to shut it down?
He didn’t invite anyone to shut it down. He simply gave more people the power to shut it down than already had it and invited us to practice not using that power. (I think this was permissible.)
But for the sake of argument, even if Aaron did invite us to shut it down, that would not mean that Aaron’s action was necessarily permissible. Maybe it would be since service providers have the right to stop providing services, but when the stakes are sufficiently high suddenly deciding to just stop providing a service to harm all your customers seems unethical (e.g. if Bezos and/or whoever else has the authority at Amazon decided to just shut down Amazon without warning).
Some of the LessWrong and Forum moderators are away at a conference, so we won’t be publishing a retrospective right away, but we do expect to publish one eventually. Stay tuned!
An obvious question which I’m keen to hear people’s thoughts on—does MAD work here? Specifically, does it make sense for the EA forum users with launch codes to commit to a retaliatory attack? The obvious case for it is deterrence. The obvious counterarguments are that the Forum could go down for a reason other than a strike from LessWrong, and that once the Forum is down, it doesn’t help us to take down LW (though this type of situation might be regular enough that future credibility makes it worth it)
Though of course it would be really bad for us to have to take down LW, and we really don’t want to. And I imagine most of us trust the 100 LW users with codes not to use them :)
The question is whether precommitment would actually change behavior. In this case, anyone shutting down either site is effectively playing nihilist, and doesn’t care, so it shouldn’t.
In fact, if it does anything, it would be destabilizing—if “they” commit to pushing the button if “we” do, they are saying they aren’t committed to minimizing damage overall, which should make us question whether we’re actually on the same side. (And this is a large part of why MAD only works if you are both selfish, and scared of losing.)
Furthermore, if there was a user who wanted to take down LW/EA for fun, a precommitment to MAD would only help that user take down an additional site.
Everyone cares about something, so maybe we should precommit to something more .. deterring? It should likely be something that’s not really bad, but still somewhat uncomfortable for the person to experience. (I realize that going down this path of thinking might produce actual outside-game harm)
Er.. I’m reading Khorton’s post now, and apparently people are viewing this game/event thing very differently, so I think with that meta-uncertainty I am unwilling to ruthlessly strategize.
Don’t forget that this is iterated, though. In order to save the site from going down a year from now, we might want to follow through on a tit-for-tat strategy this year.
I’m not certain that this is the correct play, but it is an important distinction from the usual MAD theorizing.
Surely we don’t as anyone bringing down a site next year would still be some sort of reckless nihilist who just doesn’t care. So tit-for-tat this year wouldn’t actually change anything?
Thanks - I meant to point out that it wasn’t definitively single-shot, unlike actual, you know, destruction.
I know we’re trying to remember when the US and USSR had their weapons pointed at each other but it feels more like the North and South islands of New Zealand are trying to decide whether to nuke each other!
Edit: Not even something so violent—just temporarily inconvenience each other
I briefly saw a “Missile Incoming” message with a 60:00 timer (that wasn’t updating) on the buttons on the front pages of both LW and the EA Forum, at around 12pm EST, on mobile. Both messages were gone when I refreshed. Was this a bug or were they testing the functionality, testing us or preparing to test us?
I think it was an intentional false alarm, to better simulate Petrov’s situation
They should have left it up longer if they wanted to test us with it, since it was gone when I reloaded the pages and the timer was never updated while it was up, even though each side would have an hour to retaliate (or it was supposed to give the impression that the hour was over, and it was already too late).
Since the timer wasn’t updating on either site, I assume they weren’t testing us (yet).
Correction: The annual Petrov Day celebration in Boston has never used the button.
Thanks — I’ve updated the post.
On the one hand, in order for MAD to work, decision-makers on both sides must be able to give credible threats for a retaliatory strike scenario. This is also true in this experiment’s case: if we assume that this will be iterated on future Petrov Days, then we must show that any tit-for-tat precommitments made are followed through.
But at the same time, if LessWrong takes down the EA Forum, it just seems like wanton destruction to similarly take it down, too. I know that, as a holder of the codes, I should ensure that I’m making a fully credible threat by precommitting to a retaliatory strike, but I want to take precommitments seriously and I don’t feel confident enough to precommit to such an action.
After giving this much thought, I decided to present the perhaps-too-weak claim that if the EA Forum goes down due to a LessWrong user pressing the button, I may press in retaliation. While this is not an idle threat, and I am serious about potentially performing a retaliatory strike, I am falling short of committing myself to that action in advance. I give more of my reasoning in my blog post on this.
(Ultimately, this is moot, since others are already willing to make such a precommitment so I don’t have to.)
I’ve made a much stronger precommitment
oh crap! I accidentally pressed the button :O I’m super sorry
I don’t think anything happens unless you enter the code, too.
Attention EA Forum—I am a chosen user of LessWrong and I have the codes needed to destroy the EA Forum. I hereby make a no first use pledge and I will not enter my codes for any reason, even if asked to do so. I also hereby pledge to second strike—if LessWrong is taken down, I will retaliate.
Ahh, Nixon’s madman strategy.
I downvoted this. I’m not sure if that was an appropriate way to express my views about your comment, but I think you should lift your pledge to second strike, and I think it’s bad that you pledged to do so in the first place.
I think one important disanalogy between real nuclear strategy and this game is that there’s kind of no reason to press the button, which means that for someone pressing the button, we don’t really understand their motives, which makes it less clear that this kind of comment addresses their motives.
Consider that last time LessWrong was persuaded to destroy itself, it was approximately by accident. Especially considering the context of the event we’re commemorating was essentially another accident, I think the most likely story for why one of the sites gets destroyed is not intentional, and thus not affected by precommitments to retaliate.
All of this seems consistent with Peter’s pledge to second strike being +EV, as long as he’s lying.
Yeah, that did occur to me. I think it’s more likely that he’s telling the truth, and even if he’s lying, I think it’s worth engaging as if he’s sincere, since other people might sincerely believe the same things.
(I’ve also downvoted Peter’s comment when I first read it, for similar reasons).
Please don’t retaliate; that just ~doubles the damage for no reason. Per David’s comments, I don’t think threatening retaliation helps the situation here.
Too bad—I am committing to retaliating to establish a deterrent.
What if LessWrong is taken down for another reason? Eg. the organisers of this game/exercise want to imitate the situation Petrov was in, so they create some kind of false alarm
Last year the site looked very obviously nuked. If I see that situation, I will retaliate. If I see some other situation, I will use my best judgement.
Surely after the site has been nuked you will no longer be able to enter the codes, because your silos will have been destroyed? And prior to that you risk mis-classifying our civilian space exploration vehicles, whose optimal launch trajectory just happens to go over LessWrong airspace, as weapons?
I hope we invested in secure second strike capabilities. I think Lesswrong has a nuclear triad—we have guest posts on other websites that can launch nukes even after Lesswrong itself has been destroyed
Were you selected to have the codes for both LessWrong and the EA Forum? I see you made a similar post on LW.
That’s correct.
I motion to
remove Peter Wildeford’s launch codes from the list of valid launch codes for both this forum and LessWrong. Reason: he clearly does not understand that this precommitment is unlikely to deter any of the ‘trusted’ LW users to press the button (see this David Mannheim’s comment and discussion below)
evaluate our method of chosing ‘trusted users’. We may want to put specific users that take dangerous actions like these on a black list for future instances of Petrov Day.
I would ask how users are chosen, but I imagine that making that knowledge more available increasing the information risk it will be misused by nefarious actors.
Retracting my comment because it’s unclear what kind of event (game, ritual, experiment) this is.
That seems harsh.
I think this just gets back to what the game is.
If it’s a game, I think what Peter did was fun and cool.
If it’s a ritual, then yeah maybe it was irresponsible (maybe not I don’t know).
Personally, it made me think about precommitments, which seems good, so I’m glad he did it.
Yeah, my comments should be read as [in-game] comments, not as [ritual] comments, and I all mean it in good nature!
Damn, seeing the social complexity of this event with the uncertainty about what it is quickly made it feel more like a social minefield than a game.
Yeah I actually thought you were legit mad at me rather than just in-game strategizing, so that’s +1 to this game being unnecessarily stressful.
Thanks for clarifying.
Oops! Sorry Peter, not my intention at all!
Haha it’s ok!
Hopefully we can actually play a game version sometime.
I have also used my strong downvote capability to reduce the signal of Peter’s message. I hereby apologize for any harm outside of this game (Peter’s total karma), but I saw no other way.
You could upvote something else I said ;)
I think this is an excellent contribution to the forum: strong upvote! ;)
I can’t parse the concept of ‘precommitment’. I don’t intend to launch a first strike, but maybe something will happen in the next few hours to change my intention, and I don’t have any way to restructure my brain to reduce that possibility to 0. The reverse applies for second striking.
Sure, precommitments are not certain, but they’re a way of raising the stakes for yourself (putting more of your reputation on the line) to make it more likely that you’ll follow through, and more convincing to other people that this is likely.
In other words: of course you don’t have any way to reach probability 0, but you can form intentions and make promises that reduce the probability (I guess technically this is “restructuring your brain”?)
This is not how I understand the term. What you’re describing is how I would describe the word “commitment”. But a “precommitment” is more strict; the idea is that you have to follow through in order to ensure that you can get through a Newcomb’s paradox situation.
You can use precommitments to take advantage of time-travel shenanigans, to successfully one-box Newcomb, or to ensure that near-copies of you (in the multiverse sense) can work together to achieve things that you otherwise wouldn’t.
With that said, it may make sense to say that we humans can’t really precommit in these kinds of ways. But to the extent that we might be able to, we may want to try, so that if any of these scifi scenarios ever do come up, we’d be able to take advantage of them.
Yeah, if precommitment is to be distinguished from regular ‘intending to do a thing’ or ‘stating such intention’, it must be ripping out your steering wheel in a game of chicken.
Making a promise not to something I didn’t intend to—and where doing it would already harm me socially—doesn’t seem to add much beyond the value of stating my intentions (and the statement could still be a lie).
Totally agreed. Low-key one of my pet peeves is that most commitments are not precommitments.
Hello partner! ;)