Request for input on multiverse-wide superrationality (MSR)

I am cur­rently work­ing on a re­search pro­ject as part of CEA’s sum­mer re­search fel­low­ship. I am build­ing a sim­ple model of so-called “mul­ti­verse-wide co­op­er­a­tion via su­per­ra­tional­ity” (MSR). The model should in­cor­po­rate the most rele­vant un­cer­tain­ties for de­ter­min­ing pos­si­ble gains from trade. To be able to make this model max­i­mally use­ful, I would like to ask oth­ers for their opinions on the idea of MSR. For in­stance, what are the main rea­sons you think MSR might be ir­rele­vant or might not work as it is sup­posed to work? Which ques­tions are unan­swered and need to be ad­dressed be­fore be­ing able to as­sess the merit of the idea? I would be happy about any in­put in the com­ments to this post or via mail to jo­hannes@foun­da­tional-re­

An overview of re­sources on MSR, in­clud­ing in­tro­duc­tory texts, can be found on the link above. To briefly illus­trate the idea, con­sider two ar­tifi­cial agents with iden­ti­cal source code play­ing a pris­oner’s dilemma. Even if both agents can­not causally in­ter­act, one agent’s ac­tion pro­vides them with strong ev­i­dence about the other agent’s ac­tion. Ev­i­den­tial de­ci­sion the­ory and re­cently pro­posed var­i­ants of causal de­ci­sion the­ory (Yud­kowsky and Soares, 2018; Spohn, 2003; Poel­linger, 2013) say that agents should take such ev­i­dence into ac­count when mak­ing de­ci­sions. MSR is based on the idea that (i) hu­mans on Earth are in a similar situ­a­tion as the two AI agents: there prob­a­bly is a large or in­finite mul­ti­verse con­tain­ing many ex­act copies of hu­mans on Earth (Teg­mark 2003, p. 464), but also agents similar but non-iden­ti­cal to hu­mans. (ii) If hu­mans and these other, similar agents take each other’s prefer­ences into ac­count, then, due to gains from trade, ev­ery­one is bet­ter off than if ev­ery­one were to pur­sue only their own ends. It fol­lows from (i) and (ii) that hu­mans should take the prefer­ences of other, similar agents in the mul­ti­verse into ac­count, to pro­duce the ev­i­dence that they do in turn take hu­mans’ prefer­ences into ac­count, which leaves ev­ery­one bet­ter off.

Ac­cord­ing to Oester­held (2017, sec. 4), this idea could have far-reach­ing im­pli­ca­tions for pri­ori­ti­za­tion. For in­stance, given MSR, some forms of moral ad­vo­cacy could be­come in­effec­tive: ad­vo­cat­ing for their par­tic­u­lar val­ues pro­vides agents with ev­i­dence that oth­ers do the same, po­ten­tially neu­tral­iz­ing each other’s efforts. More­over, MSR could play a role in de­cid­ing which strate­gies to pur­sue in AI al­ign­ment. It could be­come es­pe­cially valuable to en­sure an AGI will en­gage in a mul­ti­verse-wide trade.