I find this argument unconvincing. The vast majority of ‘simulations’ humans run are very unlike our actual history. The modal simulated entity to date is probably an NPC from World of Warcraft, a zergling from Starcraft or similar. This makes it incredibly speculative to imagine what our supposed simulators might be like, what resources they might have available and what their motivations might be.
Also the vast majority of ‘simulations’ focus on ‘exciting’ moments—pitched Team Fortress battles, epic RPG narratives, or at least active interaction with the simulators. If you and your workmates are just tapping away in your office on your keyboard doing theoretical existential risk research, the probability that someone like us has spent their precious resources to (re)create you seem radically lowered than if you’re (say) fighting a pitched battle.
I find this argument unconvincing. The vast majority of ‘simulations’ humans run are very unlike our actual history. The modal simulated entity to date is probably an NPC from World of Warcraft, a zergling from Starcraft or similar. This makes it incredibly speculative to imagine what our supposed simulators might be like, what resources they might have available and what their motivations might be.
Agreed, but could you explain why that would be an objection to Brian’s argument?
Also the vast majority of ‘simulations’ focus on ‘exciting’ moments—pitched Team Fortress battles, epic RPG narratives, or at least active interaction with the simulators. If you and your workmates are just tapping away in your office on your keyboard doing theoretical existential risk research, the probability that someone like us has spent their precious resources to (re)create you seem radically lowered than if you’re (say) fighting a pitched battle.
I do not know, because I agree with your 1st paragraph about it being quite hard to predict future simulated entities based on past history.
I mainly had in mind Pablo’s summary. It’s been a long time since I read Brian’s essay, and I don’t have bandwidth to review it now, so if he says something substantially different there, my argument might not apply. But basically every argument I remember hearing about how the simulation argument implies we should modify our behaviour presupposes that we have some level of inferential knowledge of our simulators (this presupposition being hidden in the assumption that simulations would be primarily ancestor simulations). This presupposition seems basically false to me, because, for example:
a. A zergling would struggle to gain much inferential knowledge of its simulators’ motivations.
b. A zergling looking around at the scope and complexity of its universe would typically observe that it itself is 2-dimensional (albeit with some quasi-3D properties), and is made from approx 38x94 ‘atoms’. Perhaps more advanced simulations would both be more numerous (and hence a higher proportion of simulationspace) and more complex, but it still seems hard to imagine they’ll average to anything like the same level of complexity as we see in our universe, or have a consistent difference from it.
c. If the simulation argument is correct for a single layer of reality, it seems (to the degree permitted by a and b) far more likely that it’s correct for multiple, perhaps vast numbers of layers of reality (insert ‘spawn more Overlords’ joke here). Thus the people whose decisions and motivations a zergling is trying to ultimately guess at is not ours, but someone whose distance from us is approx n(|human - zergling|), where n is the number of layers. It’s hard to imagine the zergling—or us—could make any intelligible assumptions at all about them at that level of removal.
To show this in Pablo’s argument:
Now suppose (3) is true. Here it seems plausible that the simulators will restart the simulation very quickly after the sims manage to kill themselves.
For this to be ‘plausible’ is to assert that we know our simulators’ motivations well enough to know that whatever they hoped to gain by running us will ‘plausibly’ be motivating enough for them to do it a second time in much the same form, and that their simulators will at least permit it, and so on.
Another version of the anti-x-risk argument from simulation I’ve heard (and which I confess with hindsight I was conflating Pablo’s with—maybe it’s part of Brian’s argument?) is that the simulators will likely switch off our universe if it expands beyond a certain size due to resource constraints. Again, this argument implies IMO vastly too high confidence in both their motivation and resource limits.
L is the cost-effectiveness of longtermist interventions.
S is the cost-effectiveness of neartermist interventions.
T “represent[s] how much more important it is to influence a unit of sentience by the average future digital agent than a present-day biological one for these reasons [“future, simulated human might have much higher intensity of experience per unit time, and we may have much greater control over the quality of his experience”]”.
D is “a discount representing how much harder it is to actually end up helping a being in the far future than in the near term, due to both uncertainty and the muted effects of our actions now on what happens later on”.
F is “the fraction of all computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets (both the most intelligent and also less intelligent creatures on those planets)”. “A non-solipsish simulation is one in which most or all of the people and animals who seem to exist on Earth are actually being simulated to a non-trivial level of detail”.
Brian guesses T = 10^4, D = 10^-3, and F = 10^-6, thus concluding L/S = 10^7. I guess you are saying with your comment just above that F should be much lower than 10^-6? For reference, here is Brian’s motivation for F = 10^-6:
It’s very unclear how many simulations of almost-space-colonizing planets superintelligences would run. The fraction of all computing resources spent on this might be close to 100% or might be below 10-15. It’s hard to predict resource allocation by advanced civilizations. But I set this parameter based on assuming that ~10-4 of sent-years will go toward ancestor simulations of some sort (this is probably too high, but it’s biased upward in expectation, since, e.g., maybe there’s a 0.05% chance that post-humans devote 20% of sent-years to ancestor simulations), and only 1% of those simulations will be of the almost-space-colonizing period (since there might also be many simulations of the origin of life, prehistory, and the early years after a planet’s “singularity”). If we think that simulations contain more sentience per petaflop of computation than do other number-crunching calculations, then 10-4 of sent-years devoted to ancestor simulations of some kind may mean less than 10-4 of all raw petaflops devoted to such simulations.
The informality of that equation makes it hard for me to know how to reason about it. For eg,
T, D and F seem heavily interdependent.
I’m just not sure how to parse ‘computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets’. What does it mean for a year of sentient life to be spent simulating something? Do you think he means what fraction of experienced years exist in ancestor simulations? I’m still confused by this after reading the last paragraph.
I’m not sure what the expression’s value represents. Are we supposed to multiply some further estimate we have of longtermist work by 10^7? (if so, what estimate is it that’s so low that 10^7 isn’t enough of a multiplier to make it still eclipse all short termist work?)
If you feel like you understand it, maybe you could give me a concrete example of how to apply this reasoning?
For what it’s worth, I have much more prosaic reasons for doubting the value of explicitly longtermist work both in practice (the stuff I’ve discussed with you before that makes me feel like it’s misprioritised) and in principle (my instinct is that in situations that reduce to a kind of Pascalian mugging, xP(x) where x is a counterfactual payoff increase and P(x) is the probability of that payoff increase, approaches 0 as x tends to infinity).
I’m just not sure how to parse ‘computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets’.
I think F = “sent-years respecting the simulations of the beings in almost-space-colonizing ancestral planets”/”all sent-years of the universe”. Brian defines sent-years as follows:
I’ll define 1 sent-year as the amount of complexity-weighted experience of one life-year of a typical biological human. That is, consider the sentience over time experienced in a year by the median biological human on Earth right now. Then, a computational process that has 46 times this much subjective experience has 46 sent-years of computation.2 Computations with a higher density of sentience may have more sents even if they have fewer FLOPS.
I said Brian concluded that L/S = T*D/F, but this was after simplifying L/S = T*D/(E/N + F), where:
E is “the amount of sentience on Earth in the near term (say, the next century or two)”.
“On average, these civilizations [“that are about to colonize space”] will run computations whose sentience is equivalent to that of N human-years”.
Then Brian says:
Everyone agrees that E/N is very small, perhaps less than 10-30 or something, because the far future could contain astronomical amounts of sentience [see e.g. Table 1 of Newberry 2021]. If F is not nearly as small (and I would guess that it’s not), then we can approximate L/S as T * D / F.
The simulation argument dampening future fanaticism comes from Brian assuming that E/N << F, in which case L/S = T*D/F, and therefore prioritising the future no longer depends on its size. However, for the reasons you mentioned (we are not simulating our ancestors much), I feel like we should a priori expect E/N and F to be similar, and correlated, in which case L/S will still be huge unless it is countered by a very small D (i.e. if the typical low tractability argument against longtermism goes through).
I’m not sure what the expression’s value represents. Are we supposed to multiply some further estimate we have of longtermist work by 10^7? (if so, what estimate is it that’s so low that 10^7 isn’t enough of a multiplier to make it still eclipse all short termist work?)
I think L/S is just supposed to be a heuristic for how much to prioritise longtermist actions relative to neartermist ones. Brian’s inputs lead to 10^7, but they were mainly illustrative:
This [L/S = 10^7] happens to be bigger than 1, which suggests that targeting the far future is still ~10 million times better than targeting the short term. But this calculation could have come out as less than 1 using other possible inputs. Combined with general model uncertainty, it seems premature to conclude that far-future-focused actions dominate short-term helping. It’s likely that the far future will still dominate after more thorough analysis, but by much less than a naive future fanatic would have thought.
However, it seems to me that, even if one thinks that both E/N and F are super small, L/S could still be smaller than 1 due to super small D. This relates to your point that:
my instinct is that in situations that reduce to a kind of Pascalian mugging, xP(x) where x is a payoff size and P(x) is the probability of that payoff, approaches 0 as x tends to infinity
I share your instinct. I think David Thorstad calls that rapid diminution.
If you feel like you understand it, maybe you could give me a concrete example of how to apply this reasoning?
I think Brian’s reasoning works more or less as follows. Neglecting the simulation argument, if I save one life, I am only saving one life. However, if F = 10^-16[1] of sentience-years are spent simulating situation like my own, and the future contains N = 10^30 sentience-years, then me saving a life will imply saving F*N = 10^14 copies of the person I saved. I do not think the argument goes through because I would expect F to be super small in this case, such that F*N is similar to 1.
This [L/S = 10^7] happens to be bigger than 1, which suggests that targeting the far future is still ~10 million times better than targeting the short term. But this calculation could have come out as less than 1 using other possible inputs. Combined with general model uncertainty, it seems premature to conclude that far-future-focused actions dominate short-term helping. It’s likely that the far future will still dominate after more thorough analysis, but by much less than a naive future fanatic would have thought.
This is more of a sidenote, but given all the empirical and model uncertainty in any far-future oriented work, it doesn’t seem like adding a highly speculative counterargument with its own radical uncertainties should meaningfully shift anyone’s priors. It seems like a strong longtermist could accept Brian’s views at face value and say ‘but the possibility of L/S being vastly bigger than 1 means we should just accept the Pascalian reasoning and plow ahead regardless’, while a sceptic could point to rapid diminution and say no simulationy weirdness is necessary to reject these views.
(Sidesidenote: I wonder whether anyone has investigated the maths of this in any detail? I can imagine there being some possible proof by contradiction of RD, along the lines of ’if there were some minimum amount that it was rational for the muggee to accept, a dishonest mugger could learn that and raise the offer beyond it whereas an honest mugger might not be able to, and therefore, when the mugger’s epistemics are taken into account, you should not be willing to accept that amount. Though I can also imagine this might just end up as an awkward integral that you have to choose your values for somewhat arbitrarily)
I think Brian’s reasoning works more or less as follows. Neglecting the simulation argument, if I save one life, I am only saving one life. However, if F = 10^-16[1] of sentience-years are spent simulating situation like my own, and the future contains N = 10^30 sentience-years, then me saving a life will imply saving F*N = 10^14 copies of the person I saved. I do not think the argument goes through because I would expect F to be super small in this case, such that F*N is similar to 1.
For the record, this kind of thing is why I love Brian (aside from him being a wonderful human) - I disagree with him vigorously on almost every point of detail on reflection, but he always come up with some weird take. I had either forgotten or never saw this version of the argument, and was imagining the version closer to Pablo’s that talks about the limited value of the far future rather than the increased near-term value.
That said, I still think I can basically C&P my objection. It’s maybe less that I think F is likely to be super small, and more that, given our inability to make any intelligible statements about our purported simulators’ nature or intentions it feels basically undefined (or, if you like, any statement whatsoever about its value is ultimately going to be predicated on arbitrary assumptions), making the equation just not parse (or not output any value that could guide our behaviour).
I find this argument unconvincing. The vast majority of ‘simulations’ humans run are very unlike our actual history. The modal simulated entity to date is probably an NPC from World of Warcraft, a zergling from Starcraft or similar. This makes it incredibly speculative to imagine what our supposed simulators might be like, what resources they might have available and what their motivations might be.
Also the vast majority of ‘simulations’ focus on ‘exciting’ moments—pitched Team Fortress battles, epic RPG narratives, or at least active interaction with the simulators. If you and your workmates are just tapping away in your office on your keyboard doing theoretical existential risk research, the probability that someone like us has spent their precious resources to (re)create you seem radically lowered than if you’re (say) fighting a pitched battle.
Thanks for commenting!
Agreed, but could you explain why that would be an objection to Brian’s argument?
I do not know, because I agree with your 1st paragraph about it being quite hard to predict future simulated entities based on past history.
I mainly had in mind Pablo’s summary. It’s been a long time since I read Brian’s essay, and I don’t have bandwidth to review it now, so if he says something substantially different there, my argument might not apply. But basically every argument I remember hearing about how the simulation argument implies we should modify our behaviour presupposes that we have some level of inferential knowledge of our simulators (this presupposition being hidden in the assumption that simulations would be primarily ancestor simulations). This presupposition seems basically false to me, because, for example:
a. A zergling would struggle to gain much inferential knowledge of its simulators’ motivations.
b. A zergling looking around at the scope and complexity of its universe would typically observe that it itself is 2-dimensional (albeit with some quasi-3D properties), and is made from approx 38x94 ‘atoms’. Perhaps more advanced simulations would both be more numerous (and hence a higher proportion of simulationspace) and more complex, but it still seems hard to imagine they’ll average to anything like the same level of complexity as we see in our universe, or have a consistent difference from it.
c. If the simulation argument is correct for a single layer of reality, it seems (to the degree permitted by a and b) far more likely that it’s correct for multiple, perhaps vast numbers of layers of reality (insert ‘spawn more Overlords’ joke here). Thus the people whose decisions and motivations a zergling is trying to ultimately guess at is not ours, but someone whose distance from us is approx n(|human - zergling|), where n is the number of layers. It’s hard to imagine the zergling—or us—could make any intelligible assumptions at all about them at that level of removal.
To show this in Pablo’s argument:
For this to be ‘plausible’ is to assert that we know our simulators’ motivations well enough to know that whatever they hoped to gain by running us will ‘plausibly’ be motivating enough for them to do it a second time in much the same form, and that their simulators will at least permit it, and so on.
Another version of the anti-x-risk argument from simulation I’ve heard (and which I confess with hindsight I was conflating Pablo’s with—maybe it’s part of Brian’s argument?) is that the simulators will likely switch off our universe if it expands beyond a certain size due to resource constraints. Again, this argument implies IMO vastly too high confidence in both their motivation and resource limits.
Thanks for explaining that!
Brian concludes that L/S = T*D/F, where:
L is the cost-effectiveness of longtermist interventions.
S is the cost-effectiveness of neartermist interventions.
T “represent[s] how much more important it is to influence a unit of sentience by the average future digital agent than a present-day biological one for these reasons [“future, simulated human might have much higher intensity of experience per unit time, and we may have much greater control over the quality of his experience”]”.
D is “a discount representing how much harder it is to actually end up helping a being in the far future than in the near term, due to both uncertainty and the muted effects of our actions now on what happens later on”.
F is “the fraction of all computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets (both the most intelligent and also less intelligent creatures on those planets)”. “A non-solipsish simulation is one in which most or all of the people and animals who seem to exist on Earth are actually being simulated to a non-trivial level of detail”.
Brian guesses T = 10^4, D = 10^-3, and F = 10^-6, thus concluding L/S = 10^7. I guess you are saying with your comment just above that F should be much lower than 10^-6? For reference, here is Brian’s motivation for F = 10^-6:
The informality of that equation makes it hard for me to know how to reason about it. For eg,
T, D and F seem heavily interdependent.
I’m just not sure how to parse ‘computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets’. What does it mean for a year of sentient life to be spent simulating something? Do you think he means what fraction of experienced years exist in ancestor simulations? I’m still confused by this after reading the last paragraph.
I’m not sure what the expression’s value represents. Are we supposed to multiply some further estimate we have of longtermist work by 10^7? (if so, what estimate is it that’s so low that 10^7 isn’t enough of a multiplier to make it still eclipse all short termist work?)
If you feel like you understand it, maybe you could give me a concrete example of how to apply this reasoning?
For what it’s worth, I have much more prosaic reasons for doubting the value of explicitly longtermist work both in practice (the stuff I’ve discussed with you before that makes me feel like it’s misprioritised) and in principle (my instinct is that in situations that reduce to a kind of Pascalian mugging, xP(x) where x is a counterfactual payoff increase and P(x) is the probability of that payoff increase, approaches 0 as x tends to infinity).
I agree.
I think F = “sent-years respecting the simulations of the beings in almost-space-colonizing ancestral planets”/”all sent-years of the universe”. Brian defines sent-years as follows:
I said Brian concluded that L/S = T*D/F, but this was after simplifying L/S = T*D/(E/N + F), where:
E is “the amount of sentience on Earth in the near term (say, the next century or two)”.
“On average, these civilizations [“that are about to colonize space”] will run computations whose sentience is equivalent to that of N human-years”.
Then Brian says:
The simulation argument dampening future fanaticism comes from Brian assuming that E/N << F, in which case L/S = T*D/F, and therefore prioritising the future no longer depends on its size. However, for the reasons you mentioned (we are not simulating our ancestors much), I feel like we should a priori expect E/N and F to be similar, and correlated, in which case L/S will still be huge unless it is countered by a very small D (i.e. if the typical low tractability argument against longtermism goes through).
I think L/S is just supposed to be a heuristic for how much to prioritise longtermist actions relative to neartermist ones. Brian’s inputs lead to 10^7, but they were mainly illustrative:
However, it seems to me that, even if one thinks that both E/N and F are super small, L/S could still be smaller than 1 due to super small D. This relates to your point that:
I share your instinct. I think David Thorstad calls that rapid diminution.
I think Brian’s reasoning works more or less as follows. Neglecting the simulation argument, if I save one life, I am only saving one life. However, if F = 10^-16[1] of sentience-years are spent simulating situation like my own, and the future contains N = 10^30 sentience-years, then me saving a life will imply saving F*N = 10^14 copies of the person I saved. I do not think the argument goes through because I would expect F to be super small in this case, such that F*N is similar to 1.
Brian’s F = 10^-6 divided by the human population of 10^10.
Appreciate the patient breakdown :)
This is more of a sidenote, but given all the empirical and model uncertainty in any far-future oriented work, it doesn’t seem like adding a highly speculative counterargument with its own radical uncertainties should meaningfully shift anyone’s priors. It seems like a strong longtermist could accept Brian’s views at face value and say ‘but the possibility of L/S being vastly bigger than 1 means we should just accept the Pascalian reasoning and plow ahead regardless’, while a sceptic could point to rapid diminution and say no simulationy weirdness is necessary to reject these views.
(Sidesidenote: I wonder whether anyone has investigated the maths of this in any detail? I can imagine there being some possible proof by contradiction of RD, along the lines of ’if there were some minimum amount that it was rational for the muggee to accept, a dishonest mugger could learn that and raise the offer beyond it whereas an honest mugger might not be able to, and therefore, when the mugger’s epistemics are taken into account, you should not be willing to accept that amount. Though I can also imagine this might just end up as an awkward integral that you have to choose your values for somewhat arbitrarily)
For the record, this kind of thing is why I love Brian (aside from him being a wonderful human) - I disagree with him vigorously on almost every point of detail on reflection, but he always come up with some weird take. I had either forgotten or never saw this version of the argument, and was imagining the version closer to Pablo’s that talks about the limited value of the far future rather than the increased near-term value.
That said, I still think I can basically C&P my objection. It’s maybe less that I think F is likely to be super small, and more that, given our inability to make any intelligible statements about our purported simulators’ nature or intentions it feels basically undefined (or, if you like, any statement whatsoever about its value is ultimately going to be predicated on arbitrary assumptions), making the equation just not parse (or not output any value that could guide our behaviour).