According to the simulation argument, either (1) humanity will soon become extinct; (2) posthumanity will never run ancestor simulations; or (3) we are almost certainly living in a simulation. Suppose (1) is true. Then the classical utilitarian case for focusing on existential risk reduction loses much of its force, since we are by assumption doomed to perish quickly anyway. Now suppose (3) is true. Here it seems plausible that the simulators will restart the simulation very quickly after the sims manage to kill themselves. So the case for focusing on existential risk is also weakened considerably. It is only on the second of the three scenarios that extinction is (roughly) as bad as classical utilitarians take it to be. So we can conclude: if you think there is a chance that posthumanity will run ancestor simulations (~2), the prospect of human extinction is much less serious than you thought it was.
If I recall correctly, the argument goes more or less as follows. The larger the future, the greater the likelihood of us being in a short-lived simulation, thus having negligible influence in the far future. The 2 effects cancel out, and therefore the ratio between far future value and near future value does not depend on the size of the future. The ratio is roughly inversely proportional to the fraction of resources going towards simulations, i.e. 2 times as much resources going to simulations means the far future is half as valuable relative to the near term future.
The possibility of us living in a short-lived simulation isn’t enough to count much against longtermism, because it’s also possible we could live in a long-lived simulation or a long-lived world, and those possibilities will be much higher stakes, so still dominate expected value calculations unless we assign them tiny probability together.
I think the argument crucially depends on the assumption that simulations will be disproportionately short-lived, and we have acausal influence over agents in other simulations. If for each long-running world (simulated or otherwise) with moral agents and moral patients, there are N short-lived worlds with (moral) agents and moral patients, and our actions are correlated with those of agents across worlds, then we get to decide for more agents in in short-lived worlds than long-lived ones. Basically, acausal influence will boost the expected value of all interventions, but if moral patients are disproportionately in short-lived simulations with agents whose decisions we’re correlated with relative to long-run simulations with agents whose decisions we’re correlated with (or more skewed towards the short-lived than it seems for our own world), acausal influence will disproportionately boost the expected value of neartermist interventions relative to longtermist ones.
Also, ~all of the expected value will be acausal if we fully count the value of acausal influence, based on the evidentialist’s wager and similar, given the possibility of very large or even infinite numbers of agents with whom we’re correlated.
I think the argument crucially depends on the assumption that simulations will be disproportionately short-lived
Yes, the argument depends on Brian’s parameter F not being super small. F is “fraction of all computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets (both the most intelligent and also less intelligent creatures on those planets)”. “A non-solipsish simulation is one in which most or all of the people and animals who seem to exist on Earth are actually being simulated to a non-trivial level of detail”. Brian guessed F = 10^-6, but it feels like it should be much smaller to me. If the value of the future is e.g. 10^30 times the value of this century, it is maybe reasonable to assume that the vast vast majority of computational sent-years are also simulations of the far future, as opposed to simulations of almost-space-colonizing ancestral planets.
I find this argument unconvincing. The vast majority of ‘simulations’ humans run are very unlike our actual history. The modal simulated entity to date is probably an NPC from World of Warcraft, a zergling from Starcraft or similar. This makes it incredibly speculative to imagine what our supposed simulators might be like, what resources they might have available and what their motivations might be.
Also the vast majority of ‘simulations’ focus on ‘exciting’ moments—pitched Team Fortress battles, epic RPG narratives, or at least active interaction with the simulators. If you and your workmates are just tapping away in your office on your keyboard doing theoretical existential risk research, the probability that someone like us has spent their precious resources to (re)create you seem radically lowered than if you’re (say) fighting a pitched battle.
I find this argument unconvincing. The vast majority of ‘simulations’ humans run are very unlike our actual history. The modal simulated entity to date is probably an NPC from World of Warcraft, a zergling from Starcraft or similar. This makes it incredibly speculative to imagine what our supposed simulators might be like, what resources they might have available and what their motivations might be.
Agreed, but could you explain why that would be an objection to Brian’s argument?
Also the vast majority of ‘simulations’ focus on ‘exciting’ moments—pitched Team Fortress battles, epic RPG narratives, or at least active interaction with the simulators. If you and your workmates are just tapping away in your office on your keyboard doing theoretical existential risk research, the probability that someone like us has spent their precious resources to (re)create you seem radically lowered than if you’re (say) fighting a pitched battle.
I do not know, because I agree with your 1st paragraph about it being quite hard to predict future simulated entities based on past history.
I mainly had in mind Pablo’s summary. It’s been a long time since I read Brian’s essay, and I don’t have bandwidth to review it now, so if he says something substantially different there, my argument might not apply. But basically every argument I remember hearing about how the simulation argument implies we should modify our behaviour presupposes that we have some level of inferential knowledge of our simulators (this presupposition being hidden in the assumption that simulations would be primarily ancestor simulations). This presupposition seems basically false to me, because, for example:
a. A zergling would struggle to gain much inferential knowledge of its simulators’ motivations.
b. A zergling looking around at the scope and complexity of its universe would typically observe that it itself is 2-dimensional (albeit with some quasi-3D properties), and is made from approx 38x94 ‘atoms’. Perhaps more advanced simulations would both be more numerous (and hence a higher proportion of simulationspace) and more complex, but it still seems hard to imagine they’ll average to anything like the same level of complexity as we see in our universe, or have a consistent difference from it.
c. If the simulation argument is correct for a single layer of reality, it seems (to the degree permitted by a and b) far more likely that it’s correct for multiple, perhaps vast numbers of layers of reality (insert ‘spawn more Overlords’ joke here). Thus the people whose decisions and motivations a zergling is trying to ultimately guess at is not ours, but someone whose distance from us is approx n(|human - zergling|), where n is the number of layers. It’s hard to imagine the zergling—or us—could make any intelligible assumptions at all about them at that level of removal.
To show this in Pablo’s argument:
Now suppose (3) is true. Here it seems plausible that the simulators will restart the simulation very quickly after the sims manage to kill themselves.
For this to be ‘plausible’ is to assert that we know our simulators’ motivations well enough to know that whatever they hoped to gain by running us will ‘plausibly’ be motivating enough for them to do it a second time in much the same form, and that their simulators will at least permit it, and so on.
Another version of the anti-x-risk argument from simulation I’ve heard (and which I confess with hindsight I was conflating Pablo’s with—maybe it’s part of Brian’s argument?) is that the simulators will likely switch off our universe if it expands beyond a certain size due to resource constraints. Again, this argument implies IMO vastly too high confidence in both their motivation and resource limits.
L is the cost-effectiveness of longtermist interventions.
S is the cost-effectiveness of neartermist interventions.
T “represent[s] how much more important it is to influence a unit of sentience by the average future digital agent than a present-day biological one for these reasons [“future, simulated human might have much higher intensity of experience per unit time, and we may have much greater control over the quality of his experience”]”.
D is “a discount representing how much harder it is to actually end up helping a being in the far future than in the near term, due to both uncertainty and the muted effects of our actions now on what happens later on”.
F is “the fraction of all computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets (both the most intelligent and also less intelligent creatures on those planets)”. “A non-solipsish simulation is one in which most or all of the people and animals who seem to exist on Earth are actually being simulated to a non-trivial level of detail”.
Brian guesses T = 10^4, D = 10^-3, and F = 10^-6, thus concluding L/S = 10^7. I guess you are saying with your comment just above that F should be much lower than 10^-6? For reference, here is Brian’s motivation for F = 10^-6:
It’s very unclear how many simulations of almost-space-colonizing planets superintelligences would run. The fraction of all computing resources spent on this might be close to 100% or might be below 10-15. It’s hard to predict resource allocation by advanced civilizations. But I set this parameter based on assuming that ~10-4 of sent-years will go toward ancestor simulations of some sort (this is probably too high, but it’s biased upward in expectation, since, e.g., maybe there’s a 0.05% chance that post-humans devote 20% of sent-years to ancestor simulations), and only 1% of those simulations will be of the almost-space-colonizing period (since there might also be many simulations of the origin of life, prehistory, and the early years after a planet’s “singularity”). If we think that simulations contain more sentience per petaflop of computation than do other number-crunching calculations, then 10-4 of sent-years devoted to ancestor simulations of some kind may mean less than 10-4 of all raw petaflops devoted to such simulations.
The informality of that equation makes it hard for me to know how to reason about it. For eg,
T, D and F seem heavily interdependent.
I’m just not sure how to parse ‘computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets’. What does it mean for a year of sentient life to be spent simulating something? Do you think he means what fraction of experienced years exist in ancestor simulations? I’m still confused by this after reading the last paragraph.
I’m not sure what the expression’s value represents. Are we supposed to multiply some further estimate we have of longtermist work by 10^7? (if so, what estimate is it that’s so low that 10^7 isn’t enough of a multiplier to make it still eclipse all short termist work?)
If you feel like you understand it, maybe you could give me a concrete example of how to apply this reasoning?
For what it’s worth, I have much more prosaic reasons for doubting the value of explicitly longtermist work both in practice (the stuff I’ve discussed with you before that makes me feel like it’s misprioritised) and in principle (my instinct is that in situations that reduce to a kind of Pascalian mugging, xP(x) where x is a counterfactual payoff increase and P(x) is the probability of that payoff increase, approaches 0 as x tends to infinity).
I’m just not sure how to parse ‘computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets’.
I think F = “sent-years respecting the simulations of the beings in almost-space-colonizing ancestral planets”/”all sent-years of the universe”. Brian defines sent-years as follows:
I’ll define 1 sent-year as the amount of complexity-weighted experience of one life-year of a typical biological human. That is, consider the sentience over time experienced in a year by the median biological human on Earth right now. Then, a computational process that has 46 times this much subjective experience has 46 sent-years of computation.2 Computations with a higher density of sentience may have more sents even if they have fewer FLOPS.
I said Brian concluded that L/S = T*D/F, but this was after simplifying L/S = T*D/(E/N + F), where:
E is “the amount of sentience on Earth in the near term (say, the next century or two)”.
“On average, these civilizations [“that are about to colonize space”] will run computations whose sentience is equivalent to that of N human-years”.
Then Brian says:
Everyone agrees that E/N is very small, perhaps less than 10-30 or something, because the far future could contain astronomical amounts of sentience [see e.g. Table 1 of Newberry 2021]. If F is not nearly as small (and I would guess that it’s not), then we can approximate L/S as T * D / F.
The simulation argument dampening future fanaticism comes from Brian assuming that E/N << F, in which case L/S = T*D/F, and therefore prioritising the future no longer depends on its size. However, for the reasons you mentioned (we are not simulating our ancestors much), I feel like we should a priori expect E/N and F to be similar, and correlated, in which case L/S will still be huge unless it is countered by a very small D (i.e. if the typical low tractability argument against longtermism goes through).
I’m not sure what the expression’s value represents. Are we supposed to multiply some further estimate we have of longtermist work by 10^7? (if so, what estimate is it that’s so low that 10^7 isn’t enough of a multiplier to make it still eclipse all short termist work?)
I think L/S is just supposed to be a heuristic for how much to prioritise longtermist actions relative to neartermist ones. Brian’s inputs lead to 10^7, but they were mainly illustrative:
This [L/S = 10^7] happens to be bigger than 1, which suggests that targeting the far future is still ~10 million times better than targeting the short term. But this calculation could have come out as less than 1 using other possible inputs. Combined with general model uncertainty, it seems premature to conclude that far-future-focused actions dominate short-term helping. It’s likely that the far future will still dominate after more thorough analysis, but by much less than a naive future fanatic would have thought.
However, it seems to me that, even if one thinks that both E/N and F are super small, L/S could still be smaller than 1 due to super small D. This relates to your point that:
my instinct is that in situations that reduce to a kind of Pascalian mugging, xP(x) where x is a payoff size and P(x) is the probability of that payoff, approaches 0 as x tends to infinity
I share your instinct. I think David Thorstad calls that rapid diminution.
If you feel like you understand it, maybe you could give me a concrete example of how to apply this reasoning?
I think Brian’s reasoning works more or less as follows. Neglecting the simulation argument, if I save one life, I am only saving one life. However, if F = 10^-16[1] of sentience-years are spent simulating situation like my own, and the future contains N = 10^30 sentience-years, then me saving a life will imply saving F*N = 10^14 copies of the person I saved. I do not think the argument goes through because I would expect F to be super small in this case, such that F*N is similar to 1.
This [L/S = 10^7] happens to be bigger than 1, which suggests that targeting the far future is still ~10 million times better than targeting the short term. But this calculation could have come out as less than 1 using other possible inputs. Combined with general model uncertainty, it seems premature to conclude that far-future-focused actions dominate short-term helping. It’s likely that the far future will still dominate after more thorough analysis, but by much less than a naive future fanatic would have thought.
This is more of a sidenote, but given all the empirical and model uncertainty in any far-future oriented work, it doesn’t seem like adding a highly speculative counterargument with its own radical uncertainties should meaningfully shift anyone’s priors. It seems like a strong longtermist could accept Brian’s views at face value and say ‘but the possibility of L/S being vastly bigger than 1 means we should just accept the Pascalian reasoning and plow ahead regardless’, while a sceptic could point to rapid diminution and say no simulationy weirdness is necessary to reject these views.
(Sidesidenote: I wonder whether anyone has investigated the maths of this in any detail? I can imagine there being some possible proof by contradiction of RD, along the lines of ’if there were some minimum amount that it was rational for the muggee to accept, a dishonest mugger could learn that and raise the offer beyond it whereas an honest mugger might not be able to, and therefore, when the mugger’s epistemics are taken into account, you should not be willing to accept that amount. Though I can also imagine this might just end up as an awkward integral that you have to choose your values for somewhat arbitrarily)
I think Brian’s reasoning works more or less as follows. Neglecting the simulation argument, if I save one life, I am only saving one life. However, if F = 10^-16[1] of sentience-years are spent simulating situation like my own, and the future contains N = 10^30 sentience-years, then me saving a life will imply saving F*N = 10^14 copies of the person I saved. I do not think the argument goes through because I would expect F to be super small in this case, such that F*N is similar to 1.
For the record, this kind of thing is why I love Brian (aside from him being a wonderful human) - I disagree with him vigorously on almost every point of detail on reflection, but he always come up with some weird take. I had either forgotten or never saw this version of the argument, and was imagining the version closer to Pablo’s that talks about the limited value of the far future rather than the increased near-term value.
That said, I still think I can basically C&P my objection. It’s maybe less that I think F is likely to be super small, and more that, given our inability to make any intelligible statements about our purported simulators’ nature or intentions it feels basically undefined (or, if you like, any statement whatsoever about its value is ultimately going to be predicated on arbitrary assumptions), making the equation just not parse (or not output any value that could guide our behaviour).
Interesting. But how soon is “soon”? And even if we are a simulation, to all intents and purposes it is real to us. It doesn’t seem like much of a consolation that the simulators might restart the simulation after we go extinct (any more than the Many Worlds interpretation of Quantum Mechanics gives solace over many universes still existing nearby in probability space in the multiverse).
I seem to remember a comment from Carl Shulman saying the risk of simulation shut-down should not be assumed to be less than 1 in 1 M per year (or maybe it was per century). This suggests there is still a long way before it happens. On the other hand, I would intuitively think the risk to be higher if the time we are in really is special. I do not remember whether the comment was taking that into account.
And even if we are a simulation, to all intents and purposes it is real to us. It doesn’t seem like much of a consolation that the simulators might restart the simulation after we go extinct (any more than the Many Worlds interpretation of Quantum Mechanics gives solace over many universes still existing nearby in probability space in the multiverse).
Yes, it is not a consolation. It is an argument for focussing more on interventions which have nearterm benefits, like corporate campaigns for chicken welfare, instead of ones whose benefits may not be realised due to simulation shut-down.
Yes, it is not a consolation. It is an argument for focussing more on interventions which have nearterm benefits, like corporate campaigns for chicken welfare, instead of ones whose benefits may not be realised due to simulation shut-down
I still don’t think this goes through either. I’m saying we should care about our world going extinct just as much as if it were the only world (given we can’t causally influence the others).
Agreed, but if the lifespan of the only world is much shorter due to risk of simulation shut-down, the loss of value due to extinction is smaller. In any case, this argument should be weighted together with many others. I personally still direct 100 % of my donations to the Long-Term Future Fund, which is essentially funding AI safety work. Thanks for your work in this space!
Thanks for your donations to the LTFF. I think they need to start funding stuff aimedat slowing AI down (/pushing for a global moratorium on AGI development). There’s not enough time for AI Safety work to bear fruit otherwise.
Hi Spencer,
You may be interested in Brian Tomasik’s analysis on How the Simulation Argument Dampens Future Fanaticism. I think its essence is well captured in this short comment by Pablo Stafforini:
If I recall correctly, the argument goes more or less as follows. The larger the future, the greater the likelihood of us being in a short-lived simulation, thus having negligible influence in the far future. The 2 effects cancel out, and therefore the ratio between far future value and near future value does not depend on the size of the future. The ratio is roughly inversely proportional to the fraction of resources going towards simulations, i.e. 2 times as much resources going to simulations means the far future is half as valuable relative to the near term future.
The possibility of us living in a short-lived simulation isn’t enough to count much against longtermism, because it’s also possible we could live in a long-lived simulation or a long-lived world, and those possibilities will be much higher stakes, so still dominate expected value calculations unless we assign them tiny probability together.
I think the argument crucially depends on the assumption that simulations will be disproportionately short-lived, and we have acausal influence over agents in other simulations. If for each long-running world (simulated or otherwise) with moral agents and moral patients, there are N short-lived worlds with (moral) agents and moral patients, and our actions are correlated with those of agents across worlds, then we get to decide for more agents in in short-lived worlds than long-lived ones. Basically, acausal influence will boost the expected value of all interventions, but if moral patients are disproportionately in short-lived simulations with agents whose decisions we’re correlated with relative to long-run simulations with agents whose decisions we’re correlated with (or more skewed towards the short-lived than it seems for our own world), acausal influence will disproportionately boost the expected value of neartermist interventions relative to longtermist ones.
Also, ~all of the expected value will be acausal if we fully count the value of acausal influence, based on the evidentialist’s wager and similar, given the possibility of very large or even infinite numbers of agents with whom we’re correlated.
Thanks for clarifying, Michael!
Yes, the argument depends on Brian’s parameter F not being super small. F is “fraction of all computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets (both the most intelligent and also less intelligent creatures on those planets)”. “A non-solipsish simulation is one in which most or all of the people and animals who seem to exist on Earth are actually being simulated to a non-trivial level of detail”. Brian guessed F = 10^-6, but it feels like it should be much smaller to me. If the value of the future is e.g. 10^30 times the value of this century, it is maybe reasonable to assume that the vast vast majority of computational sent-years are also simulations of the far future, as opposed to simulations of almost-space-colonizing ancestral planets.
I find this argument unconvincing. The vast majority of ‘simulations’ humans run are very unlike our actual history. The modal simulated entity to date is probably an NPC from World of Warcraft, a zergling from Starcraft or similar. This makes it incredibly speculative to imagine what our supposed simulators might be like, what resources they might have available and what their motivations might be.
Also the vast majority of ‘simulations’ focus on ‘exciting’ moments—pitched Team Fortress battles, epic RPG narratives, or at least active interaction with the simulators. If you and your workmates are just tapping away in your office on your keyboard doing theoretical existential risk research, the probability that someone like us has spent their precious resources to (re)create you seem radically lowered than if you’re (say) fighting a pitched battle.
Thanks for commenting!
Agreed, but could you explain why that would be an objection to Brian’s argument?
I do not know, because I agree with your 1st paragraph about it being quite hard to predict future simulated entities based on past history.
I mainly had in mind Pablo’s summary. It’s been a long time since I read Brian’s essay, and I don’t have bandwidth to review it now, so if he says something substantially different there, my argument might not apply. But basically every argument I remember hearing about how the simulation argument implies we should modify our behaviour presupposes that we have some level of inferential knowledge of our simulators (this presupposition being hidden in the assumption that simulations would be primarily ancestor simulations). This presupposition seems basically false to me, because, for example:
a. A zergling would struggle to gain much inferential knowledge of its simulators’ motivations.
b. A zergling looking around at the scope and complexity of its universe would typically observe that it itself is 2-dimensional (albeit with some quasi-3D properties), and is made from approx 38x94 ‘atoms’. Perhaps more advanced simulations would both be more numerous (and hence a higher proportion of simulationspace) and more complex, but it still seems hard to imagine they’ll average to anything like the same level of complexity as we see in our universe, or have a consistent difference from it.
c. If the simulation argument is correct for a single layer of reality, it seems (to the degree permitted by a and b) far more likely that it’s correct for multiple, perhaps vast numbers of layers of reality (insert ‘spawn more Overlords’ joke here). Thus the people whose decisions and motivations a zergling is trying to ultimately guess at is not ours, but someone whose distance from us is approx n(|human - zergling|), where n is the number of layers. It’s hard to imagine the zergling—or us—could make any intelligible assumptions at all about them at that level of removal.
To show this in Pablo’s argument:
For this to be ‘plausible’ is to assert that we know our simulators’ motivations well enough to know that whatever they hoped to gain by running us will ‘plausibly’ be motivating enough for them to do it a second time in much the same form, and that their simulators will at least permit it, and so on.
Another version of the anti-x-risk argument from simulation I’ve heard (and which I confess with hindsight I was conflating Pablo’s with—maybe it’s part of Brian’s argument?) is that the simulators will likely switch off our universe if it expands beyond a certain size due to resource constraints. Again, this argument implies IMO vastly too high confidence in both their motivation and resource limits.
Thanks for explaining that!
Brian concludes that L/S = T*D/F, where:
L is the cost-effectiveness of longtermist interventions.
S is the cost-effectiveness of neartermist interventions.
T “represent[s] how much more important it is to influence a unit of sentience by the average future digital agent than a present-day biological one for these reasons [“future, simulated human might have much higher intensity of experience per unit time, and we may have much greater control over the quality of his experience”]”.
D is “a discount representing how much harder it is to actually end up helping a being in the far future than in the near term, due to both uncertainty and the muted effects of our actions now on what happens later on”.
F is “the fraction of all computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets (both the most intelligent and also less intelligent creatures on those planets)”. “A non-solipsish simulation is one in which most or all of the people and animals who seem to exist on Earth are actually being simulated to a non-trivial level of detail”.
Brian guesses T = 10^4, D = 10^-3, and F = 10^-6, thus concluding L/S = 10^7. I guess you are saying with your comment just above that F should be much lower than 10^-6? For reference, here is Brian’s motivation for F = 10^-6:
The informality of that equation makes it hard for me to know how to reason about it. For eg,
T, D and F seem heavily interdependent.
I’m just not sure how to parse ‘computational sent-years spent non-solipsishly simulating almost-space-colonizing ancestral planets’. What does it mean for a year of sentient life to be spent simulating something? Do you think he means what fraction of experienced years exist in ancestor simulations? I’m still confused by this after reading the last paragraph.
I’m not sure what the expression’s value represents. Are we supposed to multiply some further estimate we have of longtermist work by 10^7? (if so, what estimate is it that’s so low that 10^7 isn’t enough of a multiplier to make it still eclipse all short termist work?)
If you feel like you understand it, maybe you could give me a concrete example of how to apply this reasoning?
For what it’s worth, I have much more prosaic reasons for doubting the value of explicitly longtermist work both in practice (the stuff I’ve discussed with you before that makes me feel like it’s misprioritised) and in principle (my instinct is that in situations that reduce to a kind of Pascalian mugging, xP(x) where x is a counterfactual payoff increase and P(x) is the probability of that payoff increase, approaches 0 as x tends to infinity).
I agree.
I think F = “sent-years respecting the simulations of the beings in almost-space-colonizing ancestral planets”/”all sent-years of the universe”. Brian defines sent-years as follows:
I said Brian concluded that L/S = T*D/F, but this was after simplifying L/S = T*D/(E/N + F), where:
E is “the amount of sentience on Earth in the near term (say, the next century or two)”.
“On average, these civilizations [“that are about to colonize space”] will run computations whose sentience is equivalent to that of N human-years”.
Then Brian says:
The simulation argument dampening future fanaticism comes from Brian assuming that E/N << F, in which case L/S = T*D/F, and therefore prioritising the future no longer depends on its size. However, for the reasons you mentioned (we are not simulating our ancestors much), I feel like we should a priori expect E/N and F to be similar, and correlated, in which case L/S will still be huge unless it is countered by a very small D (i.e. if the typical low tractability argument against longtermism goes through).
I think L/S is just supposed to be a heuristic for how much to prioritise longtermist actions relative to neartermist ones. Brian’s inputs lead to 10^7, but they were mainly illustrative:
However, it seems to me that, even if one thinks that both E/N and F are super small, L/S could still be smaller than 1 due to super small D. This relates to your point that:
I share your instinct. I think David Thorstad calls that rapid diminution.
I think Brian’s reasoning works more or less as follows. Neglecting the simulation argument, if I save one life, I am only saving one life. However, if F = 10^-16[1] of sentience-years are spent simulating situation like my own, and the future contains N = 10^30 sentience-years, then me saving a life will imply saving F*N = 10^14 copies of the person I saved. I do not think the argument goes through because I would expect F to be super small in this case, such that F*N is similar to 1.
Brian’s F = 10^-6 divided by the human population of 10^10.
Appreciate the patient breakdown :)
This is more of a sidenote, but given all the empirical and model uncertainty in any far-future oriented work, it doesn’t seem like adding a highly speculative counterargument with its own radical uncertainties should meaningfully shift anyone’s priors. It seems like a strong longtermist could accept Brian’s views at face value and say ‘but the possibility of L/S being vastly bigger than 1 means we should just accept the Pascalian reasoning and plow ahead regardless’, while a sceptic could point to rapid diminution and say no simulationy weirdness is necessary to reject these views.
(Sidesidenote: I wonder whether anyone has investigated the maths of this in any detail? I can imagine there being some possible proof by contradiction of RD, along the lines of ’if there were some minimum amount that it was rational for the muggee to accept, a dishonest mugger could learn that and raise the offer beyond it whereas an honest mugger might not be able to, and therefore, when the mugger’s epistemics are taken into account, you should not be willing to accept that amount. Though I can also imagine this might just end up as an awkward integral that you have to choose your values for somewhat arbitrarily)
For the record, this kind of thing is why I love Brian (aside from him being a wonderful human) - I disagree with him vigorously on almost every point of detail on reflection, but he always come up with some weird take. I had either forgotten or never saw this version of the argument, and was imagining the version closer to Pablo’s that talks about the limited value of the far future rather than the increased near-term value.
That said, I still think I can basically C&P my objection. It’s maybe less that I think F is likely to be super small, and more that, given our inability to make any intelligible statements about our purported simulators’ nature or intentions it feels basically undefined (or, if you like, any statement whatsoever about its value is ultimately going to be predicated on arbitrary assumptions), making the equation just not parse (or not output any value that could guide our behaviour).
Interesting. But how soon is “soon”? And even if we are a simulation, to all intents and purposes it is real to us. It doesn’t seem like much of a consolation that the simulators might restart the simulation after we go extinct (any more than the Many Worlds interpretation of Quantum Mechanics gives solace over many universes still existing nearby in probability space in the multiverse).
Maybe the simulators will stage an intervention over us reaching the Singularity. I don’t think we can rely on this though (indeed, this is part of the exotic scenarios that make up the ~10% chance that I think we aren’t doomed from AGI by default).
Thanks for engaging, Greg!
I seem to remember a comment from Carl Shulman saying the risk of simulation shut-down should not be assumed to be less than 1 in 1 M per year (or maybe it was per century). This suggests there is still a long way before it happens. On the other hand, I would intuitively think the risk to be higher if the time we are in really is special. I do not remember whether the comment was taking that into account.
Yes, it is not a consolation. It is an argument for focussing more on interventions which have nearterm benefits, like corporate campaigns for chicken welfare, instead of ones whose benefits may not be realised due to simulation shut-down.
I still don’t think this goes through either. I’m saying we should care about our world going extinct just as much as if it were the only world (given we can’t causally influence the others).
Agreed, but if the lifespan of the only world is much shorter due to risk of simulation shut-down, the loss of value due to extinction is smaller. In any case, this argument should be weighted together with many others. I personally still direct 100 % of my donations to the Long-Term Future Fund, which is essentially funding AI safety work. Thanks for your work in this space!
Thanks for your donations to the LTFF. I think they need to start funding stuff aimed at slowing AI down (/pushing for a global moratorium on AGI development). There’s not enough time for AI Safety work to bear fruit otherwise.
Thank you Vasco! This seems hard to model, but worthwhile. I’ll think on it.