Amazing first post really good. Great to see someone mapping out the various implications of of long-term risk are on the value of existential risk reduction. Some fairly powerful implications.
Some immediate musings:
I wonder what the model would say if you said there was a small chance that the time of perils hypothesis was true. (Given it might be hard to work out if it is true or not).
I am not sure if this make sense but I would be curious about a model where rl starts at the same level as r but reduces over time tending towards 0 (like a Kuznets curve or similar would suggest) how quickly would it have to reduce to get astronomical value (linearly, exponentially, etc)
I think space might be more tractable than you make out. I think spreading beyond the solar system might be doable in 5-10 centuries. That said new technology could also keep the anthropogenic risks high too.
Thanks! It’s good to hear from you. These are really good points. Let me take them in turn. Apologies, it might take a bit to talk through them all.
First point: A small chance of the time of perils: It’s definitely right that if you plug in a moderately small chance (say, one in a million) of a strong version of the Time of Perils Hypothesis into these models, then on many assumptions you will get astronomical value for existential risk reduction.
I think here I’d want to emphasize four things.
(A) Incrementalist resistance to longtermism—”Shedding zeroes”: Lots of pushback against longtermism tries to wreck everything in one shot. These arguments find a single consideration (risk aversion; population ethics; non-consequentialist duties; …) and argue that this single consideration, on its own, is enough to decisively refute longtermism.
The problem with this way of arguing is that often the single consideration has to be very strong—some might say implausibly strong. Suppose I say “you might think the best longtermist intervention is 10^15 times better than the best short-termist intervention. But once you understand the decision-theoretic importance of risk aversion, you should realize we shouldn’t be willing to take risky bets at impacting the long-term future”. Richard Pettigrew makes an excellent argument that goes something like that. (https://forum.effectivealtruism.org/posts/xAoZotkzcY5mvmXFY/longtermism-risk-and-extinction). We liked that argument enough at GPI that we made it a working paper.
But one worry here is that once you realize how much power we’re attributing to risk aversion (enough to outweigh 15 orders of magnitude in expected value!), you might think we’re giving risk aversion a bit too much power in our decision theory. Is the right theory of risk aversion really that powerful?
I want to take a much more “incrementalist” approach to pushing back against longtermism. That approach strings together a number of considerations that might, or might not, on their own, be enough to knock down longtermism, but which together might well be enough to push us away from longtermism.
So suppose you say to me “okay, David, I’m going to put a 10^-5 credence on the Time of Perils”. Great. Now we’ve knocked five orders of magnitude off longtermism, in the sense that you now think the best longtermist interventions are 10^10 times better than the best short-termist interventions.
But now I ask: “Are you sure you’ve got the right normative theory? Other normative theories wouldn’t give quite such a large value to longtermist interventions”. And you say: fine, have another zero.
And then I ask: “But what if the future you’re preserving is a bad one, not a good one?” And you say: fine, have a zero.
If we keep going in this way, longtermism might come out to be true. But it might also turn out to be false. If it turns out that the upshot of my post is that the longtermist has to shed, say, 3-10 orders of magnitude because of uncertainty about the Time of Perils Hypothesis, that takes a lot of ballast out from the longtermist position, in the sense that longtermists no longer have so many zeroes left to shed.
(B) Form matters: It matters which kind of Time of Perils Hypothesis we put a small but nontrivial probability on.
So for example, suppose you put probability 10^-5 on the claim “within 200 years, existential risk will drop by 4 orders of magnitude, and it will never, NEVER, ever in all of human history rise above that level”. That’s great news for the longtermist.
But what if you put most of your credence in claims like “within 200 years, existential risk will drop by 4 orders of magnitude. It’ll be low for a while, but every once in a while it will flair up to become about as bad as it is today (if not worse)? Then you’ve really substantially reduced the expected value of the future, and likewise the expected value of existential risk reduction.
The reason that this is important is that if we’re not going to rely on stories about superintelligent AI, it gets harder to motivate Time of Perils views of the first form, rather than the second. It’s not like the destructive technologies threatening humanity today are the last destructive technologies we will ever event, or the irresponsible leaders in charge of those technologies are the last risk-insensitive leaders we’ll ever have.
(C) Positive arguments: Just because the longtermist would be okay with a (relatively) small probability on the Time of Perils Hypothesis doesn’t get her out of providing an argument for the Time of Perils Hypothesis.
One way of reading this post is as suggesting that the some of the arguments that are commonly made in support of the Time of Perils Hypothesis aren’t enough to ground the levels of probability we’d need to put on Time of Perils Hypotheses of the needed form.
Why wouldn’t they be enough? Here’s one reason.
(D) Priors matter: The Time of Perils Hypothesis makes quite a strong claim. It says that some quantity (existential risk) has been at moderate levels for a while; has recently risen and is currently growing; but will soon drop to an almost unnoticeable level; and will stay at that level literally forever. This isn’t the most common trajectory for quantities found in nature to follow. So if I didn’t tell you anything about why you would expect existential risk to drop, and stay low forever, despite its recent growth, you probably wouldn’t put very much probability on the Time of Perils Hypothesis. And by “not very much” I mean, not even 1 in a million, or maybe not even 1 in a billion. Although those can seem like tiny numbers, there are plenty of claims I can make right now in which you’d invest lower (prior) credence.
This means that if we’re going to get a high enough credence in time Time of Perils Hypothesis, that credence can’t be coming from our priors. It needs to come from an argument in favor of the Time of Perils Hypothesis. That’s not to say, from the armchair, that such an argument can’t be made. It’s only to say that we really need to focus on making arguments, and not (just) describing the probability targets that our arguments need to get us to.
Second point: Speed of the “Kuznets” phenomenon. I’m also curious about this! Any takers on taking a stab at the modeling here? I’m especially interested in models that go materially beyond Leopold’s, because I think he did a really good job at giving us a handle on that model.
If anyone’s comments here make a material change to the contents of the paper, I hereby promise an acknowledgment in the published paper once I manage to convince a journal to publish this. I can also pay in beer if that’s not against forum rules.
Third point: Tractability of space settlement: That’s a really interesting point to push on. I’d be curious to hear more about your views here! I’m also curious how you think that space settlement affects existential risk.
If you want to take a dive into the academic literature here, the working paper cites nearly every academic source that I could find on this issue. There’s … surprisingly little, and nobody can agree about anything. Probably there’s room for interesting new work here.
I think it’s pretty miscalibrated to assign a 10^-5 chance (or 1 in 100,000) chance that we’re in the Time of Perils.
Would you be interested in making a $2000:$1 bet that you will change your mind in the next 10 years and think that the chance we’re in the Time of Perils is >50%? (I’m also happy to bet larger numbers at that ratio).
I think this is a pretty good deal for you, if I did the math correctly:
your fair rate is >=50,000:1 for the proposition being false. So I’m offering you a 25x discount.
the proposition could be correct but you didn’t update enough to change your mind all the way up to 50% in the next 10 years.
you get to resolve this bet according to your own beliefs, so there’s some leeway.
I might forget about this bet and probably won’t chase you for the money.
-How can you judge calibration vs miscalculation on a question like this?
-David changing his mind doesn’t seem like a good proxy, because in this context a change of mind might be better explained by cultural factors than his prior being miscalibrated
Sure, the more idealized bet would be to commit to whatever the equivalent of “my estate” will be in 2223 to give him $1, and for David’s estate to give my estate back $2000 inflation-adjusted dollars in 1,000,200 years from now.
But this seems hard to pull off logistically. 10 years is already a long time.
David changing his mind doesn’t seem like a good proxy, because in this context a change of mind might be better explained by cultural factors
I don’t know, man, it sure feels like at some level “the progenitor of a theory disavows it after some deliberation” should be one of the stronger pieces of evidence we have that a theory is false, in worlds where empirical evidence is very hard to get quickly.
It’s good to hear from you. That’s exactly right—many EAs want to argue for the Time of Perils Hypothesis based on developments in AI.
My focus in this post is on versions of the Time of Perils Hypothesis that don’t rely on AI. If I’m right that these versions of the Time of Perils Hypothesis get into some trouble, then that might put more weight behind my fourth conclusion: it’s important to make sure we’re right about AI and its relationship to the Time of Perils.
I’m a bit hesitant to say many substantive things about AI in this post, because I think they’d probably take us a fair bit beyond the model. One nice way to take the post would simply be as emphasizing the importance of AI.
I do have some work in progress on the Singularity Hypothesis if that’s helpful. Shoot me an email if interested.
I didn’t think he argued for a time of perils. I’ve not read all his stuff but I can’t think of anything I’ve read by Holden that suggests he thinks future risk will go down very significant after AGI. Do you have a link at all?
Also what stuff I have seen seems to be more saying that this century is risky (pessimism) which goes against the astronomical value thesis.
The term “most important century” pretty directly suggests that this century is unique, and I assume that includes its unusually large amount of x-risk (given that Holden seems to think that the development of TAI is both the biggest source of x-risk this century and the reason for why this might be the most important century).
Holden also talks specifically about lock-in, which is one way the time of perils could end.
It’s possible, for reasons outlined here, that whatever the main force in world events is (perhaps digital people, misaligned AI, or something else) will create highly stable civilizations with “locked in” values, which populate our entire galaxy for billions of years to come.
If enough of that “locking in” happens this century, that could make it the most important century of all time for all intelligent life in our galaxy.
I want to roughly say that if something like PASTA is developed this century, it has at least a 25% chance of being the “most important century” in the above sense.
It’s definitely right that if you plug in a moderately small chance (say, one in a million) of a strong version of the Time of Perils Hypothesis into these models, then on many assumptions you will get astronomical value for existential risk reduction.
What do you mean here by a strong version of the Time of Perils Hypothesis?
Amazing first post really good. Great to see someone mapping out the various implications of of long-term risk are on the value of existential risk reduction. Some fairly powerful implications.
Some immediate musings:
I wonder what the model would say if you said there was a small chance that the time of perils hypothesis was true. (Given it might be hard to work out if it is true or not).
I am not sure if this make sense but I would be curious about a model where rl starts at the same level as r but reduces over time tending towards 0 (like a Kuznets curve or similar would suggest) how quickly would it have to reduce to get astronomical value (linearly, exponentially, etc)
I think space might be more tractable than you make out. I think spreading beyond the solar system might be doable in 5-10 centuries. That said new technology could also keep the anthropogenic risks high too.
Thanks! It’s good to hear from you. These are really good points. Let me take them in turn. Apologies, it might take a bit to talk through them all.
First point: A small chance of the time of perils: It’s definitely right that if you plug in a moderately small chance (say, one in a million) of a strong version of the Time of Perils Hypothesis into these models, then on many assumptions you will get astronomical value for existential risk reduction.
I think here I’d want to emphasize four things.
(A) Incrementalist resistance to longtermism—”Shedding zeroes”: Lots of pushback against longtermism tries to wreck everything in one shot. These arguments find a single consideration (risk aversion; population ethics; non-consequentialist duties; …) and argue that this single consideration, on its own, is enough to decisively refute longtermism.
The problem with this way of arguing is that often the single consideration has to be very strong—some might say implausibly strong. Suppose I say “you might think the best longtermist intervention is 10^15 times better than the best short-termist intervention. But once you understand the decision-theoretic importance of risk aversion, you should realize we shouldn’t be willing to take risky bets at impacting the long-term future”. Richard Pettigrew makes an excellent argument that goes something like that. (https://forum.effectivealtruism.org/posts/xAoZotkzcY5mvmXFY/longtermism-risk-and-extinction). We liked that argument enough at GPI that we made it a working paper.
But one worry here is that once you realize how much power we’re attributing to risk aversion (enough to outweigh 15 orders of magnitude in expected value!), you might think we’re giving risk aversion a bit too much power in our decision theory. Is the right theory of risk aversion really that powerful?
I want to take a much more “incrementalist” approach to pushing back against longtermism. That approach strings together a number of considerations that might, or might not, on their own, be enough to knock down longtermism, but which together might well be enough to push us away from longtermism.
So suppose you say to me “okay, David, I’m going to put a 10^-5 credence on the Time of Perils”. Great. Now we’ve knocked five orders of magnitude off longtermism, in the sense that you now think the best longtermist interventions are 10^10 times better than the best short-termist interventions.
But now I ask: “Are you sure you’ve got the right normative theory? Other normative theories wouldn’t give quite such a large value to longtermist interventions”. And you say: fine, have another zero.
And then I ask: “But what if the future you’re preserving is a bad one, not a good one?” And you say: fine, have a zero.
If we keep going in this way, longtermism might come out to be true. But it might also turn out to be false. If it turns out that the upshot of my post is that the longtermist has to shed, say, 3-10 orders of magnitude because of uncertainty about the Time of Perils Hypothesis, that takes a lot of ballast out from the longtermist position, in the sense that longtermists no longer have so many zeroes left to shed.
(B) Form matters: It matters which kind of Time of Perils Hypothesis we put a small but nontrivial probability on.
So for example, suppose you put probability 10^-5 on the claim “within 200 years, existential risk will drop by 4 orders of magnitude, and it will never, NEVER, ever in all of human history rise above that level”. That’s great news for the longtermist.
But what if you put most of your credence in claims like “within 200 years, existential risk will drop by 4 orders of magnitude. It’ll be low for a while, but every once in a while it will flair up to become about as bad as it is today (if not worse)? Then you’ve really substantially reduced the expected value of the future, and likewise the expected value of existential risk reduction.
The reason that this is important is that if we’re not going to rely on stories about superintelligent AI, it gets harder to motivate Time of Perils views of the first form, rather than the second. It’s not like the destructive technologies threatening humanity today are the last destructive technologies we will ever event, or the irresponsible leaders in charge of those technologies are the last risk-insensitive leaders we’ll ever have.
(C) Positive arguments: Just because the longtermist would be okay with a (relatively) small probability on the Time of Perils Hypothesis doesn’t get her out of providing an argument for the Time of Perils Hypothesis.
One way of reading this post is as suggesting that the some of the arguments that are commonly made in support of the Time of Perils Hypothesis aren’t enough to ground the levels of probability we’d need to put on Time of Perils Hypotheses of the needed form.
Why wouldn’t they be enough? Here’s one reason.
(D) Priors matter: The Time of Perils Hypothesis makes quite a strong claim. It says that some quantity (existential risk) has been at moderate levels for a while; has recently risen and is currently growing; but will soon drop to an almost unnoticeable level; and will stay at that level literally forever. This isn’t the most common trajectory for quantities found in nature to follow. So if I didn’t tell you anything about why you would expect existential risk to drop, and stay low forever, despite its recent growth, you probably wouldn’t put very much probability on the Time of Perils Hypothesis. And by “not very much” I mean, not even 1 in a million, or maybe not even 1 in a billion. Although those can seem like tiny numbers, there are plenty of claims I can make right now in which you’d invest lower (prior) credence.
This means that if we’re going to get a high enough credence in time Time of Perils Hypothesis, that credence can’t be coming from our priors. It needs to come from an argument in favor of the Time of Perils Hypothesis. That’s not to say, from the armchair, that such an argument can’t be made. It’s only to say that we really need to focus on making arguments, and not (just) describing the probability targets that our arguments need to get us to.
Second point: Speed of the “Kuznets” phenomenon. I’m also curious about this! Any takers on taking a stab at the modeling here? I’m especially interested in models that go materially beyond Leopold’s, because I think he did a really good job at giving us a handle on that model.
If anyone’s comments here make a material change to the contents of the paper, I hereby promise an acknowledgment in the published paper once I manage to convince a journal to publish this. I can also pay in beer if that’s not against forum rules.
Third point: Tractability of space settlement: That’s a really interesting point to push on. I’d be curious to hear more about your views here! I’m also curious how you think that space settlement affects existential risk.
If you want to take a dive into the academic literature here, the working paper cites nearly every academic source that I could find on this issue. There’s … surprisingly little, and nobody can agree about anything. Probably there’s room for interesting new work here.
I think it’s pretty miscalibrated to assign a 10^-5 chance (or 1 in 100,000) chance that we’re in the Time of Perils.
Would you be interested in making a $2000:$1 bet that you will change your mind in the next 10 years and think that the chance we’re in the Time of Perils is >50%? (I’m also happy to bet larger numbers at that ratio).
I think this is a pretty good deal for you, if I did the math correctly:
your fair rate is >=50,000:1 for the proposition being false. So I’m offering you a 25x discount.
the proposition could be correct but you didn’t update enough to change your mind all the way up to 50% in the next 10 years.
you get to resolve this bet according to your own beliefs, so there’s some leeway.
I might forget about this bet and probably won’t chase you for the money.
-How can you judge calibration vs miscalculation on a question like this? -David changing his mind doesn’t seem like a good proxy, because in this context a change of mind might be better explained by cultural factors than his prior being miscalibrated
Sure, the more idealized bet would be to commit to whatever the equivalent of “my estate” will be in 2223 to give him $1, and for David’s estate to give my estate back $2000 inflation-adjusted dollars in 1,000,200 years from now.
But this seems hard to pull off logistically. 10 years is already a long time.
I don’t know, man, it sure feels like at some level “the progenitor of a theory disavows it after some deliberation” should be one of the stronger pieces of evidence we have that a theory is false, in worlds where empirical evidence is very hard to get quickly.
I like the category of bets that says: “You believe X; I predict you will change your mind, let’s bet on it”.
I think this does promote good epistemics.
I think that Holden Karnofsky’s version of the “Time of Perils” is based on transformative AI development, which I agree with.
It’s good to hear from you. That’s exactly right—many EAs want to argue for the Time of Perils Hypothesis based on developments in AI.
My focus in this post is on versions of the Time of Perils Hypothesis that don’t rely on AI. If I’m right that these versions of the Time of Perils Hypothesis get into some trouble, then that might put more weight behind my fourth conclusion: it’s important to make sure we’re right about AI and its relationship to the Time of Perils.
I’m a bit hesitant to say many substantive things about AI in this post, because I think they’d probably take us a fair bit beyond the model. One nice way to take the post would simply be as emphasizing the importance of AI.
I do have some work in progress on the Singularity Hypothesis if that’s helpful. Shoot me an email if interested.
I didn’t think he argued for a time of perils. I’ve not read all his stuff but I can’t think of anything I’ve read by Holden that suggests he thinks future risk will go down very significant after AGI. Do you have a link at all?
Also what stuff I have seen seems to be more saying that this century is risky (pessimism) which goes against the astronomical value thesis.
Yeah, I think I got this one wrong.
The term “most important century” pretty directly suggests that this century is unique, and I assume that includes its unusually large amount of x-risk (given that Holden seems to think that the development of TAI is both the biggest source of x-risk this century and the reason for why this might be the most important century).
Holden also talks specifically about lock-in, which is one way the time of perils could end.
See e.g. here:
Thanks great – will have a read :-)
What do you mean here by a strong version of the Time of Perils Hypothesis?
Depends on the model! In this model I mean: very small N and r_l.