I don’t see this. First, David’s claim is that a short time of perils with low risk thereafter seems unlikely—which is only a fraction of hypothesis 4, so I can easily see how you could get H3+H4_bad:H4_good >> 10:1
I don’t even see why it’s so implausible that H3 is strongly preferred to H4. There are many hypotheses we could make about time varying risk:
- Monotonic trend (many varieties)
- Oscillation (many varieties)
- Random walk (many varieties)
- …
If we aren’t trying to carefully consider technological change (and ignoring AI seems to force us not to do this carefully) then it’s not at all clear how to weigh all the different options. Many possible weightings do support hypothesis 3 over hypothesis 4:
- If we expect regular oscillation or time symmetric random walks, then I think we usually get H3 (integrated oscillation = high risk; the lack of risk in the past suggests that period of oscillation is long)
- If we expect rare, sudden changes then we get H3
- Monotonic trend obviously favours H3
If I imagine going through this this exercise, I wouldn’t be that surprised to see H3 strongly favoured over H4 - but I don’t really see it as a very valuable exercise. The risk under consideration is technologically driven, so not considering technology very carefully seems to be a mistake.
Fair overall. I talked to some other people, and I think I missed the oscillation model when writing my original comment, which in retrospect is a pretty large mistake. I still don’t think you can buy that many 9s on priors alone, but sure, if I think about it more maybe you can buy 1-3 9s. :/
First, David’s claim is that a short time of perils with low risk thereafter seems unlikely.
Suppose you were put to cryogenic sleep. You wake up in the 41st century. Before learning anything about this new world, is your prior really[1] that the 41st century world is as (or more) perilous as the 21st century?
If we aren’t trying to carefully consider technological change (and ignoring AI seems to force us not to do this carefully) then it’s not at all clear how to weigh all the different options. Many possible weightings do support hypothesis 3 over hypothesis 4
[...]
Yeah I need to think about this more. I’m less used to thinking about mathematical functions and more used to thinking about plausible reference classes. Anyway, when I think about (rare but existent) other survival risks that jumps from 1 in 10,000 in a time period to 1 in 5, I get the following observations:
Each of the following seems plausible:
risk goes up and stays up for a while before going back down to background levels
Concrete example: some infectious diseases
risk goes down to baseline almost immediately
Concrete example: somebody shoots me. (If they miss and I get away, or some time after I recover, I assume my own mortality risk is back to baseline).
risk (quickly )monotonically goes up until either you die, or goes back down again.
Concrete example: Russian roulette
risk starts going down but never quite goes back to baseline
Concrete example: “curable” cancer, infectious disease with long sequelae
On the other hand, the following seems rarer/approximately never happens (at least of the limited set of things I’ve considered):
risks constantly stay as high as 1:4 for many timesteps
Like this just feels like a pretty absurd risk profile.
I don’t think even death row prisoners, kamikaze pilots, etc, have this profile, for context.
Which makes me be a bit confused about long “time of perils” or continously elevated constant risk model.
risks go back to much lower than background levels
Convoluted stories are possible but it’s hard to buy even one OOM I think.
Which makes me (somewhat) appreciate that arguments for existentially stable utopia (ESU) are disadvantaged in the prior.
Now maybe I just lack creativity, and I’m definitely not thinking about things exhaustively.
And of course I’m cheating some by picking a reference class I actually have an intellectual handle on (humans have mostly figured out how to estimate amortized individual risks; life insurance exists). Trying to figure out x-risk 100,000 years from now is maybe less like modeling my prognosis after I get a rare disease and more like modeling my 10-year survival odds after being kidnapped by aliens. But in a way this is sort of my point: the whole situation is just really confusing so your priors should have a bunch of reference classes and some plausible ignorance priors, not the type of thing you have 10^-5 to ^ 10^-9 odds against before you see the evidence.
I’m writing quickly because I think this is a tricky issue and I’m trying not to spend too long on it. If I don’t make sense, I might have misspoken or made a reasoning error.
One way I thought about the problem (quite different to yours, very rough): variation in existential risk rate depends mostly on technology. At a wide enough interval (say, 100 years of tech development at current rates), change in existential risk with change in technology is hard to predict, though following Aschenbrenner and Xu’s observations it’s plausible that it tends to some equilibrium in the long run. You could perhaps model a mixture of a purely random walk and walks directed towards uncertain equilibria.
Also, technological growth probably has an upper limit somewhere, though quite unclear where, so even the purely random walk probably settles down eventually.
There’s uncertainty over a) how long it takes to “eventually” settle down, b) how much “randomness” there is as we approach an equilibrium c) how quickly equilibrium is approached, if it is approached.
I don’t know what you get if you try to parametrise that and integrate it all out, but I would also be surprised if it put and overwhelmingly low credence in a short sharp time of troubles.
I think “one-off displacement from equilibrium” probably isn’t a great analogy for tech-driven existential risk.
I think “high and sustained risk” seems weird partly because surviving for a long period under such conditions is weird, so conditioning on survival usually suggests that risk isn’t so high after all—so in many cases risk really does go down for survivors. But this effect only applies to survivors, and the other possibility is that we underestimated risk and we die. So I’m not sure that this effect changes conclusions. I’m also not sure how this affects your evaluation of your impact on risk—probably makes it smaller?
I think this observation might apply to your thought experiment, which conditions on survival.
(As an aside, it might not make a difference mathematically, but numerically one possible difference between us is that I think of the underlying unit to be ~logarithmic rather than linear)
Also, technological growth probably has an upper limit somewhere, though quite unclear where, so even the purely random walk probably settles down eventually.
Agreed, an important part of my model is something like nontrivial credence in a) technological completion conjecture and b) there aren’t “that many” technologies laying around to be discovered. So I zoom in and think about technological risks, a lot of my (proposed) model is thinking about the a) underlying distribution of scary vs worldsaving technologies and b) whether/how much the world is prepared for each scary technology as they appears, c) how high is the sharpness of dropoff of lethality for survival from each new scary technology conditional upon survival in the previous timestep.
I think “high and sustained risk” seems weird partly because surviving for a long period under such conditions is weird, so conditioning on survival usually suggests that risk isn’t so high after all—so risk really does go down for survivors. But this effect only applies to the fraction who survive, so I’m not sure that it changes conclusions
I think I probably didn’t make the point well enough, but roughly speaking, you only care about worlds where you survive, so my guess is that you’ll systematically overestimate longterm risk if your mixture model doesn’t update on survival at each time step to be evidence that survival is more likely on future time steps. But you do have to be careful here.
I’m also not sure how this affects your evaluation of your impact on risk—probably makes it smaller?
Yeah I think this is true. A friend brought up this point, roughly, the important parts of your risk reduction comes from temporarily vulnerable worlds. But if you’re not careful, you might “borrow” your risk-reduction from permanently vulnerable worlds (given yourself credit for high microextinctions averted), and also “borrow” your EV_of_future from permanently invulnerable worlds (given yourself credit for a share of an overwhelmingly large future). But to the extent those are different and anti-correlated worlds (which accords with David’s original point, just a bit more nuanced), then your actual EV can be a noticeably smaller slice.
- If we expect regular oscillation or time symmetric random walks, then I think we usually get H3 (integrated oscillation = high risk; the lack of risk in the past suggests that period of oscillation is long)
We can still get H4 if the amplitude of the oscillation or random walk decreases over time, right?
- If we expect rare, sudden changes then we get H3
Only if the sudden change has a sufficiently large magnitude, right?
We can still get H4 if the amplitude of the oscillation or random walk decreases over time, right?
The average needs to fall, not the amplitude. If we’re looking at risk in percentage points (rather than, say, logits, which might be a better parametrisation), small average implies small amplitude, but small amplitude does not imply small average.
Only if the sudden change has a sufficiently large magnitude, right?
The large magnitude is an observation—we have seen risk go from quite low to quite high over a short period of time. If we expect such large magnitude changes to be rare, then we might expect the present conditions to persist.
The average needs to fall, not the amplitude. If we’re looking at risk in percentage points (rather than, say, logits, which might be a better parametrisation), small average implies small amplitude, but small amplitude does not imply small average.
Agreed. I meant that, if the risk is usually quite low (e.g. 0.001 % per century), but sometimes jumps to a high value (e.g. 1 % per century), the cumulative risk (over all time) may still be significantly below 100 % (e.g. 90 %) if the magnitude of the jumps decreases quickly, and risk does not stay high for long.
The large magnitude is an observation—we have seen risk go from quite low to quite high over a short period of time. If we expect such large magnitude changes to be rare, then we might expect the present conditions to persist.
Why should we expect the present conditions to persist if we expect large magnitude changes to be rare?
Because we are more likely to see no big changes than to see another big change.
if the risk is usually quite low (e.g. 0.001 % per century), but sometimes jumps to a high value (e.g. 1 % per century), the cumulative risk (over all time) may still be significantly below 100 % (e.g. 90 %) if the magnitude of the jumps decreases quickly, and risk does not stay high for long.
I would call this model “transient deviation” rather than “random walk” or “regular oscillation”
I don’t see this. First, David’s claim is that a short time of perils with low risk thereafter seems unlikely—which is only a fraction of hypothesis 4, so I can easily see how you could get H3+H4_bad:H4_good >> 10:1
I don’t even see why it’s so implausible that H3 is strongly preferred to H4. There are many hypotheses we could make about time varying risk:
- Monotonic trend (many varieties)
- Oscillation (many varieties)
- Random walk (many varieties)
- …
If we aren’t trying to carefully consider technological change (and ignoring AI seems to force us not to do this carefully) then it’s not at all clear how to weigh all the different options. Many possible weightings do support hypothesis 3 over hypothesis 4:
- If we expect regular oscillation or time symmetric random walks, then I think we usually get H3 (integrated oscillation = high risk; the lack of risk in the past suggests that period of oscillation is long)
- If we expect rare, sudden changes then we get H3
- Monotonic trend obviously favours H3
If I imagine going through this this exercise, I wouldn’t be that surprised to see H3 strongly favoured over H4 - but I don’t really see it as a very valuable exercise. The risk under consideration is technologically driven, so not considering technology very carefully seems to be a mistake.
Fair overall. I talked to some other people, and I think I missed the oscillation model when writing my original comment, which in retrospect is a pretty large mistake. I still don’t think you can buy that many 9s on priors alone, but sure, if I think about it more maybe you can buy 1-3 9s. :/
Suppose you were put to cryogenic sleep. You wake up in the 41st century. Before learning anything about this new world, is your prior really[1] that the 41st century world is as (or more) perilous as the 21st century?
Yeah I need to think about this more. I’m less used to thinking about mathematical functions and more used to thinking about plausible reference classes. Anyway, when I think about (rare but existent) other survival risks that jumps from 1 in 10,000 in a time period to 1 in 5, I get the following observations:
Each of the following seems plausible:
risk goes up and stays up for a while before going back down to background levels
Concrete example: some infectious diseases
risk goes down to baseline almost immediately
Concrete example: somebody shoots me. (If they miss and I get away, or some time after I recover, I assume my own mortality risk is back to baseline).
risk (quickly )monotonically goes up until either you die, or goes back down again.
Concrete example: Russian roulette
risk starts going down but never quite goes back to baseline
Concrete example: “curable” cancer, infectious disease with long sequelae
On the other hand, the following seems rarer/approximately never happens (at least of the limited set of things I’ve considered):
risks constantly stay as high as 1:4 for many timesteps
Like this just feels like a pretty absurd risk profile.
I don’t think even death row prisoners, kamikaze pilots, etc, have this profile, for context.
Which makes me be a bit confused about long “time of perils” or continously elevated constant risk model.
risks go back to much lower than background levels
Convoluted stories are possible but it’s hard to buy even one OOM I think.
Which makes me (somewhat) appreciate that arguments for existentially stable utopia (ESU) are disadvantaged in the prior.
Now maybe I just lack creativity, and I’m definitely not thinking about things exhaustively.
And of course I’m cheating some by picking a reference class I actually have an intellectual handle on (humans have mostly figured out how to estimate amortized individual risks; life insurance exists). Trying to figure out x-risk 100,000 years from now is maybe less like modeling my prognosis after I get a rare disease and more like modeling my 10-year survival odds after being kidnapped by aliens. But in a way this is sort of my point: the whole situation is just really confusing so your priors should have a bunch of reference classes and some plausible ignorance priors, not the type of thing you have 10^-5 to ^ 10^-9 odds against before you see the evidence.
tbc this thought experiment only works for people who think current x-risk is high.
I’m writing quickly because I think this is a tricky issue and I’m trying not to spend too long on it. If I don’t make sense, I might have misspoken or made a reasoning error.
One way I thought about the problem (quite different to yours, very rough): variation in existential risk rate depends mostly on technology. At a wide enough interval (say, 100 years of tech development at current rates), change in existential risk with change in technology is hard to predict, though following Aschenbrenner and Xu’s observations it’s plausible that it tends to some equilibrium in the long run. You could perhaps model a mixture of a purely random walk and walks directed towards uncertain equilibria.
Also, technological growth probably has an upper limit somewhere, though quite unclear where, so even the purely random walk probably settles down eventually.
There’s uncertainty over a) how long it takes to “eventually” settle down, b) how much “randomness” there is as we approach an equilibrium c) how quickly equilibrium is approached, if it is approached.
I don’t know what you get if you try to parametrise that and integrate it all out, but I would also be surprised if it put and overwhelmingly low credence in a short sharp time of troubles.
I think “one-off displacement from equilibrium” probably isn’t a great analogy for tech-driven existential risk.
I think “high and sustained risk” seems weird partly because surviving for a long period under such conditions is weird, so conditioning on survival usually suggests that risk isn’t so high after all—so in many cases risk really does go down for survivors. But this effect only applies to survivors, and the other possibility is that we underestimated risk and we die. So I’m not sure that this effect changes conclusions. I’m also not sure how this affects your evaluation of your impact on risk—probably makes it smaller?
I think this observation might apply to your thought experiment, which conditions on survival.
Appreciate your comments!
(As an aside, it might not make a difference mathematically, but numerically one possible difference between us is that I think of the underlying unit to be ~logarithmic rather than linear)
Agreed, an important part of my model is something like nontrivial credence in a) technological completion conjecture and b) there aren’t “that many” technologies laying around to be discovered. So I zoom in and think about technological risks, a lot of my (proposed) model is thinking about the a) underlying distribution of scary vs worldsaving technologies and b) whether/how much the world is prepared for each scary technology as they appears, c) how high is the sharpness of dropoff of lethality for survival from each new scary technology conditional upon survival in the previous timestep.
I think I probably didn’t make the point well enough, but roughly speaking, you only care about worlds where you survive, so my guess is that you’ll systematically overestimate longterm risk if your mixture model doesn’t update on survival at each time step to be evidence that survival is more likely on future time steps. But you do have to be careful here.
Yeah I think this is true. A friend brought up this point, roughly, the important parts of your risk reduction comes from temporarily vulnerable worlds. But if you’re not careful, you might “borrow” your risk-reduction from permanently vulnerable worlds (given yourself credit for high microextinctions averted), and also “borrow” your EV_of_future from permanently invulnerable worlds (given yourself credit for a share of an overwhelmingly large future). But to the extent those are different and anti-correlated worlds (which accords with David’s original point, just a bit more nuanced), then your actual EV can be a noticeably smaller slice.
Hi David,
We can still get H4 if the amplitude of the oscillation or random walk decreases over time, right?
Only if the sudden change has a sufficiently large magnitude, right?
The average needs to fall, not the amplitude. If we’re looking at risk in percentage points (rather than, say, logits, which might be a better parametrisation), small average implies small amplitude, but small amplitude does not imply small average.
The large magnitude is an observation—we have seen risk go from quite low to quite high over a short period of time. If we expect such large magnitude changes to be rare, then we might expect the present conditions to persist.
Thanks for the clarifications!
Agreed. I meant that, if the risk is usually quite low (e.g. 0.001 % per century), but sometimes jumps to a high value (e.g. 1 % per century), the cumulative risk (over all time) may still be significantly below 100 % (e.g. 90 %) if the magnitude of the jumps decreases quickly, and risk does not stay high for long.
Why should we expect the present conditions to persist if we expect large magnitude changes to be rare?
Because we are more likely to see no big changes than to see another big change.
I would call this model “transient deviation” rather than “random walk” or “regular oscillation”