How likely do you think it is that the overall value of the future will be drastically less than it could have been
It was unclear to me upon several rereads whether “drastically less” is meant to be interpreted in relative terms (intuitive notions of goodness that looks more like a ratio) or absolute terms (fully taking into account astronomical waste arguments). If the former, this means that eg. 99.9% as good as it could’ve been is still a pretty solid future, and would resolve “no.” If the later, 0.1% of approximately infinity is still approximately infinity.
Would be interested if other people had the same confusion, or if I’m somehow uniquely confused here.
I’d also be interested in hearing if others found this confusing. The intent was a large relative change in the future’s value—hence the word “overall”, and the mirroring of some language from Bostrom’s definition of existential risk. I also figured that this would be clear from the fact that the survey was called “Existential risk from AI” (and this title was visible to all survey respondents).
None of the respondents (and none of the people who looked at my drafts of the survey) expressed confusion about this, though someone could potentially misunderstand without commenting on it (e.g., because they didn’t notice there was another possible interpretation).
Example of why this is important: given the rate at which galaxies are receding from us, my understanding is that every day we delay colonizing the universe loses us hundreds of thousands of stars. Thinking on those scales, almost any tiny effect today can have enormous consequences in absolute terms. But the concept of existential risk correctly focuses our attention on the things that threaten a large fraction of the future’s value.
Example of why this is important: given the rate at which galaxies are receding from us, my understanding is that every day we delay colonizing the universe loses us hundreds of thousands of stars. Thinking on those scales, almost any tiny effect today can have enormous consequences in absolute terms. But the concept of existential risk correctly focuses our attention on the things that threaten a large fraction of the future’s value.
Sure but how large is large? You said in a different comment that losing 10% of the future is too high/an existential catastrophe, which I think is already debatable (I can imagine some longtermists thinking that getting 90% of the possible value is basically an existential win, and some of the survey respondents thinking that drastic reduction actually means more like 30%+ or 50%+). I think you’re implicitly agreeing with my comment that losing 0.1% of the future is acceptable, but I’m unsure if this is endorsed.
If you were to redo the survey for people like me, I’d have preferred a phrasing that says more like
a drastic reduction (>X%) of the future’s value.
Or alternatively, instead of asking for probabilities,
> What’s the expected fraction of the future’s value that would be lost?
Though since a) nobody else raised the same issue I did, and b) I’m not a technical AI safety or strategy researcher, and thus outside of your target audience, so this might all be a moot point.
I can imagine some longtermists thinking that getting 90% of the possible value is basically an existential win
What’s the definition of an “existential win”? I agree that this would be a win, and would involve us beating some existential risks that currently loom large. But I also think this would be an existential catastrophe. So if “win” means “zero x-catastrophes”, I wouldn’t call this a win.
Bostrom’s original definition of existential risk talked about things that “drastically curtail [the] potential” of “Earth-originating intelligent life”. Under that phrasing, I think losing 10% of our total potential qualifies.
I think you’re implicitly agreeing with my comment that losing 0.1% of the future is acceptable, but I’m unsure if this is endorsed.
?!? What does “acceptable” mean? Obviously losing 0.1% of the future’s value is very bad, and should be avoided if possible!!! But I’d be fine with saying that this isn’t quite an existential risk, by Bostrom’s original phrasing.
If you were to redo the survey for people like me, I’d have preferred a phrasing that says more like
“a drastic reduction (>X%) of the future’s value.”
Agreed, I’d probably have gone with a phrasing like that.
?!? What does “acceptable” mean? Obviously losing 0.1% of the future’s value is very bad, and should be avoided if possible!!! But I’d be fine with saying that this isn’t quite an existential risk, by Bostrom’s original phrasing.
So I reskimmed the paper, and FWIW, Bostrom’s original phrasing doesn’t seem obviously sensitive to 2 orders of magnitude by my reading of it. “drastically curtail” feels more like poetic language than setting up clear boundaries.
He does have some lower bounds:
> However, the true lesson is a different one. If what we are concerned with is (something like) maximizing the expected number of worthwhile lives that we will create, then in addition to the opportunity cost of delayed colonization, we have to take into account the risk of failure to colonize at all. We might fall victim to an existential risk, one where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.[8] Because the lifespan of galaxies is measured in billions of years, whereas the time-scale of any delays that we could realistically affect would rather be measured in years or decades, the consideration of risk trumps the consideration of opportunity cost. For example, a single percentage point of reduction of existential risks would be worth (from a utilitarian expected utility point-of-view) a delay of over 10 million years. (LZ: I was unable to make this section quote-text)
Taking “decades” conservatively to mean “at most ten decades”, this would suggest that something equivalent to a delay of ten decade (100 years) probably does not count as an existential catastrophe. However, this is a lower bound of 100⁄10 million * 1%, or 10^-7, far smaller than the 10^-3 I mentioned upthread.
(I agree that “acceptable” is sloppy language on my end, and losing 0.1% of the future’s value is very bad.)
(I considered just saying “existential risk” without defining the term, but I worried that people sometimes conflate existential risk with things like “extinction risk” or “risk that we’ll lose the entire cosmic endowment”.)
I agree that “existential risk” without defining the term would be much worse. It might have a technical definition within longtermism philosophy, but I don’t think the term has the exact same meaning as broadly understood by EAs.
Unfortunately, even the technical definition relies on the words “destroys” or “drastically curtails”, which leaves room for interpretation. I would guess that most people interpret those things as “destroys the vast majority [of our potential]”, e.g. reduces the EV of the future to 10% of what it could’ve been or lower. But it sounds like Rob interprets it as reduces the EV by at least 10%, which I would’ve called an example of a non-existential trajectory change.
I leave the thresholds vague, but it should be understood that in any existential catastrophe the greater part of our potential is gone and very little remains.
So I think Rob’s “at least 10% is lost” interpretation would indeed be either unusual or out of step with Ord (less sure about Bostrom).
Then perhaps it’s good that I didn’t include my nonstandard definition of x-risk, and we can expect the respondents to be at least somewhat closer to Ord’s definition.
I do find it odd to say that ’40% of the future’s value is lost’ isn’t an x-catastrophe, and in my own experience it’s much more common that I’ve wanted to draw a clear line between ’40% of the future is lost’ and ‘0.4% of the future is lost’, than between 90% and 40%. I’d be interested to hear about cases where Toby or others found it illuminating to sharply distinguish 90% and 40%.
I have sometimes wanted to draw a sharp distinction between scenarios where 90% of humans die vs. ones where 40% of humans die; but that’s largely because the risk of subsequent extinction or permanent civilizational collapse seems much higher to me in the 90% case. I don’t currently see a similar discontinuity in ’90% of the future lost vs. 40% of the future lost’, either in ‘the practical upshot of such loss’ or in ‘the kinds of scenarios that tend to cause such loss’. But I’ve also spent a lot less time about Toby thinking about the full range of x-risk scenarios.
FWIW, I personally don’t necessarily think we should focus more on 90+% loss scenarios than 1-90% loss scenarios, or even than <1% loss scenarios (though I’d currently lean against that final focus). I see this as essentially an open question (i.e., the question of which kinds of trajectory changes to prioritise increasing/decreasing the likelihood).
I do think Ord thinks we should focus more on 90+% loss scenarios, though I’m not certain why. I think people like Beckstead and MacAskill are less confident about that. (I’m lazily not including links, but can add them on request.)
I have some messy, longwinded drafts on something like this topic from a year ago that I could share, if anyone is interested.
I was just talking about what people take x-risk to mean, rather than what I believe we should prioritise.
Some reasons I can imagine for focusing on 90+% loss scenarios:
You might just have the empirical view that very few things would cause ‘medium-sized’ losses of a lot of the future’s value. It could then be useful to define ‘existential risk’ to exclude medium-sized losses, so that when you talk about ‘x-risks’ people fully appreciate just how bad you think these outcomes would be.
‘Existential’ suggests a threat to the ‘existence’ of humanity, i.e., an outcome about as bad as human extinction. (Certainly a lot of EAs—myself included, when I first joined the community! -- misunderstand x-risk and think it’s equivalent to extinction risk.)
After googling a bit, I now think Nick Bostrom’s conception of existential risk (at least as of 2012) is similar to Toby’s. In https://www.existential-risk.org/concept.html, Nick divides up x-risks into the categories ”human extinction, permanent stagnation, flawed realization, and subsequent ruination”, and says that in a “flawed realization”, “humanity reaches technological maturity” but “the amount of value realized is but a small fraction of what could have been achieved”. This only makes sense as a partition of x-risks if all x-risks reduce value to “a small fraction of what could have been achieved” (or reduce the future’s value to zero).
I still think that the definition of x-risk I proposed is a bit more useful, and I think it’s a more natural interpretation of phrasings like “drastically curtail [Earth-originating intelligent life’s] potential” and “reduce its quality of life (compared to what would otherwise have been possible) permanently and drastically”. Perhaps I should use a new term, like hyperastronomical catastrophe, when I want to refer to something like ‘catastrophes that would reduce the total value of the future by 5% or more’.
On the final paragraph, I don’t strongly disagree, but:
I think to me “drastically curtail” more naturally means “reduces to much less than 50%” (though that may be biased by me having also heard Ord’s operationalisation for the same term).
At first glance, I feel averse to introducing a new term for something like “reduces by 5-90%”
I think “non-existential trajectory change”, or just “trajectory change”, maybe does an ok job for what you want to say
Technically those things would also cover 0.0001% losses or the like. But it seems like you could just say “trajectory change” and then also talk about roughly how much loss you mean?
It seems like if we come up with a new term for the 5-90% bucket, we would also want a new term for other buckets?
I also mentally noted that “drastically less” was ambiguous, though for the sake of my quick forecasts I decided that whether you meant (or whether others would interpret you as meaning) “5% less” or “90% less” didn’t really matter to my forecasts, so I didn’t bother commenting.
Yeah, a big part of why I left the term vague is because I didn’t want people to get hung up on those details when many AGI catastrophe scenarios are extreme enough to swamp those details. E.g., focusing on whether the astronomical loss threshold is 80% vs. 50% is besides the point if you think AGI failure almost always means losing 98+% of the future’s value.
I might still do it differently if I could re-run the survey, however. It would be nice to have a number, so we could more easily do EV calculations.
I’d be interested in seeing operationalizations at some subset of {1%, 10%, 50%, 90, 99%}.* I can imagine that most safety researchers will give nearly identical answers to all of them, but I can also imagine that large divergences, so decent value of information here.
*Probably can’t do all 5, at least not at once, because of priming effects.
It was unclear to me upon several rereads whether “drastically less” is meant to be interpreted in relative terms (intuitive notions of goodness that looks more like a ratio) or absolute terms (fully taking into account astronomical waste arguments). If the former, this means that eg. 99.9% as good as it could’ve been is still a pretty solid future, and would resolve “no.” If the later, 0.1% of approximately infinity is still approximately infinity.
Would be interested if other people had the same confusion, or if I’m somehow uniquely confused here.
I’d also be interested in hearing if others found this confusing. The intent was a large relative change in the future’s value—hence the word “overall”, and the mirroring of some language from Bostrom’s definition of existential risk. I also figured that this would be clear from the fact that the survey was called “Existential risk from AI” (and this title was visible to all survey respondents).
None of the respondents (and none of the people who looked at my drafts of the survey) expressed confusion about this, though someone could potentially misunderstand without commenting on it (e.g., because they didn’t notice there was another possible interpretation).
Example of why this is important: given the rate at which galaxies are receding from us, my understanding is that every day we delay colonizing the universe loses us hundreds of thousands of stars. Thinking on those scales, almost any tiny effect today can have enormous consequences in absolute terms. But the concept of existential risk correctly focuses our attention on the things that threaten a large fraction of the future’s value.
Sure but how large is large? You said in a different comment that losing 10% of the future is too high/an existential catastrophe, which I think is already debatable (I can imagine some longtermists thinking that getting 90% of the possible value is basically an existential win, and some of the survey respondents thinking that drastic reduction actually means more like 30%+ or 50%+). I think you’re implicitly agreeing with my comment that losing 0.1% of the future is acceptable, but I’m unsure if this is endorsed.
If you were to redo the survey for people like me, I’d have preferred a phrasing that says more like
Or alternatively, instead of asking for probabilities,
> What’s the expected fraction of the future’s value that would be lost?
Though since a) nobody else raised the same issue I did, and b) I’m not a technical AI safety or strategy researcher, and thus outside of your target audience, so this might all be a moot point.
What’s the definition of an “existential win”? I agree that this would be a win, and would involve us beating some existential risks that currently loom large. But I also think this would be an existential catastrophe. So if “win” means “zero x-catastrophes”, I wouldn’t call this a win.
Bostrom’s original definition of existential risk talked about things that “drastically curtail [the] potential” of “Earth-originating intelligent life”. Under that phrasing, I think losing 10% of our total potential qualifies.
?!? What does “acceptable” mean? Obviously losing 0.1% of the future’s value is very bad, and should be avoided if possible!!! But I’d be fine with saying that this isn’t quite an existential risk, by Bostrom’s original phrasing.
Agreed, I’d probably have gone with a phrasing like that.
So I reskimmed the paper, and FWIW, Bostrom’s original phrasing doesn’t seem obviously sensitive to 2 orders of magnitude by my reading of it. “drastically curtail” feels more like poetic language than setting up clear boundaries.
He does have some lower bounds:
> However, the true lesson is a different one. If what we are concerned with is (something like) maximizing the expected number of worthwhile lives that we will create, then in addition to the opportunity cost of delayed colonization, we have to take into account the risk of failure to colonize at all. We might fall victim to an existential risk, one where an adverse outcome would either annihilate Earth-originating intelligent life or permanently and drastically curtail its potential.[8] Because the lifespan of galaxies is measured in billions of years, whereas the time-scale of any delays that we could realistically affect would rather be measured in years or decades, the consideration of risk trumps the consideration of opportunity cost. For example, a single percentage point of reduction of existential risks would be worth (from a utilitarian expected utility point-of-view) a delay of over 10 million years. (LZ: I was unable to make this section quote-text)
Taking “decades” conservatively to mean “at most ten decades”, this would suggest that something equivalent to a delay of ten decade (100 years) probably does not count as an existential catastrophe. However, this is a lower bound of 100⁄10 million * 1%, or 10^-7, far smaller than the 10^-3 I mentioned upthread.
(I agree that “acceptable” is sloppy language on my end, and losing 0.1% of the future’s value is very bad.)
(I considered just saying “existential risk” without defining the term, but I worried that people sometimes conflate existential risk with things like “extinction risk” or “risk that we’ll lose the entire cosmic endowment”.)
I agree that “existential risk” without defining the term would be much worse. It might have a technical definition within longtermism philosophy, but I don’t think the term has the exact same meaning as broadly understood by EAs.
Unfortunately, even the technical definition relies on the words “destroys” or “drastically curtails”, which leaves room for interpretation. I would guess that most people interpret those things as “destroys the vast majority [of our potential]”, e.g. reduces the EV of the future to 10% of what it could’ve been or lower. But it sounds like Rob interprets it as reduces the EV by at least 10%, which I would’ve called an example of a non-existential trajectory change.
Actually, I’ve just checked where I wrote about this before, and saw I quoted Ord saying:
So I think Rob’s “at least 10% is lost” interpretation would indeed be either unusual or out of step with Ord (less sure about Bostrom).
Then perhaps it’s good that I didn’t include my nonstandard definition of x-risk, and we can expect the respondents to be at least somewhat closer to Ord’s definition.
I do find it odd to say that ’40% of the future’s value is lost’ isn’t an x-catastrophe, and in my own experience it’s much more common that I’ve wanted to draw a clear line between ’40% of the future is lost’ and ‘0.4% of the future is lost’, than between 90% and 40%. I’d be interested to hear about cases where Toby or others found it illuminating to sharply distinguish 90% and 40%.
I have sometimes wanted to draw a sharp distinction between scenarios where 90% of humans die vs. ones where 40% of humans die; but that’s largely because the risk of subsequent extinction or permanent civilizational collapse seems much higher to me in the 90% case. I don’t currently see a similar discontinuity in ’90% of the future lost vs. 40% of the future lost’, either in ‘the practical upshot of such loss’ or in ‘the kinds of scenarios that tend to cause such loss’. But I’ve also spent a lot less time about Toby thinking about the full range of x-risk scenarios.
FWIW, I personally don’t necessarily think we should focus more on 90+% loss scenarios than 1-90% loss scenarios, or even than <1% loss scenarios (though I’d currently lean against that final focus). I see this as essentially an open question (i.e., the question of which kinds of trajectory changes to prioritise increasing/decreasing the likelihood).
I do think Ord thinks we should focus more on 90+% loss scenarios, though I’m not certain why. I think people like Beckstead and MacAskill are less confident about that. (I’m lazily not including links, but can add them on request.)
I have some messy, longwinded drafts on something like this topic from a year ago that I could share, if anyone is interested.
I was just talking about what people take x-risk to mean, rather than what I believe we should prioritise.
Some reasons I can imagine for focusing on 90+% loss scenarios:
You might just have the empirical view that very few things would cause ‘medium-sized’ losses of a lot of the future’s value. It could then be useful to define ‘existential risk’ to exclude medium-sized losses, so that when you talk about ‘x-risks’ people fully appreciate just how bad you think these outcomes would be.
‘Existential’ suggests a threat to the ‘existence’ of humanity, i.e., an outcome about as bad as human extinction. (Certainly a lot of EAs—myself included, when I first joined the community! -- misunderstand x-risk and think it’s equivalent to extinction risk.)
After googling a bit, I now think Nick Bostrom’s conception of existential risk (at least as of 2012) is similar to Toby’s. In https://www.existential-risk.org/concept.html, Nick divides up x-risks into the categories ”human extinction, permanent stagnation, flawed realization, and subsequent ruination”, and says that in a “flawed realization”, “humanity reaches technological maturity” but “the amount of value realized is but a small fraction of what could have been achieved”. This only makes sense as a partition of x-risks if all x-risks reduce value to “a small fraction of what could have been achieved” (or reduce the future’s value to zero).
I still think that the definition of x-risk I proposed is a bit more useful, and I think it’s a more natural interpretation of phrasings like “drastically curtail [Earth-originating intelligent life’s] potential” and “reduce its quality of life (compared to what would otherwise have been possible) permanently and drastically”. Perhaps I should use a new term, like hyperastronomical catastrophe, when I want to refer to something like ‘catastrophes that would reduce the total value of the future by 5% or more’.
I agree with everything but your final paragraph.
On the final paragraph, I don’t strongly disagree, but:
I think to me “drastically curtail” more naturally means “reduces to much less than 50%” (though that may be biased by me having also heard Ord’s operationalisation for the same term).
At first glance, I feel averse to introducing a new term for something like “reduces by 5-90%”
I think “non-existential trajectory change”, or just “trajectory change”, maybe does an ok job for what you want to say
Technically those things would also cover 0.0001% losses or the like. But it seems like you could just say “trajectory change” and then also talk about roughly how much loss you mean?
It seems like if we come up with a new term for the 5-90% bucket, we would also want a new term for other buckets?
I also mentally noted that “drastically less” was ambiguous, though for the sake of my quick forecasts I decided that whether you meant (or whether others would interpret you as meaning) “5% less” or “90% less” didn’t really matter to my forecasts, so I didn’t bother commenting.
Yeah, a big part of why I left the term vague is because I didn’t want people to get hung up on those details when many AGI catastrophe scenarios are extreme enough to swamp those details. E.g., focusing on whether the astronomical loss threshold is 80% vs. 50% is besides the point if you think AGI failure almost always means losing 98+% of the future’s value.
I might still do it differently if I could re-run the survey, however. It would be nice to have a number, so we could more easily do EV calculations.
I’d be interested in seeing operationalizations at some subset of {1%, 10%, 50%, 90, 99%}.* I can imagine that most safety researchers will give nearly identical answers to all of them, but I can also imagine that large divergences, so decent value of information here.
*Probably can’t do all 5, at least not at once, because of priming effects.