(Borrowing some language from a comment I just wrote here.)
If an event occurs that permanently locks us in to an âastronomically goodâ future that is <X% as valuable as the optimal future, has an existential catastrophe occurred? Iâd like to use the term âexistential riskâ such that the answer is ânoâ for any value of X that still allows for the future to intuitively seem âastronomically good.â If a future intuitively seems just extremely, mind-bogglingly good, then saying that an existential catastrophe has occurred in that future before all the good stuff happened just feels wrong.
So in short, I think âexistential catastropheâ should mean what we think of when we think of central examples of existential catastrophes. That includes extinction events and (at least some, but not all) events that lock us in to disappointing futures (futures in which, e.g. âwe never leave the solar systemâ or âmassive nonhuman animal suffering continuesâ). But it does not include things that only seem like catastrophes when a total utilitarian compares them to whatâs optimal.
Per Linchâs point that defining existential risk entirely empirically is kind of impossible, I think that maybe we should embrace defining existential risk in terms of value by defining an arbitrary thresholds of value above which if the world is still capable of reaching that level of value then an existential catastrophe has not occurred.
But rather than use 1% or 50% or 90% of optimal as that threshold, we should use a much lower bar that is approximately at the extremely-fuzzy boundary of what seems like an âastronomically good futureâ in order to avoid situations where âan existential catastrophe has occurred, but the future is still going to be extremely good.â
One such arbitrary threshold:
If the intrinsic value of the year-long conscious experiences of the billion happiest people on Earth in 2020 is equal to 10^9 utilons, we could say that an âastronomically good futureâ is a future worth >10^30 utilons.
So if we create an AI thatâs destined to put the universe to work creating stuff of value in a sub-optimal way (<90% or <1% or even <<1% of optimal), but that will still fill the universe with amazing conscious minds such that the future is still worth >10^30 utilons, then an existential catastrophe has not occurred. But if itâs a really mediocre non-misaligned AI that (e.g.) wins out in a Adversarial Technological Maturity scenario and only puts our light cone to use to create a future worth <10^30 utilons, then we can call it an existential catastrophe (and perhaps refer to it as a disappointing future, which seems to be the sort of future that results from a subset of existential catastrophes).
(Borrowing some language from a comment I just wrote here.)
If an event occurs that permanently locks us in to an âastronomically goodâ future that is <X% as valuable as the optimal future, has an existential catastrophe occurred? Iâd like to use the term âexistential riskâ such that the answer is ânoâ for any value of X that still allows for the future to intuitively seem âastronomically good.â If a future intuitively seems just extremely, mind-bogglingly good, then saying that an existential catastrophe has occurred in that future before all the good stuff happened just feels wrong.
So in short, I think âexistential catastropheâ should mean what we think of when we think of central examples of existential catastrophes. That includes extinction events and (at least some, but not all) events that lock us in to disappointing futures (futures in which, e.g. âwe never leave the solar systemâ or âmassive nonhuman animal suffering continuesâ). But it does not include things that only seem like catastrophes when a total utilitarian compares them to whatâs optimal.
Per Linchâs point that defining existential risk entirely empirically is kind of impossible, I think that maybe we should embrace defining existential risk in terms of value by defining an arbitrary thresholds of value above which if the world is still capable of reaching that level of value then an existential catastrophe has not occurred.
But rather than use 1% or 50% or 90% of optimal as that threshold, we should use a much lower bar that is approximately at the extremely-fuzzy boundary of what seems like an âastronomically good futureâ in order to avoid situations where âan existential catastrophe has occurred, but the future is still going to be extremely good.â
One such arbitrary threshold:
So if we create an AI thatâs destined to put the universe to work creating stuff of value in a sub-optimal way (<90% or <1% or even <<1% of optimal), but that will still fill the universe with amazing conscious minds such that the future is still worth >10^30 utilons, then an existential catastrophe has not occurred. But if itâs a really mediocre non-misaligned AI that (e.g.) wins out in a Adversarial Technological Maturity scenario and only puts our light cone to use to create a future worth <10^30 utilons, then we can call it an existential catastrophe (and perhaps refer to it as a disappointing future, which seems to be the sort of future that results from a subset of existential catastrophes).