Thanks for this. I think this is very valuable and really appreciate this being set out. I expect to come back to it a few times. One query and one request from further work—from someone, not necessarily you, as this is already a sterling effort!
I’ve heard Thorstad’s TOP talk a couple of times, but it’s now a bit foggy and I can’t remember where his ends and yours starts. Is it that Thorstad argues (some version of) longtermism relies on the TOP thesis, but doesn’t investigate whether TOP is true, whereas you set about investigating if it is true?
The request for further work: 18 is a lot of premises for a philosophical argument, and your analysis is very hedged. I recognise you don’t want to claim too much but, as a reader who has thought about this far less than you, I would really appreciate you telling me what you think. Specifically, it would be useful to know which of the premises are the most crucial, in the sense of being least plausible. Presumably, some of 18 premises we don’t need to worry about, and our attention can concentrate on a subset. Or, if you think all the premises are similarly plausible, that would be useful to know too!
On 1: Thorstad argues that if you want to hold both claims (1) Existential Risk Pessimism—per-century existential risk is very high, and (2) Astronomical Value Thesis—efforts to mitigate existential risk have astronomically high expected value, then TOP is the most plausible way to jointly hold both claims. He does look at two arguments for TOP—space settlement and an existential risk Kuznets curve—but says these aren’t strong enough to ground TOP and we instead need a version of TOP that appeals to AI. It’s fair to think of this piece as starting from that point, although the motivation for appealing to AI here was more due to this seeming to be the most compelling version of TOP to x-risk scholars.
On 2: I don’t think I’m an expert on TOP and was mostly aimed at summarising premises that seem to be common, hence the hedging. Broadly, I think you do only need the 4 claims that formed the main headings (1) high levels x-risk now, (2) significantly reduced levels of x-risk in the future, (3) a long and valuable / positive EV future, and (4) a moral framework that places a lot of weight on this future. I think the slimmed down version of the argument focuses solely on AI as it’s relevant for (1), (2) and (3), but as I say in the piece, I think there are potentially other ways to ground TOP without appealing to AI and would be very keen to see those articulated and explored more.
(2) is the part where my credences feel most fragile, especially the parts about AI being sufficiently capable to drastically reduce other x-risks and misaligned AI, and AI remaining aligned near indefinitely. It would be great to have a better sense of how difficult various x-risks are to solve and how powerful an AI system we might need to near eliminate them. No unknown unknowns seems like the least plausible premise of the group, but its very nature makes it hard to know how to cash this out.
Thanks for this. I think this is very valuable and really appreciate this being set out. I expect to come back to it a few times. One query and one request from further work—from someone, not necessarily you, as this is already a sterling effort!
I’ve heard Thorstad’s TOP talk a couple of times, but it’s now a bit foggy and I can’t remember where his ends and yours starts. Is it that Thorstad argues (some version of) longtermism relies on the TOP thesis, but doesn’t investigate whether TOP is true, whereas you set about investigating if it is true?
The request for further work: 18 is a lot of premises for a philosophical argument, and your analysis is very hedged. I recognise you don’t want to claim too much but, as a reader who has thought about this far less than you, I would really appreciate you telling me what you think. Specifically, it would be useful to know which of the premises are the most crucial, in the sense of being least plausible. Presumably, some of 18 premises we don’t need to worry about, and our attention can concentrate on a subset. Or, if you think all the premises are similarly plausible, that would be useful to know too!
Hi Michael, thanks for this.
On 1: Thorstad argues that if you want to hold both claims (1) Existential Risk Pessimism—per-century existential risk is very high, and (2) Astronomical Value Thesis—efforts to mitigate existential risk have astronomically high expected value, then TOP is the most plausible way to jointly hold both claims. He does look at two arguments for TOP—space settlement and an existential risk Kuznets curve—but says these aren’t strong enough to ground TOP and we instead need a version of TOP that appeals to AI. It’s fair to think of this piece as starting from that point, although the motivation for appealing to AI here was more due to this seeming to be the most compelling version of TOP to x-risk scholars.
On 2: I don’t think I’m an expert on TOP and was mostly aimed at summarising premises that seem to be common, hence the hedging. Broadly, I think you do only need the 4 claims that formed the main headings (1) high levels x-risk now, (2) significantly reduced levels of x-risk in the future, (3) a long and valuable / positive EV future, and (4) a moral framework that places a lot of weight on this future. I think the slimmed down version of the argument focuses solely on AI as it’s relevant for (1), (2) and (3), but as I say in the piece, I think there are potentially other ways to ground TOP without appealing to AI and would be very keen to see those articulated and explored more.
(2) is the part where my credences feel most fragile, especially the parts about AI being sufficiently capable to drastically reduce other x-risks and misaligned AI, and AI remaining aligned near indefinitely. It would be great to have a better sense of how difficult various x-risks are to solve and how powerful an AI system we might need to near eliminate them. No unknown unknowns seems like the least plausible premise of the group, but its very nature makes it hard to know how to cash this out.