Whether or not aligning AI results in a big positive future, it could still have a huge positive impact between futures. Independently of this, if AI alignment work also includes enough s-risk-mitigating work and avoids enough s-risk-increasing work, it could also prevent big negative futures.
I’m not sure whether or not it does reduce s-risk on net, but I’m not that informed here. The priorities for s-risk work generally seem different from alignment and extinction risk reduction, at least according to CLR and CRS, but AI-related work is still a/the main priority.
EDIT: I decided to break up this comment, making the more general point here and discussing specific views in my reply.
Whatever the appeal of a totalist axiology, several other theories and variations of population ethics are potential contenders, such as person-affecting and critical-level views, and allowing for moral uncertainty should also be given some weight.
Both person-affecting views and critical level utilitarianism are compatible with the dominance of very low probability events of enormous value, and could imply it if we’re just taking expected values over a sum.
Critical level utilitarianism is unbounded, when just taken to be the sum of utilities minus a uniform critical level for each moral patient or utility value. Wide person-affecting views, according to which future people do matter, but just in the sense of it being better for better off people to exist than worse off people, and asymmetric person-affecting views, where bad lives are worth preventing, can also generate unbounded differences between potential outcomes and make recommendations to reduce existential risk (assuming risk and ambiguity neutral expected “value” maximization), especially to avoid mediocre or bad futures. See Thomas, 2019, especially section 6 Extinction Risk Revisited. Others may have made similar points.
(The rest of this comment is mostly copied from the one I made here.)
Basically, existing people could have extremely long and therefore extremely valuable lives, through advances in medicine, anti-aging and/or mind uploading.
That being said, it could be the case that accelerating AI development is good and slowing it is bad on some narrow person-affecting views, because AI could help existing people have extremely long lives, through its contributions to medicine, anti-aging and/or mind uploading. See also:
I think most people discount their future welfare substantially, though (perhaps other than for meeting some important life goals, like getting married and raising children), so living so much longer may not be that valuable according to their current preferences. To dramatically increase the stakes, one of the following should hold:
We need to not use their own current preferences and say their stakes are higher than they would recognize them to be, which may seem paternalistic and will fail to respect their current preferences in other ways.
The vast majority of the benefit comes from the (possibly small and/or atypical) subset of people who don’t discount their future welfare much, which gets into objections on the basis of utility monsters, inequity and elitism (maybe only the relatively wealthy/educated have very low discount rates). Or, maybe these interpersonal utility comparisons aren’t valid in the first place. It’s not clear what would ground them.
I was somewhat surprised by the lack of distinction between the cases where we go extinct and the universe is barren (value 0) and big negative futures filled with suffering. The difference between these cases seem large to me and seems like they will substantially affect the value of x-risk and s-risk mitigation. This is even more the case if you don’t subscribe to symmetric welfare ranges and think our capacity to suffer is vastly greater than our capacity to feel pleasure, which would make the worst possible futures way worse than the best possible futures are good. I suspect this is related to the popularity of the term ‘existential catastrophe’ which collapses any difference between these cases (as well as cases where we bumble along and produce some small positive value but far from our best possible future).
Whether or not aligning AI results in a big positive future, it could still have a huge positive impact between futures. Independently of this, if AI alignment work also includes enough s-risk-mitigating work and avoids enough s-risk-increasing work, it could also prevent big negative futures.
I’m not sure whether or not it does reduce s-risk on net, but I’m not that informed here. The priorities for s-risk work generally seem different from alignment and extinction risk reduction, at least according to CLR and CRS, but AI-related work is still a/the main priority.
EDIT: I decided to break up this comment, making the more general point here and discussing specific views in my reply.
On other specific axiologies:
Both person-affecting views and critical level utilitarianism are compatible with the dominance of very low probability events of enormous value, and could imply it if we’re just taking expected values over a sum.
Critical level utilitarianism is unbounded, when just taken to be the sum of utilities minus a uniform critical level for each moral patient or utility value. Wide person-affecting views, according to which future people do matter, but just in the sense of it being better for better off people to exist than worse off people, and asymmetric person-affecting views, where bad lives are worth preventing, can also generate unbounded differences between potential outcomes and make recommendations to reduce existential risk (assuming risk and ambiguity neutral expected “value” maximization), especially to avoid mediocre or bad futures. See Thomas, 2019, especially section 6 Extinction Risk Revisited. Others may have made similar points.
(The rest of this comment is mostly copied from the one I made here.)
For narrow person-affecting views, see:
Gustafsson, J. E., & Kosonen, P. (20??). Prudential Longtermism.
Carl Shulman. (2019). Person-affecting views may be dominated by possibilities of large future populations of necessary people.
Basically, existing people could have extremely long and therefore extremely valuable lives, through advances in medicine, anti-aging and/or mind uploading.
That being said, it could be the case that accelerating AI development is good and slowing it is bad on some narrow person-affecting views, because AI could help existing people have extremely long lives, through its contributions to medicine, anti-aging and/or mind uploading. See also:
Matthew Barnett. (2023). The possibility of an indefinite AI pause, section The opportunity cost of delayed technological progress.
Chad I. Jones. (2023). The A.I. Dilemma: Growth versus Existential Risk. (talk, slides).
I think most people discount their future welfare substantially, though (perhaps other than for meeting some important life goals, like getting married and raising children), so living so much longer may not be that valuable according to their current preferences. To dramatically increase the stakes, one of the following should hold:
We need to not use their own current preferences and say their stakes are higher than they would recognize them to be, which may seem paternalistic and will fail to respect their current preferences in other ways.
The vast majority of the benefit comes from the (possibly small and/or atypical) subset of people who don’t discount their future welfare much, which gets into objections on the basis of utility monsters, inequity and elitism (maybe only the relatively wealthy/educated have very low discount rates). Or, maybe these interpersonal utility comparisons aren’t valid in the first place. It’s not clear what would ground them.
I was somewhat surprised by the lack of distinction between the cases where we go extinct and the universe is barren (value 0) and big negative futures filled with suffering. The difference between these cases seem large to me and seems like they will substantially affect the value of x-risk and s-risk mitigation. This is even more the case if you don’t subscribe to symmetric welfare ranges and think our capacity to suffer is vastly greater than our capacity to feel pleasure, which would make the worst possible futures way worse than the best possible futures are good. I suspect this is related to the popularity of the term ‘existential catastrophe’ which collapses any difference between these cases (as well as cases where we bumble along and produce some small positive value but far from our best possible future).