I had a question that I think is semi-related to this thread, regarding your prediction:
Let’s suppose that one thousand years from now individual people still at least sort of exist, with a population of at least one million, and still largely govern themselves. I think, then, that there is something like a 4-in-5 chance that the portion of people living under a proper democracy will be substantially lower than it is today.
Are you seeing this prediction as including scenarios in which TAI has been developed by then, but things are basically going well, at least one million beings roughly like humans still exist, and the TAI is either agential and well-aligned with humanity and deferring to our wishes[1] or CAIS-like / tool-like?
I think I’d see those scenarios as fitting your described conditions. And I think I’d also see them as among the most likely picture of a good, non-existentially-catastrophic future.[2] So I wonder whether (a) you don’t intend to be accounting for such scenarios, (b) you think they’re much less likely relative to other good futures than I do, or (c) you think good futures are much less likely relative to bad ones than I do?
A related uncertainty I have is what you mean by “individual people still at least sort of exist” in that quote. E.g., would you include whole brain emulations with a fairly similar mind design to current humans?
[1] This could maybe be like a more extreme version of how the US President is “agential” and makes many of the actual decisions, but US citizens still in a substantial sense “govern themselves” because the president is partly acting based on their preferences. (Though obviously that’s different in that there are checks and balances, elections, etc.)
[2] I think the main alternatives would be:
somehow TAI is never developed, yet we can still fulfil our potential
humans changing into or being replaced by something very different
TAI is aligned with our idealised preferences at one point, and then just rolls with that, doing good things but not in any meaningful sense still being actively “governed by human-like beings”
Caveat that I wrote this comment relatively quickly and think a lot of it is poorly operationalised and would benefit from better terminology.
Are you seeing this prediction as including scenarios in which TAI has been developed by then, but things are basically going well, at least one million beings roughly like humans still exist, and the TAI is either agential and well-aligned with humanity and deferring to our wishes[1] or CAIS-like / tool-like?
Yep! I’m including these scenarios in the prediction.
I suppose I’m conditioning on either:
(a) AI has already been truly transformative, but people are still around and still meaningfully responsible for some important political decisions.*
(b) AI hasn’t yet been truly transformative , but people haven’t gone extinct
I actually haven’t thought enough about the relative probability of these two cases or my actual conditional probabilities for each of them. So my “4-in-5” prediction shouldn’t be taken as very rigorously thought through. I think the outside view is relevant to both cases, but the automation argument is only very relevant to the first case.
*I agree with your analogy here: People might be “meaningfully responsible” in the same way that US citizens are “meaningfully responsible” for US government actions, even though they only provide very occasional and simple inputs.
A related uncertainty I have is what you mean by “individual people still at least sort of exist” in that quote. E.g., would you include whole brain emulations with a fairly similar mind design to current humans?
I’m a little torn here. I’ve gone back and forth on this point, but haven’t really settled on how much including emulations should or should not influence the prediction. (Another sign that my “4-in-5” shouldn’t be taken too seriously.)
If whole brains emulations have largely replaced regular biological people, and mostly aren’t doing work (because other AI systems can do better jobs for most relevant cognitive tasks), then the automation argument still applies. But we should also assume, if we’re talking about emulations, that there have been an incredible number of other changes, some of which might be much more relevant than the destruction of the value of labor. For example, surely the ability to make copies of an emulation has implications for the nature of voting.
So, although I still feel that automation pushes in the direction of dictatorship, in the emulation case, I do feel a bit silly making mechanistic or “inside view” arguments given how foreign this possible future is to us. I also think the outside view continues to be relevant. At the same time, though, there might be a somewhat stronger case for just throwing up our hands and beginning from a non-informative 50⁄50 prior instead of trying to think too hard about base rates.
Yep! I’m including these scenarios in the prediction.
So now I’m thinking that maybe your prediction, if accurate, is quite concerning. It sounds like you believe the future could take roughly the following forms:
There are no longer really people or they no longer really govern themselves. Subtypes:
Extinction
A post-human/transhuman scenario (could be good or bad)
There are still essentially people, but something else is very much in control (probably AI; could be either aligned or misaligned with what we do/should value)
There are still essentially people and they still govern themselves, and there’s “something like a 4-in-5 chance that the portion of people living under a proper democracy will be substantially lower than it is today”
That sounds to me like a 4-in-5 chance of something that might probably itself be an existential catastrophe (global authoritarianism that lasts indefinitely long), or might substantially increase the chances of some other existential catastrophe (e.g., because it’s harder to have a long reflection and so bad values get locked in)
So that makes it sound like we might want to aim for good post-human/transhuman scenarios (if aiming for the good versions specifically is relatively tractable), or for good scenarios in which something non-human is very much in control (like developing a friendly agential AI).
But maybe you don’t see possibility 2 as necessarily that concerning? E.g., maybe you think that something like mild or genuinely enlightened and benevolent authoritarianism accounts for a substantial part of the likelihood of authoritarianism?
(Also, I’m aware that, as you emphasise, the “4-in-5” claim shouldn’t be taken too seriously. I’m sort of using it as a springboard for thought—something like “If the rough worldview that tentatively generated that probability turned out to be totally correct, how concerned should I be and what futures should I try to bring about?”
So that makes it sound like we might want to aim for good post-human/transhuman scenarios (if aiming for the good versions specifically is relatively tractable), or for good scenarios in which something non-human is very much in control (like developing a friendly agential AI).
I’m not sure if that follows. I mainly think that the meaning of the question “Will the future be democratic?” becomes much less clear when applied to fully/radically post-human futures. But I’m not sure if I see a natural reason to think that the futures would be ‘politically better’ than futures that are more recognizably human. So, at least at the moment, I’m not inclined to treat this as a major reason to push for a more or less post-human future.
That sounds to me like a 4-in-5 chance of something that might probably itself be an existential catastrophe (global authoritarianism that lasts indefinitely long), or might substantially increase the chances of some other existential catastrophe (e.g., because it’s harder to have a long reflection and so bad values get locked in).… But maybe you don’t see [this possibility] as necessarily that concerning? E.g., maybe you think that something like mild or genuinely enlightened and benevolent authoritarianism accounts for a substantial part of the likelihood of authoritarianism?
On the implications of my prediction for future people:
I definitely think of my prediction as, at least, bad news for future people. I’m a little unsure exactly how bad the news is, though.
Democratic governments are currently, on average, much better for the people who live under them. It’s not always possible to be totally sure of causation, but massacres, famines, serious suppressions of liberties, etc., have clearly been much more common under dictatorial governments than democratic governments. There are also pretty basic reasons to expect democracies to typically better for the people under them: there’s a stronger link between government decisions and people’s preferences. I expect this logic to hold, even if a lot of the specific ways in which dictatorships are on average worse than democracies (like higher famine risk) become less relevant in the future.
At the same time, I’m not sure we should be imagining a dystopia. Most people alive today live under dictatorial governments, and, for most of these people, daily life doesn’t feel like a boot on the face. The average person in Hanoi, for example, doesn’t think of themselves as living in the midst of catastrophe. Growing prosperity and some forms of technological progress are also reasons to expect quality of life to go up over time, even if the political situation deteriorates.
So I just want to clarify that, even though I’m predicting a counterfactually worse outcome, I’m not necessarily predicting a dystopia for most people, or a scenario in which most people’s lives are net negative. A dystopian future is conceivable, but doesn’t necessarily follow from a lack of democracy.
On the implications of my prediction for “value lock-in,” more broadly:
I think the main benefit of democracy, in this case, is that we should probably expect a wider range of values to be taken into account when important decisions with long-lasting consequences are made. Inclusiveness and pluralism of course doesn’t always imply morally better outcomes. But moral uncertainty considerations probably push in the direction of greater inclusivity/pluralism being good, in expectation. From some perspectives, it’s also inherently morally valuable for important decisions to be made in inclusive/pluralistic ways. Finally, I expect the average dictator to have worse values than the average non-dictator.
I actually haven’t thought very hard about the implications of dictatorship and democracy for value lock-in, though. I think I also probably have a bit of a reflexive bias toward democracy here.
I think the main benefit of democracy, in this case, is that we should probably expect a wider range of values to be taken into account when important decisions with long-lasting consequences are made. Inclusiveness and pluralism of course doesn’t always imply morally better outcomes. But moral uncertainty considerations probably push in the direction of greater inclusivity/pluralism being good, in expectation.
It sounds like you mainly have in mind something akin to preference aggregation. It seems to me that a similarly important benefit might be that democracies are likely more conducive to a free exchange of ideas/perspectives and to people converging on more accurate ideas/perspectives over time. (I have in mind something like the marketplace of ideas concept. I should note that I’m very unsure how strong those effects are, and how contingent they are on various features of the present world which we should expect to change in future.)
Did you mean for your comment to imply that idea as well? In any case, do you broadly agree with that idea?
Interesting, thanks! I think those points broadly make sense to me.
So I just want to clarify that, even though I’m predicting a counterfactually worse outcome, I’m not necessarily predicting a dystopia for most people, or a scenario in which most people’s lives are net negative. A dystopian future is conceivable, but doesn’t necessarily follow from a lack of democracy.
I think this is a good point, but I also think that:
The use of the term “dystopia” without clarification is probably not ideal
A future that’s basically like the current-day Hanoi everywhere forever is very plausibly an existential catastrophe (given Bostrom/Ord’s definitions and some plausible moral and empirical views)
(This is a very different claim from “Hanoi is supremely awful by present-day standards”, or even “I’d hate to live in Hanoi myself”)
In my previous comment, I intended for things like “current-day Hanoi everywhere forever” to be potentially included as among the failure modes I’m concerned about
To expand on those claims a bit:
When I use the term “dystopia”, I tend to essentially have in mind what Ord (2020) calls “unrecoverable dystopia”, which is one of his three types of existential catastrophe, along with extinction and unrecoverable dystopia. And he defines an existential catastrophe in turn as “the destruction of humanity’s longterm potential.” So I think the simplest description of what I mean by the term “unrecoverable dystopia” would be “a scenario in which civilization will continue to exist, but it is now guaranteed that the vast majority of the value that previously was attainable will never be attained”.[1]
So this wouldn’t require that the average sentient being has a net-negative life, as long as it’s possible that something far better could’ve happened but now is guaranteed to not happen. And it more clearly wouldn’t require that the average person has a net-negative life, nor that the average person perceives themselves to be in a “catastrophe” or “dystopia”.
Obviously, a world in which the average person or sentient being has a net-negative life would be even worse than a world that’s an “unrecoverable dystopia” simply due to “unfulfilled potential”, and so I think your clarification of what you’re saying is useful. But I already wasn’t necessarily thinking of a world with average net-negative lives (though I failed to clarify this).
[1] That said, Ord’s own description of what he means by “unrecoverable dystopia” seems misleading: he describes it as a type of existential catastrophe in which “civilization [is] intact, but locked into a terrible form, with little or no value”. I assume he means “terrible” and “little to know” when compared against an incredibly excellent future that he considers attainable. But it’d be very easy for someone to interpret his description as meaning the term is only applying to futures that are very net-negative.
I also think “dystopia” might not be an ideal term for what Ord and I want to be referring to, both because it invites confusion and might sound silly/sci-fi/weird.
I had a question that I think is semi-related to this thread, regarding your prediction:
Are you seeing this prediction as including scenarios in which TAI has been developed by then, but things are basically going well, at least one million beings roughly like humans still exist, and the TAI is either agential and well-aligned with humanity and deferring to our wishes[1] or CAIS-like / tool-like?
I think I’d see those scenarios as fitting your described conditions. And I think I’d also see them as among the most likely picture of a good, non-existentially-catastrophic future.[2] So I wonder whether (a) you don’t intend to be accounting for such scenarios, (b) you think they’re much less likely relative to other good futures than I do, or (c) you think good futures are much less likely relative to bad ones than I do?
A related uncertainty I have is what you mean by “individual people still at least sort of exist” in that quote. E.g., would you include whole brain emulations with a fairly similar mind design to current humans?
[1] This could maybe be like a more extreme version of how the US President is “agential” and makes many of the actual decisions, but US citizens still in a substantial sense “govern themselves” because the president is partly acting based on their preferences. (Though obviously that’s different in that there are checks and balances, elections, etc.)
[2] I think the main alternatives would be:
somehow TAI is never developed, yet we can still fulfil our potential
humans changing into or being replaced by something very different
TAI is aligned with our idealised preferences at one point, and then just rolls with that, doing good things but not in any meaningful sense still being actively “governed by human-like beings”
Caveat that I wrote this comment relatively quickly and think a lot of it is poorly operationalised and would benefit from better terminology.
Yep! I’m including these scenarios in the prediction.
I suppose I’m conditioning on either:
(a) AI has already been truly transformative, but people are still around and still meaningfully responsible for some important political decisions.*
(b) AI hasn’t yet been truly transformative , but people haven’t gone extinct
I actually haven’t thought enough about the relative probability of these two cases or my actual conditional probabilities for each of them. So my “4-in-5” prediction shouldn’t be taken as very rigorously thought through. I think the outside view is relevant to both cases, but the automation argument is only very relevant to the first case.
*I agree with your analogy here: People might be “meaningfully responsible” in the same way that US citizens are “meaningfully responsible” for US government actions, even though they only provide very occasional and simple inputs.
I’m a little torn here. I’ve gone back and forth on this point, but haven’t really settled on how much including emulations should or should not influence the prediction. (Another sign that my “4-in-5” shouldn’t be taken too seriously.)
If whole brains emulations have largely replaced regular biological people, and mostly aren’t doing work (because other AI systems can do better jobs for most relevant cognitive tasks), then the automation argument still applies. But we should also assume, if we’re talking about emulations, that there have been an incredible number of other changes, some of which might be much more relevant than the destruction of the value of labor. For example, surely the ability to make copies of an emulation has implications for the nature of voting.
So, although I still feel that automation pushes in the direction of dictatorship, in the emulation case, I do feel a bit silly making mechanistic or “inside view” arguments given how foreign this possible future is to us. I also think the outside view continues to be relevant. At the same time, though, there might be a somewhat stronger case for just throwing up our hands and beginning from a non-informative 50⁄50 prior instead of trying to think too hard about base rates.
Interesting, thanks.
So now I’m thinking that maybe your prediction, if accurate, is quite concerning. It sounds like you believe the future could take roughly the following forms:
There are no longer really people or they no longer really govern themselves. Subtypes:
Extinction
A post-human/transhuman scenario (could be good or bad)
There are still essentially people, but something else is very much in control (probably AI; could be either aligned or misaligned with what we do/should value)
There are still essentially people and they still govern themselves, and there’s “something like a 4-in-5 chance that the portion of people living under a proper democracy will be substantially lower than it is today”
That sounds to me like a 4-in-5 chance of something that might probably itself be an existential catastrophe (global authoritarianism that lasts indefinitely long), or might substantially increase the chances of some other existential catastrophe (e.g., because it’s harder to have a long reflection and so bad values get locked in)
So that makes it sound like we might want to aim for good post-human/transhuman scenarios (if aiming for the good versions specifically is relatively tractable), or for good scenarios in which something non-human is very much in control (like developing a friendly agential AI).
But maybe you don’t see possibility 2 as necessarily that concerning? E.g., maybe you think that something like mild or genuinely enlightened and benevolent authoritarianism accounts for a substantial part of the likelihood of authoritarianism?
(Also, I’m aware that, as you emphasise, the “4-in-5” claim shouldn’t be taken too seriously. I’m sort of using it as a springboard for thought—something like “If the rough worldview that tentatively generated that probability turned out to be totally correct, how concerned should I be and what futures should I try to bring about?”
Btw, I’ve now added your forecast to my “Database of existential risk estimates (or similar)”, in the tab for “Estimates of somewhat less extreme outcomes”.)
I’m not sure if that follows. I mainly think that the meaning of the question “Will the future be democratic?” becomes much less clear when applied to fully/radically post-human futures. But I’m not sure if I see a natural reason to think that the futures would be ‘politically better’ than futures that are more recognizably human. So, at least at the moment, I’m not inclined to treat this as a major reason to push for a more or less post-human future.
On the implications of my prediction for future people:
I definitely think of my prediction as, at least, bad news for future people. I’m a little unsure exactly how bad the news is, though.
Democratic governments are currently, on average, much better for the people who live under them. It’s not always possible to be totally sure of causation, but massacres, famines, serious suppressions of liberties, etc., have clearly been much more common under dictatorial governments than democratic governments. There are also pretty basic reasons to expect democracies to typically better for the people under them: there’s a stronger link between government decisions and people’s preferences. I expect this logic to hold, even if a lot of the specific ways in which dictatorships are on average worse than democracies (like higher famine risk) become less relevant in the future.
At the same time, I’m not sure we should be imagining a dystopia. Most people alive today live under dictatorial governments, and, for most of these people, daily life doesn’t feel like a boot on the face. The average person in Hanoi, for example, doesn’t think of themselves as living in the midst of catastrophe. Growing prosperity and some forms of technological progress are also reasons to expect quality of life to go up over time, even if the political situation deteriorates.
So I just want to clarify that, even though I’m predicting a counterfactually worse outcome, I’m not necessarily predicting a dystopia for most people, or a scenario in which most people’s lives are net negative. A dystopian future is conceivable, but doesn’t necessarily follow from a lack of democracy.
On the implications of my prediction for “value lock-in,” more broadly:
I think the main benefit of democracy, in this case, is that we should probably expect a wider range of values to be taken into account when important decisions with long-lasting consequences are made. Inclusiveness and pluralism of course doesn’t always imply morally better outcomes. But moral uncertainty considerations probably push in the direction of greater inclusivity/pluralism being good, in expectation. From some perspectives, it’s also inherently morally valuable for important decisions to be made in inclusive/pluralistic ways. Finally, I expect the average dictator to have worse values than the average non-dictator.
I actually haven’t thought very hard about the implications of dictatorship and democracy for value lock-in, though. I think I also probably have a bit of a reflexive bias toward democracy here.
It sounds like you mainly have in mind something akin to preference aggregation. It seems to me that a similarly important benefit might be that democracies are likely more conducive to a free exchange of ideas/perspectives and to people converging on more accurate ideas/perspectives over time. (I have in mind something like the marketplace of ideas concept. I should note that I’m very unsure how strong those effects are, and how contingent they are on various features of the present world which we should expect to change in future.)
Did you mean for your comment to imply that idea as well? In any case, do you broadly agree with that idea?
Interesting, thanks! I think those points broadly make sense to me.
I think this is a good point, but I also think that:
The use of the term “dystopia” without clarification is probably not ideal
A future that’s basically like the current-day Hanoi everywhere forever is very plausibly an existential catastrophe (given Bostrom/Ord’s definitions and some plausible moral and empirical views)
(This is a very different claim from “Hanoi is supremely awful by present-day standards”, or even “I’d hate to live in Hanoi myself”)
In my previous comment, I intended for things like “current-day Hanoi everywhere forever” to be potentially included as among the failure modes I’m concerned about
To expand on those claims a bit:
When I use the term “dystopia”, I tend to essentially have in mind what Ord (2020) calls “unrecoverable dystopia”, which is one of his three types of existential catastrophe, along with extinction and unrecoverable dystopia. And he defines an existential catastrophe in turn as “the destruction of humanity’s longterm potential.” So I think the simplest description of what I mean by the term “unrecoverable dystopia” would be “a scenario in which civilization will continue to exist, but it is now guaranteed that the vast majority of the value that previously was attainable will never be attained”.[1]
(See also Venn diagrams of existential, global, and suffering catastrophes and Clarifying existential risks and existential catastrophes.)
So this wouldn’t require that the average sentient being has a net-negative life, as long as it’s possible that something far better could’ve happened but now is guaranteed to not happen. And it more clearly wouldn’t require that the average person has a net-negative life, nor that the average person perceives themselves to be in a “catastrophe” or “dystopia”.
Obviously, a world in which the average person or sentient being has a net-negative life would be even worse than a world that’s an “unrecoverable dystopia” simply due to “unfulfilled potential”, and so I think your clarification of what you’re saying is useful. But I already wasn’t necessarily thinking of a world with average net-negative lives (though I failed to clarify this).
[1] That said, Ord’s own description of what he means by “unrecoverable dystopia” seems misleading: he describes it as a type of existential catastrophe in which “civilization [is] intact, but locked into a terrible form, with little or no value”. I assume he means “terrible” and “little to know” when compared against an incredibly excellent future that he considers attainable. But it’d be very easy for someone to interpret his description as meaning the term is only applying to futures that are very net-negative.
I also think “dystopia” might not be an ideal term for what Ord and I want to be referring to, both because it invites confusion and might sound silly/sci-fi/weird.