I had a question that I think is semi-related to this thread, regarding your prediction:
Letâs suppose that one thousand years from now individual people still at least sort of exist, with a population of at least one million, and still largely govern themselves. I think, then, that there is something like a 4-in-5 chance that the portion of people living under a proper democracy will be substantially lower than it is today.
Are you seeing this prediction as including scenarios in which TAI has been developed by then, but things are basically going well, at least one million beings roughly like humans still exist, and the TAI is either agential and well-aligned with humanity and deferring to our wishes[1] or CAIS-like /â tool-like?
I think Iâd see those scenarios as fitting your described conditions. And I think Iâd also see them as among the most likely picture of a good, non-existentially-catastrophic future.[2] So I wonder whether (a) you donât intend to be accounting for such scenarios, (b) you think theyâre much less likely relative to other good futures than I do, or (c) you think good futures are much less likely relative to bad ones than I do?
A related uncertainty I have is what you mean by âindividual people still at least sort of existâ in that quote. E.g., would you include whole brain emulations with a fairly similar mind design to current humans?
[1] This could maybe be like a more extreme version of how the US President is âagentialâ and makes many of the actual decisions, but US citizens still in a substantial sense âgovern themselvesâ because the president is partly acting based on their preferences. (Though obviously thatâs different in that there are checks and balances, elections, etc.)
[2] I think the main alternatives would be:
somehow TAI is never developed, yet we can still fulfil our potential
humans changing into or being replaced by something very different
TAI is aligned with our idealised preferences at one point, and then just rolls with that, doing good things but not in any meaningful sense still being actively âgoverned by human-like beingsâ
Caveat that I wrote this comment relatively quickly and think a lot of it is poorly operationalised and would benefit from better terminology.
Are you seeing this prediction as including scenarios in which TAI has been developed by then, but things are basically going well, at least one million beings roughly like humans still exist, and the TAI is either agential and well-aligned with humanity and deferring to our wishes[1] or CAIS-like /â tool-like?
Yep! Iâm including these scenarios in the prediction.
I suppose Iâm conditioning on either:
(a) AI has already been truly transformative, but people are still around and still meaningfully responsible for some important political decisions.*
(b) AI hasnât yet been truly transformative , but people havenât gone extinct
I actually havenât thought enough about the relative probability of these two cases or my actual conditional probabilities for each of them. So my â4-in-5â prediction shouldnât be taken as very rigorously thought through. I think the outside view is relevant to both cases, but the automation argument is only very relevant to the first case.
*I agree with your analogy here: People might be âmeaningfully responsibleâ in the same way that US citizens are âmeaningfully responsibleâ for US government actions, even though they only provide very occasional and simple inputs.
A related uncertainty I have is what you mean by âindividual people still at least sort of existâ in that quote. E.g., would you include whole brain emulations with a fairly similar mind design to current humans?
Iâm a little torn here. Iâve gone back and forth on this point, but havenât really settled on how much including emulations should or should not influence the prediction. (Another sign that my â4-in-5â shouldnât be taken too seriously.)
If whole brains emulations have largely replaced regular biological people, and mostly arenât doing work (because other AI systems can do better jobs for most relevant cognitive tasks), then the automation argument still applies. But we should also assume, if weâre talking about emulations, that there have been an incredible number of other changes, some of which might be much more relevant than the destruction of the value of labor. For example, surely the ability to make copies of an emulation has implications for the nature of voting.
So, although I still feel that automation pushes in the direction of dictatorship, in the emulation case, I do feel a bit silly making mechanistic or âinside viewâ arguments given how foreign this possible future is to us. I also think the outside view continues to be relevant. At the same time, though, there might be a somewhat stronger case for just throwing up our hands and beginning from a non-informative 50â50 prior instead of trying to think too hard about base rates.
Yep! Iâm including these scenarios in the prediction.
So now Iâm thinking that maybe your prediction, if accurate, is quite concerning. It sounds like you believe the future could take roughly the following forms:
There are no longer really people or they no longer really govern themselves. Subtypes:
Extinction
A post-human/âtranshuman scenario (could be good or bad)
There are still essentially people, but something else is very much in control (probably AI; could be either aligned or misaligned with what we do/âshould value)
There are still essentially people and they still govern themselves, and thereâs âsomething like a 4-in-5 chance that the portion of people living under a proper democracy will be substantially lower than it is todayâ
That sounds to me like a 4-in-5 chance of something that might probably itself be an existential catastrophe (global authoritarianism that lasts indefinitely long), or might substantially increase the chances of some other existential catastrophe (e.g., because itâs harder to have a long reflection and so bad values get locked in)
So that makes it sound like we might want to aim for good post-human/âtranshuman scenarios (if aiming for the good versions specifically is relatively tractable), or for good scenarios in which something non-human is very much in control (like developing a friendly agential AI).
But maybe you donât see possibility 2 as necessarily that concerning? E.g., maybe you think that something like mild or genuinely enlightened and benevolent authoritarianism accounts for a substantial part of the likelihood of authoritarianism?
(Also, Iâm aware that, as you emphasise, the â4-in-5â claim shouldnât be taken too seriously. Iâm sort of using it as a springboard for thoughtâsomething like âIf the rough worldview that tentatively generated that probability turned out to be totally correct, how concerned should I be and what futures should I try to bring about?â
So that makes it sound like we might want to aim for good post-human/âtranshuman scenarios (if aiming for the good versions specifically is relatively tractable), or for good scenarios in which something non-human is very much in control (like developing a friendly agential AI).
Iâm not sure if that follows. I mainly think that the meaning of the question âWill the future be democratic?â becomes much less clear when applied to fully/âradically post-human futures. But Iâm not sure if I see a natural reason to think that the futures would be âpolitically betterâ than futures that are more recognizably human. So, at least at the moment, Iâm not inclined to treat this as a major reason to push for a more or less post-human future.
That sounds to me like a 4-in-5 chance of something that might probably itself be an existential catastrophe (global authoritarianism that lasts indefinitely long), or might substantially increase the chances of some other existential catastrophe (e.g., because itâs harder to have a long reflection and so bad values get locked in).⌠But maybe you donât see [this possibility] as necessarily that concerning? E.g., maybe you think that something like mild or genuinely enlightened and benevolent authoritarianism accounts for a substantial part of the likelihood of authoritarianism?
On the implications of my prediction for future people:
I definitely think of my prediction as, at least, bad news for future people. Iâm a little unsure exactly how bad the news is, though.
Democratic governments are currently, on average, much better for the people who live under them. Itâs not always possible to be totally sure of causation, but massacres, famines, serious suppressions of liberties, etc., have clearly been much more common under dictatorial governments than democratic governments. There are also pretty basic reasons to expect democracies to typically better for the people under them: thereâs a stronger link between government decisions and peopleâs preferences. I expect this logic to hold, even if a lot of the specific ways in which dictatorships are on average worse than democracies (like higher famine risk) become less relevant in the future.
At the same time, Iâm not sure we should be imagining a dystopia. Most people alive today live under dictatorial governments, and, for most of these people, daily life doesnât feel like a boot on the face. The average person in Hanoi, for example, doesnât think of themselves as living in the midst of catastrophe. Growing prosperity and some forms of technological progress are also reasons to expect quality of life to go up over time, even if the political situation deteriorates.
So I just want to clarify that, even though Iâm predicting a counterfactually worse outcome, Iâm not necessarily predicting a dystopia for most people, or a scenario in which most peopleâs lives are net negative. A dystopian future is conceivable, but doesnât necessarily follow from a lack of democracy.
On the implications of my prediction for âvalue lock-in,â more broadly:
I think the main benefit of democracy, in this case, is that we should probably expect a wider range of values to be taken into account when important decisions with long-lasting consequences are made. Inclusiveness and pluralism of course doesnât always imply morally better outcomes. But moral uncertainty considerations probably push in the direction of greater inclusivity/âpluralism being good, in expectation. From some perspectives, itâs also inherently morally valuable for important decisions to be made in inclusive/âpluralistic ways. Finally, I expect the average dictator to have worse values than the average non-dictator.
I actually havenât thought very hard about the implications of dictatorship and democracy for value lock-in, though. I think I also probably have a bit of a reflexive bias toward democracy here.
I think the main benefit of democracy, in this case, is that we should probably expect a wider range of values to be taken into account when important decisions with long-lasting consequences are made. Inclusiveness and pluralism of course doesnât always imply morally better outcomes. But moral uncertainty considerations probably push in the direction of greater inclusivity/âpluralism being good, in expectation.
It sounds like you mainly have in mind something akin to preference aggregation. It seems to me that a similarly important benefit might be that democracies are likely more conducive to a free exchange of ideas/âperspectives and to people converging on more accurate ideas/âperspectives over time. (I have in mind something like the marketplace of ideas concept. I should note that Iâm very unsure how strong those effects are, and how contingent they are on various features of the present world which we should expect to change in future.)
Did you mean for your comment to imply that idea as well? In any case, do you broadly agree with that idea?
Interesting, thanks! I think those points broadly make sense to me.
So I just want to clarify that, even though Iâm predicting a counterfactually worse outcome, Iâm not necessarily predicting a dystopia for most people, or a scenario in which most peopleâs lives are net negative. A dystopian future is conceivable, but doesnât necessarily follow from a lack of democracy.
I think this is a good point, but I also think that:
The use of the term âdystopiaâ without clarification is probably not ideal
A future thatâs basically like the current-day Hanoi everywhere forever is very plausibly an existential catastrophe (given Bostrom/âOrdâs definitions and some plausible moral and empirical views)
(This is a very different claim from âHanoi is supremely awful by present-day standardsâ, or even âIâd hate to live in Hanoi myselfâ)
In my previous comment, I intended for things like âcurrent-day Hanoi everywhere foreverâ to be potentially included as among the failure modes Iâm concerned about
To expand on those claims a bit:
When I use the term âdystopiaâ, I tend to essentially have in mind what Ord (2020) calls âunrecoverable dystopiaâ, which is one of his three types of existential catastrophe, along with extinction and unrecoverable dystopia. And he defines an existential catastrophe in turn as âthe destruction of humanityâs longterm potential.â So I think the simplest description of what I mean by the term âunrecoverable dystopiaâ would be âa scenario in which civilization will continue to exist, but it is now guaranteed that the vast majority of the value that previously was attainable will never be attainedâ.[1]
So this wouldnât require that the average sentient being has a net-negative life, as long as itâs possible that something far better couldâve happened but now is guaranteed to not happen. And it more clearly wouldnât require that the average person has a net-negative life, nor that the average person perceives themselves to be in a âcatastropheâ or âdystopiaâ.
Obviously, a world in which the average person or sentient being has a net-negative life would be even worse than a world thatâs an âunrecoverable dystopiaâ simply due to âunfulfilled potentialâ, and so I think your clarification of what youâre saying is useful. But I already wasnât necessarily thinking of a world with average net-negative lives (though I failed to clarify this).
[1] That said, Ordâs own description of what he means by âunrecoverable dystopiaâ seems misleading: he describes it as a type of existential catastrophe in which âcivilization [is] intact, but locked into a terrible form, with little or no valueâ. I assume he means âterribleâ and âlittle to knowâ when compared against an incredibly excellent future that he considers attainable. But itâd be very easy for someone to interpret his description as meaning the term is only applying to futures that are very net-negative.
I also think âdystopiaâ might not be an ideal term for what Ord and I want to be referring to, both because it invites confusion and might sound silly/âsci-fi/âweird.
I had a question that I think is semi-related to this thread, regarding your prediction:
Are you seeing this prediction as including scenarios in which TAI has been developed by then, but things are basically going well, at least one million beings roughly like humans still exist, and the TAI is either agential and well-aligned with humanity and deferring to our wishes[1] or CAIS-like /â tool-like?
I think Iâd see those scenarios as fitting your described conditions. And I think Iâd also see them as among the most likely picture of a good, non-existentially-catastrophic future.[2] So I wonder whether (a) you donât intend to be accounting for such scenarios, (b) you think theyâre much less likely relative to other good futures than I do, or (c) you think good futures are much less likely relative to bad ones than I do?
A related uncertainty I have is what you mean by âindividual people still at least sort of existâ in that quote. E.g., would you include whole brain emulations with a fairly similar mind design to current humans?
[1] This could maybe be like a more extreme version of how the US President is âagentialâ and makes many of the actual decisions, but US citizens still in a substantial sense âgovern themselvesâ because the president is partly acting based on their preferences. (Though obviously thatâs different in that there are checks and balances, elections, etc.)
[2] I think the main alternatives would be:
somehow TAI is never developed, yet we can still fulfil our potential
humans changing into or being replaced by something very different
TAI is aligned with our idealised preferences at one point, and then just rolls with that, doing good things but not in any meaningful sense still being actively âgoverned by human-like beingsâ
Caveat that I wrote this comment relatively quickly and think a lot of it is poorly operationalised and would benefit from better terminology.
Yep! Iâm including these scenarios in the prediction.
I suppose Iâm conditioning on either:
(a) AI has already been truly transformative, but people are still around and still meaningfully responsible for some important political decisions.*
(b) AI hasnât yet been truly transformative , but people havenât gone extinct
I actually havenât thought enough about the relative probability of these two cases or my actual conditional probabilities for each of them. So my â4-in-5â prediction shouldnât be taken as very rigorously thought through. I think the outside view is relevant to both cases, but the automation argument is only very relevant to the first case.
*I agree with your analogy here: People might be âmeaningfully responsibleâ in the same way that US citizens are âmeaningfully responsibleâ for US government actions, even though they only provide very occasional and simple inputs.
Iâm a little torn here. Iâve gone back and forth on this point, but havenât really settled on how much including emulations should or should not influence the prediction. (Another sign that my â4-in-5â shouldnât be taken too seriously.)
If whole brains emulations have largely replaced regular biological people, and mostly arenât doing work (because other AI systems can do better jobs for most relevant cognitive tasks), then the automation argument still applies. But we should also assume, if weâre talking about emulations, that there have been an incredible number of other changes, some of which might be much more relevant than the destruction of the value of labor. For example, surely the ability to make copies of an emulation has implications for the nature of voting.
So, although I still feel that automation pushes in the direction of dictatorship, in the emulation case, I do feel a bit silly making mechanistic or âinside viewâ arguments given how foreign this possible future is to us. I also think the outside view continues to be relevant. At the same time, though, there might be a somewhat stronger case for just throwing up our hands and beginning from a non-informative 50â50 prior instead of trying to think too hard about base rates.
Interesting, thanks.
So now Iâm thinking that maybe your prediction, if accurate, is quite concerning. It sounds like you believe the future could take roughly the following forms:
There are no longer really people or they no longer really govern themselves. Subtypes:
Extinction
A post-human/âtranshuman scenario (could be good or bad)
There are still essentially people, but something else is very much in control (probably AI; could be either aligned or misaligned with what we do/âshould value)
There are still essentially people and they still govern themselves, and thereâs âsomething like a 4-in-5 chance that the portion of people living under a proper democracy will be substantially lower than it is todayâ
That sounds to me like a 4-in-5 chance of something that might probably itself be an existential catastrophe (global authoritarianism that lasts indefinitely long), or might substantially increase the chances of some other existential catastrophe (e.g., because itâs harder to have a long reflection and so bad values get locked in)
So that makes it sound like we might want to aim for good post-human/âtranshuman scenarios (if aiming for the good versions specifically is relatively tractable), or for good scenarios in which something non-human is very much in control (like developing a friendly agential AI).
But maybe you donât see possibility 2 as necessarily that concerning? E.g., maybe you think that something like mild or genuinely enlightened and benevolent authoritarianism accounts for a substantial part of the likelihood of authoritarianism?
(Also, Iâm aware that, as you emphasise, the â4-in-5â claim shouldnât be taken too seriously. Iâm sort of using it as a springboard for thoughtâsomething like âIf the rough worldview that tentatively generated that probability turned out to be totally correct, how concerned should I be and what futures should I try to bring about?â
Btw, Iâve now added your forecast to my âDatabase of existential risk estimates (or similar)â, in the tab for âEstimates of somewhat less extreme outcomesâ.)
Iâm not sure if that follows. I mainly think that the meaning of the question âWill the future be democratic?â becomes much less clear when applied to fully/âradically post-human futures. But Iâm not sure if I see a natural reason to think that the futures would be âpolitically betterâ than futures that are more recognizably human. So, at least at the moment, Iâm not inclined to treat this as a major reason to push for a more or less post-human future.
On the implications of my prediction for future people:
I definitely think of my prediction as, at least, bad news for future people. Iâm a little unsure exactly how bad the news is, though.
Democratic governments are currently, on average, much better for the people who live under them. Itâs not always possible to be totally sure of causation, but massacres, famines, serious suppressions of liberties, etc., have clearly been much more common under dictatorial governments than democratic governments. There are also pretty basic reasons to expect democracies to typically better for the people under them: thereâs a stronger link between government decisions and peopleâs preferences. I expect this logic to hold, even if a lot of the specific ways in which dictatorships are on average worse than democracies (like higher famine risk) become less relevant in the future.
At the same time, Iâm not sure we should be imagining a dystopia. Most people alive today live under dictatorial governments, and, for most of these people, daily life doesnât feel like a boot on the face. The average person in Hanoi, for example, doesnât think of themselves as living in the midst of catastrophe. Growing prosperity and some forms of technological progress are also reasons to expect quality of life to go up over time, even if the political situation deteriorates.
So I just want to clarify that, even though Iâm predicting a counterfactually worse outcome, Iâm not necessarily predicting a dystopia for most people, or a scenario in which most peopleâs lives are net negative. A dystopian future is conceivable, but doesnât necessarily follow from a lack of democracy.
On the implications of my prediction for âvalue lock-in,â more broadly:
I think the main benefit of democracy, in this case, is that we should probably expect a wider range of values to be taken into account when important decisions with long-lasting consequences are made. Inclusiveness and pluralism of course doesnât always imply morally better outcomes. But moral uncertainty considerations probably push in the direction of greater inclusivity/âpluralism being good, in expectation. From some perspectives, itâs also inherently morally valuable for important decisions to be made in inclusive/âpluralistic ways. Finally, I expect the average dictator to have worse values than the average non-dictator.
I actually havenât thought very hard about the implications of dictatorship and democracy for value lock-in, though. I think I also probably have a bit of a reflexive bias toward democracy here.
It sounds like you mainly have in mind something akin to preference aggregation. It seems to me that a similarly important benefit might be that democracies are likely more conducive to a free exchange of ideas/âperspectives and to people converging on more accurate ideas/âperspectives over time. (I have in mind something like the marketplace of ideas concept. I should note that Iâm very unsure how strong those effects are, and how contingent they are on various features of the present world which we should expect to change in future.)
Did you mean for your comment to imply that idea as well? In any case, do you broadly agree with that idea?
Interesting, thanks! I think those points broadly make sense to me.
I think this is a good point, but I also think that:
The use of the term âdystopiaâ without clarification is probably not ideal
A future thatâs basically like the current-day Hanoi everywhere forever is very plausibly an existential catastrophe (given Bostrom/âOrdâs definitions and some plausible moral and empirical views)
(This is a very different claim from âHanoi is supremely awful by present-day standardsâ, or even âIâd hate to live in Hanoi myselfâ)
In my previous comment, I intended for things like âcurrent-day Hanoi everywhere foreverâ to be potentially included as among the failure modes Iâm concerned about
To expand on those claims a bit:
When I use the term âdystopiaâ, I tend to essentially have in mind what Ord (2020) calls âunrecoverable dystopiaâ, which is one of his three types of existential catastrophe, along with extinction and unrecoverable dystopia. And he defines an existential catastrophe in turn as âthe destruction of humanityâs longterm potential.â So I think the simplest description of what I mean by the term âunrecoverable dystopiaâ would be âa scenario in which civilization will continue to exist, but it is now guaranteed that the vast majority of the value that previously was attainable will never be attainedâ.[1]
(See also Venn diagrams of existential, global, and suffering catastrophes and Clarifying existential risks and existential catastrophes.)
So this wouldnât require that the average sentient being has a net-negative life, as long as itâs possible that something far better couldâve happened but now is guaranteed to not happen. And it more clearly wouldnât require that the average person has a net-negative life, nor that the average person perceives themselves to be in a âcatastropheâ or âdystopiaâ.
Obviously, a world in which the average person or sentient being has a net-negative life would be even worse than a world thatâs an âunrecoverable dystopiaâ simply due to âunfulfilled potentialâ, and so I think your clarification of what youâre saying is useful. But I already wasnât necessarily thinking of a world with average net-negative lives (though I failed to clarify this).
[1] That said, Ordâs own description of what he means by âunrecoverable dystopiaâ seems misleading: he describes it as a type of existential catastrophe in which âcivilization [is] intact, but locked into a terrible form, with little or no valueâ. I assume he means âterribleâ and âlittle to knowâ when compared against an incredibly excellent future that he considers attainable. But itâd be very easy for someone to interpret his description as meaning the term is only applying to futures that are very net-negative.
I also think âdystopiaâ might not be an ideal term for what Ord and I want to be referring to, both because it invites confusion and might sound silly/âsci-fi/âweird.