Really interesting post, thanks! Some random reactions.
“Pretty good” governance failure is possible. We could end up with an outcome that many or most influential people want, but that wiser versions of ourselves would strongly disapprove of. This scenario is plausibly the default outcome of aligned superintelligence: great uses of power are a tiny subset of the possible uses of power, the people/institutions that currently want great outcomes constitute a tiny share of total influence, and neither will those who want non-great outcomes be persuaded nor will those who want great outcomes acquire influence much without us working to increase it.
My first gut reaction is skepticism that this is a likely or stable state. The Earthly utopia scenario will likely not happen given that seemingly most explorations of humanity’s future prominently feature its expansion to space. Additionally, I suspect that a large fraction of people who seriously start thinking about the longterm future of humanity fall into the camp that you consider “people/institutions that currently want great outcomes” If this is true, one might suspect that this will become a much stronger faction and an aligned AI will have to consider those ambitions, too?
Robin Hanson speculated that the debate between people who want to use our cosmic endownment and those who want to stay local might be the cultural debate of the future. He calls it becoming grabby vs. non-grabby. He worries that a central government will try to restrict grabby expansion because it would be nearly impossible to keep the growing civilization under its control:
If within a few centuries we have a strong world government managing capitalist competition, overpopulation, value drift, and much more, we might come to notice that these and many other governance solutions to pressing problems are threatened by unrestrained interstellar colonization. Independent colonies able to change such solutions locally could allow population explosions and value drift, as well as capitalist competition that beats out home industries. That is, colony independence suggests unmanaged colony competition. In addition, independent colonies would lower the status of those who control the central government.
So authorities would want to either ban such colonization, or to find ways to keep colonies under tight central control. Yet it seems very hard to keep a tight lid on colonies. The huge distances involved make it hard to require central approval for distant decisions, and distant colonists can’t participate as equals in governance without slowing down the whole process dramatically. Worse, allowing just one sustained failure, of some descendants who get grabby, can negate all the other successes. This single failure problem gets worse the more colonies there are, the further apart they spread, and the more advanced technology gets.
I’m kind of sceptical that the desire to have absolute control would be strong enough to stamp down any expansatory and exploratory ambitions. I suspect that humans and institutions will converge considerably towards the “making most of our endowment” stance. With the increasing wealth, we will learn more about how much value we will be able to create and how much more value is possible compared to our prosaic imaginations, so an aligned AI will also work towards helping us achieve those sooner or later.
My first gut reaction is skepticism that [a “pretty good” scenario] is a likely or stable state.
I certainly agree that Earthly utopia won’t happen; I just wrote that to illustrate how prosaic values would be disastrous in some circumstances. But here are some similar things that I think are very possible:
Scenarios where some choices that are excellent by prosaic standards unintentionally make great futures unlikely or impossible.
Scenarios where the choices that would tend to promote great futures are very weird by prosaic standards and fail to achieve the level of consensus necessary for adoption.
In retrospect, I should have thought and written more about failure scenarios instead of just risk factors for those scenarios. I expect to revise this post, and failure scenarios would be an important addition. For now, here’s my baseline intuition for a “pretty good” future:
After an intelligence explosion, a state controls aligned superintelligence. Political elites
are not familiar with ideas like long reflection and indirect normativity,
do not understand why such ideas are important,
are constrained from pursuing such goals ( or perhaps because opposed factions can veto such ideas), or
do not get to decide what to do with superintelligence because the state’s decisionmaking system is bound by prior decisions about how powerful AI should be used (either directly, by forbidding great uses of AI, or indirectly, by giving decisionmaking power to groups unlikely to choose a great future)
So the state initially uses AI in prosaic ways and, roughly speaking, thinks of AI in prosaic ways. I don’t have a great model of what happens to our cosmic endowment in this scenario, but since we’re at the point where unwise individuals/institutions are empowered, the following all feel possible:
We optimize for something prosaic
We lock in a choice that disallows intentionally optimizing for anything
We enter a stable state in which we do not choose to optimize for anything
I don’t have much to say about Hanson right now, but I’ll note that a future that involves status-seeking humans making decisions about cosmic-scale policy (for more than a transition period to locking in something great) is probably a failure; success looks more like optimizing the universe.
I suspect that a large fraction of people who seriously start thinking about the longterm future of humanity fall into the camp that you consider “people/institutions that currently want great outcomes”
Historically, sure. But I think that’s due to selection: the people who think about the longterm future are mostly rationalist/EA-aligned. I would be very surprised if a similar fraction of a more representative group had the wisdom/humility/whatever to want a great future, much less the background to understand why we even have a “pretty good” future problem.
If this is true, one might suspect that this will become a much stronger faction
I suspect that humans and institutions will converge considerably towards the “making most of our endowment” stance.
This would surprise me. I expect poor discourse (in the US, at least) about how to use powerful AI. In particular, I expect:
The discourse will focus on prosaic issues like privacy and the future of work.
People will assume that the universe-scale future looks like “humans flying around in spaceships” and debate what those humans should do (rather than “superintelligent von Neumann probes optimizing for something” and debate what they should optimize for — much less recognize that we shouldn’t be thinking about what they should optimize for; we should delegate that decision to a better system than current human judgment).
(Also, your comment implies that aligned superintelligence will try to optimize for all humans’ preferences. I would be surprised if this occurs; I expect aligned superintelligence to try to do what its controller says.)
I would be very excited to call to discuss this further. Please PM me if you’re interested.
Really interesting post, thanks! Some random reactions.
My first gut reaction is skepticism that this is a likely or stable state. The Earthly utopia scenario will likely not happen given that seemingly most explorations of humanity’s future prominently feature its expansion to space. Additionally, I suspect that a large fraction of people who seriously start thinking about the longterm future of humanity fall into the camp that you consider “people/institutions that currently want great outcomes” If this is true, one might suspect that this will become a much stronger faction and an aligned AI will have to consider those ambitions, too?
Robin Hanson speculated that the debate between people who want to use our cosmic endownment and those who want to stay local might be the cultural debate of the future. He calls it becoming grabby vs. non-grabby. He worries that a central government will try to restrict grabby expansion because it would be nearly impossible to keep the growing civilization under its control:
https://www.overcomingbias.com/2021/07/the-coming-cosmic-control-conflict.html
I’m kind of sceptical that the desire to have absolute control would be strong enough to stamp down any expansatory and exploratory ambitions. I suspect that humans and institutions will converge considerably towards the “making most of our endowment” stance. With the increasing wealth, we will learn more about how much value we will be able to create and how much more value is possible compared to our prosaic imaginations, so an aligned AI will also work towards helping us achieve those sooner or later.
Thanks for your comments!
I certainly agree that Earthly utopia won’t happen; I just wrote that to illustrate how prosaic values would be disastrous in some circumstances. But here are some similar things that I think are very possible:
Scenarios where some choices that are excellent by prosaic standards unintentionally make great futures unlikely or impossible.
Scenarios where the choices that would tend to promote great futures are very weird by prosaic standards and fail to achieve the level of consensus necessary for adoption.
In retrospect, I should have thought and written more about failure scenarios instead of just risk factors for those scenarios. I expect to revise this post, and failure scenarios would be an important addition. For now, here’s my baseline intuition for a “pretty good” future:
After an intelligence explosion, a state controls aligned superintelligence. Political elites
are not familiar with ideas like long reflection and indirect normativity,
do not understand why such ideas are important,
are constrained from pursuing such goals ( or perhaps because opposed factions can veto such ideas), or
do not get to decide what to do with superintelligence because the state’s decisionmaking system is bound by prior decisions about how powerful AI should be used (either directly, by forbidding great uses of AI, or indirectly, by giving decisionmaking power to groups unlikely to choose a great future)
So the state initially uses AI in prosaic ways and, roughly speaking, thinks of AI in prosaic ways. I don’t have a great model of what happens to our cosmic endowment in this scenario, but since we’re at the point where unwise individuals/institutions are empowered, the following all feel possible:
We optimize for something prosaic
We lock in a choice that disallows intentionally optimizing for anything
We enter a stable state in which we do not choose to optimize for anything
I don’t have much to say about Hanson right now, but I’ll note that a future that involves status-seeking humans making decisions about cosmic-scale policy (for more than a transition period to locking in something great) is probably a failure; success looks more like optimizing the universe.
Historically, sure. But I think that’s due to selection: the people who think about the longterm future are mostly rationalist/EA-aligned. I would be very surprised if a similar fraction of a more representative group had the wisdom/humility/whatever to want a great future, much less the background to understand why we even have a “pretty good” future problem.
This would surprise me. I expect poor discourse (in the US, at least) about how to use powerful AI. In particular, I expect:
The discourse will focus on prosaic issues like privacy and the future of work.
People will assume that the universe-scale future looks like “humans flying around in spaceships” and debate what those humans should do (rather than “superintelligent von Neumann probes optimizing for something” and debate what they should optimize for — much less recognize that we shouldn’t be thinking about what they should optimize for; we should delegate that decision to a better system than current human judgment).
(Also, your comment implies that aligned superintelligence will try to optimize for all humans’ preferences. I would be surprised if this occurs; I expect aligned superintelligence to try to do what its controller says.)
I would be very excited to call to discuss this further. Please PM me if you’re interested.