My own thinking is that war between AIs and humans could happen in many ways. One simple (easy to understand) way is that agents will generally refuse a settlement worse than what they think they could obtain on their own (by going to war), so human irrationality could cause a war when e.g. the AI faction thinks it will win with 99% probability, and humans think they could win with 50% probability, so each side demand more of the lightcone (or resources in general) than the other side is willing to grant.
This generally makes sense to me. I also think human irrationality could prompt a war with AIs. I don’t disagree with the claim insofar as you’re claiming that such a war is merely plausible (say >10% chance), rather than a default outcome. (Although to be clear, I don’t think such a war would likely cut cleanly along human vs. AI lines.)
On the other hand, humans are currently already irrational and yet human vs. human wars are not the default (they happen frequently, e.g. but at any given time on Earth, the vast majority of humans are not in a warzone or fighting in an active war). It’s not clear to me why human vs. AIs would make war more likely to occur than in the human vs. human case, if by assumption the main difference here is that one side is more rational.
In other words, if we’re moving from a situation of irrational parties vs. other irrational parties to irrational parties vs. rational parties, I’m not sure why we’d expect this change to make things more warlike and less peaceful as a result. You mention one potential reason:
Also, given that humans often do (or did) go to war with each other, our shared values (i.e. the extent to which we do have empathy/altruism for others) must contribute to the current relative peace in some way.
I don’t think this follows. Humans presumably also had empathy in e.g. 1500, back when war was more common, so how could it explain our current relative peace?
Perhaps you mean that cultural changes caused our present time period to be relatively peaceful. But I’m not sure about that; or at least, the claim should probably be made more specific. There are many things about the environment that have changed since our relatively more warlike ancestors, and (from my current perspective) I think it’s plausible that any one of them could have been the reason for our current relative peace. That is, I don’t see a good reason to single out human values or empathy as the main cause in itself.
For example, humans are now a lot richer per capita, which might mean that people have “more to lose” when going to war, and thus are less likely to engage in it. We’re also a more globalized culture, and our economic system relies more on long-distance trade than it did in the past, making war more costly. We’re also older, in the sense that the median age is higher (and old people are less likely to commit violence), and women got the right to vote (who perhaps are less likely to support hawkish politicians).
To be clear, I don’t put much confidence in any of these explanations. As of now, I’m very uncertain about why the 21st century seems relatively peaceful compared to the distant past. However I do think that:
None of the explanations I’ve given above seem well-described as “our values/empathy” made us less warlike. And to the extent our values changed, I expect that was probably downstream of more fundamental changes, like economic growth and globalization, rather than being an exogenous change that was independent of these effects.
To the extent that changing human nature explains our current relatively peaceful era, this position seems to require that you believe human nature is fundamentally quite plastic and can be warped over time pretty easily due to cultural changes. If that’s true, human nature is ultimately quite variable, perhaps more similar to AI than you might have otherwise thought (as both are presumably pushed around easily by training data).
It’s not clear to me why human vs. AIs would make war more likely to occur than in the human vs. human case, if by assumption the main difference here is that one side is more rational.
We have more empirical evidence that we can look at when it comes to human-human wars, making it easier to have well-calibrated beliefs about chances of winning. When it comes to human-AI wars, we’re more likely to have wildly irrational beliefs.
This is just one reason war could occur though. Perhaps a more likely reason is that there won’t be a way to maintain the peace, that both sides can be convinced will work, and is sufficiently cheap that the cost doesn’t eat up all of the gains from avoiding war. For example, how would the human faction know that if it agrees to peace, the AI faction won’t fully dispossess the humans at some future date when it’s even more powerful? Even if AIs are able to come up with some workable mechanisms, how would the humans know that it’s not just a trick?
Without credible assurances (which seems hard to come by), I think if humans do agree to peace, the most likely outcome is that it does get dispossessed in the not too distant future, either gradually (for example getting scammed/persuaded/blackmailed/stolen from in various ways), or all at once. I think society as a whole won’t have a strong incentive to protect humans because they’ll be almost pure consumers (not producing much relative to what they consume), and such classes of people are often killed or dispossessed in human history (e.g., landlords after communist takeovers).
I don’t think this follows. Humans presumably also had empathy in e.g. 1500, back when war was more common, so how could it explain our current relative peace?
I mainly mean that without empathy/altruism, we’d probably have even more wars, both now and then.
To the extent that changing human nature explains our current relatively peaceful era, this position seems to require that you believe human nature is fundamentally quite plastic and can be warped over time pretty easily due to cultural changes.
Well, yes, I’m also pretty scared of this. See this post where I talked about something similar. I guess overall I’m still inclined to push for a future where “AI alignment” and “human safety” are both solved, instead of settling for one in which neither is (which I’m tempted to summarize your position as, but I’m not sure if I’m being fair).
I guess overall I’m still inclined to push for a future where “AI alignment” and “human safety” are both solved, instead of settling for one in which neither is (which I’m tempted to summarize your position as, but I’m not sure if I’m being fair)
For what it’s worth, I’d loosely summarize my position on this issue as being that I mainly think of AI as a general vehicle for accelerating technological and economic growth, along with accelerating things downstream of technology and growth, such as cultural change. And I’m skeptical we could ever fully “solve alignment” in the ambitious sense you seem to be imagining.
In this frame, it could be good to slow down AI if your goal is to delay large changes to the world. There are plausible scenarios in which this could make sense. Perhaps most significantly, one could be a cultural conservative and think that cultural change is generally bad in expectation, and thus more change is bad even if it yields higher aggregate prosperity sooner in time (though I’m not claiming this is your position).
Whereas, by contrast, I think cultural change can be bad, but I don’t see much reason to delay it if it’s inevitable. And the case against delaying AI seems even stronger here if you care about preserving (something like) the lives and values of people who currently exist, as AI offers the best chance of extending our lifespans, and “putting us in the driver’s seat” more generally by allowing us to actually be there during AGI development.
If future humans were in the driver’s seat instead, but with slightly more control over the process, I wouldn’t necessarily see that as being significantly better in expectation compared to my favored alternative, including over the very long run (according to my values).
(And as a side note, I also care about influencing human values, or what you might term “human safety”, but I generally see this as orthogonal to this specific discussion.)
If future humans were in the driver’s seat instead, but with slightly more control over the process
Why only “slightly” more control? It’s surprising to see you say this without giving any reasons or linking to some arguments, as this degree of alignment difficulty seems like a very unusual position that I’ve never seen anyone argue for before.
I’m a bit surprised you haven’t seen anyone make this argument before. To be clear, I wrote the comment last night on a mobile device, and it was intended to be a brief summary of my position, which perhaps explains why I didn’t link to anything or elaborate on that specific question. I’m not sure I want to outline my justifications for my view right now, but my general impression is that civilization has never had much central control over cultural values, so it’s unsurprising if this situation persists into the future, including with AI. Even if we align AIs, cultural and evolutionary forces can nonetheless push our values far. Does that brief explanation provide enough of a pointer to what I’m saying for you to be ~satisfied? I know I haven’t said much here; but I kind doubt my view on this issue is that rare that you’ve literally never seen someone present a case for it.
I have some objections to the idea that groups will be “immortal” in the future, in the sense of never changing, dying, or rotting, and persisting over time in a roughly unchanged form, exerting consistent levels of power over a very long time period. To be clear, I do think AGI can make some forms of value lock-in more likely, but I want to distinguish a few different claims:
(1) is a future value lock-in likely to occur at some point, especially not long after human labor has become ~obsolete?
(2) is lock-in more likely if we perform, say, a century more of technical AI alignment research, before proceeding forward?
(3) is it good to make lock-in more likely by, say, delaying AI by 100 years to do more technical alignment research, before proceeding forward? (i.e., will it be good or bad to do this type of thing?)
My quick and loose current answers to these questions are as follows:
This seems plausible but unlikely to me in a strong form. Some forms of lock-in seem likely; I’m more skeptical of the more radical scenarios people have talked about.
I suspect lock-in would become more likely in this case, but the marginal effect of more research would likely be pretty small.
I am pretty uncertain about this question, but I lean towards being against deliberately aiming for this type of lock-in. I am inclined to this view for a number of reasons, but one reason is that this policy seems to make it more likely that we restrict innovation and experience system rot on a large scale, causing the future to be much bleaker than it otherwise could be. See also Robin Hanson’s post on world government rot.
This generally makes sense to me. I also think human irrationality could prompt a war with AIs. I don’t disagree with the claim insofar as you’re claiming that such a war is merely plausible (say >10% chance), rather than a default outcome. (Although to be clear, I don’t think such a war would likely cut cleanly along human vs. AI lines.)
On the other hand, humans are currently already irrational and yet human vs. human wars are not the default (they happen frequently, e.g. but at any given time on Earth, the vast majority of humans are not in a warzone or fighting in an active war). It’s not clear to me why human vs. AIs would make war more likely to occur than in the human vs. human case, if by assumption the main difference here is that one side is more rational.
In other words, if we’re moving from a situation of irrational parties vs. other irrational parties to irrational parties vs. rational parties, I’m not sure why we’d expect this change to make things more warlike and less peaceful as a result. You mention one potential reason:
I don’t think this follows. Humans presumably also had empathy in e.g. 1500, back when war was more common, so how could it explain our current relative peace?
Perhaps you mean that cultural changes caused our present time period to be relatively peaceful. But I’m not sure about that; or at least, the claim should probably be made more specific. There are many things about the environment that have changed since our relatively more warlike ancestors, and (from my current perspective) I think it’s plausible that any one of them could have been the reason for our current relative peace. That is, I don’t see a good reason to single out human values or empathy as the main cause in itself.
For example, humans are now a lot richer per capita, which might mean that people have “more to lose” when going to war, and thus are less likely to engage in it. We’re also a more globalized culture, and our economic system relies more on long-distance trade than it did in the past, making war more costly. We’re also older, in the sense that the median age is higher (and old people are less likely to commit violence), and women got the right to vote (who perhaps are less likely to support hawkish politicians).
To be clear, I don’t put much confidence in any of these explanations. As of now, I’m very uncertain about why the 21st century seems relatively peaceful compared to the distant past. However I do think that:
None of the explanations I’ve given above seem well-described as “our values/empathy” made us less warlike. And to the extent our values changed, I expect that was probably downstream of more fundamental changes, like economic growth and globalization, rather than being an exogenous change that was independent of these effects.
To the extent that changing human nature explains our current relatively peaceful era, this position seems to require that you believe human nature is fundamentally quite plastic and can be warped over time pretty easily due to cultural changes. If that’s true, human nature is ultimately quite variable, perhaps more similar to AI than you might have otherwise thought (as both are presumably pushed around easily by training data).
We have more empirical evidence that we can look at when it comes to human-human wars, making it easier to have well-calibrated beliefs about chances of winning. When it comes to human-AI wars, we’re more likely to have wildly irrational beliefs.
This is just one reason war could occur though. Perhaps a more likely reason is that there won’t be a way to maintain the peace, that both sides can be convinced will work, and is sufficiently cheap that the cost doesn’t eat up all of the gains from avoiding war. For example, how would the human faction know that if it agrees to peace, the AI faction won’t fully dispossess the humans at some future date when it’s even more powerful? Even if AIs are able to come up with some workable mechanisms, how would the humans know that it’s not just a trick?
Without credible assurances (which seems hard to come by), I think if humans do agree to peace, the most likely outcome is that it does get dispossessed in the not too distant future, either gradually (for example getting scammed/persuaded/blackmailed/stolen from in various ways), or all at once. I think society as a whole won’t have a strong incentive to protect humans because they’ll be almost pure consumers (not producing much relative to what they consume), and such classes of people are often killed or dispossessed in human history (e.g., landlords after communist takeovers).
I mainly mean that without empathy/altruism, we’d probably have even more wars, both now and then.
Well, yes, I’m also pretty scared of this. See this post where I talked about something similar. I guess overall I’m still inclined to push for a future where “AI alignment” and “human safety” are both solved, instead of settling for one in which neither is (which I’m tempted to summarize your position as, but I’m not sure if I’m being fair).
For what it’s worth, I’d loosely summarize my position on this issue as being that I mainly think of AI as a general vehicle for accelerating technological and economic growth, along with accelerating things downstream of technology and growth, such as cultural change. And I’m skeptical we could ever fully “solve alignment” in the ambitious sense you seem to be imagining.
In this frame, it could be good to slow down AI if your goal is to delay large changes to the world. There are plausible scenarios in which this could make sense. Perhaps most significantly, one could be a cultural conservative and think that cultural change is generally bad in expectation, and thus more change is bad even if it yields higher aggregate prosperity sooner in time (though I’m not claiming this is your position).
Whereas, by contrast, I think cultural change can be bad, but I don’t see much reason to delay it if it’s inevitable. And the case against delaying AI seems even stronger here if you care about preserving (something like) the lives and values of people who currently exist, as AI offers the best chance of extending our lifespans, and “putting us in the driver’s seat” more generally by allowing us to actually be there during AGI development.
If future humans were in the driver’s seat instead, but with slightly more control over the process, I wouldn’t necessarily see that as being significantly better in expectation compared to my favored alternative, including over the very long run (according to my values).
(And as a side note, I also care about influencing human values, or what you might term “human safety”, but I generally see this as orthogonal to this specific discussion.)
Why only “slightly” more control? It’s surprising to see you say this without giving any reasons or linking to some arguments, as this degree of alignment difficulty seems like a very unusual position that I’ve never seen anyone argue for before.
I’m a bit surprised you haven’t seen anyone make this argument before. To be clear, I wrote the comment last night on a mobile device, and it was intended to be a brief summary of my position, which perhaps explains why I didn’t link to anything or elaborate on that specific question. I’m not sure I want to outline my justifications for my view right now, but my general impression is that civilization has never had much central control over cultural values, so it’s unsurprising if this situation persists into the future, including with AI. Even if we align AIs, cultural and evolutionary forces can nonetheless push our values far. Does that brief explanation provide enough of a pointer to what I’m saying for you to be ~satisfied? I know I haven’t said much here; but I kind doubt my view on this issue is that rare that you’ve literally never seen someone present a case for it.
Where the main counterargument is that now the groups in power can be immortal and digital minds will be possible.
See also: AGI and Lock-in
I have some objections to the idea that groups will be “immortal” in the future, in the sense of never changing, dying, or rotting, and persisting over time in a roughly unchanged form, exerting consistent levels of power over a very long time period. To be clear, I do think AGI can make some forms of value lock-in more likely, but I want to distinguish a few different claims:
(1) is a future value lock-in likely to occur at some point, especially not long after human labor has become ~obsolete?
(2) is lock-in more likely if we perform, say, a century more of technical AI alignment research, before proceeding forward?
(3) is it good to make lock-in more likely by, say, delaying AI by 100 years to do more technical alignment research, before proceeding forward? (i.e., will it be good or bad to do this type of thing?)
My quick and loose current answers to these questions are as follows:
This seems plausible but unlikely to me in a strong form. Some forms of lock-in seem likely; I’m more skeptical of the more radical scenarios people have talked about.
I suspect lock-in would become more likely in this case, but the marginal effect of more research would likely be pretty small.
I am pretty uncertain about this question, but I lean towards being against deliberately aiming for this type of lock-in. I am inclined to this view for a number of reasons, but one reason is that this policy seems to make it more likely that we restrict innovation and experience system rot on a large scale, causing the future to be much bleaker than it otherwise could be. See also Robin Hanson’s post on world government rot.