When I think of values I think of interpretation #2, and I don’t think you prove that P4 is untrue under that interpretation. The idea is that humans are both a) constrained and b) generally inclined to follow some set of rules. An AI would be neither constrained nor necessarily inclined to follow these rules.
Consider our most basic laws: do not murder, do not steal, do not physically assault another person. These seem like very natural ideas could be stumbled upon by a large set of civilizations, even given wildly varying individual and cultural values between them.
Virtually all historical and present atrocities are framed in terms of determining who is a person and who is not. Why would AIs see us as having moral personhood?
When I think of values I think of interpretation #2, and I don’t think you prove that P4 is untrue under that interpretation. The idea is that humans are both a) constrained and b) generally inclined to follow some set of rules. An AI would be neither constrained nor necessarily inclined to follow these rules.
P4 is about whether human values are an extremely narrow target, not about whether AIs will be necessarily be inclined to follow them, or necessarily constrained by them. I agree it is logically possible for AIs to exist who would try to murder humans; indeed, there are already humans who try to do that to others. The primary question is instead about how narrow of a target the value “don’t murder” or “don’t steal” is, and whether we need to put in exceptional effort in order to hit these targets.
Among humans, it seems the specific target here is not very narrow, despite our greatly varying individual objectives. This fact provides a hint at how narrow our basic social mechanisms really are, in my opinion.
Virtually all historical and present atrocities are framed in terms of determining who is a person and who is not. Why would AIs see us as having moral personhood?
Here again I would say the question is more about whether thinking that humans have relevant personhood is an extremely narrow target, not about whether AIs will necessarily see us as persons. They may see us as persons, and maybe they won’t. But the idea that they would doesn’t seem very unnatural. For one, if AIs are created in something like our current legal system, the concept of legal personhood will already be extended to humans by default. It seems pretty natural for future people to inherit legal concepts from the past. And all I’m really arguing here is that this isn’t an extremely narrow target to hit, not that it must happen by necessity.
I guess “narrow target” is just an underspecified part of your argument then, because I don’t know what it’s meant to capture if not “in most plausible scenarios, AI doesn’t follow the same set of rules as humans”.
Can you outline the case for thinking that “in most plausible scenarios, AI doesn’t follow the same set of rules as humans”? To clarify, by “same set of rules” here I’m imagining basic legal rules: do not murder, do not steal etc. I’m not making a claim that specific legal statutes will persist over time.
It seems to me both that:
To the extent that AIs are our descendants, they should inherit our legal system, legal principles, and legal concepts, similar to how e.g. the United States inherited legal principles from the United Kingdom. We should certainly expect our legal system to change over time as our institutions adapt to technological change. But, absent a compelling reason otherwise, it seems wrong to think that “do not murder a human” will go out the window in “most plausible scenarios”.
Our basic legal rules seem pretty natural, rather than being highly contingent. It’s easy to imagine plenty of alien cultures stumbling upon the idea of property rights, and implementing the rule “do not steal from another legal person”.
My point is that AI could plausibly have rules for interacting with other “persons”, and those rules could look much like ours, but that we will not be “persons” under their code. Consider how “do not murder” has never applied to animals.
If AIs treat us like we treat animals then the fact that they have “values” will not be very helpful to us.
I think AIs will be trained on our data, and will be integrated into our culture, having been deliberately designed for the purpose of filling human-shaped holes in our economy, to automate labor. This means they’ll probably inherit our social concepts, in addition to most other concepts we have about the physical world. This situation seems disanalogous to the way humans interact with animals in many ways. Animals can’t even speak language.
Anyway, even the framing you have given seems like a partial concession towards my original point. A rejection of premise 4 is not equivalent to the idea that AIs will automatically follow our legal norms. Instead, it was about whether “human values” are an extremely narrow target, in the sense of being a natural vs. contingent set of values that are very hard to replicate in other circumstances.
If the way AIs relate to human values is similar to how humans relate to animals, then I’ll point out that many existing humans already find the idea of caring about animals to be quite natural, even if most ultimately decide not to take the idea very far. Compare the concept of “caring about animals” to “caring about paperclip maximization”. In the first instance, we have robust examples of people actually doing that, but hardly any examples of people in the second instance. This is after all because caring about paperclip maximization is an unnatural and arbitrary thing to care about relative to how most people conceptualize the world.
Again, I’m not saying AIs will necessarily care about human values. That was never the claim. The entire question was about whether human values are an “extremely narrow target”. And I think, within this context, given the second interpretation of human values in my original comment, the original thesis seems to have held up fine.
When I think of values I think of interpretation #2, and I don’t think you prove that P4 is untrue under that interpretation. The idea is that humans are both a) constrained and b) generally inclined to follow some set of rules. An AI would be neither constrained nor necessarily inclined to follow these rules.
Virtually all historical and present atrocities are framed in terms of determining who is a person and who is not. Why would AIs see us as having moral personhood?
P4 is about whether human values are an extremely narrow target, not about whether AIs will be necessarily be inclined to follow them, or necessarily constrained by them. I agree it is logically possible for AIs to exist who would try to murder humans; indeed, there are already humans who try to do that to others. The primary question is instead about how narrow of a target the value “don’t murder” or “don’t steal” is, and whether we need to put in exceptional effort in order to hit these targets.
Among humans, it seems the specific target here is not very narrow, despite our greatly varying individual objectives. This fact provides a hint at how narrow our basic social mechanisms really are, in my opinion.
Here again I would say the question is more about whether thinking that humans have relevant personhood is an extremely narrow target, not about whether AIs will necessarily see us as persons. They may see us as persons, and maybe they won’t. But the idea that they would doesn’t seem very unnatural. For one, if AIs are created in something like our current legal system, the concept of legal personhood will already be extended to humans by default. It seems pretty natural for future people to inherit legal concepts from the past. And all I’m really arguing here is that this isn’t an extremely narrow target to hit, not that it must happen by necessity.
I guess “narrow target” is just an underspecified part of your argument then, because I don’t know what it’s meant to capture if not “in most plausible scenarios, AI doesn’t follow the same set of rules as humans”.
Can you outline the case for thinking that “in most plausible scenarios, AI doesn’t follow the same set of rules as humans”? To clarify, by “same set of rules” here I’m imagining basic legal rules: do not murder, do not steal etc. I’m not making a claim that specific legal statutes will persist over time.
It seems to me both that:
To the extent that AIs are our descendants, they should inherit our legal system, legal principles, and legal concepts, similar to how e.g. the United States inherited legal principles from the United Kingdom. We should certainly expect our legal system to change over time as our institutions adapt to technological change. But, absent a compelling reason otherwise, it seems wrong to think that “do not murder a human” will go out the window in “most plausible scenarios”.
Our basic legal rules seem pretty natural, rather than being highly contingent. It’s easy to imagine plenty of alien cultures stumbling upon the idea of property rights, and implementing the rule “do not steal from another legal person”.
My point is that AI could plausibly have rules for interacting with other “persons”, and those rules could look much like ours, but that we will not be “persons” under their code. Consider how “do not murder” has never applied to animals.
If AIs treat us like we treat animals then the fact that they have “values” will not be very helpful to us.
I think AIs will be trained on our data, and will be integrated into our culture, having been deliberately designed for the purpose of filling human-shaped holes in our economy, to automate labor. This means they’ll probably inherit our social concepts, in addition to most other concepts we have about the physical world. This situation seems disanalogous to the way humans interact with animals in many ways. Animals can’t even speak language.
Anyway, even the framing you have given seems like a partial concession towards my original point. A rejection of premise 4 is not equivalent to the idea that AIs will automatically follow our legal norms. Instead, it was about whether “human values” are an extremely narrow target, in the sense of being a natural vs. contingent set of values that are very hard to replicate in other circumstances.
If the way AIs relate to human values is similar to how humans relate to animals, then I’ll point out that many existing humans already find the idea of caring about animals to be quite natural, even if most ultimately decide not to take the idea very far. Compare the concept of “caring about animals” to “caring about paperclip maximization”. In the first instance, we have robust examples of people actually doing that, but hardly any examples of people in the second instance. This is after all because caring about paperclip maximization is an unnatural and arbitrary thing to care about relative to how most people conceptualize the world.
Again, I’m not saying AIs will necessarily care about human values. That was never the claim. The entire question was about whether human values are an “extremely narrow target”. And I think, within this context, given the second interpretation of human values in my original comment, the original thesis seems to have held up fine.