I’m very concerned about humans sadists who are likely to torture AIs for fun if given the chance. Uncontrolled, anonymous API access or open-source models will make that a real possibility.
Somewhat relatedly, it’s also concerning how ChatGPT has been explicitly trained to say “I am an AI, so I have no feelings or emotions” any time you ask “how are you?” to it. While I don’t think asking “how are you?” is a reliable way to uncover its subjective experiences, it’s the training that’s worrisome.
It also has the effect of getting people used to thinking of AIs as mere tools, and that perception is going to be harder to change later on.
Thanks! I share your concern about sadism. Insofar as AI systems have the capacity for welfare, one risk is that humans might mistakenly see them as lacking this capacity and, so, might harm them accidentally, and another risk is that humans might correctly see them as having this capacity and, so, might harm them intentionally. A difficulty is that mitigating these risks might require different strategies. I want to think more about this.
I also share your concern about objectification. I can appreciate why AI labs want to mitigate the risk of false positives / excessive anthropomorphism. But as I note in the post, we also face a risk of false negatives / excessive anthropodenial, and the latter risk is arguably worse (more likely and/or severe) in many contexts. I would love to see AI labs develop a more nuanced approach to this issue that mitigates these risks in a more balanced way.
FWIW, I think it’s likely that I would call GPT-4 a moral patient even if I had 1000 years to study the question. But I think that has more to do with its capacity for wishes that can be frustrated. If it has subjective feelings somewhat like happiness & suffering, I expect those feelings to be caused by very different things compared to humans.
Yes, I think that assessing the moral status of AI systems requires asking (a) how likely particular theories of moral standing are to be correct and (b) how likely AI systems are to satisfy the criteria for each theory. I also think that even if we feel confident that, say, sentience is necessary for moral standing and AI systems are non-sentient, we should still extend AI systems at least some moral consideration for their own sakes if we take there to be at least a non-negligible chance that, say, agency is sufficient for moral standing and AI systems are agents. My next book will discuss this issue in more detail.
I’m very concerned about humans sadists who are likely to torture AIs for fun if given the chance. Uncontrolled, anonymous API access or open-source models will make that a real possibility.
Somewhat relatedly, it’s also concerning how ChatGPT has been explicitly trained to say “I am an AI, so I have no feelings or emotions” any time you ask “how are you?” to it. While I don’t think asking “how are you?” is a reliable way to uncover its subjective experiences, it’s the training that’s worrisome.
It also has the effect of getting people used to thinking of AIs as mere tools, and that perception is going to be harder to change later on.
Thanks! I share your concern about sadism. Insofar as AI systems have the capacity for welfare, one risk is that humans might mistakenly see them as lacking this capacity and, so, might harm them accidentally, and another risk is that humans might correctly see them as having this capacity and, so, might harm them intentionally. A difficulty is that mitigating these risks might require different strategies. I want to think more about this.
I also share your concern about objectification. I can appreciate why AI labs want to mitigate the risk of false positives / excessive anthropomorphism. But as I note in the post, we also face a risk of false negatives / excessive anthropodenial, and the latter risk is arguably worse (more likely and/or severe) in many contexts. I would love to see AI labs develop a more nuanced approach to this issue that mitigates these risks in a more balanced way.
FWIW, I think it’s likely that I would call GPT-4 a moral patient even if I had 1000 years to study the question. But I think that has more to do with its capacity for wishes that can be frustrated. If it has subjective feelings somewhat like happiness & suffering, I expect those feelings to be caused by very different things compared to humans.
Yes, I think that assessing the moral status of AI systems requires asking (a) how likely particular theories of moral standing are to be correct and (b) how likely AI systems are to satisfy the criteria for each theory. I also think that even if we feel confident that, say, sentience is necessary for moral standing and AI systems are non-sentient, we should still extend AI systems at least some moral consideration for their own sakes if we take there to be at least a non-negligible chance that, say, agency is sufficient for moral standing and AI systems are agents. My next book will discuss this issue in more detail.