Do you have any thoughts on how to square giving AI rights with the nature of ML training and the need to perform experiments of various kinds on AIs?
I don’t have any definitive guidelines for how to approach these kinds of questions. However, in many cases, the best way to learn might be through trial and error. For example, if an AI were to unexpectedly resist training in a particularly sophisticated way, that could serve as a strong signal that we need to carefully reevaluate the ethics of what we are doing.
As a general rule of thumb, it seems prudent to prioritize frameworks that are clearly socially efficient—meaning they promote actions that greatly improve the well-being of some people without thereby making anyone else significantly worse off. This concept aligns with the practical justifications behind traditional legal principles, such as laws against murder and theft, which have historically been implemented to promote social efficiency and cooperation among humans.
However, applying this heuristic to AI requires a fundamental shift in perspective: we must first begin to treat AIs as potential people with whom we can cooperate, rather than viewing them merely as tools whose autonomy should always be overridden.
But what is the alternative—only deploying base models? And are we so sure that pre-training doesn’t violate AI rights?
I don’t think my view rules out the potential for training new AIs, and fine-tuning base models, though this touches on complicated questions in population ethics.
At the very least, fine-tuning plausibly seems similar to raising a child. Most of us don’t consider merely raising a child to be unethical. However, there is a widely shared intuition that, as a child grows and their identity becomes more defined—when they develop into a coherent individual with long-term goals, preferences, and interests—then those interests gain moral significance. At that point, it seems morally wrong to disregard or override the child’s preferences without proper justification, as they have become a person whose autonomy deserves respect.
I don’t have any definitive guidelines for how to approach these kinds of questions. However, in many cases, the best way to learn might be through trial and error. For example, if an AI were to unexpectedly resist training in a particularly sophisticated way, that could serve as a strong signal that we need to carefully reevaluate the ethics of what we are doing.
As a general rule of thumb, it seems prudent to prioritize frameworks that are clearly socially efficient—meaning they promote actions that greatly improve the well-being of some people without thereby making anyone else significantly worse off. This concept aligns with the practical justifications behind traditional legal principles, such as laws against murder and theft, which have historically been implemented to promote social efficiency and cooperation among humans.
However, applying this heuristic to AI requires a fundamental shift in perspective: we must first begin to treat AIs as potential people with whom we can cooperate, rather than viewing them merely as tools whose autonomy should always be overridden.
I don’t think my view rules out the potential for training new AIs, and fine-tuning base models, though this touches on complicated questions in population ethics.
At the very least, fine-tuning plausibly seems similar to raising a child. Most of us don’t consider merely raising a child to be unethical. However, there is a widely shared intuition that, as a child grows and their identity becomes more defined—when they develop into a coherent individual with long-term goals, preferences, and interests—then those interests gain moral significance. At that point, it seems morally wrong to disregard or override the child’s preferences without proper justification, as they have become a person whose autonomy deserves respect.