From the perspective of non-human animals, humanity looks a lot like an unaligned superintelligence. We closely resemble the “paperclip maximizer” thought experiment, where the “paperclips” are narrow human goals. Over millennia, we’ve become incredibly good at optimizing for those goals, but in the process we systematically exclude other sentient beings out of the moral circle and override their most basic interests for benefits that are often trivial.
Given this reality, without a fundamental shift in our ethics, superintelligence is more likely to scale our existing biases than to correct them. A more powerful optimizer does not automatically become more benevolent; it just becomes more effective at pursuing the same goals. And higher intelligence and capability do not by themselves fix moral blind spots.
This is precisely the insight that drives concern about AI alignment. We do not assume that more capable and intelligent AI systems will automatically act in ways that are good for us. (Even though they do not have anywhere near as bad a track record as humanity does toward animals.) If “automatic benefit” were a real thing, AI alignment would be a niche concern rather than a central one. We would just accelerate progress and trust that everything else would sort itself out. But we do not believe that, and for good reason.
If we take this insight seriously, we should also apply it symmetrically. The core alignment problem may not just be between humans and AI, but between humans and the rest of sentient life. And it would be dangerously Panglossian to assume that AGI will automatically solve animal suffering. Based on humanity’s track record of causing massive harm despite our increasing capabilities, it is irresponsible to default to optimism about AGI “naturally” improving things without a justification that matches what is at stake.
Extra thought 1:
And the thing that worries me most about human alignment? Permanent lock-in. If we reach advanced AI systems without deliberately including concern for all sentient beings, we risk locking in a future where today’s exclusions last for a very long time. Once such systems are embedded in infrastructure, institutions, and potentially self-improving AI, their underlying value structures may become extremely difficult to change.
A historical analogy makes this clear. Think about the Industrial Revolution, a massive event that empowered humanity. If it had happened in a society that cared about animal welfare (maybe a vegetarian country?), trillions of farmed animals could have been spared extreme suffering (cramped spaces, painful procedures and deaths for basically trivial human gain.) Early ethical choices really do shape the fate of huge numbers of sentient beings.
Moreover, the stakes arefar greater this time. Humanity will remain a tiny outlier, yet one that would hold disproportionate power over a vastly larger number of sentient beings in the future. So, misalignments now could ripple across astronomical numbers of individuals, turning a large-scale moral failure into a potentially permanent, cosmic-scale one.
Extra thought 2:
It’s sadly all too common for us to push animals to the very bottom of the priority list, thinking, “Once we fix all our problems, we’ll start worrying about extreme animal suffering.” So I’m really glad to see this discussion happening!
Here’s how I’m thinking about this:
From the perspective of non-human animals, humanity looks a lot like an unaligned superintelligence. We closely resemble the “paperclip maximizer” thought experiment, where the “paperclips” are narrow human goals. Over millennia, we’ve become incredibly good at optimizing for those goals, but in the process we systematically exclude other sentient beings out of the moral circle and override their most basic interests for benefits that are often trivial.
Given this reality, without a fundamental shift in our ethics, superintelligence is more likely to scale our existing biases than to correct them. A more powerful optimizer does not automatically become more benevolent; it just becomes more effective at pursuing the same goals. And higher intelligence and capability do not by themselves fix moral blind spots.
This is precisely the insight that drives concern about AI alignment. We do not assume that more capable and intelligent AI systems will automatically act in ways that are good for us. (Even though they do not have anywhere near as bad a track record as humanity does toward animals.) If “automatic benefit” were a real thing, AI alignment would be a niche concern rather than a central one. We would just accelerate progress and trust that everything else would sort itself out. But we do not believe that, and for good reason.
If we take this insight seriously, we should also apply it symmetrically. The core alignment problem may not just be between humans and AI, but between humans and the rest of sentient life. And it would be dangerously Panglossian to assume that AGI will automatically solve animal suffering. Based on humanity’s track record of causing massive harm despite our increasing capabilities, it is irresponsible to default to optimism about AGI “naturally” improving things without a justification that matches what is at stake.
Extra thought 1:
And the thing that worries me most about human alignment? Permanent lock-in. If we reach advanced AI systems without deliberately including concern for all sentient beings, we risk locking in a future where today’s exclusions last for a very long time. Once such systems are embedded in infrastructure, institutions, and potentially self-improving AI, their underlying value structures may become extremely difficult to change.
A historical analogy makes this clear. Think about the Industrial Revolution, a massive event that empowered humanity. If it had happened in a society that cared about animal welfare (maybe a vegetarian country?), trillions of farmed animals could have been spared extreme suffering (cramped spaces, painful procedures and deaths for basically trivial human gain.) Early ethical choices really do shape the fate of huge numbers of sentient beings.
Moreover, the stakes are far greater this time. Humanity will remain a tiny outlier, yet one that would hold disproportionate power over a vastly larger number of sentient beings in the future. So, misalignments now could ripple across astronomical numbers of individuals, turning a large-scale moral failure into a potentially permanent, cosmic-scale one.
Extra thought 2:
It’s sadly all too common for us to push animals to the very bottom of the priority list, thinking, “Once we fix all our problems, we’ll start worrying about extreme animal suffering.” So I’m really glad to see this discussion happening!