“Goes well for humans” (i.e for a very long time) worlds are mostly worlds where AGI is fully theoretically and empirically aligned with a CEV-shaped alignment target, which for me logically requires animal welfare. (I also currently believe those worlds to be implausible because no company seems focused on this)
I struggle to imagine any deliberative or reflective-preference oriented process that does not give the right answer to the animal welfare question. If it doesn’t care about non-human animals, then it means animals are not sentient, or that the CEV is misaligned with human interests and some humans will die because they don’t check the right boxes (and sentience isn’t a box), or that morality is weird and it’s actually fine to torture sentient beings (possible but implausible).
There are other worlds where “goes well for humans” means corrigible and aligned on some unaltered human values. In those worlds, I expect animals to take a blow on the short term, and possibly on the very long if the principal does not care about animal suffering. I also expect humanity to do other morally wrong things that they don’t suspect to be wrong, and die counterfactually way sooner.
“Goes well for humans” (i.e for a very long time) worlds are mostly worlds where AGI is fully theoretically and empirically aligned with a CEV-shaped alignment target, which for me logically requires animal welfare. (I also currently believe those worlds to be implausible because no company seems focused on this)
I struggle to imagine any deliberative or reflective-preference oriented process that does not give the right answer to the animal welfare question. If it doesn’t care about non-human animals, then it means animals are not sentient, or that the CEV is misaligned with human interests and some humans will die because they don’t check the right boxes (and sentience isn’t a box), or that morality is weird and it’s actually fine to torture sentient beings (possible but implausible).
There are other worlds where “goes well for humans” means corrigible and aligned on some unaltered human values. In those worlds, I expect animals to take a blow on the short term, and possibly on the very long if the principal does not care about animal suffering. I also expect humanity to do other morally wrong things that they don’t suspect to be wrong, and die counterfactually way sooner.