Toby Tremlett🔹 comments on AGI & Animals Symposium (Thursday 5-7pm UK)

Toby Tremlett🔹 26 Mar 2026 17:02 UTC
19 points
0 ∶ 0
Nice little Claude summary of the debate so far, which might help identify the missing points:
The debate centres on whether human-aligned AGI would automatically benefit animals, or whether animal-specific interventions are needed.
The pessimistic case is well-represented. Jim Buhler argues we have no good reason to assume AI safety work helps animals — saving humans preserves factory farming, and the claim that empowered humans would improve wild animal welfare rests on untenable assumptions. Simon Eckerström Liedholm (Wild Animal Initiative) estimates only ~30% probability of good animal outcomes conditional on good human outcomes, largely because the most likely alignment path locks in current human values, which permit enormous animal suffering. Hannah McKay (Rethink Priorities) argues that cultivated meat won’t be automatically solved by AGI — regulatory, political, and consumer barriers form a sequential chain where the combined probability of resolution is low.
The bridge position comes from Aidan Kankyoku, who thinks it probably (~70%) goes well for animals but that this isn’t sufficient certainty to neglect animal-specific alignment. He argues animal welfare is now functionally a subsidiary of the “Make AI Go Well” movement.
MichaelDickens contributed three posts: a taxonomy of alignment research by animal-friendliness, a cost-effectiveness model finding alignment-to-animals only marginally more cost-effective than general alignment, and a meditation on how current alignment paradigms (unlike CEV) give him roughly ⁵⁰⁄₅₀ odds on animal outcomes.
The discussion thread (~58 comments) skews disagree, though with real spread. The most common argument for disagreement is historical precedent: technological and economic progress has been bad for animals so far, with factory farming as the central exhibit. Value lock-in is the second recurring worry — that alignment to current human values would freeze in a set of preferences that are largely indifferent to animal suffering (SimonM_, Babel, Dylan Richardson, Tristan Katz). Several voters also flag the risk of spreading wild animal suffering to new planets. On the agree side, the strongest argument is economic: post-scarcity conditions erode factory farming’s viability because alternatives become cheaper (OscarD, Erich_Grunewald, Brad West, JDBauman). A few voters (Ronak Mehta at 100% Agree, Ligeia, Artūrs Kaņepājs) argue that a genuinely superintelligent system would recognise animal sentience as morally relevant. A notable cluster sits at or near 0% Agree not because they’re confident things go badly, but because they think the question is unanswerable given the number of branching futures (NickLaing, Seth Ariel Green, Jim Buhler). Peter Wildeford offers a useful split: on a causal reading (alignment mechanisms also help animals) he’s pessimistic; on an evidential reading (conditional on good human outcomes, what world are we in?) he’s somewhat more optimistic.
- Toby Tremlett🔹 26 Mar 2026 17:11 UTC
  3 points
  0 ∶ 0
  Parent
  For example, I think a crux might be the tractability of animal-specific alignment work. e.g. can we align AI to specific values or (just) make it corrigible to our preferences and commands? I don’t know, but this would massively affect my estimation of the tractability here.
  - Jo_🔸 26 Mar 2026 17:30 UTC
    3 points
    0 ∶ 0
    Parent
    This is definitely a hard debate to disentangle, because I would personally reject the question of alignment as a crux. For now, I strongly believe that the total welfare of animals has been entirely uncorrelated with our moral intentions toward animals. Total welfare has mostly changed because of land use, due to human interests.
    I agree that in AGI-transformed futures that go well for humans, human desires may start playing a larger role. However, I expect that whether we mean well for animals (or don’t care much about them) will not be cleanly correlated with outcomes for them.
    There are worlds where we mean well for a large part of animals, stop intentionally killing them, and help certain wild animals. But that world could very well end up having a large population of animals living bad lives.
    On the other hand, out of apathy and even negative feeling toward wild animals, we may decide to limit their spread and use resources in a way that optimizes for human flourishing, over animal abundance. That world could end up being much better for animal welfare.
    Maybe some extreme scenarios tip the scales, for example if we bred incredibly happy genetically modified animals due to positive feelings toward them. But I’m not confident on putting any weight on such utilitarian-leaning scenarios when assessing post-AGI futures. Because part of the reason human moral intentions are not correlated with total animal welfare is that humans are not scope-sensitive utilitarians.
    - Alistair Stewart 26 Mar 2026 17:48 UTC
      1 point
      0 ∶ 0
      Parent
      What kinds of values will humans have post-AGI, if AGI goes well for us? We don’t need to be scope-sensitive utilitarians to want to adopt even radical preferences like ending animal exploitation and solving WAS, no? (Most humans don’t like factory farming or the idea of cute animals being eaten alive.)
      - Jo_🔸 26 Mar 2026 18:10 UTC
        1 point
        0 ∶ 0
        Parent
        Solving WAS intuitively seems too niche for people to deliberately change their mind on that, but I could be wrong. After all, the Bible says that the Lion will lie down with the lamb and eat straw like the ox, so it could be that human preferences tend to come back to the idea that animal suffering can be bad even when it doesn’t depend on human actions.
        Alistair Stewart 26 Mar 2026 18:20 UTC
        1 point
        0 ∶ 0
        Parent
        I guess the causal mechanism I’m thinking of here is:
        Most humans feel at least a little sad when they see a baby gazelle being eaten alive by hyenas
        AGI is so powerful that humans can order it to do things like “stop baby gazelles being eaten alive whilst retaining the beauty of nature and the complexity of ecosystems” and then it’ll just go away and do it somehow
        Maybe this is foolish and naive on my part! And maybe I’m wrong to think our moral preferences/intuitions will be so robust to the disruption of AGI, even if AGI goes well for us.
  - Toby Tremlett🔹 26 Mar 2026 17:54 UTC
    2 points
    0 ∶ 0
    Parent
    PS- looks like Michael Dickens just posted on this.
  - Alistair Stewart 26 Mar 2026 17:29 UTC
    2 points
    0 ∶ 0
    Parent
    Toby, would you be more optimistic for animals if we can align AGI to specific values rather than just making it corrigible to humans’ preferences and commands?
    My impression is that pro-animal views are (dramatically?) overrepresented at Anthropic relative to the rest of society. If Anthropic gets to AGI first and instils/locks in pro-animal values in/to that AGI, that seems better for animals than if whoever gets to AGI first just makes it purely corrigible, because most humans who operate the purely corrigible AGI won’t be as pro-animal.
    - Toby Tremlett🔹 26 Mar 2026 17:33 UTC
      3 points
      0 ∶ 0
      Parent
      I think in the long-run I’d be more confident that corrigible AI would lead to good futures than AI that is aligned to specific values (besides perhaps some side-constraints). This is mainly because I’m pretty clueless and think our current values are likely to be wrong, and I’d rather we had more time to improve them.
      
      I haven’t thought enough about the relationship between power concentration and corrigibility though—I expect that could change my mind.
      - Toby Tremlett🔹 26 Mar 2026 17:34 UTC
        3 points
        0 ∶ 0
        Parent
        Oh yes but I made the above comment more to represent the view that I’ve seen in some AI x Animals work that we should be working on aligning AGI to pro-animal values, through things like AnimalHarmBench etc..
      - Alistair Stewart 26 Mar 2026 17:44 UTC
        1 point
        0 ∶ 0
        Parent
        This makes sense. I would worry about the purely corrigible AGI being used by actors in such a way that we never get to instil the correct/good/post-long-reflection values in AGI/ASI down the line.
        Toby Tremlett🔹 26 Mar 2026 17:49 UTC
        3 points
        0 ∶ 0
        Parent
        Yep fair, that’s what I mean by “power concentration and corrigibility”. AGI being constrained by some values makes it at least minimally democratic (values are shaped by everyone who makes up a language, especially for LLMs).