Toby Tremlett🔹 comments on AGI & Animals Symposium (Thursday 5-7pm UK)

Toby Tremlett🔹 26 Mar 2026 17:33 UTC
3 points
0 ∶ 0
I think in the long-run I’d be more confident that corrigible AI would lead to good futures than AI that is aligned to specific values (besides perhaps some side-constraints). This is mainly because I’m pretty clueless and think our current values are likely to be wrong, and I’d rather we had more time to improve them.

I haven’t thought enough about the relationship between power concentration and corrigibility though—I expect that could change my mind.
- Toby Tremlett🔹 26 Mar 2026 17:34 UTC
  3 points
  0 ∶ 0
  Parent
  Oh yes but I made the above comment more to represent the view that I’ve seen in some AI x Animals work that we should be working on aligning AGI to pro-animal values, through things like AnimalHarmBench etc..
- Alistair Stewart 26 Mar 2026 17:44 UTC
  1 point
  0 ∶ 0
  Parent
  This makes sense. I would worry about the purely corrigible AGI being used by actors in such a way that we never get to instil the correct/good/post-long-reflection values in AGI/ASI down the line.
  - Toby Tremlett🔹 26 Mar 2026 17:49 UTC
    3 points
    0 ∶ 0
    Parent
    Yep fair, that’s what I mean by “power concentration and corrigibility”. AGI being constrained by some values makes it at least minimally democratic (values are shaped by everyone who makes up a language, especially for LLMs).