MichaelDickens comments on Cost-effectiveness model for AI alignment-to-animals vs. alignment-in-general

MichaelDickens 29 Mar 2026 0:59 UTC
4 points
0 ∶ 0

for example we’ve aligned some ai to winning at chess and now they’re better than any human

Chess bots are narrow AI, not general AI, which makes the situation very different. We don’t know how to align an ASI to the goal of winning at chess. The most likely outcome would be some sort of severe misalignment—for example, maybe we think we trained the ASI to win at chess, but what actually maximizes its reward signal is the checkmate position, so it builds a fleet of robots to cut down every tree in the world to build trillions of chess sets and arranges every chess board into a checkmate position. See A simple case for extreme inner misalignment for more on why this sort of thing would happen.

Chess bots don’t do that because they have no concept of any world existing outside of the game they’re playing, which would not be the case for ASI.

ETA: That’s also why a lot of people oppose building ASI but still want to build powerful-but-narrow AIs like AlphaFold.