AI systems modeling their own training process is a pretty big deal for modeling what AIs will end up caring about, and how well you can control them (cf. the latest Anthropic paper)
AI systems modeling their own training process is a pretty big deal for modeling what AIs will end up caring about, and how well you can control them (cf. the latest Anthropic paper)