Yarrow Bouchard 🔸 comments on Yarrow’s Quick takes

Yarrow Bouchard 🔸 17 Nov 2025 20:18 UTC
2 points
0 ∶ 0
There’s a longer discussion of that oft-discussed METR time horizons graph that warrants a post of its own.

My problem with how people interpret the graph is that people slip quickly and wordlessly from step to step in a logical chain of inferences that I don’t think can be justified. The chain of inferences is something like:

AI model performance on a set of very limited benchmark tasks → AI model performance on software engineering in general → AI model performance on everything humans do
I don’t think these inferences are justifiable.