I think that’s right, but modern AI benchmarks seem to have much the same issue. A human with a modern Claude instance might be able to write code 100x faster than without, but probably less than 2x as fast at choosing a birthday present for a friend.
Ideally you want to integrate over… something to do with the set of all tasks. But it’s hard to say what that something would be, let alone how you’re going to meaningfully integrate it.
I think that’s right, but modern AI benchmarks seem to have much the same issue. A human with a modern Claude instance might be able to write code 100x faster than without, but probably less than 2x as fast at choosing a birthday present for a friend.
Ideally you want to integrate over… something to do with the set of all tasks. But it’s hard to say what that something would be, let alone how you’re going to meaningfully integrate it.