Note that these METR cost vs time horizon are not at all pareto frontiers. These just correspond to what you get if you cut off the agent early, so they are probably very underelicited for “optimal performance for some cost” (e.g. note that if an agent doesn’t complete some part of the task until it is nearly out of budget it would do much worse on this metric at low cost, see e.g. gpt-5 for which this is true). My guess is that with better elicitation you get closer to the regime I expect.
At some point, METR might run results where they try to elicit performance at lower budgets such that we can actually get a pareto frontier.
I agree my abstraction might not be the right ones and maybe there is a cleaner way to think about this.
Note that these METR cost vs time horizon are not at all pareto frontiers. These just correspond to what you get if you cut off the agent early, so they are probably very underelicited for “optimal performance for some cost” (e.g. note that if an agent doesn’t complete some part of the task until it is nearly out of budget it would do much worse on this metric at low cost, see e.g. gpt-5 for which this is true). My guess is that with better elicitation you get closer to the regime I expect.
At some point, METR might run results where they try to elicit performance at lower budgets such that we can actually get a pareto frontier.
I agree my abstraction might not be the right ones and maybe there is a cleaner way to think about this.
Good point about the METR curves not being Pareto frontiers.