Have you put any thought into whether two of your points can be combined?
The constant hazard rate model: there is a predictable difference between T_”50% success” horizon vs. T_”X% success” horizon.
Costs matter, and costs are plausibly rising fast.
In particular, can one use the constant hazard rate model—and other information? -- to go from “data on costs to achieve 50% success” to extrapolate “cost to achieve 99% (e.g.) success”.
I spent a bit of time thinking about this, but I think there’s a missing ingredient: for example,
If you can do a 70-minute task, at 50% reliability, for $100
Then (per constant hazard rate) you can do a 1-minute task, at 99% reliability, for $X
The difficulty is: what can we say about $X?
Presumably it is upper bounded by $100, the cost of doing a 70-minute task at 50% reliability
Presumably it is lower bounded by “cost of doing a 1-minute task at 50% reliability”
==> basically, as you wrote, would love more data from METR on costs...!
(Details: I had ChatGPT attempt to digitize the cost curve for GPT-5 from METR, and then generated the upper and lower bounds as described above)
Thanks Basil! That’s an interesting idea. The constant hazard rate model is just comparing two uses of the same model over different task lengths, so if using that to work out the 99% time horizon, it should cost 1/70th as much ($1.43). Over time, I think these 99% tasks should rise in cost in roughly the same way as the 50%-horizon ones (as they are both increasing in length in proportion). But estimating how that will change in practice is especially dicey as there is too little data.
Also, note that Gus Hamilton has written a great essay that takes the survival analysis angle I used in my constant hazard rates piece and extended it to pretty convincingly show that the hazard rates are actually decreasing. I explain it in more detail here. One upshot is that it gives a different function for estimating the 99% horizon lengths and he also shows that these are poorly constrained by the data and his model disagrees with METR’s by a factor of 20 on how long they are, with even more disagreement for shorter lengths.
(The whole series of essays has been fantastic.)
Have you put any thought into whether two of your points can be combined?
The constant hazard rate model: there is a predictable difference between T_”50% success” horizon vs. T_”X% success” horizon.
Costs matter, and costs are plausibly rising fast.
In particular, can one use the constant hazard rate model—and other information? -- to go from “data on costs to achieve 50% success” to extrapolate “cost to achieve 99% (e.g.) success”.
I spent a bit of time thinking about this, but I think there’s a missing ingredient: for example,
If you can do a 70-minute task, at 50% reliability, for $100
Then (per constant hazard rate) you can do a 1-minute task, at 99% reliability, for $X
The difficulty is: what can we say about $X?
Presumably it is upper bounded by $100, the cost of doing a 70-minute task at 50% reliability
Presumably it is lower bounded by “cost of doing a 1-minute task at 50% reliability”
==> basically, as you wrote, would love more data from METR on costs...!
(Details: I had ChatGPT attempt to digitize the cost curve for GPT-5 from METR, and then generated the upper and lower bounds as described above)
Thanks Basil! That’s an interesting idea. The constant hazard rate model is just comparing two uses of the same model over different task lengths, so if using that to work out the 99% time horizon, it should cost 1/70th as much ($1.43). Over time, I think these 99% tasks should rise in cost in roughly the same way as the 50%-horizon ones (as they are both increasing in length in proportion). But estimating how that will change in practice is especially dicey as there is too little data.
Also, note that Gus Hamilton has written a great essay that takes the survival analysis angle I used in my constant hazard rates piece and extended it to pretty convincingly show that the hazard rates are actually decreasing. I explain it in more detail here. One upshot is that it gives a different function for estimating the 99% horizon lengths and he also shows that these are poorly constrained by the data and his model disagrees with METR’s by a factor of 20 on how long they are, with even more disagreement for shorter lengths.