Nice post. The argument for simultaneity (most models train at the same time, and then are evaled at the same time, and then released at the same time) seems ultimately to be based on the assumption that the training cap grows in discrete amounts (say, a factor of 3) each year.
> * We want to prevent model training runs of size N+1 until alignment researchers have had time to study and build evals based on models of size N. > * For-profit companies will probably all want to start going as soon as they can once the training cap is lifted. > * So there will naturally be lots of simultaneous runs...
But why not smoothly raise the size limit? (Or approximate that by raising it in small steps every week or month?) The key feature of this proposal, as I see it, is that there is a fixed interval for training and eval, to prevent incentives to rush. But that doesn’t require simultaneity.
Nice post. The argument for simultaneity (most models train at the same time, and then are evaled at the same time, and then released at the same time) seems ultimately to be based on the assumption that the training cap grows in discrete amounts (say, a factor of 3) each year.
> * We want to prevent model training runs of size N+1 until alignment researchers have had time to study and build evals based on models of size N.
> * For-profit companies will probably all want to start going as soon as they can once the training cap is lifted.
> * So there will naturally be lots of simultaneous runs...
But why not smoothly raise the size limit? (Or approximate that by raising it in small steps every week or month?) The key feature of this proposal, as I see it, is that there is a fixed interval for training and eval, to prevent incentives to rush. But that doesn’t require simultaneity.
Interesting suggestion! Continuous or pseudo-continuous threshold raising isn’t something I considered. Here are some quick thoughts:
Continuous scaling could make eval validity easier, because the jump between eval-train (n-1) and eval-deploy (n) is smaller.
Continuous scaling encourages training to be done quickly, because you want to get your model launched before it is outdated.
Continuous scaling means you give up on the idea of models being evaluated side-by-side.