This is a good point, we agree, thanks! Note that you need to assume that the algorithmic progress that gives you more effective inference compute is the same that gives you more effective research compute. This seems pretty reasonable but worth a discussion.
Although note that this argument works only with the CES in compute formulation. For the CES in frontier experiments, you would have the AKresAKtrain so the A cancels out.[1]
You might be able to avoid this by adding the A’s in a less naive fashion. You don’t have to train larger models if you don’t want to. So perhaps you can freeze the frontier, and then you getAKresAfrozenKtrain? I need to think more about this point.
Although note that this argument works only with the CES in compute formulation. For the CES in frontier experiments, you would have the AKresAKtrain so the A cancels out.
Yep, as you say in your footnote, you can choose to freeze the frontier, so you train models of a fixed capability using less and less compute (at least for a while).
This is a good point, we agree, thanks! Note that you need to assume that the algorithmic progress that gives you more effective inference compute is the same that gives you more effective research compute. This seems pretty reasonable but worth a discussion.
Although note that this argument works only with the CES in compute formulation. For the CES in frontier experiments, you would have the AKresAKtrain so the A cancels out.[1]
You might be able to avoid this by adding the A’s in a less naive fashion. You don’t have to train larger models if you don’t want to. So perhaps you can freeze the frontier, and then you getAKresAfrozenKtrain? I need to think more about this point.
Yep, as you say in your footnote, you can choose to freeze the frontier, so you train models of a fixed capability using less and less compute (at least for a while).
Also, updating this would change all the intelligence explosion conditions, not just when σ<1.