hmm, I think I would expect different experience curves for the efficiency of running experiments vs producing cognitive labour (with generally less efficiency-boosts with time for running experiments). Is there any reason to expect them to behave similarly?
(Though I think I agree with the qualitative point that you could get a software-only intelligence explosion even if you can’t do this with human-only research input, which was maybe your main point.)
Agree that i wouldn’t particularly expect the efficiency curves to be the same.
But if the phi>0 for both types of efficiency, then I think this argument will still go through.
To put it in math, there would be two types of AI software technology, one for experimental efficiency and one for cognitive labour efficiency: A_exp and A_cog. The equations are then:
The intution for this result is that when σ<1, you are bottlenecked by your slower growing sector.
If the slower growing sector is cognitive labor, then asympotically F∝Acog, and we get ˙A∝AϕcogcogAλcog so we have blow-up iff ϕcog+λ>1.
If the slower growing sector is experimental compute, then there are two cases. If experimental compute is blowing up on its own, then so is cogntive labor because by assumption cognitive labor is growing faster. If experimental compute is not blowing up on its own then asympotically F∝Aexp and we get ˙Acog∝AϕcogcogAλexp. Here we get a blow-up iff ϕcog>1.[1]
In contrast, if σ>1 then F is approximately the fastest growing sector. You get blow-up in both sectors if either sector blows up. Therefore, you get blow-up iff max{ϕcog+λ,ϕexp+λ}>1.
So if you accept this framing, complements vs substitutes only matters if some sectors are blowing up but not others. If all sectors have the returns to research high enough, then we get an intelligence explosion no matter what. This is an update for me, thanks!
If your algorithms get more efficient over time at both small and large scales, and experiments test incremental improvements to architecture or data, then they should get cheaper to run proportionally to algorithmic efficiency of cognitive labor. I think this is better as a first approximation than assuming they’re constant, and might hold in practice especially when you can target small-scale algorithmic improvements.
I guess it’s not clear to me if that should hold if I think that most experiment compute will be ~training, and most cognitive labour compute will be ~inference?
However, over time maybe more experiment compute will be ~inference, as it shifts more to being about producing data rather than testing architectures? That could push back towards this being a reasonable assumption. (Definitely don’t feel like I have a clear picture of the dynamics here, though.)
hmm, I think I would expect different experience curves for the efficiency of running experiments vs producing cognitive labour (with generally less efficiency-boosts with time for running experiments). Is there any reason to expect them to behave similarly?
(Though I think I agree with the qualitative point that you could get a software-only intelligence explosion even if you can’t do this with human-only research input, which was maybe your main point.)
Agree that i wouldn’t particularly expect the efficiency curves to be the same.
But if the phi>0 for both types of efficiency, then I think this argument will still go through.
To put it in math, there would be two types of AI software technology, one for experimental efficiency and one for cognitive labour efficiency: A_exp and A_cog. The equations are then:
dA_exp = A_exp^phi_exp F(A_exp K_res, A_cog K_inf)
dA_cog = A_cog^phi_cog F(A_exp K_res, A_cog K_inf)
And then I think you’ll find that, even with sigma < 1, it explodes when phi_exp>0 and phi_cog>0.
I spent a bit of time thinking about this today.
Lets adopt the notation in your comment and suppose that F(⋅) is the same across research sectors, with common λ. Let’s also suppose common σ<1.
Then we get blow up in Acog iff
{ϕcog+λ>1if ϕcog≤ϕexpmax{ϕcog,ϕexp+λ}>1if ϕcog>ϕexp
The intution for this result is that when σ<1, you are bottlenecked by your slower growing sector.
If the slower growing sector is cognitive labor, then asympotically F∝Acog, and we get ˙A∝AϕcogcogAλcog so we have blow-up iff ϕcog+λ>1.
If the slower growing sector is experimental compute, then there are two cases. If experimental compute is blowing up on its own, then so is cogntive labor because by assumption cognitive labor is growing faster. If experimental compute is not blowing up on its own then asympotically F∝Aexp and we get ˙Acog∝AϕcogcogAλexp. Here we get a blow-up iff ϕcog>1.[1]
In contrast, if σ>1 then F is approximately the fastest growing sector. You get blow-up in both sectors if either sector blows up. Therefore, you get blow-up iff max{ϕcog+λ,ϕexp+λ}>1.
So if you accept this framing, complements vs substitutes only matters if some sectors are blowing up but not others. If all sectors have the returns to research high enough, then we get an intelligence explosion no matter what. This is an update for me, thanks!
I’m only analyzing blow-up conditions here. You could get e.g. double exponential growth here by having ϕcog=1 and ϕexp+λ=1.
Nice!
I think that condition is equivalent to saying that A_cog explodes iff either
phi_cog + lambda > 1 and phi_exp + lambda > 1, or
phi_cog > 1
Where the second possibility is the unrealistic one where it could explode with just human input
If your algorithms get more efficient over time at both small and large scales, and experiments test incremental improvements to architecture or data, then they should get cheaper to run proportionally to algorithmic efficiency of cognitive labor. I think this is better as a first approximation than assuming they’re constant, and might hold in practice especially when you can target small-scale algorithmic improvements.
OK I see the model there.
I guess it’s not clear to me if that should hold if I think that most experiment compute will be ~training, and most cognitive labour compute will be ~inference?
However, over time maybe more experiment compute will be ~inference, as it shifts more to being about producing data rather than testing architectures? That could push back towards this being a reasonable assumption. (Definitely don’t feel like I have a clear picture of the dynamics here, though.)