I am not aware of modeling here, but I have thought about this a bit. Besides what you mention, some other ways I think this story may not pan out (very speculative):
At the critical time, the cost of compute for automated researchers may be really high such that it’s actually not cost effective to buy labor this way. This would mainly be because many people want to use the best hardware for AI training or productive work, and this demand just overwhelms suppliers and prices skyrocket. This is like the labs and governments paying a lot more except that they’re buying things which are not altruistically-motivated research. Because autonomous labor is really expensive, it isn’t a much better deal than 2023 human labor.
A similar problem is that there may not be a market for buying autonomous labor because somebody is restricting this. Perhaps a government implements compute controls including on inference to slow AI progress (because they think that rapid progress would lead to catastrophe from misalignment). Perhaps the lab that develops the first of these capable-of-autonomous-research models restricts who can use it. To spell this out more, say GPT-6 is capable of massively accelerating research, then OpenAI may only make it available to alignment researchers for 3 months. Alternatively, they may only make it available to cancer researchers. In the first case, it’s probably relatively cheap to get autonomous alignment research (I’m assuming OpenAI is subsidizing this, though this may not be a good assumption). In the second case you can’t get useful alignment research with your money because you’re not allowed to.
It might be that the intellectual labor we can get out of AI systems at the critical time is bottlenecked by human labor (i.e., humans are needed to: review the output of AI debates, give instructions to autonomous software engineers, or construct high quality datasets). In this situation, you can’t buy very much autonomous labor with your money because autonomous labor isn’t the limiting factor on progress. This is pretty much the state of things in 2023; AI systems help speed up human researchers, but the compute cost of them doing so is still far below the human costs, and you probably didn’t need to save significant money 5 years ago to make this happen.
My current thinking is that there’s a >20% chance that EA-oriented funders should be saving significant money to spend on compute for autonomous researchers, and it is an important thing for them to gain clarity on. I want to point out that there is probably a partial-automation phase (like point 3 above) before a full-automation phase. The partial-automation phase has less opportunity to usefully spend money on compute (plausibly still in the tens of millions of dollars), but our actions are more likely to matter. After that comes the full-automation phase where money can be scalably spent to e.g., differentially speed up alignment vs. AI capabilities research by hundreds of millions of dollars, but there’s a decent chance our actions don’t matter then.
As you mention, perhaps our actions don’t matter then because humans don’t control the future. I would emphasize that if we have fully autonomous, no humans in the loop, research happening without already having good alignment of those systems, it’s highly likely that we get disempowered. That is, it might not make sense to aim to do alignment research at that point because either the crucial alignment work was already done, or we lose. Conditional on having aligned systems at this point, having saved money to spend on altruistically motivated cognitive work probably isn’t very important because economic growth gets going really fast and there’s plenty of money to be spent on non-alignment altruistic causes. On the other hand, something something at that point it’s the last train on it’s way to the dragon and it sure would be sad to not have money saved to buy those bed-nets.
I am not aware of modeling here, but I have thought about this a bit. Besides what you mention, some other ways I think this story may not pan out (very speculative):
At the critical time, the cost of compute for automated researchers may be really high such that it’s actually not cost effective to buy labor this way. This would mainly be because many people want to use the best hardware for AI training or productive work, and this demand just overwhelms suppliers and prices skyrocket. This is like the labs and governments paying a lot more except that they’re buying things which are not altruistically-motivated research. Because autonomous labor is really expensive, it isn’t a much better deal than 2023 human labor.
A similar problem is that there may not be a market for buying autonomous labor because somebody is restricting this. Perhaps a government implements compute controls including on inference to slow AI progress (because they think that rapid progress would lead to catastrophe from misalignment). Perhaps the lab that develops the first of these capable-of-autonomous-research models restricts who can use it. To spell this out more, say GPT-6 is capable of massively accelerating research, then OpenAI may only make it available to alignment researchers for 3 months. Alternatively, they may only make it available to cancer researchers. In the first case, it’s probably relatively cheap to get autonomous alignment research (I’m assuming OpenAI is subsidizing this, though this may not be a good assumption). In the second case you can’t get useful alignment research with your money because you’re not allowed to.
It might be that the intellectual labor we can get out of AI systems at the critical time is bottlenecked by human labor (i.e., humans are needed to: review the output of AI debates, give instructions to autonomous software engineers, or construct high quality datasets). In this situation, you can’t buy very much autonomous labor with your money because autonomous labor isn’t the limiting factor on progress. This is pretty much the state of things in 2023; AI systems help speed up human researchers, but the compute cost of them doing so is still far below the human costs, and you probably didn’t need to save significant money 5 years ago to make this happen.
My current thinking is that there’s a >20% chance that EA-oriented funders should be saving significant money to spend on compute for autonomous researchers, and it is an important thing for them to gain clarity on. I want to point out that there is probably a partial-automation phase (like point 3 above) before a full-automation phase. The partial-automation phase has less opportunity to usefully spend money on compute (plausibly still in the tens of millions of dollars), but our actions are more likely to matter. After that comes the full-automation phase where money can be scalably spent to e.g., differentially speed up alignment vs. AI capabilities research by hundreds of millions of dollars, but there’s a decent chance our actions don’t matter then.
As you mention, perhaps our actions don’t matter then because humans don’t control the future. I would emphasize that if we have fully autonomous, no humans in the loop, research happening without already having good alignment of those systems, it’s highly likely that we get disempowered. That is, it might not make sense to aim to do alignment research at that point because either the crucial alignment work was already done, or we lose. Conditional on having aligned systems at this point, having saved money to spend on altruistically motivated cognitive work probably isn’t very important because economic growth gets going really fast and there’s plenty of money to be spent on non-alignment altruistic causes. On the other hand, something something at that point it’s the last train on it’s way to the dragon and it sure would be sad to not have money saved to buy those bed-nets.