That is quite a surprising graph — the annual tripling and the correlation between the compute and revenue are much more perfect than I think anyone would have expected. Indeed they are so perfect that I’m a bit skeptical of what is going on.
One thing to note is that it isn’t clear what the compute graph is of (e.g. is it inference + training compute, but not R&D?). Another thing to note is that it is year-end figures vs year total figures on the right, but given they are exponentials with the same doubling time and different units, that isn’t a big deal.
There are a number of things I disagree with in the post. The main one relevant to this graph is the implication that the graph on the left causes the graph on the right. That would be genuinely surprising. We’ve seen that the slope on the famous scaling law graphs is about −0.05 for compute — so you need to double compute 20 times to get log-loss to halve. Whereas this story of 3x compute leading to 3x the revenue implies that the exponent for a putative scaling law of compute vs R&D is extremely close to 1.0. And that it remains flukishly close to that magic number despite the transition from pretraining scaling to RL+inference scaling. I could believe a power law exponent of 1.0 for some things that are quite mathematical of physical, but not for the extremely messy relationship of compute to total revenue which depends on details of:
the changing relationship between compute and intelligence,
the utility of more intelligence to people, the market dynamics between competitors,
running out of new customers and having to shift to more revenue per customer,
the change from a big upfront cost (training compute) to mostly per-customer charges (inference compute)
More likely is something like reverse causation — that the growth in revenue is driving the amount of compute they can afford. Or it could be that the prices they need to charge increase with the amount of investment they received in order to buy compute — so they are charging the minimum they can in order to allow revenue growth to match investment growth.
Overall, I’d say that I believe these are real numbers, but I don’t believe the implied model. e.g. I don’t believe this trend will continue in the long run and I don’t think that if they had been able to 10x compute in one of those years that the revenue would have also jumped by 10x (unless that is because they effectively choose how much revenue to take by trading market growth for revenue in order to make this graph work to convince investors).
That is quite a surprising graph — the annual tripling and the correlation between the compute and revenue are much more perfect than I think anyone would have expected. Indeed they are so perfect that I’m a bit skeptical of what is going on.
One thing to note is that it isn’t clear what the compute graph is of (e.g. is it inference + training compute, but not R&D?). Another thing to note is that it is year-end figures vs year total figures on the right, but given they are exponentials with the same doubling time and different units, that isn’t a big deal.
There are a number of things I disagree with in the post. The main one relevant to this graph is the implication that the graph on the left causes the graph on the right. That would be genuinely surprising. We’ve seen that the slope on the famous scaling law graphs is about −0.05 for compute — so you need to double compute 20 times to get log-loss to halve. Whereas this story of 3x compute leading to 3x the revenue implies that the exponent for a putative scaling law of compute vs R&D is extremely close to 1.0. And that it remains flukishly close to that magic number despite the transition from pretraining scaling to RL+inference scaling. I could believe a power law exponent of 1.0 for some things that are quite mathematical of physical, but not for the extremely messy relationship of compute to total revenue which depends on details of:
the changing relationship between compute and intelligence,
the utility of more intelligence to people, the market dynamics between competitors,
running out of new customers and having to shift to more revenue per customer,
the change from a big upfront cost (training compute) to mostly per-customer charges (inference compute)
More likely is something like reverse causation — that the growth in revenue is driving the amount of compute they can afford. Or it could be that the prices they need to charge increase with the amount of investment they received in order to buy compute — so they are charging the minimum they can in order to allow revenue growth to match investment growth.
Overall, I’d say that I believe these are real numbers, but I don’t believe the implied model. e.g. I don’t believe this trend will continue in the long run and I don’t think that if they had been able to 10x compute in one of those years that the revenue would have also jumped by 10x (unless that is because they effectively choose how much revenue to take by trading market growth for revenue in order to make this graph work to convince investors).