A floating point operation is more like 1e5 bit erasures today and is necessarily at least 16 bit erasures at fp16 (and your estimates don’t allow for large precision reductions e.g. to 1 bit arithmetic). Let’s call it 1.6e21 bit erasures per second, I think quite conservatively?
I don’t follow you here.
Why is a floating point operation 1e5 bit erasures today?
Why does a fp16 operation necessitate 16 bit erasures? As an example, if we have two 16-bit registers (A, B) and we do a multiplication to get (A, A*B), where is the 16 bits of information loss?
(In any case, no real need to reply to this. As someone who has spent a lot of time thinking about the Landauer limit, my main takeaway is that it’s more irrelevant than often supposed, and I suspect getting to the bottom of this rabbit hole is not going to yield much for us in terms of TAGI timelines.)
I don’t follow you here.
Why is a floating point operation 1e5 bit erasures today?
Why does a fp16 operation necessitate 16 bit erasures? As an example, if we have two 16-bit registers (A, B) and we do a multiplication to get (A, A*B), where is the 16 bits of information loss?
(In any case, no real need to reply to this. As someone who has spent a lot of time thinking about the Landauer limit, my main takeaway is that it’s more irrelevant than often supposed, and I suspect getting to the bottom of this rabbit hole is not going to yield much for us in terms of TAGI timelines.)