I have thoughts, but a question first: you link a Kambhampati tweet where he says,
...as the context window changes (with additional prompt words), the LLM, by design, switches the CPT used to generate next token—given that all these CPTs have been pre-computed?
What does ‘CPT’ stand for here? It’s not a common ML or computer science acronym that I’ve been able to find.
I have thoughts, but a question first: you link a Kambhampati tweet where he says,
What does ‘CPT’ stand for here? It’s not a common ML or computer science acronym that I’ve been able to find.
Since nobody else has responded, my best guess would be “conditional probability table”.