I’m not sure it going to feel like a zoom call with a colleague any time soon. That’s a pretty high bar IMO that we ain’t anywhere near yet. Many steps aren’t yet there which include
(I’m probably misrepresenting a couple of these due to lack of expertise but something like...)
1) Video quality (especially being rendered real time) 2) Almost insta-replying 3) Facial warmth and expressions 4) LLM sounding exactly like a real person (this one might be closest) 5) Reduction in processing power required for this to be a norm. Having thousands of these going simultaneously is going to need a lot. (again less important)
I would bet against Avatars being this high fidelity in 3 years a in common use because I think LLM progress is tailing off and there are multiple problems to be solved to get there—but maybe I’m a troglodyte...
I’m not sure it going to feel like a zoom call with a colleague any time soon. That’s a pretty high bar IMO that we ain’t anywhere near yet. Many steps aren’t yet there which include
(I’m probably misrepresenting a couple of these due to lack of expertise but something like...)
1) Video quality (especially being rendered real time)
2) Almost insta-replying
3) Facial warmth and expressions
4) LLM sounding exactly like a real person (this one might be closest)
5) Reduction in processing power required for this to be a norm. Having thousands of these going simultaneously is going to need a lot. (again less important)
I would bet against Avatars being this high fidelity in 3 years a in common use because I think LLM progress is tailing off and there are multiple problems to be solved to get there—but maybe I’m a troglodyte...