Fermi–Dirac Distribution comments on Have your timelines changed as a result of ChatGPT?

Fermi–Dirac Distribution 6 Dec 2022 23:25 UTC
3 points
1 ∶ 0
text-davinci-003 (which is effectively ChatGPT)
This is probably a stupid question, but: do we actually know if ChatGPT uses text-davinci-003?
When I talk to ChatGPT with the Network tab of Chrome DevTools open, filter for the name “conversation,” and look at any request payload, I see that it has the key-value pair
model: “text-davinci-002-render”
Which seems to indicate that it might not be using text-davinci-003.
- TW123 7 Dec 2022 0:00 UTC
  2 points
  0 ∶ 0
  Parent
  The blog post says ChatGPT is trained with proximal policy optimization. This documentation says text-davinci-003 was trained with PPO, but not text-davinci-002.
  However, it is interesting what you’re saying about the request payloads, because this seems to be contradictory. So I’m not quite sure anymore. It’s possible that ChatGPT was trained with PPO on top of the non-PPO text-davinci-002.