I think he would agree with “we wouldn’t have GPT-3 from an economical perspective”. I am not sure whether he would agree with a theoretical impossibility. From the transcript:
“Because a lot of the current models are based on diffusion stuff, not just bigger transformers. If you didn’t have diffusion models [and] you didn’t have transformers, both of which were invented in the last five years, you wouldn’t have GPT-3 or DALL-E. And so I think it’s silly to say that scale was the only thing that was necessary because that’s just clearly not true.”
To be clear, the part about the credit assignment problem was mostly when discussing the research at his lab, and he did not explicitly argue that the long-term credit assignment problem was evidence that training powerful AI systems is hard. I included the quote because it was relevant, but it was not an “argument” per se.
I think he would agree with “we wouldn’t have GPT-3 from an economical perspective”. I am not sure whether he would agree with a theoretical impossibility. From the transcript:
To be clear, the part about the credit assignment problem was mostly when discussing the research at his lab, and he did not explicitly argue that the long-term credit assignment problem was evidence that training powerful AI systems is hard. I included the quote because it was relevant, but it was not an “argument” per se.