That’s a good point. We “actually” can’t be certain that a meta-representation was formed for a particular word in that example. I should have used the word “probably” when talking about meta-representations for individual words. However, we can be fairly confident that ChatGPT formed meta-representations for enough words to go from getting an incorrect answer to a correct answer in the example. I believe we can isolate specific words better in the second and third sessions.
As far as whether associating the word “actually” “with a goal that the following words should be negatively correlated in the corpus with the words that were in the previous message,” the idea of “having a goal” and the concept that the word “actually” goes with negative correlations seem a bit like signs of meta-representations in and of themselves, but I guess that’s just my opinion.
With respect to all the sessions, there may very well be similar conversations in the corpus of training data in which Person A (like myself) teaches Person B (like ChatGPT), and ChatGPT is just imitating that learning process (by giving the wrong answer first and then “learning”), but I address why that is probably not the case in the “mimicking learning” paragraph.
I would say my overall thesis still stands (since enough words must have meta-reps), but good point on the particulars. Thank-you!
That’s a good point. We “actually” can’t be certain that a meta-representation was formed for a particular word in that example. I should have used the word “probably” when talking about meta-representations for individual words. However, we can be fairly confident that ChatGPT formed meta-representations for enough words to go from getting an incorrect answer to a correct answer in the example. I believe we can isolate specific words better in the second and third sessions.
As far as whether associating the word “actually” “with a goal that the following words should be negatively correlated in the corpus with the words that were in the previous message,” the idea of “having a goal” and the concept that the word “actually” goes with negative correlations seem a bit like signs of meta-representations in and of themselves, but I guess that’s just my opinion.
With respect to all the sessions, there may very well be similar conversations in the corpus of training data in which Person A (like myself) teaches Person B (like ChatGPT), and ChatGPT is just imitating that learning process (by giving the wrong answer first and then “learning”), but I address why that is probably not the case in the “mimicking learning” paragraph.
I would say my overall thesis still stands (since enough words must have meta-reps), but good point on the particulars. Thank-you!