To successfully reason in the way it did, ChatGPT would have needed a meta-representation for the word “actually,” in order to understand that its prior answer was incorrect.
What makes this a meta-representation instead of something next-word-weight-y, like merely associating the appearance of “Actually,” with a goal that the following words should be negatively correlated in the corpus with the words that were in the previous message?
I’m also wondering if the butcher shop and the grocery store didn’t have different answers because of the name you gave the store. Maybe it was because you gave the quantity in pounds instead of in items?
You previously told ChatGPT “That’s because you’re basically taking (and wasting) the whole item.” ChatGPT might not have an association between “pound” and “item” the way a “calzone” is an “item,” so it might not use your earlier mention of “item” as something that should affect how it predicts the words that come after “pound.”
Or ChatGPT might have a really strong prior association between pounds → mass → [numbers that show up as decimals in texts about shopping] that overrode your earlier lesson.