This could lead to behaviour analogous to conscious behaviour in humans, but with a very different mechanism or purpose, which does not actually produce qualia. I do not know how we would be able to tell the difference, even in theory.
How about: Remove all text of humans discussing their conscious experiences (or even the existence of consciousness) from the AI’s training set. See if it still claims to have internal experiences.
I don’t think this is a perfect method:
If it still talks about internal experiences, maybe it was able to extrapolate the ability to discuss internal experiences from text that wasn’t removed.
If it doesn’t talk about internal experiences, maybe it has them and just lacks the ability to talk about them. Some animals are probably like this.
Finally, in principle I can imagine that ingesting text related to internal experiences is actually what causes an AI to learn to have them.
How about: Remove all text of humans discussing their conscious experiences (or even the existence of consciousness) from the AI’s training set. See if it still claims to have internal experiences.
I don’t think this is a perfect method:
If it still talks about internal experiences, maybe it was able to extrapolate the ability to discuss internal experiences from text that wasn’t removed.
If it doesn’t talk about internal experiences, maybe it has them and just lacks the ability to talk about them. Some animals are probably like this.
Finally, in principle I can imagine that ingesting text related to internal experiences is actually what causes an AI to learn to have them.