I think this post needs more discussion of concrete examples.
Hypothetically, say you trained an advanced version of GPT on all human chess games, but removed computer games from it’s database (and human games that cheated with an engine). The reward function here is still “predict the next move”. How “smart” would this GPT version be at chess? How would it far against stockfish or alphazero?
It seems like the “prediction” goal function is inherently limiting here. GPT-9 would be focusing it’s compute time and energy on modelling the psychology of a a top-of-his game Magnus Carlsen in a given situation, which definitely would require a brilliant understanding of chess. But it would be learning how to play human chess. Stockfish would crush it (even the present day version vs future super-GPT), because the goal function of stockfish is to be good at chess. This is the sense in which I think being an “imitator” limits your intelligence.
There’s an important distinction here between prediction the next token in a piece of text and predicting the next action in a causal chain. If you have a computation that is represented by a causal graph, and you train a predictor to predict nodes conditional on previous nodes, then it’s true that the predictor won’t end up being able to do better than the original computational process. But text is not ordered that way! Texts often describe outcomes before describing the details of the events which generated them. If you train on texts like those, you get something more powerful than an imitator. If you train a good enough next-token predictor on chess games where the winner is mentioned before the list of moves, you can get superhuman play by prepending “This is a game which white/black wins:”. If you train a good enough next-token predictor on texts that have the outputs of circuits listed before the inputs, you get an NP-oracle. You’re almost certainly not going to get an NP-oracle from GPT-9, but that’s because of the limitations of the training processes and architectures of that this universe can support, it’s not a limitation of the loss function.
I think there very much is a limitation in the loss function, when you consider efficiency of results. In chess, stockfish and alphazero don’t just match the best chess players, they exceed them by a ridiculous level, and that’s right now. Whereas GPT, with the same level of computation, still hasn’t figured out how not to make illegal moves.
I can’t rule out that a future GPT version will be able to beat the best human, by really good pattern matching on what a “winning” game looks like. But that’s still pattern matching on human games. Stockfish has no such limitation.
I think this post needs more discussion of concrete examples.
Hypothetically, say you trained an advanced version of GPT on all human chess games, but removed computer games from it’s database (and human games that cheated with an engine). The reward function here is still “predict the next move”. How “smart” would this GPT version be at chess? How would it far against stockfish or alphazero?
It seems like the “prediction” goal function is inherently limiting here. GPT-9 would be focusing it’s compute time and energy on modelling the psychology of a a top-of-his game Magnus Carlsen in a given situation, which definitely would require a brilliant understanding of chess. But it would be learning how to play human chess. Stockfish would crush it (even the present day version vs future super-GPT), because the goal function of stockfish is to be good at chess. This is the sense in which I think being an “imitator” limits your intelligence.
There’s an important distinction here between prediction the next token in a piece of text and predicting the next action in a causal chain. If you have a computation that is represented by a causal graph, and you train a predictor to predict nodes conditional on previous nodes, then it’s true that the predictor won’t end up being able to do better than the original computational process. But text is not ordered that way! Texts often describe outcomes before describing the details of the events which generated them. If you train on texts like those, you get something more powerful than an imitator. If you train a good enough next-token predictor on chess games where the winner is mentioned before the list of moves, you can get superhuman play by prepending “This is a game which white/black wins:”. If you train a good enough next-token predictor on texts that have the outputs of circuits listed before the inputs, you get an NP-oracle. You’re almost certainly not going to get an NP-oracle from GPT-9, but that’s because of the limitations of the training processes and architectures of that this universe can support, it’s not a limitation of the loss function.
I think there very much is a limitation in the loss function, when you consider efficiency of results. In chess, stockfish and alphazero don’t just match the best chess players, they exceed them by a ridiculous level, and that’s right now. Whereas GPT, with the same level of computation, still hasn’t figured out how not to make illegal moves.
I can’t rule out that a future GPT version will be able to beat the best human, by really good pattern matching on what a “winning” game looks like. But that’s still pattern matching on human games. Stockfish has no such limitation.