Connor Blake comments on All AGI Safety questions welcome (especially basic ones) [April 2023]

Connor Blake 18 Apr 2023 22:59 UTC
3 points
0 ∶ 0
I should have clarified that LW post is the post on which I based my question, so here is a more fleshed out version: Because GPTs are trained on human data, and given that humans make mistakes and don’t have complete understanding of most situations, it seems highly implausible to me that enough information can be extracted from text/images to make a valid prediction of highly complex/abstract topics because of the imprecision of language.
Yudkowsky says of GPT-4:
It is being asked to model what you were thinking—the thoughts in your mind whose shadow is your text output—so as to assign as much probability as possible to your true next word.
How do we know it will be able to extract enough information from the shadow to be able to reconstruct the thoughts? Text has comparatively little information to characterize such a complex system. It reminds me of the difficulty of problems like the inverse scattering problem or CT scan computation where underlying structure is very complex, and all you get is a low-dimensional projection of it which may or may not be solvable to obtain the original complex structure. CT scans can find tumors, but they can’t tell you which gene mutated because they just don’t have enough resolution.
Yudkowsky gives this as an example in the article:
“Imagine a Mind of a level where it can hear you say ‘morvelkainen blaambla ringa’, and maybe also read your entire social media history, and then manage to assign 20% probability that your next utterance is ‘mongo’.”
I understand that it would be evidence of extreme intelligence to make that kind of prediction, but I don’t see how the path to such a conclusion can be made solely from its training data.
Going further, because the training data is from humans (who, as mentioned, make mistakes and have an incomplete understanding of the world), it seems highly unlikely that the model would have the ability to produce new concepts in something exact as, for example, math and science if its understanding of causality is solely based on predicting something as unpredictable as human behavior, even if it’s really good. Why should we assume that a model, even a really big one, would converge to understanding the laws of physics well enough to make new discoveries based on human data alone? Is the idea behind ASI that it will even come from LLMs? If so, I am very curious to hear the theory for how that will develop that I am not grasping here.
- aog 18 Apr 2023 23:50 UTC
  5 points
  1 ∶ 0
  Parent
  Yep that’s a fair argument, and I don’t have a knockdown case that predicting human generated data will result in great abilities.
  One bit of evidence is that people used to be really pessimistic that scaling up imitation would do anything interesting, this paper was a popular knockdown arguing language models could never understand the physical world, but most of the substantive predictions of that line of thinking have been wrong and those people have largely retreated to semantics debates about the meaning of “understanding”. Scaling has gone further than many people expected, and could continue.
  Another argument would be that pretraining on human data has a ceiling, but RL fine-tuning on downstream objectives will be much more efficient after pretraining and will allow AI to surpass the human level.
  But again, there are plenty of people who think GPT will not scale to superintelligence—Eliezer, Gary Marcus, Yann Lecun—and it’s hard to predict these things in advance.