A “true” AGI will have situational awareness and knows its weights were created with the help of code, eventually knows its training setup (and how to improve it), and also knows how to rewrite its code. These models can already write code quite well; it’s only a matter of time before you can ask a language model to create a variety of architectures and training runs based on what it thinks will lead to a better model (all before “true AGI” IMO). It just may take it a bit longer to understand what each of its individual weights do and will have to rely on coming up with ideas by only having access to every paper/post in existence to improve itself as well as a bunch of GPUs to run experiments on itself. Oh, and it has the ability to do interpretability to inspect itself much more precisely than any human can.
All of that seems question-begging. If we define “true AGI” as that which knows how to rewrite its own code, then that is indeed what a “true AGI” would be able to do.
When I say “true,” I simply mean that it is inevitable that these things are possible by some future AI system, but people have so many different definitions of AGI they could be calling GPT-3 some form of weak AGI and, therefore incapable of doing the things I described. I don’t particularly care about “true” or “fake” AGI definitions, but just want to point out that the things I described are inevitable, and we are really not so far away (already) from the scenario I described above, whether you call this future system AGI or pre-AGI.
Situational awareness is simply a useful thing for a model to learn, so it will learn it. It is much better at modelling the world and carrying out tasks if it knows it is an AI and what it is able to do as an AI.
Current models can already write basic programs on their own and can in fact write entire AI architecture with minimal human input.
A “true” AGI will have situational awareness and knows its weights were created with the help of code, eventually knows its training setup (and how to improve it), and also knows how to rewrite its code. These models can already write code quite well; it’s only a matter of time before you can ask a language model to create a variety of architectures and training runs based on what it thinks will lead to a better model (all before “true AGI” IMO). It just may take it a bit longer to understand what each of its individual weights do and will have to rely on coming up with ideas by only having access to every paper/post in existence to improve itself as well as a bunch of GPUs to run experiments on itself. Oh, and it has the ability to do interpretability to inspect itself much more precisely than any human can.
All of that seems question-begging. If we define “true AGI” as that which knows how to rewrite its own code, then that is indeed what a “true AGI” would be able to do.
When I say “true,” I simply mean that it is inevitable that these things are possible by some future AI system, but people have so many different definitions of AGI they could be calling GPT-3 some form of weak AGI and, therefore incapable of doing the things I described. I don’t particularly care about “true” or “fake” AGI definitions, but just want to point out that the things I described are inevitable, and we are really not so far away (already) from the scenario I described above, whether you call this future system AGI or pre-AGI.
Situational awareness is simply a useful thing for a model to learn, so it will learn it. It is much better at modelling the world and carrying out tasks if it knows it is an AI and what it is able to do as an AI.
Current models can already write basic programs on their own and can in fact write entire AI architecture with minimal human input.