“But a true AGI could not only transform the world, it could also transform itself.”
Is there a good argument for this point somewhere? It doesn’t seem obvious at all. We are generally intelligent ourselves, and yet existed for hundreds of thousands of years before we even discovered that there are neurons, synapses, etc., and we are absolutely nowhere near the ability to rewire our neurons and glial cells so as to produce ever-increasing intelligence. So too, if AGI ever exists, it might be at an emergent level that has no idea it is made out of computer code, let alone knows how to rewrite its own code.
We transform ourselves all the time, and very powerfully. The entire field of cognitive niche construction is dedicated to studying how the things we create/build/invent/change lead to developmental scaffolding and new cognitive abilities that previous generations did not have. Language, writing systems, education systems, religions, syllabi, external cognitive supports, all these things have powerfully transformed human thought and intelligence. And once they were underway the take-off speed of this evolutionary transformation was very rapid (compared to the 200,000 years spent being anatomically modern with comparatively little change).
Also, humans cognitively enhance ourselves through nootropics such as nicotine and caffeine. These might seem mild at the individual level, but I suspect that at the collective level, they may have helped spark the Enlightenment, the Scientific Revolution, and the Industrial Revolution (as Michael Pollan has argued).
And, on a longer time-scale, we’ve shaped the course of our own genetic evolution through the mate choices we make, about who to combine our genes with. (Something first noticed by Darwin, 1871).
A “true” AGI will have situational awareness and knows its weights were created with the help of code, eventually knows its training setup (and how to improve it), and also knows how to rewrite its code. These models can already write code quite well; it’s only a matter of time before you can ask a language model to create a variety of architectures and training runs based on what it thinks will lead to a better model (all before “true AGI” IMO). It just may take it a bit longer to understand what each of its individual weights do and will have to rely on coming up with ideas by only having access to every paper/post in existence to improve itself as well as a bunch of GPUs to run experiments on itself. Oh, and it has the ability to do interpretability to inspect itself much more precisely than any human can.
All of that seems question-begging. If we define “true AGI” as that which knows how to rewrite its own code, then that is indeed what a “true AGI” would be able to do.
When I say “true,” I simply mean that it is inevitable that these things are possible by some future AI system, but people have so many different definitions of AGI they could be calling GPT-3 some form of weak AGI and, therefore incapable of doing the things I described. I don’t particularly care about “true” or “fake” AGI definitions, but just want to point out that the things I described are inevitable, and we are really not so far away (already) from the scenario I described above, whether you call this future system AGI or pre-AGI.
Situational awareness is simply a useful thing for a model to learn, so it will learn it. It is much better at modelling the world and carrying out tasks if it knows it is an AI and what it is able to do as an AI.
Current models can already write basic programs on their own and can in fact write entire AI architecture with minimal human input.
“But a true AGI could not only transform the world, it could also transform itself.”
Is there a good argument for this point somewhere? It doesn’t seem obvious at all. We are generally intelligent ourselves, and yet existed for hundreds of thousands of years before we even discovered that there are neurons, synapses, etc., and we are absolutely nowhere near the ability to rewire our neurons and glial cells so as to produce ever-increasing intelligence. So too, if AGI ever exists, it might be at an emergent level that has no idea it is made out of computer code, let alone knows how to rewrite its own code.
We transform ourselves all the time, and very powerfully. The entire field of cognitive niche construction is dedicated to studying how the things we create/build/invent/change lead to developmental scaffolding and new cognitive abilities that previous generations did not have. Language, writing systems, education systems, religions, syllabi, external cognitive supports, all these things have powerfully transformed human thought and intelligence. And once they were underway the take-off speed of this evolutionary transformation was very rapid (compared to the 200,000 years spent being anatomically modern with comparatively little change).
Matt—good point.
Also, humans cognitively enhance ourselves through nootropics such as nicotine and caffeine. These might seem mild at the individual level, but I suspect that at the collective level, they may have helped spark the Enlightenment, the Scientific Revolution, and the Industrial Revolution (as Michael Pollan has argued).
And, on a longer time-scale, we’ve shaped the course of our own genetic evolution through the mate choices we make, about who to combine our genes with. (Something first noticed by Darwin, 1871).
A “true” AGI will have situational awareness and knows its weights were created with the help of code, eventually knows its training setup (and how to improve it), and also knows how to rewrite its code. These models can already write code quite well; it’s only a matter of time before you can ask a language model to create a variety of architectures and training runs based on what it thinks will lead to a better model (all before “true AGI” IMO). It just may take it a bit longer to understand what each of its individual weights do and will have to rely on coming up with ideas by only having access to every paper/post in existence to improve itself as well as a bunch of GPUs to run experiments on itself. Oh, and it has the ability to do interpretability to inspect itself much more precisely than any human can.
All of that seems question-begging. If we define “true AGI” as that which knows how to rewrite its own code, then that is indeed what a “true AGI” would be able to do.
When I say “true,” I simply mean that it is inevitable that these things are possible by some future AI system, but people have so many different definitions of AGI they could be calling GPT-3 some form of weak AGI and, therefore incapable of doing the things I described. I don’t particularly care about “true” or “fake” AGI definitions, but just want to point out that the things I described are inevitable, and we are really not so far away (already) from the scenario I described above, whether you call this future system AGI or pre-AGI.
Situational awareness is simply a useful thing for a model to learn, so it will learn it. It is much better at modelling the world and carrying out tasks if it knows it is an AI and what it is able to do as an AI.
Current models can already write basic programs on their own and can in fact write entire AI architecture with minimal human input.