My read on this so far is that low estimates for P(doom|AGI) are either borne of ignorance of what the true difficulties in AI alignment are; stem from wishful thinking / a lack of security mindset; or are a social phenomenon where people want to sound respectable and non-alarmist; as opposed to being based on any sound technical argument.
After spending a significant amount of my own free time writing up technical arguments that AI risk is overestimated, I find it quite annoying to be told that my reasons must be secretly based on social pressure. No, I just legitimately think you’re wrong, as do a huge number of other people who have been turned away from EA by dismissive attitudes like this.
If I had to state only one argument (there are very many) that P of AGI doom is low, it’d be the following.
Conquering the world is really really really hard.
Conquering the world starting from nothing is really, really, really, ridiculously hard.
Conquering the world, starting from nothing, when your brain isfully accessible to your enemy for your entire lifetime of plotting, is stupidly, ridiculously, insanely hard.
Every time I point this basic fact out, the response is a speculative science fiction story, or an assertion that “a superintelligence will figure something out”. But nobody actually knows the capabilities of this invention that doesn’t exist yet. I have seen zero convinc
Why is “it will be borderline omnipotent” being treated as the default scenario?No invention in the history of humanity has been that perfect, especially early on. No intelligence in the history of the universe has been that flawless. Can you really be 90% sure that
your brain isfully accessible to your enemy for your entire lifetime of plotting
This sounds like you are assuming that mechanistic interpretability has somehow been solved. We are nowhere near on track for that to happen in time!
Also, re “it will be borderline omnipotent”: this is not required for doom. ~Human level AI hackers copied a million times, and sped up a million times, could destroy civilisation.
It doesn’t seem to me that titotal is assuming MI is solved; having direct access to the brain doesn’t give you full insight into someone’s thoughts either, because neuroscience is basically a pile of unsolved problems with growing-but-still-very-incomplete-picture of low-level and high-level details. We don’t even have a consensus on how memory is physically implemented.
Nonetheless, if you had a bunch of invasive probes feeding you gigabytes/sec of live data from the brain of the genius general of the opposing army, it would be extremely likely to be useful information.
A really interesting thing is that, at the moment, this appears in practice to be a very-asymmetrical advantage. The high-level reasoning processes that GPT-4 implements don’t seem to be able to introspect about fine-grained details, like “how many tokens are in a given string”. The information is obviously and straightforwardly part of the model, but absent external help the model doesn’t seem to bridge the gap between low-level implementation details and high-level reasoning abilities—like us.
Ok, so the “brain” is fully accessible, but that is near useless with the level of interpretability we have. We know way more human neuroscience by comparison. It’s hard to grasp just how large these AI models are. They have of the order of a trillion dimensions. Try plotting that out in Wolfram Alpha or Matlab..
It should be scary in itself that we don’t even know what these models can do ahead of time. It is an active area of scientific investigation to discover their true capabilities, after the fact of their creation.
Well probability of AGI doom doesn’t depend on probability that AI can ‘conquer the world’.
It only depends on the probability that AI can disrupt the world sufficiently that the latent tensions in human societies, plus all the other global catastrophic risks that other technologies could unleash (e.g. nukes, bioweapons), would lead to some vicious downward spirals, eventually culminating in human extinction.
This doesn’t require AGI or ASI. It could just happen through very good AI-generated propaganda, that’s deployed at scale, in multiple languages, in a mass-customized way, by any ‘bad actors’ who want to watch the world burn. And there are many millions of such people.
After spending a significant amount of my own free time writing up technical arguments that AI risk is overestimated, I find it quite annoying to be told that my reasons must be secretly based on social pressure. No, I just legitimately think you’re wrong, as do a huge number of other people who have been turned away from EA by dismissive attitudes like this.
If I had to state only one argument (there are very many) that P of AGI doom is low, it’d be the following.
Conquering the world is really really really hard.
Conquering the world starting from nothing is really, really, really, ridiculously hard.
Conquering the world, starting from nothing, when your brain is fully accessible to your enemy for your entire lifetime of plotting, is stupidly, ridiculously, insanely hard.
Every time I point this basic fact out, the response is a speculative science fiction story, or an assertion that “a superintelligence will figure something out”. But nobody actually knows the capabilities of this invention that doesn’t exist yet. I have seen zero convinc
Why is “it will be borderline omnipotent” being treated as the default scenario? No invention in the history of humanity has been that perfect, especially early on. No intelligence in the history of the universe has been that flawless. Can you really be 90% sure that
This sounds like you are assuming that mechanistic interpretability has somehow been solved. We are nowhere near on track for that to happen in time!
Also, re “it will be borderline omnipotent”: this is not required for doom. ~Human level AI hackers copied a million times, and sped up a million times, could destroy civilisation.
It doesn’t seem to me that titotal is assuming MI is solved; having direct access to the brain doesn’t give you full insight into someone’s thoughts either, because neuroscience is basically a pile of unsolved problems with growing-but-still-very-incomplete-picture of low-level and high-level details. We don’t even have a consensus on how memory is physically implemented.
Nonetheless, if you had a bunch of invasive probes feeding you gigabytes/sec of live data from the brain of the genius general of the opposing army, it would be extremely likely to be useful information.
A really interesting thing is that, at the moment, this appears in practice to be a very-asymmetrical advantage. The high-level reasoning processes that GPT-4 implements don’t seem to be able to introspect about fine-grained details, like “how many tokens are in a given string”. The information is obviously and straightforwardly part of the model, but absent external help the model doesn’t seem to bridge the gap between low-level implementation details and high-level reasoning abilities—like us.
Ok, so the “brain” is fully accessible, but that is near useless with the level of interpretability we have. We know way more human neuroscience by comparison. It’s hard to grasp just how large these AI models are. They have of the order of a trillion dimensions. Try plotting that out in Wolfram Alpha or Matlab..
It should be scary in itself that we don’t even know what these models can do ahead of time. It is an active area of scientific investigation to discover their true capabilities, after the fact of their creation.
Well probability of AGI doom doesn’t depend on probability that AI can ‘conquer the world’.
It only depends on the probability that AI can disrupt the world sufficiently that the latent tensions in human societies, plus all the other global catastrophic risks that other technologies could unleash (e.g. nukes, bioweapons), would lead to some vicious downward spirals, eventually culminating in human extinction.
This doesn’t require AGI or ASI. It could just happen through very good AI-generated propaganda, that’s deployed at scale, in multiple languages, in a mass-customized way, by any ‘bad actors’ who want to watch the world burn. And there are many millions of such people.