Developing theories of how to align AI and reasoning about how they could fail (AI alignment research)
AI alignment research will fail, because the ruthless powers who control much of the planet’s population and land mass will simply ignore it. Drug gangs will ignore it. Terrorists will ignore it. Large corporations will ignore it if they calculate they can get away with doing so. Amateur civilian hacker boys on Reddit will ignore it.
Look, I’m sorry to be the party pooper, yell at me if you want, that’s ok, but this is just how it is. Much of the discussion on this well intended forum is grounded in well meaning wishful thinking fantasy.
Intellectual elites at prestigious universities will not control the future of AI. That’s a MYTH.
If a reader is currently in college and your teachers are feeding you this myth, ask for refund!
Currently, only few organizations can build large AI models (it costs millions of dollars in energy, computation, and equipment). This will remain the case for a few years. These organizations do seem interested in AI safety research. A lot of things will happen before AI is so commonplace that small actors like “amateur civilian hacker boys” will be able to deploy powerful models. By that time, our capabilities for safety and defense will look quite different from today—largely thanks to people working in AI safety now.
I think there is a case for defending against the use of AI by malicious actors. I just don’t follow your argument that this would invalidate all of AI safety research.
Currently, only few organizations can build large AI models (it costs millions of dollars in energy, computation, and equipment).
Millions of dollars is chump change for nation states and global corporations. And of course those costs will come down, down, down over time. You know, somebody will build AI systems that build AI systems, the same way I once built websites that build websites.
By that time, our capabilities for safety and defense will look quite different from today—largely thanks to people working in AI safety now.
My apologies, but it doesn’t matter. So long as the knowledge explosion is generating ever more, ever larger threats, at an ever accelerating rate sooner or later some threat that can’t manage will emerge, and then it won’t matter whether AI research was successful or not. AI can’t solve this, because the deciding factor will be the human condition, our maturity etc.
I’m not against AI research. I’m just trying to make clear that is addressing symptoms, not root causes.
It’s an interesting question to what degree AI and related technologies will strengthen offensive vs defensive capabilities.
You seem to think that they strengthen offensive capabilities a lot more, leading to “ever larger threats”. If true, this would be markedly different from other areas. For example, in information security, techniques like fuzz testing led to better exploits, but also made software a lot safer overall. In biosecurity, new technologies contribute to new threats, but also speed up detection and make vaccine development cheaper. Andy Weber discusses that bioweapons might become obsolete on the 80000hours.org podcast. Similar trends might apply to AI.
Overall, it seems this is not such a clear case as you believe it to be.
AI alignment research will fail, because the ruthless powers who control much of the planet’s population and land mass will simply ignore it. Drug gangs will ignore it. Terrorists will ignore it. Large corporations will ignore it if they calculate they can get away with doing so. Amateur civilian hacker boys on Reddit will ignore it.
Look, I’m sorry to be the party pooper, yell at me if you want, that’s ok, but this is just how it is. Much of the discussion on this well intended forum is grounded in well meaning wishful thinking fantasy.
Intellectual elites at prestigious universities will not control the future of AI. That’s a MYTH.
If a reader is currently in college and your teachers are feeding you this myth, ask for refund!
Why do you think this is true?
Currently, only few organizations can build large AI models (it costs millions of dollars in energy, computation, and equipment). This will remain the case for a few years. These organizations do seem interested in AI safety research. A lot of things will happen before AI is so commonplace that small actors like “amateur civilian hacker boys” will be able to deploy powerful models. By that time, our capabilities for safety and defense will look quite different from today—largely thanks to people working in AI safety now.
I think there is a case for defending against the use of AI by malicious actors. I just don’t follow your argument that this would invalidate all of AI safety research.
Millions of dollars is chump change for nation states and global corporations. And of course those costs will come down, down, down over time. You know, somebody will build AI systems that build AI systems, the same way I once built websites that build websites.
My apologies, but it doesn’t matter. So long as the knowledge explosion is generating ever more, ever larger threats, at an ever accelerating rate sooner or later some threat that can’t manage will emerge, and then it won’t matter whether AI research was successful or not. AI can’t solve this, because the deciding factor will be the human condition, our maturity etc.
I’m not against AI research. I’m just trying to make clear that is addressing symptoms, not root causes.
It’s an interesting question to what degree AI and related technologies will strengthen offensive vs defensive capabilities.
You seem to think that they strengthen offensive capabilities a lot more, leading to “ever larger threats”. If true, this would be markedly different from other areas. For example, in information security, techniques like fuzz testing led to better exploits, but also made software a lot safer overall. In biosecurity, new technologies contribute to new threats, but also speed up detection and make vaccine development cheaper. Andy Weber discusses that bioweapons might become obsolete on the 80000hours.org podcast. Similar trends might apply to AI.
Overall, it seems this is not such a clear case as you believe it to be.