Hi Zed!
Thanks for your post. A couple of responses:
âAs critics of the long-termist viewpoint have noted, the base-rate for human extinction is zero.â
Yes, but this is tautologically true: Only in worlds where humanity hasnât gone extinct could you make that observation in the first place. (For a discussion of this and some tentative probabilities, see https://ââwww.nature.com/ââarticles/ââs41598-019-47540-7)
âInstead of outlandish ideas of a new global government capable of unilaterally curtailing compute power or some other factor through force, we should focus on what is practically achievable today. Encouraging firms like OpenAI to red-team their models before release, for example, is practical and limits negative externalities.â
âLetâs assume for a moment that domain experts who warn of imminent threats to humanityâs survival from AI are acting in good faith and are sincere in their convictions.â
The way you phrase this makes it sound like we have reason to doubt their sincerity. Iâd love to hear what makes you think we do!
âFor example, a global pause in model training that many advocated for made no reference to the ideaâs inherent weaknessâthat is, it sets up a prisonerâs dilemma in which the more AI firms voluntarily agree to pause research, the greater the incentive for any one group to defect from the agreement and gain a competitive edge. It makes no mention of practical implementation, nor does it explain how it arrived on its pause time-duration; nor does it recognize the improbability of enforcing a global treaty on AI.â
âA strict international regime dedicated to preventing proliferation still failed to prevent India, Israel, Pakistan, North Korea, and, likely, Iran from acquiring weapons.â
Are you talking about the NPT or the IAEA here? My expertise on this is limited (~90 hours of engagement), but I authored a case study on IAEA safeguards this summer and my overall takeaway was that domain experts like Carl Robichaud still consider these regimes success stories. Iâd be curious to hear where you disagree! :)
On background extinction rates, rather than go down that rabbit hole, I think my point still stands, any estimation of human extinction needs to be rooted in some historical analysis. Whether that is one in 87,000 of homo sapiens going extinct in any given year as the Nature piece suggests, or something revised up or down from there.
On false dichotomiesâIâd set aside individual behavior for a moment and look at the macro picture. We know from political science basics that elites can meaningfully shift public opinion on issues of low salience. According to Pew, weâve seen a 15-point shift in the general public expressing âmore concern than excitementâ over AI in the United States. Rarely do we see such a marked shift in opinion on any particular issue in such a divided electorate.
Letâs put it this wayâin a literal sense, yes, one could loudly espouse a belief that AI could destroy humanity within a decade and at the same time, advocate for rudimentary red-teaming to keep napalm recipes out of an LLMâs response, but, in practice, this seems to defy common sense and ignores the effect on public opinion.
Imagine weâre engineers at a new electric vehicle company. At an all hands meeting, we discuss one of the biggest issues with the design, the automatic trunk release. Weâre afraid people might get their hands caught in it. An engineer pipes up and says, âwhile weâre talking about flaws, I think thereâs a chance that the car might explode and take out a city block.â Now, thereâs nothing stopping us from looking at the trunk release and investigating spontaneous combustion, but in practice, I struggle to imagine those processes happening in parallel in a meaningful way.
Coming back to public opinion, weâve seen what happens when novel technology gains motivated opponents, from nuclear fission to genetic engineering, to geoengineering, to stem cell research, to gain of function research, to autonomous vehicles, and on. Government policy responds to voter sentiment, not elite opinion. And fear of the unknown is a much more powerful driver of behavior than a vague sense of productivity gains. My sense is that if we continue to see elites writing op-eds on how the world will end soon, weâll see public opinion treat AI like it treats GMO fruits and veg.
My default is to assume folks are sincere in their convictions (and data shows most people are)--I should have clarified that line; it was in reference to claims that outfits calling for AI regulation are cynically putting up barriers to entry and on a path to rent-seeking.
On the pause being a bad idea: my point here is that the very conception is foolish at the strategic level, not that it has practical implementation difficulties. First, what would change in six months? And second, why would creating a prisonerâs dilemma lead to better outcomes? It would be like soft drink makers asking for a non-binding pause on advertisingâit only works if thereâs consensus and an enforcement mechanism that would impose a penalty on defectors; otherwise, itâs even better for me if you stop advertising, and I continue, stealing your market share.
The IAEA and NPT are their own can of worms, but in general, my broader point here is that even a global attempt to limit the spread of nuclear weapons failed. What is the likelihood of imposing a similar regime on a technology that is much simpler to work with? No centrifuges, no radiation, just code and compute power? I struggle to see how creating an IAEA for AI would have a different outcome.
Do you think a permanent* ban on AI research and development would be a better path than a pause? I agree a six-month pause is likely not to do anything, but far-reaching government legislation banning AI just mightâespecially if we can get the U.S., China, EU, and Russia all on board (easier said than done!).
*nothing is truly permanent, but I would feel much more comfortable with a more socially just and morally advanced human society having the AI discussion ~200 years from now, than for the tech to exist today. Humanity today shouldnât be trusted to develop AI for the same reason 10-year-olds shouldnât be trusted to drive trucks: it lacks the knowledge, experience, and development to do it safely.
Letâs look at the history of global bans: - They donât work for doping in the Olympics. - They donât work for fissile material. - They donât prevent luxury goods from entering North Korea. - They donât work against cocaine or heroine.
We could go on. And those examples are much easier to implementâthereâs global consensus and law enforcement trying to stop the drug trade, but the economics of the sector mean an escalating war with cartels only leads to greater payoffs for new market entrants.
Setting aside practical limitations, we ought to think carefully before weaponizing the power of central governments against private individuals. When we can identify a negative externality, we have some justification to internalize it. No one wants firms polluting rivers or scammers selling tainted milk.
Generative AI hasnât shown externalities that would necessitate something like a global ban.
Trucks: we know what the externalities of a poorly piloted vehicle are. So we minimize those risks by requiring competence.
And on a morally advanced societyâyes, Iâm certain a majority of folks if asked would say theyâd like a more moral and ethical world. But thatâs not the questionâthe question is who gets to decide what we can and cannot do? And what criteria are they using to make these decisions? Real risk, as demonstrated by data, or theoretical risk? The latter was used to halt interest in nuclear fission. Should we expect the same for generative AI?
The question of âwho gets to do whatâ is fundamentally political, and I really try to stay away from politics especially when dealing with the subject of existential risk. This isnât to discount the importance of politics, only to say that while political processes are helpful in determining how we manage x-risk, they donât in and of themselves directly relate to the issue. Global bans would also be political, of course.
You may well be right that the existential risk iof generative AI, and eventually AGI, is low or indeterminate, and theoretical rather than actual. I donât think we should wait until we have an actual x-risk on our hands to act â because then it may be too late.
Youâre also likely correct on AI development being unstoppable at this point. Mitigation plans are needed should unfriendly outcomes occur especially with an AGI, and I think we can both agree on that.
Maybe Iâm too cautious when it comes to the subject of AI, but part of what motivates me is the idea that, should the catastrophic occur, I could at least know that I did everything in my power to oppose that risk.
These are all very reasonable positions, and one would struggle to find fault with them.
Personally, Iâm glad there are smart folks out there thinking about what sorts of risks we might face in the near future. Biologists have been talking about the next big pandemic for years. It makes sense to think these issues through.
Where I vehemently object is on the policy side. To use the pandemic analogy, itâs the difference between a research-led investigation into future pandemics and a call to ban the use of CRISPR. Itâs impractical and, from a policy perspective, questionable.
The conversation around AI within EA is framed as âwe need to stop AI progress before we all die.â It seems tough to justify such an extreme policy position.
Hi Zed! Thanks for your post. A couple of responses:
âAs critics of the long-termist viewpoint have noted, the base-rate for human extinction is zero.â
Yes, but this is tautologically true: Only in worlds where humanity hasnât gone extinct could you make that observation in the first place. (For a discussion of this and some tentative probabilities, see https://ââwww.nature.com/ââarticles/ââs41598-019-47540-7)
âInstead of outlandish ideas of a new global government capable of unilaterally curtailing compute power or some other factor through force, we should focus on what is practically achievable today. Encouraging firms like OpenAI to red-team their models before release, for example, is practical and limits negative externalities.â
Why are the two mutually exclusive? I think youâre opening a false dichotomyâas far as I know, x-risk oriented folks are amongst the leading voices calling for red teams or even engaging in this work themselves. (See also: https://ââforum.effectivealtruism.org/ââposts/ââQ4rg6vwbtPxXW6ECj/ââwe-are-fighting-a-shared-battle-a-call-for-a-different)
âLetâs assume for a moment that domain experts who warn of imminent threats to humanityâs survival from AI are acting in good faith and are sincere in their convictions.â
The way you phrase this makes it sound like we have reason to doubt their sincerity. Iâd love to hear what makes you think we do!
âFor example, a global pause in model training that many advocated for made no reference to the ideaâs inherent weaknessâthat is, it sets up a prisonerâs dilemma in which the more AI firms voluntarily agree to pause research, the greater the incentive for any one group to defect from the agreement and gain a competitive edge. It makes no mention of practical implementation, nor does it explain how it arrived on its pause time-duration; nor does it recognize the improbability of enforcing a global treaty on AI.â
My understanding is that even strong advocates of a pause are aware of its shortcomings and communicate these uncertainties rather transparentlyâI have yet to meet someone who sees them as a panacea. Granted, the questions you ask need to be answered, but the fact that an idea is thorny and potentially difficult to implement doesnât make it a bad one per sĂ©.
âA strict international regime dedicated to preventing proliferation still failed to prevent India, Israel, Pakistan, North Korea, and, likely, Iran from acquiring weapons.â
Are you talking about the NPT or the IAEA here? My expertise on this is limited (~90 hours of engagement), but I authored a case study on IAEA safeguards this summer and my overall takeaway was that domain experts like Carl Robichaud still consider these regimes success stories. Iâd be curious to hear where you disagree! :)
Thanks for the thoughtful response.
On background extinction rates, rather than go down that rabbit hole, I think my point still stands, any estimation of human extinction needs to be rooted in some historical analysis. Whether that is one in 87,000 of homo sapiens going extinct in any given year as the Nature piece suggests, or something revised up or down from there.
On false dichotomiesâIâd set aside individual behavior for a moment and look at the macro picture. We know from political science basics that elites can meaningfully shift public opinion on issues of low salience. According to Pew, weâve seen a 15-point shift in the general public expressing âmore concern than excitementâ over AI in the United States. Rarely do we see such a marked shift in opinion on any particular issue in such a divided electorate.
Letâs put it this wayâin a literal sense, yes, one could loudly espouse a belief that AI could destroy humanity within a decade and at the same time, advocate for rudimentary red-teaming to keep napalm recipes out of an LLMâs response, but, in practice, this seems to defy common sense and ignores the effect on public opinion.
Imagine weâre engineers at a new electric vehicle company. At an all hands meeting, we discuss one of the biggest issues with the design, the automatic trunk release. Weâre afraid people might get their hands caught in it. An engineer pipes up and says, âwhile weâre talking about flaws, I think thereâs a chance that the car might explode and take out a city block.â Now, thereâs nothing stopping us from looking at the trunk release and investigating spontaneous combustion, but in practice, I struggle to imagine those processes happening in parallel in a meaningful way.
Coming back to public opinion, weâve seen what happens when novel technology gains motivated opponents, from nuclear fission to genetic engineering, to geoengineering, to stem cell research, to gain of function research, to autonomous vehicles, and on. Government policy responds to voter sentiment, not elite opinion. And fear of the unknown is a much more powerful driver of behavior than a vague sense of productivity gains. My sense is that if we continue to see elites writing op-eds on how the world will end soon, weâll see public opinion treat AI like it treats GMO fruits and veg.
My default is to assume folks are sincere in their convictions (and data shows most people are)--I should have clarified that line; it was in reference to claims that outfits calling for AI regulation are cynically putting up barriers to entry and on a path to rent-seeking.
On the pause being a bad idea: my point here is that the very conception is foolish at the strategic level, not that it has practical implementation difficulties. First, what would change in six months? And second, why would creating a prisonerâs dilemma lead to better outcomes? It would be like soft drink makers asking for a non-binding pause on advertisingâit only works if thereâs consensus and an enforcement mechanism that would impose a penalty on defectors; otherwise, itâs even better for me if you stop advertising, and I continue, stealing your market share.
The IAEA and NPT are their own can of worms, but in general, my broader point here is that even a global attempt to limit the spread of nuclear weapons failed. What is the likelihood of imposing a similar regime on a technology that is much simpler to work with? No centrifuges, no radiation, just code and compute power? I struggle to see how creating an IAEA for AI would have a different outcome.
Do you think a permanent* ban on AI research and development would be a better path than a pause? I agree a six-month pause is likely not to do anything, but far-reaching government legislation banning AI just mightâespecially if we can get the U.S., China, EU, and Russia all on board (easier said than done!).
*nothing is truly permanent, but I would feel much more comfortable with a more socially just and morally advanced human society having the AI discussion ~200 years from now, than for the tech to exist today. Humanity today shouldnât be trusted to develop AI for the same reason 10-year-olds shouldnât be trusted to drive trucks: it lacks the knowledge, experience, and development to do it safely.
Letâs look at the history of global bans:
- They donât work for doping in the Olympics.
- They donât work for fissile material.
- They donât prevent luxury goods from entering North Korea.
- They donât work against cocaine or heroine.
We could go on. And those examples are much easier to implementâthereâs global consensus and law enforcement trying to stop the drug trade, but the economics of the sector mean an escalating war with cartels only leads to greater payoffs for new market entrants.
Setting aside practical limitations, we ought to think carefully before weaponizing the power of central governments against private individuals. When we can identify a negative externality, we have some justification to internalize it. No one wants firms polluting rivers or scammers selling tainted milk.
Generative AI hasnât shown externalities that would necessitate something like a global ban.
Trucks: we know what the externalities of a poorly piloted vehicle are. So we minimize those risks by requiring competence.
And on a morally advanced societyâyes, Iâm certain a majority of folks if asked would say theyâd like a more moral and ethical world. But thatâs not the questionâthe question is who gets to decide what we can and cannot do? And what criteria are they using to make these decisions? Real risk, as demonstrated by data, or theoretical risk? The latter was used to halt interest in nuclear fission. Should we expect the same for generative AI?
The question of âwho gets to do whatâ is fundamentally political, and I really try to stay away from politics especially when dealing with the subject of existential risk. This isnât to discount the importance of politics, only to say that while political processes are helpful in determining how we manage x-risk, they donât in and of themselves directly relate to the issue. Global bans would also be political, of course.
You may well be right that the existential risk iof generative AI, and eventually AGI, is low or indeterminate, and theoretical rather than actual. I donât think we should wait until we have an actual x-risk on our hands to act â because then it may be too late.
Youâre also likely correct on AI development being unstoppable at this point. Mitigation plans are needed should unfriendly outcomes occur especially with an AGI, and I think we can both agree on that.
Maybe Iâm too cautious when it comes to the subject of AI, but part of what motivates me is the idea that, should the catastrophic occur, I could at least know that I did everything in my power to oppose that risk.
These are all very reasonable positions, and one would struggle to find fault with them.
Personally, Iâm glad there are smart folks out there thinking about what sorts of risks we might face in the near future. Biologists have been talking about the next big pandemic for years. It makes sense to think these issues through.
Where I vehemently object is on the policy side. To use the pandemic analogy, itâs the difference between a research-led investigation into future pandemics and a call to ban the use of CRISPR. Itâs impractical and, from a policy perspective, questionable.
The conversation around AI within EA is framed as âwe need to stop AI progress before we all die.â It seems tough to justify such an extreme policy position.