You are concerned that nudging the world toward pausing AI progress risks global totalitarianism. I do not share this concern because (setting aside how bad it would be) I think global totalitarianism is extremely unlikely
That makes sense, but I think that’s compatible with what I wrote in the post:
I think there are two ways of viewing this objection. Either it is an argument against the feasibility of an indefinite pause, or it is a statement about the magnitude of the negative consequences of trying an indefinite pause. I think either way you view the objection, it should lower your evaluation of advocating for an indefinite pause.
Pause-like policy regimes don’t need to be indefinite to be good. Most of the benefit of nudging the world toward pausing comes from paths other than increasing P(indefinite pause).
I agree with you that indefinite pause is the wrong goal to aim for. It does not follow that “EAs actively push[ing] for a generic pause” has substantial totalitarianism-risk downsides.
I agree with you that indefinite pause is the wrong goal to aim for. It does not follow that “EAs actively push[ing] for a generic pause” has substantial totalitarianism-risk downsides.
That’s reasonable. In this post I primarily argued against advocating indefinite pause. I said in the introduction that the merits of a brief pause are much more uncertain, and may be beneficial. It sounds like you mostly agree with me?
I think you’re trying to argue that all proposals that are being promoted as pauses or moratoriums require that there be no further progress during that time, even on safety. I don’t agree; there exists a real possibility that further research is done, experts conclude that AI can be harnessed safely in specific situations, and we can allow any of the specific forms of AI that are safe.
This seems similar to banning nuclear tests, but allowing nuclear testing in laboratories to ensure we understand nuclear power well enough to make better power plants. We don’t want or need nuclear bombs tested in order to get the benefits of nuclear power, and we don’t want or need unrestricted misaligned AI in order to build safe systems.
I think you’re trying to argue that all proposals that are being promoted as pauses or moratoriums require that there be no further progress during that time, even on safety.
I don’t think I’m arguing that. Can you be more specific about what part of my post lead you to think I’m arguing for that position? I mentioned that during a pause, we will get more “time to do AI safety research”, and said that was a positive. I merely argued that the costs of an indefinite pause outweigh the benefits.
Also, my post was not primarily about a brief pause, and I conceded that “Overall I’m quite uncertain about the costs and benefits of a brief AI pause.” I did argue that a brief pause could lead to an indefinite pause, but I took no strong position on that question.
As I argued in my post, I think that we need a moratorium, and one that would lead to an indefinite set of strong restrictions on dangerous AIs, and continued restrictions and oversight on any types of systems that aren’t pretty rigorously provably safe, forever.
The end goal isn’t a situation where we give up on safety, it’s one where we insist that only safe “human-level” but effectively superhuman systems be built—once we can do that at all, which at present I think essentially everyone agrees we cannot.
As I argued in my post, I think that we need a moratorium, and one that would lead to an indefinite set of strong restrictions on dangerous AIs, and continued restrictions and oversight on any types of systems that aren’t pretty rigorously provably safe, forever.
To be clear, I’m fine with locking in a set of nice regulations that can prevent dangerous AIs from coming about, if we know how to do that. I think the concept of a “pause” or “moratorium”—as it is traditionally understood, and explicitly outlined in the FLI letter—doesn’t merely mean that we should have legal rules for AI development. The standard meaning of “moratorium” is that we should not build the technology at all until the moratorium ends.
The end goal isn’t a situation where we give up on safety, it’s one where we insist that only safe “human-level” but effectively superhuman systems be built—once we can do that at all, which at present I think essentially everyone agrees we cannot.
Presently, the fact that we can’t build safe superhuman systems is mostly a side effect of the fact that we can’t build superhuman systems at all. By itself, that’s pretty trivial, and it’s not surprising that “essentially everyone” agrees on this point. However, I don’t think essentially everyone agrees that superhuman systems will be unsafe by default unless we give ourselves a lot of extra time right now to do safety research—and that seems closer to the claim that I’m arguing against in the post.
I don’t think anyone in this discussion, with the partial exception of Rob Bensinger, thinks we’re discussing a pause of the type FLI suggested. And I agree that a facile interpretation of the words leads to that misunderstanding, which is why my initial essay—which was supposed to frame the debate—explicitly tried to clarify that it’s not what anyone is actually discussing.
How much time we need is a critical uncertainty. It seems foolhardy to refuse to build a stop button because we might not need more time.
You say in a different comment that you think we need a significant amount of safety research to make future systems safe. I agree, and think that until that occurs, we need regulation on systems which are unsafe—which I think we all agree are possible to create. And in the future, even if we can align systems, it’s unlikely that we can make unaligned systems impossible. So if nothing else, a Bing-like deployment of potentially aligned but currently unsafe systems is incredibly likely, especially if strong systems are open-sourced so that people can reverse any safety features.
How much time we need is a critical uncertainty. It seems foolhardy to refuse to build a stop button because we might not need more time.
You say in a different comment that you think we need a significant amount of safety research to make future systems safe. I agree, and think that until that occurs, we need regulation on systems which are unsafe—which I think we all agree are possible to create.
I think that AI safety research will more-or-less simultaneously occur with AI capabilities research. I don’t think it’s a simple matter of thinking we need more safety before capabilities. I’d prefer to talk about something like the ratio of spending on capabilities to safety, or the specific regulatory regime we need, rather than how much safety research we need before moving forward with capabilities.
This is not so much a disagreement with what you said, but rather a comment about how I think we should frame the discussion.
I agree that we should be looking at investment, and carefully considering the offense-defense balance of the new technology. Investments into safety seem important, and we should certainly look at how to balance the two sides—but you were arguing against building a stop button, not saying that the real issue is that we need to figure out how much safety research (and, I hope, actual review of models and assurances of safety in each case,) is needed before proceeding. I agree with your claim that this is the key issue—which is why I think we desperately need a stop button for the case where it fails, and think we can’t build such a button later.
I don’t think anyone in this discussion, with the partial exception of Rob Bensinger, thinks we’re discussing a pause of the type FLI suggested. And I agree that a facile interpretation of the words leads to that misunderstanding, which is why my initial essay—which was supposed to frame the debate—explicitly tried to clarify that it’s not what anyone is actually discussing
I think Holly Elmore is also asking for an FLI-type pause. If I’m responding to two members of this debate, doesn’t that seem sufficient for my argument to be relevant?
I also think your essay was originally supposed to frame the debate, but no longer serves that purpose. There’s no indication in the original pause post from Ben West that we need to reply to your post.
Tens of thousands of people signed the FLI letter, and many people have asked for an “indefinite pause” on social media and in various articles in the last 12 months. I’m writing an essay in that context, and I don’t think it’s unreasonable to interpret people’s words at face value.
I don’t want to speak for her, but believe that Holly is advocating for both public response to dangerous systems, via advocacy, and shifting the default burden of proof towards those building powerful systems. Given that, stopping the most dangerous types of models—those scaled well beyond current capabilities—until companies agree that they need to prove they are safe before releasing them is critical. That’s certainly not stopping everything for a predefined period of time.
It seems like you’re ignoring other participants’ views in not responding to their actual ideas and claims. (I also think it’s disengenious to say “there’s no indication in the original pause post,” when that post was written after you and others saw an outline and then a draft of my post, and then started writing things that didn’t respond to it. You didn’t write you post after he wrote his!)
Again, I think you’re pushing a literal interpretation as the only way anyone could support “Pause,” and the people you’re talking to are actively disagreeing. If you want to address that idea, I will agree you’ve done so, but think that continuing to insist that you’re talking to someone else discussing a different proposal that I agree is a bad idea will be detrimental to the discussion.
I also think it’s disengenious to say “there’s no indication in the original pause post,” when that post was written after you and others saw an outline and then a draft of my post, and then started writing things that didn’t respond to it. You didn’t write you post after he wrote his!
I did write my post after he wrote his, so your claim is false. Also, Ben explicitly told me that I didn’t need to reply to you before I started writing my draft. I’d appreciate if you didn’t suggest that I’m being disingenuous on the basis of very weak evidence.
I agree with you that some alternatives to “pause” or “indefinite pause” are better
I’m agnostic on what advocacy folks should advocate for; I think advocating indefinite pause is net-positive
I disagree on P(global totalitarianism for AI pause); I think it is extremely unlikely
I disagree with some vibes, like your focus on the downsides of totalitarianism (rather than its probability) and your “presumption in favor of innovation” even for predictably dangerous AI; they don’t seem to be load-bearing for your precise argument but I think they’re likely to mislead incautious readers
I agree with you that some alternatives to “pause” or “indefinite pause” are better
Thanks for clarifying. Assuming those alternative policies compete for attention and trade off against each other in some non-trivial way, I think that’s a pretty big deal.
I think advocating indefinite pause is net-positive
I find it interesting that you seem to think that advocacy for X is good even if X is bad, in this case. Maybe this is a crux for me? I think EAs shouldn’t advocate bad things just because we think we’ll fail at getting them, and will get some separate good thing instead.
I never said “indefinite pause” was bad or net-negative. Normally I’d say it’s good but I think it depends on the precise definition and maybe you’re using the term in a way such that it’s actually bad.
Clearly sometimes advocacy for a bad thing can be good. I’m just trying to model the world correctly.
Zach in a hypothetical world that pauses AI development, how many years do you think it would take medical science, at the current rate of progress, which is close to zero, to find
(1) treatments for aging
(2) treatments for all forms of dementia
And once treatments are found, what about the practical nature of actually carrying them out? Manipulating thr human body is extremely dangerous and risky. Ultimately all ICUs fail, their patients will always eventually enter a complex failure state that current doctors don’t have the tools or knowledge to stop. (Always fail in the sense that if you release ICU patients and wait a few years and they come back, eventually they will die there)
It is possible that certain hypothetical medical procedures like a series of transplants to replace an entire body, or to edit adult genes across entire organs, are impossible for human physicians to perform without an unacceptable mortality rate. In the same way there are aircraft that human pilots can’t actually fly. It takes automation and algorithms to do it at all.
What I am trying to say is a world free of aging and death is possible, but perhaps it’s 50-100 years away with ASI, and 1000+ years away in AI pause worlds. (Possibly quite a bit longer than 1000 years, see the repression of technology in China.)
It seems like if your mental discount rate counts people who will exist past 1000 years from now with non negligible weight, you could support an AI pause. Is this the crux of it? If a human alive today is worth 1.0, what is the worth of someone who might exist in 1000 years?
I never said “indefinite pause” was bad or net-negative. Normally I’d say it’s good but I think it depends on the precise definition and maybe you’re using the term in a way such that it’s actually bad.
In that case, I do think the arguments in the post probably address your beliefs. I think the downsides of doing an indefinite pause seem large. I’m curious if you have any direct reply to these arguments, even if you think that we are extremely unlikely to do an indefinite pause.
Clearly sometimes advocacy for a bad thing can be good.
I agree, but as a general rule, I think EAs should be very suspicious of arguments that assert X is bad while advocating for X is good.
That makes sense, but I think that’s compatible with what I wrote in the post:
In this post I’m mainly talking about an indefinite pause, and left an analysis of brief pauses to others. [ETA: moreover I dispute that a totalitarian world government is “extremely unlikely”.]
Sure—but that’s compatible with what I wrote:
I agree with you that indefinite pause is the wrong goal to aim for. It does not follow that “EAs actively push[ing] for a generic pause” has substantial totalitarianism-risk downsides.
That’s reasonable. In this post I primarily argued against advocating indefinite pause. I said in the introduction that the merits of a brief pause are much more uncertain, and may be beneficial. It sounds like you mostly agree with me?
I think you’re trying to argue that all proposals that are being promoted as pauses or moratoriums require that there be no further progress during that time, even on safety. I don’t agree; there exists a real possibility that further research is done, experts conclude that AI can be harnessed safely in specific situations, and we can allow any of the specific forms of AI that are safe.
This seems similar to banning nuclear tests, but allowing nuclear testing in laboratories to ensure we understand nuclear power well enough to make better power plants. We don’t want or need nuclear bombs tested in order to get the benefits of nuclear power, and we don’t want or need unrestricted misaligned AI in order to build safe systems.
I don’t think I’m arguing that. Can you be more specific about what part of my post lead you to think I’m arguing for that position? I mentioned that during a pause, we will get more “time to do AI safety research”, and said that was a positive. I merely argued that the costs of an indefinite pause outweigh the benefits.
Also, my post was not primarily about a brief pause, and I conceded that “Overall I’m quite uncertain about the costs and benefits of a brief AI pause.” I did argue that a brief pause could lead to an indefinite pause, but I took no strong position on that question.
As I argued in my post, I think that we need a moratorium, and one that would lead to an indefinite set of strong restrictions on dangerous AIs, and continued restrictions and oversight on any types of systems that aren’t pretty rigorously provably safe, forever.
The end goal isn’t a situation where we give up on safety, it’s one where we insist that only safe “human-level” but effectively superhuman systems be built—once we can do that at all, which at present I think essentially everyone agrees we cannot.
To be clear, I’m fine with locking in a set of nice regulations that can prevent dangerous AIs from coming about, if we know how to do that. I think the concept of a “pause” or “moratorium”—as it is traditionally understood, and explicitly outlined in the FLI letter—doesn’t merely mean that we should have legal rules for AI development. The standard meaning of “moratorium” is that we should not build the technology at all until the moratorium ends.
Presently, the fact that we can’t build safe superhuman systems is mostly a side effect of the fact that we can’t build superhuman systems at all. By itself, that’s pretty trivial, and it’s not surprising that “essentially everyone” agrees on this point. However, I don’t think essentially everyone agrees that superhuman systems will be unsafe by default unless we give ourselves a lot of extra time right now to do safety research—and that seems closer to the claim that I’m arguing against in the post.
I don’t think anyone in this discussion, with the partial exception of Rob Bensinger, thinks we’re discussing a pause of the type FLI suggested. And I agree that a facile interpretation of the words leads to that misunderstanding, which is why my initial essay—which was supposed to frame the debate—explicitly tried to clarify that it’s not what anyone is actually discussing.
How much time we need is a critical uncertainty. It seems foolhardy to refuse to build a stop button because we might not need more time.
You say in a different comment that you think we need a significant amount of safety research to make future systems safe. I agree, and think that until that occurs, we need regulation on systems which are unsafe—which I think we all agree are possible to create. And in the future, even if we can align systems, it’s unlikely that we can make unaligned systems impossible. So if nothing else, a Bing-like deployment of potentially aligned but currently unsafe systems is incredibly likely, especially if strong systems are open-sourced so that people can reverse any safety features.
I think that AI safety research will more-or-less simultaneously occur with AI capabilities research. I don’t think it’s a simple matter of thinking we need more safety before capabilities. I’d prefer to talk about something like the ratio of spending on capabilities to safety, or the specific regulatory regime we need, rather than how much safety research we need before moving forward with capabilities.
This is not so much a disagreement with what you said, but rather a comment about how I think we should frame the discussion.
I agree that we should be looking at investment, and carefully considering the offense-defense balance of the new technology. Investments into safety seem important, and we should certainly look at how to balance the two sides—but you were arguing against building a stop button, not saying that the real issue is that we need to figure out how much safety research (and, I hope, actual review of models and assurances of safety in each case,) is needed before proceeding. I agree with your claim that this is the key issue—which is why I think we desperately need a stop button for the case where it fails, and think we can’t build such a button later.
I think Holly Elmore is also asking for an FLI-type pause. If I’m responding to two members of this debate, doesn’t that seem sufficient for my argument to be relevant?
I also think your essay was originally supposed to frame the debate, but no longer serves that purpose. There’s no indication in the original pause post from Ben West that we need to reply to your post.
Tens of thousands of people signed the FLI letter, and many people have asked for an “indefinite pause” on social media and in various articles in the last 12 months. I’m writing an essay in that context, and I don’t think it’s unreasonable to interpret people’s words at face value.
I don’t want to speak for her, but believe that Holly is advocating for both public response to dangerous systems, via advocacy, and shifting the default burden of proof towards those building powerful systems. Given that, stopping the most dangerous types of models—those scaled well beyond current capabilities—until companies agree that they need to prove they are safe before releasing them is critical. That’s certainly not stopping everything for a predefined period of time.
It seems like you’re ignoring other participants’ views in not responding to their actual ideas and claims. (I also think it’s disengenious to say “there’s no indication in the original pause post,” when that post was written after you and others saw an outline and then a draft of my post, and then started writing things that didn’t respond to it. You didn’t write you post after he wrote his!)
Again, I think you’re pushing a literal interpretation as the only way anyone could support “Pause,” and the people you’re talking to are actively disagreeing. If you want to address that idea, I will agree you’ve done so, but think that continuing to insist that you’re talking to someone else discussing a different proposal that I agree is a bad idea will be detrimental to the discussion.
I did write my post after he wrote his, so your claim is false. Also, Ben explicitly told me that I didn’t need to reply to you before I started writing my draft. I’d appreciate if you didn’t suggest that I’m being disingenuous on the basis of very weak evidence.
I agree with you that some alternatives to “pause” or “indefinite pause” are better
I’m agnostic on what advocacy folks should advocate for; I think advocating indefinite pause is net-positive
I disagree on P(global totalitarianism for AI pause); I think it is extremely unlikely
I disagree with some vibes, like your focus on the downsides of totalitarianism (rather than its probability) and your “presumption in favor of innovation” even for predictably dangerous AI; they don’t seem to be load-bearing for your precise argument but I think they’re likely to mislead incautious readers
Thanks for clarifying. Assuming those alternative policies compete for attention and trade off against each other in some non-trivial way, I think that’s a pretty big deal.
I find it interesting that you seem to think that advocacy for X is good even if X is bad, in this case. Maybe this is a crux for me? I think EAs shouldn’t advocate bad things just because we think we’ll fail at getting them, and will get some separate good thing instead.
I never said “indefinite pause” was bad or net-negative. Normally I’d say it’s good but I think it depends on the precise definition and maybe you’re using the term in a way such that it’s actually bad.
Clearly sometimes advocacy for a bad thing can be good. I’m just trying to model the world correctly.
Zach in a hypothetical world that pauses AI development, how many years do you think it would take medical science, at the current rate of progress, which is close to zero, to find
(1) treatments for aging (2) treatments for all forms of dementia
And once treatments are found, what about the practical nature of actually carrying them out? Manipulating thr human body is extremely dangerous and risky. Ultimately all ICUs fail, their patients will always eventually enter a complex failure state that current doctors don’t have the tools or knowledge to stop. (Always fail in the sense that if you release ICU patients and wait a few years and they come back, eventually they will die there)
It is possible that certain hypothetical medical procedures like a series of transplants to replace an entire body, or to edit adult genes across entire organs, are impossible for human physicians to perform without an unacceptable mortality rate. In the same way there are aircraft that human pilots can’t actually fly. It takes automation and algorithms to do it at all.
What I am trying to say is a world free of aging and death is possible, but perhaps it’s 50-100 years away with ASI, and 1000+ years away in AI pause worlds. (Possibly quite a bit longer than 1000 years, see the repression of technology in China.)
It seems like if your mental discount rate counts people who will exist past 1000 years from now with non negligible weight, you could support an AI pause. Is this the crux of it? If a human alive today is worth 1.0, what is the worth of someone who might exist in 1000 years?
In that case, I do think the arguments in the post probably address your beliefs. I think the downsides of doing an indefinite pause seem large. I’m curious if you have any direct reply to these arguments, even if you think that we are extremely unlikely to do an indefinite pause.
I agree, but as a general rule, I think EAs should be very suspicious of arguments that assert X is bad while advocating for X is good.