I don’t think anyone in this discussion, with the partial exception of Rob Bensinger, thinks we’re discussing a pause of the type FLI suggested. And I agree that a facile interpretation of the words leads to that misunderstanding, which is why my initial essay—which was supposed to frame the debate—explicitly tried to clarify that it’s not what anyone is actually discussing.
How much time we need is a critical uncertainty. It seems foolhardy to refuse to build a stop button because we might not need more time.
You say in a different comment that you think we need a significant amount of safety research to make future systems safe. I agree, and think that until that occurs, we need regulation on systems which are unsafe—which I think we all agree are possible to create. And in the future, even if we can align systems, it’s unlikely that we can make unaligned systems impossible. So if nothing else, a Bing-like deployment of potentially aligned but currently unsafe systems is incredibly likely, especially if strong systems are open-sourced so that people can reverse any safety features.
How much time we need is a critical uncertainty. It seems foolhardy to refuse to build a stop button because we might not need more time.
You say in a different comment that you think we need a significant amount of safety research to make future systems safe. I agree, and think that until that occurs, we need regulation on systems which are unsafe—which I think we all agree are possible to create.
I think that AI safety research will more-or-less simultaneously occur with AI capabilities research. I don’t think it’s a simple matter of thinking we need more safety before capabilities. I’d prefer to talk about something like the ratio of spending on capabilities to safety, or the specific regulatory regime we need, rather than how much safety research we need before moving forward with capabilities.
This is not so much a disagreement with what you said, but rather a comment about how I think we should frame the discussion.
I agree that we should be looking at investment, and carefully considering the offense-defense balance of the new technology. Investments into safety seem important, and we should certainly look at how to balance the two sides—but you were arguing against building a stop button, not saying that the real issue is that we need to figure out how much safety research (and, I hope, actual review of models and assurances of safety in each case,) is needed before proceeding. I agree with your claim that this is the key issue—which is why I think we desperately need a stop button for the case where it fails, and think we can’t build such a button later.
I don’t think anyone in this discussion, with the partial exception of Rob Bensinger, thinks we’re discussing a pause of the type FLI suggested. And I agree that a facile interpretation of the words leads to that misunderstanding, which is why my initial essay—which was supposed to frame the debate—explicitly tried to clarify that it’s not what anyone is actually discussing
I think Holly Elmore is also asking for an FLI-type pause. If I’m responding to two members of this debate, doesn’t that seem sufficient for my argument to be relevant?
I also think your essay was originally supposed to frame the debate, but no longer serves that purpose. There’s no indication in the original pause post from Ben West that we need to reply to your post.
Tens of thousands of people signed the FLI letter, and many people have asked for an “indefinite pause” on social media and in various articles in the last 12 months. I’m writing an essay in that context, and I don’t think it’s unreasonable to interpret people’s words at face value.
I don’t want to speak for her, but believe that Holly is advocating for both public response to dangerous systems, via advocacy, and shifting the default burden of proof towards those building powerful systems. Given that, stopping the most dangerous types of models—those scaled well beyond current capabilities—until companies agree that they need to prove they are safe before releasing them is critical. That’s certainly not stopping everything for a predefined period of time.
It seems like you’re ignoring other participants’ views in not responding to their actual ideas and claims. (I also think it’s disengenious to say “there’s no indication in the original pause post,” when that post was written after you and others saw an outline and then a draft of my post, and then started writing things that didn’t respond to it. You didn’t write you post after he wrote his!)
Again, I think you’re pushing a literal interpretation as the only way anyone could support “Pause,” and the people you’re talking to are actively disagreeing. If you want to address that idea, I will agree you’ve done so, but think that continuing to insist that you’re talking to someone else discussing a different proposal that I agree is a bad idea will be detrimental to the discussion.
I also think it’s disengenious to say “there’s no indication in the original pause post,” when that post was written after you and others saw an outline and then a draft of my post, and then started writing things that didn’t respond to it. You didn’t write you post after he wrote his!
I did write my post after he wrote his, so your claim is false. Also, Ben explicitly told me that I didn’t need to reply to you before I started writing my draft. I’d appreciate if you didn’t suggest that I’m being disingenuous on the basis of very weak evidence.
I don’t think anyone in this discussion, with the partial exception of Rob Bensinger, thinks we’re discussing a pause of the type FLI suggested. And I agree that a facile interpretation of the words leads to that misunderstanding, which is why my initial essay—which was supposed to frame the debate—explicitly tried to clarify that it’s not what anyone is actually discussing.
How much time we need is a critical uncertainty. It seems foolhardy to refuse to build a stop button because we might not need more time.
You say in a different comment that you think we need a significant amount of safety research to make future systems safe. I agree, and think that until that occurs, we need regulation on systems which are unsafe—which I think we all agree are possible to create. And in the future, even if we can align systems, it’s unlikely that we can make unaligned systems impossible. So if nothing else, a Bing-like deployment of potentially aligned but currently unsafe systems is incredibly likely, especially if strong systems are open-sourced so that people can reverse any safety features.
I think that AI safety research will more-or-less simultaneously occur with AI capabilities research. I don’t think it’s a simple matter of thinking we need more safety before capabilities. I’d prefer to talk about something like the ratio of spending on capabilities to safety, or the specific regulatory regime we need, rather than how much safety research we need before moving forward with capabilities.
This is not so much a disagreement with what you said, but rather a comment about how I think we should frame the discussion.
I agree that we should be looking at investment, and carefully considering the offense-defense balance of the new technology. Investments into safety seem important, and we should certainly look at how to balance the two sides—but you were arguing against building a stop button, not saying that the real issue is that we need to figure out how much safety research (and, I hope, actual review of models and assurances of safety in each case,) is needed before proceeding. I agree with your claim that this is the key issue—which is why I think we desperately need a stop button for the case where it fails, and think we can’t build such a button later.
I think Holly Elmore is also asking for an FLI-type pause. If I’m responding to two members of this debate, doesn’t that seem sufficient for my argument to be relevant?
I also think your essay was originally supposed to frame the debate, but no longer serves that purpose. There’s no indication in the original pause post from Ben West that we need to reply to your post.
Tens of thousands of people signed the FLI letter, and many people have asked for an “indefinite pause” on social media and in various articles in the last 12 months. I’m writing an essay in that context, and I don’t think it’s unreasonable to interpret people’s words at face value.
I don’t want to speak for her, but believe that Holly is advocating for both public response to dangerous systems, via advocacy, and shifting the default burden of proof towards those building powerful systems. Given that, stopping the most dangerous types of models—those scaled well beyond current capabilities—until companies agree that they need to prove they are safe before releasing them is critical. That’s certainly not stopping everything for a predefined period of time.
It seems like you’re ignoring other participants’ views in not responding to their actual ideas and claims. (I also think it’s disengenious to say “there’s no indication in the original pause post,” when that post was written after you and others saw an outline and then a draft of my post, and then started writing things that didn’t respond to it. You didn’t write you post after he wrote his!)
Again, I think you’re pushing a literal interpretation as the only way anyone could support “Pause,” and the people you’re talking to are actively disagreeing. If you want to address that idea, I will agree you’ve done so, but think that continuing to insist that you’re talking to someone else discussing a different proposal that I agree is a bad idea will be detrimental to the discussion.
I did write my post after he wrote his, so your claim is false. Also, Ben explicitly told me that I didn’t need to reply to you before I started writing my draft. I’d appreciate if you didn’t suggest that I’m being disingenuous on the basis of very weak evidence.