Have you had an in depth discussion with Roman Yampolskiy? (If not, I think you should!)
I think the Overton Window is really shifting on the issue of AGI x-risk now, with it going mainstream. The burden of proof should be on the developers of AGI to prove that it is 100% safe (as opposed to the previous era where it was on the x-risk worriers to prove it was dangerous). Do you have a good post-GPT-4+plugins/AutoGPT (new 2023 era) answer to this question (a mechanistic explanation for why we get an ok outcome, given AGI).
I’m pushing back against the framing: “this is a suicide race with no benefit from winning.”
If there is a 10% chance of AI takeover, then there is a real and potentially huge benefit from winning the race. But we still should not be OK with someone unilaterally taking that risk.
I agree that AI developers should have to prove that the systems they build are reasonably safe. I don’t think 100% is a reasonable ask, but 90% or 99% seem pretty safe (i.e. robustly reasonable asks).
(Edited to complete cutoff sentence and clarify “safe.”)
I agree that AI developers should have to prove that the systems they build are reasonably safe. I don’t think 100% is a reasonable ask, but 90% or 99% seem pretty safe.
Sorry, just to clarify, what do we mean by “safe” here? Clearly a 90% chance of the system not disempowering all of humanity is not sufficient (and neither would a 99% chance, though that’s maybe a bit more debatable), so presumably you mean something else here.
I mean that 90% or 99% seem like clearly reasonable asks, and 100% is a clearly unreasonable ask.
I’m just saying that the argument “this is a suicide race” is really not the way we should go. We should say the risk is >10% and that’s obviously unacceptable, because that’s an argument we can actually win.
I’m just saying that the argument “this is a suicide race” is really not the way we should go. We should say the risk is >10% and that’s obviously unacceptable, because that’s an argument we can actually win.
Hmm, just to be clear, I think saying that “this deployment has a 1% chance of causing an existential risk, so you can’t deploy it” seems like a pretty reasonable ask to me.
I agree that I would like to focus on the >10% case first, but I also don’t want to set wrong expectations that I think it’s reasonable at 1% or below.
I agree. When I give numbers I usually say “We should keep the risk of AI takeover beneath 1%” (though I haven’t thought about it very much and mostly the numbers seem less important than the qualitative standard of evidence).
I think that 10% is obviously too high. I think that a society making reasonable tradeoffs could end up with 1% risk, but that it’s not something a government should allow AI developers to do without broader public input (and I suspect that our society would not choose to take this level of risk).
90% or 99% safe is still gambling the lives of 80M-800M humans in expectation (in the limit of scaling to superintelligence). I don’t think it’s acceptable for AI companies, with no democratic mandate, to be unilaterally making that decision!
But we still should not be OK with someone
Or did you mean to say something to that effect with this truncated sentence?
Yeah, the sentence cut off. I was saying: obviously a 10% risk is socially unacceptable. Trying to convince someone it’s not in their interest is not the right approach, because doing so requires you to argue that P(doom) is much greater than 10% (at least with some audiences who care a lot about winning a race). Whereas trying to convince policy makers and the public that they shouldn’t tolerate the risk requires meeting a radically lower bar, probably even 1% is good enough.
I think arguing P(doom|AGI) >>10% is a decent strategy. So far I haven’t had anyone give good enough reasons for me to update in the other direction. I think the CEOs in the vanguard of AGI development need to really think about this. If they have good reasons for thinking that P(doom|AGI) ≤ 10%, I want to hear them! To give a worrying example: LeCun is, frankly, sounding like he has no idea of what the problem even is. OpenAI might think they can solve alignment, but their progress on alignment to date isn’t encouraging (this is so far away from the 100% watertight, 0 failure modes that we need). And Google Deepmind are throwing caution to the wind (despite safetywashing their statement with 7 mentions of the word “responsible”/”responsibly”).
The above also has the effect of shifting the public framing toward the burden being on the AI companies to prove their products are safe (in terms of not causing global catastrophe). I’m unsure as to whether the public at large would tolerate a 1% risk. Maybe they would (given the potential upside). But we are not in that world. The risk is at least 50%, probably closer to 99% imo.
Paul, you are saying “50/50 chance of doom” here (on the Bankless podcast). Surely that is enough to be using the suicide race argument!? I mean “it’s not suicide, it’s a coin flip; heads utopia, tails you’re doomed” seems like quibbling at this point. Or at least: you should be explicit when talking to CEOs that you think it’s 50⁄50 that AGI dooms us!
Re the you’re in “you’re doomed”—I used that instead of “we’re doomed”, because when CEOs hear “we’re”, they’re probably often thinking “not we’re, you’re. I’ll be alright in my secure compound in NZ”. But they really won’t! Do they think that if the shit hits the fan with this and there are survivors, there won’t be the vast majority of the survivors wanting justice?
Have you had an in depth discussion with Roman Yampolskiy? (If not, I think you should!)
I think the Overton Window is really shifting on the issue of AGI x-risk now, with it going mainstream. The burden of proof should be on the developers of AGI to prove that it is 100% safe (as opposed to the previous era where it was on the x-risk worriers to prove it was dangerous). Do you have a good post-GPT-4+plugins/AutoGPT (new 2023 era) answer to this question (a mechanistic explanation for why we get an ok outcome, given AGI).
I’m pushing back against the framing: “this is a suicide race with no benefit from winning.”
If there is a 10% chance of AI takeover, then there is a real and potentially huge benefit from winning the race. But we still should not be OK with someone unilaterally taking that risk.
I agree that AI developers should have to prove that the systems they build are reasonably safe. I don’t think 100% is a reasonable ask, but 90% or 99% seem pretty safe (i.e. robustly reasonable asks).
(Edited to complete cutoff sentence and clarify “safe.”)
Sorry, just to clarify, what do we mean by “safe” here? Clearly a 90% chance of the system not disempowering all of humanity is not sufficient (and neither would a 99% chance, though that’s maybe a bit more debatable), so presumably you mean something else here.
I mean that 90% or 99% seem like clearly reasonable asks, and 100% is a clearly unreasonable ask.
I’m just saying that the argument “this is a suicide race” is really not the way we should go. We should say the risk is >10% and that’s obviously unacceptable, because that’s an argument we can actually win.
Hmm, just to be clear, I think saying that “this deployment has a 1% chance of causing an existential risk, so you can’t deploy it” seems like a pretty reasonable ask to me.
I agree that I would like to focus on the >10% case first, but I also don’t want to set wrong expectations that I think it’s reasonable at 1% or below.
I agree. When I give numbers I usually say “We should keep the risk of AI takeover beneath 1%” (though I haven’t thought about it very much and mostly the numbers seem less important than the qualitative standard of evidence).
I think that 10% is obviously too high. I think that a society making reasonable tradeoffs could end up with 1% risk, but that it’s not something a government should allow AI developers to do without broader public input (and I suspect that our society would not choose to take this level of risk).
Cool, makes sense. Seems like we are mostly on the same page on this subpoint.
90% or 99% safe is still gambling the lives of 80M-800M humans in expectation (in the limit of scaling to superintelligence). I don’t think it’s acceptable for AI companies, with no democratic mandate, to be unilaterally making that decision!
Or did you mean to say something to that effect with this truncated sentence?
Yeah, the sentence cut off. I was saying: obviously a 10% risk is socially unacceptable. Trying to convince someone it’s not in their interest is not the right approach, because doing so requires you to argue that P(doom) is much greater than 10% (at least with some audiences who care a lot about winning a race). Whereas trying to convince policy makers and the public that they shouldn’t tolerate the risk requires meeting a radically lower bar, probably even 1% is good enough.
I think arguing P(doom|AGI) >>10% is a decent strategy. So far I haven’t had anyone give good enough reasons for me to update in the other direction. I think the CEOs in the vanguard of AGI development need to really think about this. If they have good reasons for thinking that P(doom|AGI) ≤ 10%, I want to hear them! To give a worrying example: LeCun is, frankly, sounding like he has no idea of what the problem even is. OpenAI might think they can solve alignment, but their progress on alignment to date isn’t encouraging (this is so far away from the 100% watertight, 0 failure modes that we need). And Google Deepmind are throwing caution to the wind (despite safetywashing their statement with 7 mentions of the word “responsible”/”responsibly”).
The above also has the effect of shifting the public framing toward the burden being on the AI companies to prove their products are safe (in terms of not causing global catastrophe). I’m unsure as to whether the public at large would tolerate a 1% risk. Maybe they would (given the potential upside). But we are not in that world. The risk is at least 50%, probably closer to 99% imo.
Paul, you are saying “50/50 chance of doom” here (on the Bankless podcast). Surely that is enough to be using the suicide race argument!? I mean “it’s not suicide, it’s a coin flip; heads utopia, tails you’re doomed” seems like quibbling at this point. Or at least: you should be explicit when talking to CEOs that you think it’s 50⁄50 that AGI dooms us!
Re the you’re in “you’re doomed”—I used that instead of “we’re doomed”, because when CEOs hear “we’re”, they’re probably often thinking “not we’re, you’re. I’ll be alright in my secure compound in NZ”. But they really won’t! Do they think that if the shit hits the fan with this and there are survivors, there won’t be the vast majority of the survivors wanting justice?