unless you get Pascal’s Mugged by the word “reliably”
I don’t think it’s a case of Pascal’s Mugging. Given the stakes (extinction), even a 1% risk of a lab leak for a next gen model is more than enough to not build it (I think we’re there already).
who aren’t ML experts, so e.g. not Connor Leahy
Connor Leahy is an ML expert (he co-founded EleutherAI before realising x-risk was an massive issue).
I don’t see any good reason to expect this to change for GPT-4.5.
To me this sounds like you are expecting scaling laws to break? Or not factoring it being given access to other tools such as planners (AutoGPT etc) or plugins.
Though I do think it would both be an overreaction and would increase x-risk, so I am pretty strongly against it.
How would it increase x-risk!? We’re not talking about a temporary pause with potential for an overhang. Or a local pause with potential for less safe actors to race ahead. The only sensible (and, I’d say, realistic) pause is global and indefinite, until global consensus on x-safety or global democratic mandate to proceed; and lifted gradually to avoid sudden jumps in capability.
I think you are also likely to be quite biased if you are, in fact, working for (and being paid good money by) a Major AI Lab (why the anonymity?)
I think the crux boils down to you basically saying “we can’t be certain that it would be very dangerous, therefore we should build it to find out”. This, to me, it totally reckless when it comes to the stakes being extinction (we really do not want to be FAFO-ing this! Where is your security mindset?). You don’t seem to put much (any?) weight on lab leaks during training (as a result of emergent situational awareness). “Responsible Scaling” isanythingbut, in the situation we are now in.
Also, the disadvantages you mention around Goodharting make me think that the only sensible way to proceed is to just shut it all down.
You say that you disagree with Nora over alignment optimism, but then also that you “strongly disagree” with “the premise that if smarter-than-human AI is developed in the near future, then we almost surely die, regardless of who builds it” (Rob’s post). I think you are also way too optimistic about alignment work on it’s current trajectory actually leading to x-safety by saying this.
I don’t think it’s a case of Pascal’s Mugging. Given the stakes (extinction), even a 1% risk of a lab leak for a next gen model is more than enough to not build it (I think we’re there already).
Connor Leahy is an ML expert (he co-founded EleutherAI before realising x-risk was an massive issue).
To me this sounds like you are expecting scaling laws to break? Or not factoring it being given access to other tools such as planners (AutoGPT etc) or plugins.
How would it increase x-risk!? We’re not talking about a temporary pause with potential for an overhang. Or a local pause with potential for less safe actors to race ahead. The only sensible (and, I’d say, realistic) pause is global and indefinite, until global consensus on x-safety or global democratic mandate to proceed; and lifted gradually to avoid sudden jumps in capability.
I think you are also likely to be quite biased if you are, in fact, working for (and being paid good money by) a Major AI Lab (why the anonymity?)
Every comment of yours so far has misunderstood or misconstrued at least one thing I said, so I’m going to bow out now.
I think the crux boils down to you basically saying “we can’t be certain that it would be very dangerous, therefore we should build it to find out”. This, to me, it totally reckless when it comes to the stakes being extinction (we really do not want to be FAFO-ing this! Where is your security mindset?). You don’t seem to put much (any?) weight on lab leaks during training (as a result of emergent situational awareness). “Responsible Scaling” is anything but, in the situation we are now in.
Also, the disadvantages you mention around Goodharting make me think that the only sensible way to proceed is to just shut it all down.
You say that you disagree with Nora over alignment optimism, but then also that you “strongly disagree” with “the premise that if smarter-than-human AI is developed in the near future, then we almost surely die, regardless of who builds it” (Rob’s post). I think you are also way too optimistic about alignment work on it’s current trajectory actually leading to x-safety by saying this.