AnonResearcherMajorAILab comments on Aim for conditional pauses

AnonResearcherMajorAILab Sep 27, 2023, 7:06 AM
0 points
0 ∶ 2
You’re right, you can’t do it. Therefore we should not call for a pause on systems more powerful than GPT-4, because we can’t reliably calibrate that systems more powerful than GPT-4 would be plausibly very dangerous. </sarcasm>
If you actually engage with what I write, I’ll engage with what you write. For reference, here’s the paragraph around the line you quoted:
We can get much wider support for a conditional pause. Most people can get on board with the principle “if an AI system would be very dangerous, don’t build it”, and then the relevant details are about when a potential AI system should be considered plausibly very dangerous. At one extreme, a typical unconditional pause proposal would say “anything more powerful than GPT-4 should be considered plausibly very dangerous”. As you make the condition less restrictive and more obviously tied to harm, it provokes less and less opposition.
There are also a couple of places where I discuss specific conditional pause proposals, you might want to read those too.
- Greg_Colbourn ⏸️ Sep 27, 2023, 10:05 AM
  2 points
  0 ∶ 0
  Parent
  You’re right, you can’t do it. Therefore we should not call for a pause on systems more powerful than GPT-4, because we can’t reliably calibrate that systems more powerful than GPT-4 would be plausibly very dangerous.
  The second sentence really doesn’t follow from the first! I would replace should not with should, and would with would not^[1] [Added 10 Oct: for those reading now, the “</sarcasm>” was not there when I responded]:
  
  We should call for a pause on systems more powerful than GPT-4, because we can’t reliably calibrate that systems more powerful than GPT-4 would not be very dangerous.
  
  We need to employ abundant precaution when dealing with extinction level threats!
  At one extreme, a typical unconditional pause proposal would say “anything more powerful than GPT-4 should be considered plausibly very dangerous”.
  I really don’t think this is extreme at all. Outside of the major AI labs and their lobbys (and this community), I think this is mostly regarded as normal and sensible.
  1. ^
    and remove the “plausibly”
  - AnonResearcherMajorAILab Sep 28, 2023, 7:35 AM
    1 point
    0 ∶ 1
    Parent
    we can’t reliably calibrate that systems more powerful than GPT-4 would not be very dangerous.
    Seems false (unless you get Pascal’s Mugged by the word “reliably”).
    We’re pretty good at evaluating models at GPT-4 levels of capability, and they don’t seem capable of autonomous replication. (I believe that evaluation is flawed because ARC didn’t have finetuning access, but we could do better evaluations.) I don’t see any good reason to expect this to change for GPT-4.5.
    Or as a more social argument: I can’t think of a single ML expert who’d agree with your claim (including ones who work in alignment, but excluding alignment researchers who aren’t ML experts, so e.g. not Connor Leahy or Eliezer Yudkowsky). Probably some do exist, but it seems like it would be a tiny minority.
    I really don’t think this is extreme at all.
    I was using the word extreme to mean “one end of the spectrum” rather than “crazy”. I’ve changed it to “end” instead.
    Though I do think it would both be an overreaction and would increase x-risk, so I am pretty strongly against it.
    - Greg_Colbourn ⏸️ Sep 28, 2023, 8:52 AM
      4 points
      1 ∶ 1
      Parent
      unless you get Pascal’s Mugged by the word “reliably”
      I don’t think it’s a case of Pascal’s Mugging. Given the stakes (extinction), even a 1% risk of a lab leak for a next gen model is more than enough to not build it (I think we’re there already).
      who aren’t ML experts, so e.g. not Connor Leahy
      Connor Leahy is an ML expert (he co-founded EleutherAI before realising x-risk was an massive issue).
      I don’t see any good reason to expect this to change for GPT-4.5.
      To me this sounds like you are expecting scaling laws to break? Or not factoring it being given access to other tools such as planners (AutoGPT etc) or plugins.
      Though I do think it would both be an overreaction and would increase x-risk, so I am pretty strongly against it.
      How would it increase x-risk!? We’re not talking about a temporary pause with potential for an overhang. Or a local pause with potential for less safe actors to race ahead. The only sensible (and, I’d say, realistic) pause is global and indefinite, until global consensus on x-safety or global democratic mandate to proceed; and lifted gradually to avoid sudden jumps in capability.
      
      I think you are also likely to be quite biased if you are, in fact, working for (and being paid good money by) a Major AI Lab (why the anonymity?)
      - AnonResearcherMajorAILab Sep 29, 2023, 7:23 AM
        −2 points
        0 ∶ 1
        Parent
        Every comment of yours so far has misunderstood or misconstrued at least one thing I said, so I’m going to bow out now.
        What links here?
        Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope by Greg_Colbourn ⏸️ (Oct 12, 2023, 11:24 AM; 76 points)
        Greg_Colbourn ⏸️ Sep 29, 2023, 8:28 AM
        1 point
        1 ∶ 1
        Parent
        I think the crux boils down to you basically saying “we can’t be certain that it would be very dangerous, therefore we should build it to find out”. This, to me, it totally reckless when it comes to the stakes being extinction (we really do not want to be FAFO-ing this! Where is your security mindset?). You don’t seem to put much (any?) weight on lab leaks during training (as a result of emergent situational awareness). “Responsible Scaling” is anything but, in the situation we are now in.
        
        Also, the disadvantages you mention around Goodharting make me think that the only sensible way to proceed is to just shut it all down.
        
        You say that you disagree with Nora over alignment optimism, but then also that you “strongly disagree” with “the premise that if smarter-than-human AI is developed in the near future, then we almost surely die, regardless of who builds it” (Rob’s post). I think you are also way too optimistic about alignment work on it’s current trajectory actually leading to x-safety by saying this.
        What links here?
        Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope by Greg_Colbourn ⏸️ (Oct 12, 2023, 11:24 AM; 76 points)
        Greg_Colbourn ⏸️ 's comment on Timelines are short, p(doom) is high: a global stop to frontier AI development until x-safety consensus is our only reasonable hope by Greg_Colbourn ⏸️ (Oct 13, 2023, 8:48 AM; 0 points)