Holly Elmore ⏸️ 🔸 comments on Debate series: should we push for a pause on the development of AI?

Holly Elmore ⏸️ 🔸Sep 19, 2023, 2:14 AM
10 points
2 ∶ 0
To stick with your analogy, each time we do evals we thin out the fog a bit, with the intention of clearing it before we reach the edge, as well as improving our ability to stop.
How does doing evals improve your ability to stop? What concrete actions will you take when an eval shows a dangerous result? Do none of them overlap with pausing?
- Lukas_Gloor Sep 19, 2023, 10:47 AM
  7 points
  0 ∶ 0
  Parent
  Evals showing dangerous capabilities (such as how to build a nuclear weapon) can be used to convince lawmakers that this stuff is real and imminent.
  
  Of course, you don’t need that if lawmakers already agree with you – in that case, it’s strictly best to not tinker with anything dangerous.
  
  But assuming that many lawmakers will remain skeptical, one function of evals could be “drawing out an AI warning shot, making it happen in a contained and controlled environment where there’s no damage.”
  
  Of course, we wouldn’t want evals teams to come up with AI capability improvements, so evals shouldn’t become dangerous AI gain-of-function research. Still, it’s a spectrum because even just clever prompting or small tricks can sometimes unearth hidden capabilities that the model had to begin with, and that’s the sort of thing that evals should warn us about.