Marcel D comments on My attempt at explaining the case for AI risk in a straightforward way

Marcel D Mar 25, 2023, 7:08 PM
22 points
8 ∶ 0
FWIW, I just wrote a midterm paper in the format of a memo to the director of OSTP on basically this topic (“The potential growth of AI capabilities and catastrophic risks by 2045”). One of the most frustrating aspects of writing the paper was trying to find externally-credible sources for claims with which I was broadly familiar, rather than links to e.g., EA Forum posts. I think it’s good to see the conceptual explainers, but in the future I would very much hope that a heavily time-constrained but AI-safety-conscious staffer can more easily find credible sources for details and arguments such as the analogy to human evolution, historical examples and analyses of discontinuities in technological progress, the scaling hypothesis and forecasts based heavily on compute trends (please, Open Phil, something other than a Google Doc! Like, can’t we at least get an Arxiv link?), why alignment may be hard, etc.
I guess my concern is that as the explainer guides proliferate, it can be harder to find the guides that actually emphasize/provide credible sources… This concern probably doesn’t make new guides net negative, but think it could potentially be mitigated, maybe by having clear, up-front links to other explainers which do provide better sources. (Or if there were some spreadsheet/list of “here are credible sources for X claim, with multiple variant phrasings of X claim,” that might also be nice...)
What links here?
- Platform for Project Spitballing? (e.g., for AI field building) by Marcel D (Apr 3, 2023, 3:45 PM; 7 points)
- JulianHazell Mar 25, 2023, 7:18 PM
  4 points
  0 ∶ 0
  Parent
  +1, I would like things like that too. I agree that having much of the great object-level work in the field route through forums (alongside a lot of other material that is not so great) is probably not optimal.
  I will say though that going into this, I was not particularly impressed with the suite of beginner articles out there — sans some of Kelsey Piper’s writing — and so I doubt we’re anywhere close to approaching the net-negative territory for the marginal intro piece.
  One approach to this might be a soft norm of trying to arxiv-ify things that would be publishable on arxiv without much additional effort.
  - Marcel D Mar 26, 2023, 3:45 AM
    2 points
    1 ∶ 0
    Parent
    I think that another major problem is simply that there is no one-size-fits-all intro guide. I think I saw some guides by Daniel Eth (or someone else?) and a few other people that were denser than the guide you’ve written here, and yeah the intro by Kelsey Piper is also quite good.
    
    I’ve wondered if it could be possible/valuable to have a curated list of the best intros, and perhaps even to make a modular system, so people can customize better for specific contexts. (Or maybe having numerous good articles would be valuable if eventually someone wanted to and could use them as part of a language model prompt to help them write a guide tailored to a specific audience??)
    - Darren McKee Mar 26, 2023, 6:29 PM
      1 point
      0 ∶ 0
      Parent
      Interesting points. I’m working on a book which is not quite a solution to your issue but hopefully goes in the same direction.
      And I’m now curious to see that memo :)
      - Marcel D Mar 26, 2023, 6:48 PM
        3 points
        0 ∶ 0
        Parent
        Which issue are you referring to? (External credibility?)
        I don’t see a reason to not share the paper, although I will caveat that it definitely was a rushed job. https://docs.google.com/document/d/1ctTGcmbmjJlsTQHWXxQmhMNqtnVFRPz10rfCGTore7g/edit
        Darren McKee Mar 27, 2023, 12:58 PM
        1 point
        0 ∶ 0
        Parent
        I was referring to external credibility if you are looking for a scientific paper with the key ideas. Secondarily, an online, modular guide is not quite the frame of the book either (although it could possible be adapted towards such a thing in the future)