richard_ngo comments on P(doom|AGI) is high: why the default outcome of AGI is doom

richard_ngo 2 May 2023 18:52 UTC
6 points
4 ∶ 2
If you apply a security mindset (Murphy’s Law) to the problem of AI alignment, it should quickly become apparent that it is very difficult.
FYI I disagree with this. I think that the difficulty of alignment is a complicated and open question, not something that is quickly apparent. In particular, security mindset is about beating adversaries, and it’s plausible that we train AIs in ways that mostly avoid them treating us as adversaries.
What links here?
- Greg_Colbourn ⏸️ 's comment on P(doom|AGI) is high: why the default outcome of AGI is doom by Greg_Colbourn ⏸️ (2 May 2023 19:58 UTC; 6 points)
- Greg_Colbourn ⏸️ 2 May 2023 19:32 UTC
  3 points
  1 ∶ 0
  Parent
  Interesting perspective, although I’m not sure how much we actually disagree. “Complicated and open”, to me reads as “difficult” (i.e. the fact that it is still open means it has remained unsolved. For ~20 years now.).
  
  And re “adversaries”, I feel like this is not really what I’m thinking of when I think about applying security mindset to transformative AI (for the most part—see next para.). “Adversary” seems to be putting too much (malicious) intent into the actions of the AI. Another way of thinking about misaligned transformative AI is as a super powered computer virus that is in someways an automatic process, and kills us (manslaughters us?) as collateral damage. It seeps through every hole that isn’t patched. So eventually, in the limit of superintelligence, all the doom flows through the tiniest crack in otherwise perfect alignment (the tiniest crack in our “defences”).
  However, having said that, the term adversaries is totally appropriate when thinking of human actors who might maliciously use transformative AI to cause doom (Misuse risk, as referred to in OP). Any viable alignment solution needs to prevent this from happening too! (Because we now know there will be no shortage of such threats).
  - jai 2 May 2023 22:15 UTC
    3 points
    1 ∶ 0
    Parent
    Interesting perspective, although I’m not sure how much we actually disagree. “Complicated and open”, to me reads as “difficult”
    Is there a rephrasing of the initial statement you would endorse that makes this clearer? I’d suggest “If you apply a security mindset (Murphy’s Law) to the problem of AI alignment, it should quickly become apparent that we do not currently possess the means to ensure that any given AI is safe.”
    - Greg_Colbourn ⏸️ 3 May 2023 9:19 UTC
      3 points
      1 ∶ 0
      Parent
      Yes, I would endorse that phrasing (maybe s/”safe”/”100% safe”). Overall I think I need to rewrite and extend the post to spell things out in more detail. Also change the title to something less provocative^[1] because I get the feeling that people are knee-jerk downvoting without even reading it, judging by some of the comments (i.e. I’m having to repeat things I refer to in the OP).
      ^
      perhaps “Why the most likely outcome of AGI is doom”?