RobBensinger comments on What does it mean for an AGI to be ‘safe’?

RobBensinger 9 Oct 2022 0:36 UTC
2 points
0 ∶ 0
“ASI x-safety” might be a better term for other reasons (though Nate objects to it here), but by default, I don’t think we should be influenced in our terminology decisions by ‘term T will cause some alignment researchers to have falser beliefs and pursue dumb-but-harmless strategies, and maybe this will be good’. (Or, by default this should be a reason not to adopt terminology.)
Whether current terms cause muddle and misunderstanding doesn’t change my view on this. In that case, IMO we should consider changing to a new term in order to reduce muddle and misunderstanding. We shouldn’t strategically confuse and mislead people in a new direction, just because we accidentally confused or misled people in the past.
- Greg_Colbourn 10 Oct 2022 10:03 UTC
  2 points
  0 ∶ 0
  Parent
  What are some better options? Or, what are your current favourites?
  - RobBensinger 15 Oct 2022 16:12 UTC
    4 points
    0 ∶ 0
    Parent
    “AGI existential safety” seems like the most popular relatively-unambiguous term for “making the AGI transition go well”, so I’m fine with using it until we find a better term.
    I think “AI alignment” is a good term for the technical side of differentially producing good outcomes from AI, though it’s an imperfect term insofar as it collides with Stuart Russell’s “value alignment” and Paul Christiano’s “intent alignment”. (The latter, at least, better subsumes a lot of the core challenges in making AI go well.)
    What links here?
    Davidmanheim's comment on Don’t Call It AI Alignment by Gil (20 Feb 2023 14:19 UTC; 4 points)
    - Greg_Colbourn 19 Oct 2022 17:24 UTC
      2 points
      0 ∶ 0
      Parent
      Perhaps using “doom” more could work (doom encompasses extinction, permanent curtailment of future potential, and fates worse than extinction).