Michael St Jules 🔸 comments on Twitter-length responses to 24 AI alignment arguments

Michael St Jules 🔸 15 Mar 2022 7:57 UTC
4 points
0 ∶ 0
Is there a plausible (but perhaps impractical?) steelman of “merging with the AI” as building an AI with direct access to your values/preferences, and optimizing for those by design? It could simulate outcomes of actions, your judgements are queried to filter actions, and then you choose. Maybe you need separate simulations of your brain to keep up. The AI is trained with simulation accuracy as its objective, and is not an RL agent.

Maybe it could show you something to fool you or otherwise take over?
- Isaac King 17 Mar 2022 0:48 UTC
  5 points
  0 ∶ 0
  Parent
  Just pick a human to upload and let them recursively improve themselves into an SAI. If they’re smart enough to start out with, they might be able to keep their goals intact throughout the process.
  (This isn’t a strategy I’d choose given any decent alternative, but it’s better than nothing. Likely to be irrelevant though, since it looks like we’re going to get GAI before we’re even close to being able to upload a human.)