While I’m on board with concerns about even aligned AI being aligned with the “wrong” people, I generally hate the use of the phrase “the real X” to suggest, often without any justification, that there can only be one “real” X. It creates a false conflict between goals that are otherwise arguably complementary. You can’t, after all, guarantee that an AI will be compassionate to animals if you’re unable to guarantee anything at all about its behaviour or long term plans.
Fair point. It’s more nuanced later on: “Almost all of the conversations about risk have to do with the potential consequences of AI systems pursuing goals that diverge from what they were programmed to do and that are not in the interests of humans. Everyone can get behind this notion of AI alignment and safety, but this is only one side of the danger. Imagine what could unfold if AI does do what humans want.”
While I’m on board with concerns about even aligned AI being aligned with the “wrong” people, I generally hate the use of the phrase “the real X” to suggest, often without any justification, that there can only be one “real” X. It creates a false conflict between goals that are otherwise arguably complementary. You can’t, after all, guarantee that an AI will be compassionate to animals if you’re unable to guarantee anything at all about its behaviour or long term plans.
Fair point. It’s more nuanced later on: “Almost all of the conversations about risk have to do with the potential consequences of AI systems pursuing goals that diverge from what they were programmed to do and that are not in the interests of humans. Everyone can get behind this notion of AI alignment and safety, but this is only one side of the danger. Imagine what could unfold if AI does do what humans want.”