Here is a vaguely related rough project proposal I once wrote (apologies for the academic philosophy jargon):
“Implications of evaluative indeterminacy and ‘ontological crises’ for AI alignment
There is a broadly Wittgensteinian/Quinean view which says that we just can’t make meaningful judgments about situations we’re too unfamiliar with (could apply to epistemic judgments, evaluative judgments, or both). E.g. Parfit briefly mentions (and tries to rebut) this in Reasons and Persons before discussing the more science-fiction-y thought experiments about personal identity.
A more moderate variant would be the claim that such judgments at least are underdetermined; e.g. perhaps there are adequacy conditions (say, consistency, knowing all relevant facts, …) on the process by which the judgment is being made, but the judgment’s content is sensitive to initial conditions or details of the process that are left open.
One reason to believe in such views could be the anticipation that some of our current concepts will be replaced in the future. E.g. perhaps ‘folk psychology’ will be replaced by some sort of scientific theory of consciousness. In LW terminology, this is known as ‘ontological crisis’.
Due to its unusual spatiotemporal and conceptual scope, the question of how to best shape the long-term future – and so by extension AI alignment – depends on many judgments that seem likely candidates for being un(der)determined if one of those views is true, in areas such as: Population ethics on astronomical scales; consciousness and moral patienthood of novel kinds of minds; axiological value of alien or posthuman civilizations; potential divergence of currently contingently relatively convergent axiologies (e.g. desire satisfaction vs. hedonism); ethical implications of creating new universes; ethical implications of ‘acausal effects’ of our actions across the multiverse, etc. (Some of these things might not even make sense upon scrutiny.)
Some of these issues might be about metaethics; some might be theoretical challenges to consequentialism or other specific theoretical views; some might have practical implications for AI alignment or EA efforts to shape the long-term future more broadly.”
Here is a vaguely related rough project proposal I once wrote (apologies for the academic philosophy jargon):
“Implications of evaluative indeterminacy and ‘ontological crises’ for AI alignment
There is a broadly Wittgensteinian/Quinean view which says that we just can’t make meaningful judgments about situations we’re too unfamiliar with (could apply to epistemic judgments, evaluative judgments, or both). E.g. Parfit briefly mentions (and tries to rebut) this in Reasons and Persons before discussing the more science-fiction-y thought experiments about personal identity.
A more moderate variant would be the claim that such judgments at least are underdetermined; e.g. perhaps there are adequacy conditions (say, consistency, knowing all relevant facts, …) on the process by which the judgment is being made, but the judgment’s content is sensitive to initial conditions or details of the process that are left open.
One reason to believe in such views could be the anticipation that some of our current concepts will be replaced in the future. E.g. perhaps ‘folk psychology’ will be replaced by some sort of scientific theory of consciousness. In LW terminology, this is known as ‘ontological crisis’.
Due to its unusual spatiotemporal and conceptual scope, the question of how to best shape the long-term future – and so by extension AI alignment – depends on many judgments that seem likely candidates for being un(der)determined if one of those views is true, in areas such as: Population ethics on astronomical scales; consciousness and moral patienthood of novel kinds of minds; axiological value of alien or posthuman civilizations; potential divergence of currently contingently relatively convergent axiologies (e.g. desire satisfaction vs. hedonism); ethical implications of creating new universes; ethical implications of ‘acausal effects’ of our actions across the multiverse, etc. (Some of these things might not even make sense upon scrutiny.)
Some of these issues might be about metaethics; some might be theoretical challenges to consequentialism or other specific theoretical views; some might have practical implications for AI alignment or EA efforts to shape the long-term future more broadly.”