RSS

Indi­rect normativity

TagLast edit: 16 Mar 2022 20:49 UTC by Pablo

Indirect normativity is an approach to the AI alignment problem that attempts to specify AI values indirectly, such as by reference to what a rational agent would value under idealized conditions, rather than via direct specification.

Further reading

Bostrom, Nick (2014) Superintelligence: Paths, Dangers, Strategies, Oxford: Oxford University Press, ch. 13.

Christiano, Paul (2012) A formalization of indirect normativity, Ordinary Ideas, April 21.

Yudkowsky, Eliezer (2013) Five theses, two lemmas, and a couple of strategic implications, Machine Intelligence Research Institute’s Blog, May 5.

Related links

AI alignment | motivation selection method

De­com­pos­ing al­ign­ment to take ad­van­tage of paradigms

Christopher King4 Jun 2023 14:26 UTC
2 points
0 comments4 min readEA link