PhD candidate at Goldsmiths College, University of London. Title: ‘Reasons for Persons, or the Good Successor Problem’. Abstract: AI alignment aims for advanced machine intelligences that preserves and enhances human welfare (appropriately defined). In a narrow sense, this includes not reflecting existing biases in society or destabilising political systems; more broadly, it could also mean not creating conditions that result in the extinction or disempowerment of humanity. This project tries to define a alternative, speculative vision of alignment: one that relaxes the assumption (tacit in some alignment discourse) that humans must indefinitely retain control over the future. My exploration instead aims for some ambitious and catholic notion of value (that is, avoiding maximisers of the squiggle/paperclip/hedonium varieties and fleshing out sources of value that don’t hinge on a biological human subject). I draw upon philosophy (moral realism, population ethics, and decision theory), aesthetic theory (can the human tendency to make and appreciate aesthetic products, something broadly shared across cultures, be generalised to AIs or is it some contingent, and evolutionarily-useful, practice that arose amongst a particular set of primates?) This project has an empirical aspect: to the extent possible, I want to enrich speculations with experiments using LLMs in multi-agent setups.
ukc10014
Karma: 56