As long as state-of-the-art alignment attempts by industry involve eliciting human evaluations of actual or hypothetical AI behaviors (e.g. responses a chatbot might give to a prompt, as in RLHF), it seems important to understand the psychological aspects of such human-AI interactions. I plan to do some experiments on what I call collective RLHF myself, more from a social choice perspective (see http://amsterdam.vodle.it ), and can imagine collaborating on similar questions.
Jobst—yes, I think ew need a lot more psych research on how to elicit the human values that AI systems are trying to align with. Especially given that some of our most important values either can’t be articulated very well, or are too ‘obvious’ and ‘common-sensical’ to be discussed much, or are embodied in our physical phenotypes rather than articulated in our brains.
This becomes particularly important in human feedback/input about “higher-level” or more “abstract” questions, as in OpenAI’s deliberative mini-public / citizen assembly idea (https://openai.com/blog/democratic-inputs-to-ai).
As long as state-of-the-art alignment attempts by industry involve eliciting human evaluations of actual or hypothetical AI behaviors (e.g. responses a chatbot might give to a prompt, as in RLHF), it seems important to understand the psychological aspects of such human-AI interactions. I plan to do some experiments on what I call collective RLHF myself, more from a social choice perspective (see http://amsterdam.vodle.it ), and can imagine collaborating on similar questions.
Jobst—yes, I think ew need a lot more psych research on how to elicit the human values that AI systems are trying to align with. Especially given that some of our most important values either can’t be articulated very well, or are too ‘obvious’ and ‘common-sensical’ to be discussed much, or are embodied in our physical phenotypes rather than articulated in our brains.
This becomes particularly important in human feedback/input about “higher-level” or more “abstract” questions, as in OpenAI’s deliberative mini-public / citizen assembly idea (https://openai.com/blog/democratic-inputs-to-ai).