Lukas, thanks for pulling together all these notes. To me, “cooperative AI” stands out and might deserve its own page(s). This terminology covers remarkably broad and disparate pursuits. In the words of Dafoe, et al. (mostly of the Cooperative AI Foundation):
“A first cluster consists of AI–AI cooperation, tackling ever more difficult, rich and realistic settings (see ‘Four elements of cooperative intelligence’).”—this is notably the focus of FOCAL@CMU, who are looking at “game theory appropriate for advanced, autonomous AI agents – with a focus on achieving cooperation”.
“A second is AI–human cooperation, for which we will need to advance natural-language understanding, enable machines to learn about people’s preferences, and make machine reasoning more accessible to humans.”—big problems but plenty happening here, of course, with RLHF and research on alignment (representation, etc.).
“A third cluster is work on tools for improving (and not harming) human–human cooperation, such as ways of making the algorithms that govern social media better at promoting healthy online communities.”
This last one seems neglected, in my view, probably because it is an an inherently less straightforward and more interdisciplinary problem to tackle. But it’s also arguably the one with the single greatest upside potential. Will MacAskill, in describing “the best possible future” imagines “technological advances… in the ability to reflect and reason with one another”. Already today, there’s a wealth of social psychology research on what creates connection and cooperation; these ideas might be implemented at scale, with the help of AI—to help us understand, connect, and achieve things together. In a narrow sense, that might help scientists collaborate. In a bigger sense, it might ultimately reverse societal polarization and help unite humankind, in way that reduces existential risk and increases upside potential more than anything else we could do.
Lukas, thanks for pulling together all these notes. To me, “cooperative AI” stands out and might deserve its own page(s). This terminology covers remarkably broad and disparate pursuits. In the words of Dafoe, et al. (mostly of the Cooperative AI Foundation):
“A first cluster consists of AI–AI cooperation, tackling ever more difficult, rich and realistic settings (see ‘Four elements of cooperative intelligence’).”—this is notably the focus of FOCAL@CMU, who are looking at “game theory appropriate for advanced, autonomous AI agents – with a focus on achieving cooperation”.
“A second is AI–human cooperation, for which we will need to advance natural-language understanding, enable machines to learn about people’s preferences, and make machine reasoning more accessible to humans.”—big problems but plenty happening here, of course, with RLHF and research on alignment (representation, etc.).
“A third cluster is work on tools for improving (and not harming) human–human cooperation, such as ways of making the algorithms that govern social media better at promoting healthy online communities.”
This last one seems neglected, in my view, probably because it is an an inherently less straightforward and more interdisciplinary problem to tackle. But it’s also arguably the one with the single greatest upside potential. Will MacAskill, in describing “the best possible future” imagines “technological advances… in the ability to reflect and reason with one another”. Already today, there’s a wealth of social psychology research on what creates connection and cooperation; these ideas might be implemented at scale, with the help of AI—to help us understand, connect, and achieve things together. In a narrow sense, that might help scientists collaborate. In a bigger sense, it might ultimately reverse societal polarization and help unite humankind, in way that reduces existential risk and increases upside potential more than anything else we could do.