To my mind they are fully complementary: Iterated Amplification is a general scheme for AI alignment, whereas this post describes an application area where we could use and learn more about various alignment schemes. I personally think using amplification for aligning recommender systems is very much worth trying. It would have great direct positive effects if it worked, and the experiment would shed light on the viability of the scheme as a whole.
Thanks. I guess I’m fuzzy on what your actual research proposal is.
Are you proposing to implement an Iterated Amplification approach on existing recommender systems?
Or are you more agnostic about specific implementations? (“Hey, better alignment of recommender systems seems important, but we don’t yet know what to do about that specifically.”)
Definitely the latter. Though I would frame it more optimistically as “better alignment of recommender systems seems important, there’s a lot of plausible solutions out there, let’s prioritize them and try out the few most promising ones”. Actually doing that prioritization was out of scope for this post but definitely something we want to do—and are looking for collaborators on.
Could you say a little bit about how this approach compares to Christiano’s Iterated Amplification?
To my mind they are fully complementary: Iterated Amplification is a general scheme for AI alignment, whereas this post describes an application area where we could use and learn more about various alignment schemes. I personally think using amplification for aligning recommender systems is very much worth trying. It would have great direct positive effects if it worked, and the experiment would shed light on the viability of the scheme as a whole.
Thanks. I guess I’m fuzzy on what your actual research proposal is.
Are you proposing to implement an Iterated Amplification approach on existing recommender systems?
Or are you more agnostic about specific implementations? (“Hey, better alignment of recommender systems seems important, but we don’t yet know what to do about that specifically.”)
Definitely the latter. Though I would frame it more optimistically as “better alignment of recommender systems seems important, there’s a lot of plausible solutions out there, let’s prioritize them and try out the few most promising ones”. Actually doing that prioritization was out of scope for this post but definitely something we want to do—and are looking for collaborators on.