About: Reinforcement learning is increasingly applied to social, data-limited environments where agents must interact with others without the luxury of online experimentation. However, standard offline RL benchmarks are non-social. This seminar introduces Molten Pot, a benchmark for offline mixed-motive social RL that spans five substrates from DeepMind’s Melting Pot environment and ~1TB of trajectory data. Testing agents across three levels of social complexity reveals a significant ‘social robustness gap’.
Hi everyone! Make sure to join the Cooperative AI Foundation’s next seminar: