The AGI model in the future might be scaled up to multiple orders of magnitude than current models. Can we extrapolate the behavior of large models from small models? First, Jared will briefly review scaling laws and introduce alignment and Anthropic’s pragmatic definitions and approach, while providing some very simple baselines and associated scaling trends. Next, we will discuss preference models and various details of scaling laws for it. Finally, some results using preference modeling to train helpful and harmless language models are discussed.
About the speaker series
With the advancement of Machine Learning research and the expected appearance of Artificial General Intelligence in the near future, it becomes an extremely important problem to positively shape the development of AI and to align AI values to that of human values.
In this speaker series, we bring state-of-art research on AI alignment into focus for audiences interested in contributing to this field. We will kick off the series by closely examining the potential risks engendered by AGIs and making the case for prioritizing the mitigation of risks now. Later on in the series, we will hear about more technical talks on concrete proposals for AI alignment.
You can participate in the talks in person at Harvard and MIT, as well as remotely through the webinar by registering ahead of time (link above). All talks happen at 5 pm EST (2 pm PST, 10 pm GMT) on Thursdays. Dinner is provided for in-person venues.
Scaling Laws in Neural Networks (AI Alignment Speaker Series)
Talk by Jared Kaplan, Johns Hopkins / Anthropic
About this talk
Scaling laws for training AI models
The AGI model in the future might be scaled up to multiple orders of magnitude than current models. Can we extrapolate the behavior of large models from small models? First, Jared will briefly review scaling laws and introduce alignment and Anthropic’s pragmatic definitions and approach, while providing some very simple baselines and associated scaling trends. Next, we will discuss preference models and various details of scaling laws for it. Finally, some results using preference modeling to train helpful and harmless language models are discussed.
About the speaker series
With the advancement of Machine Learning research and the expected appearance of Artificial General Intelligence in the near future, it becomes an extremely important problem to positively shape the development of AI and to align AI values to that of human values.
In this speaker series, we bring state-of-art research on AI alignment into focus for audiences interested in contributing to this field. We will kick off the series by closely examining the potential risks engendered by AGIs and making the case for prioritizing the mitigation of risks now. Later on in the series, we will hear about more technical talks on concrete proposals for AI alignment.
See the full schedule and register at https://www.harvardea.org/agathon.
You can participate in the talks in person at Harvard and MIT, as well as remotely through the webinar by registering ahead of time (link above). All talks happen at 5 pm EST (2 pm PST, 10 pm GMT) on Thursdays. Dinner is provided for in-person venues.