The performance of machine learning models is closely related to their amount of training data, compute, and number of parameters. At Epoch, we’re investigating the key inputs that enable today’s AIs to reach new heights.
Our recently expanded Parameter, Compute and Data Trends database traces these details for hundreds of landmark ML systems and research papers.
Building on a model and parameter dataset we first introduced in 2021, we’ve newly collected or edited data for over 400 systems, enriching the records with extra details. In the past six months, we’ve added 240 new language models and 170 compute estimates.
We will be maintaining this dataset, updating it with more historical information, and adding new significant releases. It’s a valuable resource for journalists, academics, policymakers, and anyone interested in understanding the trajectory of AI.
Explore the interactive visualization, check out the documentation, and access the data for your own research at epochai.org/data/pcd.
I haven’t explored the new database in depth, but the site looks really cool! I really love the visualizations/interactive charts. Thanks for making and sharing this.
Minor/quick suggestion: have you considered adding a screenshot (or two) to this post?