Friday Dec 05, 2025

DataRater: Meta-Learned Dataset Curation

In this episode, we discuss DataRater: Meta-Learned Dataset Curation by Dan A. Calian, Gregory Farquhar, Iurii Kemaev, Luisa M. Zintgraf, Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, Tom Schaul, Jeffrey Dean, Hado van Hasselt, David Silver. The paper proposes DataRater, a meta-learning approach that estimates the value of individual training data points to improve dataset curation. By leveraging meta-gradients, DataRater optimizes data selection to enhance training efficiency on held-out data. Experiments demonstrate that filtering data with DataRater significantly boosts compute efficiency across various model scales and datasets.

Comment (0)

No comments yet. Be the first to say something!

Copyright 2023 All rights reserved.

Podcast Powered By Podbean

Version: 20241125