Fueling Machine Learning Research

Yahoo has made its massive Yahoo News Feed data set—anonymized user interactions on Yahoo services, including Yahoo News, Yahoo Sports, and Yahoo Finance—publicly available to spur machine learning research. Yahoo’s 13.5 terabyte data set contains approximately 110 billion user interactions for 20 million users from February 2015 to May 2015. Yahoo published the data to help academics working in data science, who often do not have access to large-scale data sets that can help train machine learning algorithms and lead to new analytic techniques.