Reddit 放出完整的全站投稿資料

Data is complete from January 01, 2008 thru August 31, 2015. Partial data is available for years 2006 and 2007. The reason for this is that the id's used when Reddit was just a baby were scattered a bit -- but I am making an attempt to grab all data from 2006 and 2007 and will make a supplementary upload for that data once I'm satisfied that I've found all data that is available.

約 42GB 的資料，幾乎是公開的資料都包含進去了：

This dataset represents approximately 200 million submission objects with score data, author, title, self_text, media tags and all other attributes available via the Reddit API.