Web Scale Analytics Reading List

Big data is taking the world by storm and with it comes an explosion of new ideas and technologies looking to help us understand what this data is telling us. With VLDB 2012 under way I decided to take another look at the literature to see what advances are out there as well as refresh on the classics. The result of this deep dive is the web scale analytics reading list below. The list is grouped at a high level into column oriented database solutions and online analytical processing (OLAP) solutions. Column oriented databases are by far more powerful but also more complicated to implement. As such, much of the work on column oriented databases is being done at companies that are building one as a product. The obvious exception being Google. OLAP systems on the other hand, while less powerful, are simpler to implement. For this reason we see a variety of companies rolling their own solutions in response to their growing analytics problems.

Column Oriented Databases

Brighthouse

An architectural overview of the Brighthouse (now Infobright) database