Channels

Services

Digg switches to "NoSQL" Cassandra

Cassandra logo
The Digg developers have announced that they are abandoning MySQL, the database that has been powering the social link sharing site, and are moving to Cassandra, a distributed, scalable "NoSQL" database. The move is another win for the NoSQL movement which looks to move from centralised SQL based databases to distributed key/value stores more suited to large scale web applications.

Cassandra was open sourced by Facebook in 2008 and is currently being developed as an Apache Incubator project. According to the Cassandra developers, it is in use at Rackspace, Facebook, Twitter and other companies with large active data sets. The system offers a fault tolerant, high availability, decentralised store for data which can be scaled up by adding hardware nodes to the system. Cassandra implements an "eventually consistent" model which trades-off consistency of data stores in the system for availability.

The Digg developers had looked at a number of open source NoSQL stores and decided to go with Cassandra, in part because of its column-orientation which allows for the storage of reasonably structured data. The developers tested Cassandra by replacing a critical component on the live Digg site in September 2009 and based on the results of the experiment have now moved most of Digg's functionality to using Cassandra as the primary store for data. The developers have also been working on Cassandra itself, implementing performance improvements such as faster compaction and row-level caching and are open sourcing their work.