The commoditization of technology has reached its pinnacle with the advent of the recent paradigm of Cloud Computing. Infosys Cloud Computing blog is a platform to exchange thoughts, ideas and opinions with Infosys experts on Cloud Computing

May 31, 2010

With the increasing popularity of social media and cloud based services key value (K-V) stores are gaining prominence. For cloud based application and services which require internet scale data management, K-V databases are increasingly being preferred over traditional relational databases. While vendor products in relational database have matured, those in K-V space are still evolving. Today some of the prominent players in K-V database are Amazon's SimpleDB, project Voldemort, couch db, etc.In this post let's take a quick look on how K-V stores differ from relational database.

Scalability: K-V databases are more scalable as compared to relational databases. Scaling up and data replication is not a problem with K-V stores as they store data in the key-value format. In cloud based services and application where data explodes exponentially with popularity, this is one of the key requirement. For e.g., due to increasing popularity of applications like Facebook (uses cassandra), linkedin (voldemort), twitter (uses cassandra), etc are using K-V stores for their data management.

Data Availability & durability: Data is replicated on different nodes in case of K-V stores. All of the nodes are updated with latest data and nodes containing latest data are identified while retrieval. This marks K-V stores for high availability & data durability.

Data Integrity: Data integrity is lost in K-V databases as there is no scope of defining the datatype. A String value can be entered successfully for the key where integer was expected because value holder is common for all keys and the user can end up throwing a number format exception due to this. On the other hand, this is not allowed in relational databases as this check is present at DB level itself. Though this can be considered as advantage also as user don't have to deal with datatypes in K-V stores.

Data Creation: K-V stores do not follow object-relational mapping. This binds the data layer with the application logic to some extent unlike relation databases. Hence to modify/create data, custom API has to be written rather than standard API calls/ simple SQL queries as in case of relational database.

Fetching data: K-V store makes data fetching a little complex.

For e.g. :

In case of relational DB, one can easily fetch data using a simple select sql query like:

Select CUST_ID from CUSTOMER where LOCATION="INDIA";

In case of K-V, the same has to be implemented as

Select CUST_ID from CUSTOMER where KEY1='LOCATION' and VALUE1='INDIA';

With increase in conditions, query complexity will also increase.

Extensibility : Relational databases are strongly schema typed i.e. any change in table structure needs DBA interference. Whereas in case of K-V store, one can easily add the column without changing the table structure and without the help of DBA. Hence, it provides more flexibility.

Cross vendor compatibility: Unlike relational databases, each of the cloud databases has their own API which makes migration from one to another difficult. While in case of relational, switching between different vendors would not involve many changes.

Due to the advantages like high availability, simplicity, high scalability, low latency and replication mechanism of K-V stores, they are generally preferred over relational database for cloud based applications. There are many market leaders like Amazon's simple db, project voldemort, Cassandra, couch db which takes care of all technical complicacies and have an a simple API for meeting users' requirements packed with all advantages of cloud databases.