The Business Pros and Cons of NoSQL

When I first read the post, I thought this was some sort of protest movement, as in we're not going to take SQL any more. And in a way, it is-but really, it's more useful to think of it as an alternative way to approach data, rather than anti-SQL. Instead of the good ole relational database model we all know and love, NoSQL uses a non-relational and distributed data store approach.

That's pretty geeky, right? I mean, who cares? Let the data admin people sort it out.

Well, it turns out, there are some rather intriguing advantages to NoSQL offerings-advantages a business user would notice and appreciate.

In short, NoSQL solutions offer a way to store and use large amounts of data, but with:

Less overhead

Less work

Less downtime

Faster results

But keep in mind, when I say "large amounts of data," I mean LARGE amounts of data. As Malik wrote in his post, Digg manages 40 million visitors a month, for roughly 500 million page views each month. There's 20,000 submissions each day, with 170,000 daily Diggs and 19,000 comments: That's not what you'd call typical of most businesses.

Our primary motivation for moving away from MySQL is the increasing difficulty of building a high performance, write intensive, application on a data set that is growing quickly, with no end in sight. ... As our system grows, it's important for us to span multiple data centers for redundancy and network performance and to add capacity or replace failed nodes with no downtime. We plan to continue using commodity hardware, and to continue assuming that it will fail regularly. All of this is increasingly difficult with MySQL.

I should note that in Digg's case, it's not just about switching from SQL to NoSQL. As Quinn shared in his post, Cassandra offers distinct advantages over other NoSQL tools. Twitter also recently switched to Cassandra.

Quite possibly there are advantages beyond effectively storing and moving massive amounts of data. For instance, last October, someone posted about the possible implications of NoSQL for business intelligence. But for the most part, the discussion is focused on those who trade in large volumes of data.

If you're curious about other NoSQL options and how they work, Database Journal has a nice, short piece coming off last week's NoSQL Live conference in Boston. It looks at some of the other solutions, and includes this piece of cautionary advice from Mark Atwood, community development director at memcached vendor Gear6:

In Atwood's view, the learning curve for many of the NoSQL systems can be too steep, particularly in cases when the RDBMS approach would only require a database with a few tables. He noted that not all of the NoSQL database query tools are as well understood as their SQL counterparts, which have been in use for many years.

I think the catalyst for nosql trend was a desire for scalability: so certainly solving scale and big data problems is very key to the space.

That said, the solutions are not relational data model, so once they moved beyond that, there are new opportunities to innovate in other ways. Some of the products make development a lot easier - I've seen a lot of developers of small scale data apps starting to really enjoy developing on NoSQL for productivity reasons.

I find that the nosql products are good at "online" problems, and less suitable for bulk load data warehousing and business intelligence. SQL is very useful in that context.

This is not to say I'm not a fan of NoSQL, the various "other than SQL" storage models and patterns that are emerging. There are many use cases and scales where one of them will fit very well.

Just don't commit a live site to a NoSQL solution until you and your arch and ops team have spent the time to understand what you're getting into. This is all very new, so there are going to be a lot of surprises, and also a lot of pressure to use something just because it's "new and shiney", instead of old and understood.

For online transation processing (OLTP) aka data entry, relational databases are a great solution because of its ability to enforce data integrity and handle locking. Retrieving data from huge datasets is not a problem that relational databases handle well because of the overhead involved to handle data integrity, indexing, complex joins, etc.

NoSQL is a perfect fit for these types of problems because it is built for speed. Keep in mind that often, NoSQL deals with data sets that are historical in nature (ie transactions, web logs, etc.) which have static rows and don't have the need to deal with constant updates to the data. Often large datasets are denormalized or flattened which uses more space but eliminates the need for joining huge tables together. Normalized data is what gives you the data integrity that relational database enforce. Denormalized data makes the processing sequential, thus allowing NoSQL to distribute its work load and quickly access and retrieve the desired rows.

So for static, historical type data, you just can't beat the speed and efficiency of NoSQL. For dynamic, data entry type data, relational databases are better suited. To put it simply, relational databases solved problems when data integrity is the key, NoSQL solves problems when speed and size are the key. Most data warehouse data has gone through an ETL process where the data integrity issues were handled. That means that many data warehouse datasets are flattened and built for speed.

Useful article. You make me interesting in NoSQL. This month I try new ways to promote my business, name badges, different discounts at some products. For no is good, but I want to develop my business through Internet. Are you saying that this NoSQL is helpful for large data?

My observation is that NoSQL is often taken for performance reasons. However, performance is often a bad reason to choose NoSQL, especially if the side effects, like eventual consistency, are poorly understood.

I agree the organisation and structuring of data is very important and getting it right is vital. We have databases of name badges supplied and these need to be stored effectively. NoSQL is definitely to be investigated.

I work for a NoSQL database company. The NoSQL advantages that our customers commonly relate are performance, scalability and ease of development (compared to RDBMS'). Having worked in the relational db world for many years, I would agree that they are best suited for OLTP and BI apps where data integrity is key.