After several months of research, review and revision, a white paper I wrote for the SQL Azure team, “NoSQL and the Windows Azure Platform”, has been published by Microsoft. If you go to http://www.microsoft.com/windowsazure/whitepapers and do a find within the page for “NoSQL” you’ll see a link for it. If you’d rather download the PDF directly, you can do so by clicking here. The 25-page (not including cover and TOC) paper provides an introduction to NoSQL database technology, and its major subcategories, for those new to the subject; an examination of NoSQL technologies available in the cloud using Windows Azure and SQL Azure; and a critical discussion of the NoSQL and relational database approaches, including the suitability of each to line-of-business application development.

As I conducted my research for the paper, and read material written by members of the NoSQL community, I found a consistent sentiment toward, and desire for, cleaning the database technology slate. NoSQL involves returning to a basic data storage and retrieval approach. Many NoSQL databases, including even Microsoft’s Azure Table Storage, are premised on basic key-value storage technology – in effect, the same data structure used in collections, associative arrays and cache products. I couldn’t help thinking that the recent popularity of NoSQL is symptomatic of a generational and cyclical phenomenon in computing. As product categories (relational databases in this case) mature, products within them load up on features and create a barrier to entry for new, younger developers. The latter group may prefer to start over with a fresh approach to the category, rather than learn the wise old ways of products whose market presence predates their careers – sometimes by a couple of decades.

The new generation may do this even at the risk of regression in functionality. In the the case of NoSQL databases, that regression may include loss of “ACID” (transactional) guarantees; declarative query (as opposed to imperative inspection of collections of rows); comprehensive tooling; and wide availability of trained and experienced professionals. Existing technologies have evolved in response to the requirements, stress tests, bug reports, and user suggestions accumulated over time. And sometimes old technologies can even be used in ways equivalent to the new ones. Two cases in point: the old SQL Server Data Services was a NoSQL store, and its underlying implementation used SQL Server. Even the developer fabric version of Azure Table Storage is implemented using SQL Server Express Edition’s XML columns.

So if older technologies are proven technologies, and if they can be repurposed to function like some of the newer ones, what causes such discomfort with them? Is it mere folly of younger developers? Are older developers building up barriers of vocabulary, APIs and accumulated, sometimes seldom used, features in their products, to keep their club an exclusive one? In other engineering disciplines, evolution in technology is embraced, built upon, made beneficial to consumers, and contributory to overall progress. But the computing disciplines maintain a certain folk heroism in rejecting prior progress as misguided. For some reason, we see new implementations of established solutions as elegant and laudable. And virtually every player in the industry is guilty of this. I haven’t figured out why this phenomenon exists, but I think it’s bad for the industry. It allows indulgence to masquerade as enlightenment, and it holds the whole field back.

Programming has an artistic element to it; it’s not mere rote science. That’s why many talented practitioners are attracted to the field, and removing that creative aspect of software work would therefore be counter-productive. But we owe it to our colleagues, and to our customers, to conquer fundamentally new problems, rather than provide so many alternative solutions to the old ones. There’s plenty of creativity involved in breaking new ground too, and I dare say it brings more benefit to the industry, even to society. NoSQL is interesting technology and its challenge to established ways of thinking about data does have merit and benefit. Nevertheless, I hope the next disruptive technology to come along says yes to conquering new territory. At the very least, I hope it doesn’t start with “No.”

i agree Andrew. I too have seen this pattern before. Just look at the numerous incarnations of Microsoft's own data access technologies: ODBC, DAO, RDC, ADO, RIA, Entity Framework etc.. In relation to SQL, and No SQL, it is true that the database vendors have done a poor job of managing unstructured BLOB storage. This led none-databases developers to conclude that SQL is not for them, hence they looked elsewhere. For example SQL Azure still does not support free text searching over simple unstructured text. LINQ is another example of NoSQL where inexperienced developers have chosen to ignore at their peril what is going on under the hood. Final example is Silverlight, where the script kiddies who built it forgot to include support for ADO.Net datasets.My advice to junior developers is to learn and appreciate SQL and RDBMS. My advice to RDBMS vendors is to innovate and extend SQL to support unstructured data.

I think that's good advice all around. That said, I don't think the Silverlight team *forgot* to include ADO.NET support but rather wanted to keep the runtime small and thought its async model would be better supported by data access on the back-end that SL apps could use via service calls.

There are certainly use cases for NoSql. I've actually grown to love working with it. I'm not sure why you said LOB apps need ACIDity as I've gotten along fine without it for 18 months.

Today I had to pair with a developer on another team that uses SQL + NHibernate. Something that I thought should have taken about half an hour ended up taking over 1.5 hours since we had to create DB tables, write NHibernate mappings, look up how to do a certain mapping on Google, then update our DB schema in version control. Not to mention going back and adding a field we forgot (alter table, update model, update mapping, recommit schema, more tests). Working with NoSql I could have done ALL of that just by writing the model itself.

The reason I love NoSql is I can spend more time thinking about how to effectively solve the business problem and much less time thinking about the technology.

While I agree that the route you and the iother developer took was less than efficient, I would say that's more an indictment of ORM development (with nHibernate in this case) than of relational databases per se. And checking your schema into version control is not a required step, though it certainly is one that will aid maintainability of the database in the future, especially if new developers inherit your system.

There are plenty of programming interfaces to relational databases taht (a) get you there quickly and (b) meanwhile ensuire your data is maintained in a transactional and consistent manner, with the regularized schema that the majority of LOB apps require.

I had a skim through your paper and put it aside for later reading, not because I use NoSQL but because I'm as interested in the movement as you are, even though I'm never likely to encounter it professionally. It's too bad you didn't write about the history as well ...

I'm not sure if people are rebelling against the layers and complexity in other mature software (though I wish they would do it more often); but more that programmers enjoy living on the cutting edge and have an ego-maniacal attraction to working somewhere their work runs across dozens/hundreds of machines at once... which is more likely to be a NoSQL environment. IMHO.