The EDB Blog

Which NoSQL Database for New Project?

August 31, 2017

Oh, how I love the title of this 2014 Slashdot request, "Which NoSQL Database For New Project?" The user already knows the type of clients (iPhones and Android phones) and the middleware (Php/Symfony or Ruby/Rails) and then comes the zinger, "I would like to start with a NoSQL solution for scaling …."

In addition to the question of whether Ruby on Rails is a scalable solution, the replies are illuminating, and even funny, e.g. "I'm working on a new independent project. It will soon become the new Facebook, and I'll be billionaire next quarter. The only problem is that I don't know which luxury yacht to buy with all this money."

OK, now on to the serious replies, which are many, seem to come from seasoned Sql and NoSql veterans and all seem to fall under the heading of premature optimization and scaling:

* NoSQL solutions can be ridiculously fast and scale beautifully over billions of rows. Under a billion rows, though, and they're just different from normal databases in various arguably broken ways. By the time you need a NoSQL database, you'll be successful enough to have a well-organized team to manage the transition to a different backend. For a new project, use a rdbms, and enjoy the ample documentation and resources available.

* However, like a lot of other posters, I'm very skeptical that NoSQL is the place to start. SQL databases can do a LOT for you, are very robust and can scale very considerably. As your requirements grow you might find yourself wanting things like indexes, transactions, referential integrity, the ability to manually inspect and edit data using SQL and the ability to store and access more complex structures. You're likely to give yourself a lot of pain if you go straight for NoSQL, and even if you DO need to scale later combining existing SQL and new NoSQL data stores can be a useful way to go.

* So many developers start with the phrase "I need NoSQL so I can scale" and almost all of them are wrong. The chances are your project will never ever ever scale to the kind of size where the NoSQL design decision will win. Its far more likely that NoSQL design choice will cause far more problems (performance etc), than the theoretical scaling issues. … NoSQL does not guarantee scaling, in many cases it scales worse than an SQL based solution. Workout what your scaling problems will be for your proposed application and workout when they will become a problem and will you ever reach that scale. Being on a bandwagon can be fun, but you would be in a better place if you really think through any potential scaling issues. NoSQL might be the right choice but in many places I've seen it in use it was the wrong choice, and it was chosen base on one developers faith that NoSQL scales better rather than think through the scaling issues.

* Given 3 trillion users your options are pretty much limited to horizontal scaling, no SQL etc. but most people never get that far with their applications and in that case, storing the data in a noSQL database and then getting actionable information out of it (which is the hardest part IMO) is a lot of effort spent for something much cheaper and easier done with an rdbms.

* "Why not" is because the cost/benefit analysis is not in NoSQL's favor. NoSQL's downsides are a steeper learning curve (to do it right), fewer support tools, and a more specialized skill set. Its primary benefits don't apply to you. You don't need ridiculously fast writes, you don't need schema flexibility, and you don't need to run complex queries on previously-unknown keys.

* It's a mistake to think that "NoSQL" is a silver bullet for scalability. You can scale just fine using MySQL (FlockDB) or Postresgl if you know what you're doing. On the other, if you don't know what you're doing, NoSQL may create problems where you didn't have them.

* Databases don't scale for people who don't understand SQL, don't understand data normalization, indexing and want to use them as flat files. Unfortunately, a way too common anti-pattern :(

*So default answer to "Which NoSQL database should I use?" is always "Don't use NoSQL."

There were many positive comments about Postgres, both from a relational database perspective and touting its scalability and NoSQL-like features. One poster even wrote about their accounting system using Mongo that they are porting to Postgres.

Bruce Momjian is a co-founder of the PostgreSQL Global Development Group, and has worked on PostgreSQL since 1996 as a committer and community leader. He is a frequent speaker and Postgres evangelist and travels worldwide appearing at conferences to help educate the community on the business...