Build fast, scale big with MongoDB Atlas, a hosted service for the leading NoSQL database. Try it now! Brought to you in partnership with MongoDB.

I have an opinion: most people learned about relational databases
as if RDBMSes were designed to store the ultimate truth about some data.
They figured that once the schema had been properly diagrammed and
normalized, then they could load all their data into it, and finally,
start doing some queries.

Superimpose a logical structure upon the data on the basis of these relationships.

Do you see a step that’s missing? If you’ve deployed and maintained a
large-scale application you’ll probably see what the Wikipedia authors
omitted. In fact, it’s the first step: Figure out what one
question your database must answer. Then, design your schema to answer
that question as fast as possible. And now you’re done. Come to think of
it, you never had to do steps 1 and 2 at all.

There’s a total disconnect between the approaches of introductory SQL
courses and real-world application development, and I think this
disconnect is slowing down adoption of NoSQL.

Consider Facebook Messages. After a (now rather well-publicized) evaluation process, Facebook chose HBase,
a NoSQL data store, as the main database for their message system. I
haven’t talked to anyone there, but I figure they chose it based on this
criterion:

How fast will our database answer the question, “What are this user’s most recent 10 messages?”

They chose the database system that could answer that question the
fastest, and they designed the best schema they could think of to answer
that question. Anything else they need to ask HBase may be slow, or
difficult, but that doesn’t matter, because “What are this user’s most
recent 10 messages?” probably accounts for 99% of the load on their
system.

If you learned about databases in college, following some textbook, I
expect you were guided through a long process of modeling real-world
data using rows and columns, to express some profound truth about the
data. Then, you were introduced to SQL, with which you could query the
data. At the end of the course, maybe there was a brief discussion of
database performance. Probably not.

Data at the scale that the largest websites handle doesn’t work that
way. Large applications design their schemas to answer one question as
quickly as possible, and no other considerations are significant.

The next time you read about a NoSQL database you might wonder, “What
about foreign keys, or normalization? What about transactions? Why
can’t I define secondary indexes? Why are range queries prohibited?”
(I’m just picking some limitations at random—each system is different.)
Consider who built these new database systems, and what their experience
has been. The ideas behind NoSQL databases mostly originated at places
like Google, Amazon, and Yahoo. They build huge systems, and huge
systems’ loads are usually dominated by a handful of queries. Companies
build their database systems from the ground up to optimize the
performance of these queries. NoSQL databases encourage you to figure
out ahead of time, “What one question do I need to answer?” Figure that
out, and choose your database software and your schema based on that.
Nothing else really matters.

Now it's easier than ever to get started with MongoDB, the database that allows startups and enterprises alike to rapidly build planet-scale apps. Introducing MongoDB Atlas, the official hosted service for the database on AWS. Try it now! Brought to you in partnership with MongoDB.