Useful Links

One of MongoDB’s key strengths has always been developer empowerment: by relying on a flexible schema architecture, MongoDB makes it easier and faster for applications to move through the development stages from proof-of-concept to production and iterate over update cycles as requirements evolve.

However, as applications mature and scale, they tend to reach a stable stage where frequent schema changes are no longer critical or must be rolled out in a more controlled fashion, to prevent undesirable data from being inserted into the database. These controls are especially important when multiple applications write into the same database, or when analytics processes rely on predefined data structures to be accurate and useful.

Along with the rest of the 3.2 “schema when you need it” features, document validation gives MongoDB a new, powerful way to keep data clean. These are definitely not the final set of tools we will provide, but is rather an important step in how MongoDB handles schema.

Announcing JSON Schema Validation support

Building upon MongoDB 3.2’s Document Validation functionality, MongoDB 3.6 introduces a more powerful way of enforcing schemas in the database, with its support of JSON Schema Validation, a specification which is part of IETF’s emerging JSON Schema standard.

JSON Schema Validation extends Document Validation in many different ways, including the ability to enforce schemas inside arrays and prevent unapproved attributes from being added. These are the new features we will focus on in this blog post, as well as the ability to build business validation rules.

Starting with MongoDB 3.6, JSON Schema is the recommended way of enforcing Schema Validation. The next section highlights the features and benefits of using JSON Schema Validation.

With this document validation configuration, we not only make sure that both the item and price attributes are present in any order document, but also that item is a string and price a decimal (which is the recommended type for all currency and percentage values). Therefore, the following element cannot be inserted (because of the “rogue” price attribute):

Prior to MongoDB 3.6, you could not prevent the addition of misspelled or unauthorized attributes. Let’s see how JSON Schema Validation can prevent this behavior. To do so, we will use a new operator, $jsonSchema:

The JSON Schema above is the exact equivalent of the document validation rule we previously set above on the orders collection. Let’s check that our schema has indeed been updated to use the new $jsonSchema operator by using the db.getCollectionInfos() method in the Mongo shell:

db.getCollectionInfos({name:"orders"})

This command prints out a wealth of information about the orders collection. For the sake of readability, below is the section that includes the JSON Schema:

First, note the use of the additionalProperties:false attribute: it prevents us from adding any attribute other than those mentioned in the properties section. For example, it will no longer be possible to insert data containing a misspelled pryce attribute. As a result, the use of additionalProperties:false at the root level of the document also makes the declaration of the _id property mandatory: whether our insert code explicitly sets it or not, it is a field MongoDB requires and would automatically create, if not present. Thus, we must include it explicitly in the properties section of our schema.

Second, we have chosen to declare the quantity attribute as either a short or long integer between 1 and 99 (using the minimum, maximum and exclusiveMaximum attributes). Of course, because our schema only allows integers lower than 100, we could simply have set the bsonType property to int. But adding long as a valid type makes application code more flexible, especially if there might be plans to lift the maximum restriction.

Finally, note that the description attribute (present in the item, price, and quantity attribute declarations) is entirely optional and has no effect on the schema aside from documenting the schema for the reader.

With the schema above, the following documents can be inserted into our orders collection:

You probably noticed that our orders above are seemingly odd: they only contain one single item. More realistically, an order consists of multiple items and a possible JSON structure might be as follows:

With the schema above, we enforce that any order inserted or updated in the orders collection contain a lineitems array of 1 to 10 documents that all have sku, unit_price and quantity attributes (with quantity required to be an integer).

The schema would prevent inserting the following, badly formed document:

However, if you pay close attention to the order above, you may notice that it contains a few errors:

The totalWithVAT attribute value is incorrect (it should be equal to 141*1.20=169.2)

The total attribute value is incorrect (it should be equal to the sum of each line item sub-total, (i.e. 10*9+10*5=140)

Is there any way to enforce that total and totalWithVAT values be correct using database validation rules, without relying solely on application logic?

Introducing MongoDB expressive query syntax

Adding more complex business validation rules is now possible thanks to the expressive query syntax, a new feature of MongoDB 3.6.

One of the objectives of the expressive query syntax is to bring the power of MongoDB’s aggregation expressions to MongoDB’s query language. An interesting use case is the ability to compose dynamic validation rules that compute and compare multiple attribute values at runtime. Using the new $expr operator, it is possible to validate the value of the totalWithVAT attribute with the following validation expression:

The above expression checks that the totalWithVAT attribute value is equal to total * (1+VAT). In its compact form, here is how we could use it as a validation rule, alongside our JSON Schema validation:

The above expression uses the $map operator to compute each line item’s sub-total, then sums all these sub-totals, and finally compares it to the total value. To make sure that both the Total and VAT validation rules are checked, we must combine them using the $and operator. Finally, our collection validator can be updated with the following command:

Next steps

With the introduction of JSON Schema Validation in MongoDB 3.6, database administrators are now better equipped to address data governance requirements coming from compliance officers or regulators, while still benefiting from MongoDB’s flexible schema architecture.

Additionally, developers will find the new expressive query syntax useful to keep their application code base simpler by moving business logic from the application layer to the database layer.

If you want to learn more about everything new in MongoDB 3.6, download our What’s New guide.

Raphael Londner is a Principal Developer Advocate at MongoDB, focused on cloud technologies such as Amazon Web Services, Microsoft Azure and Google Cloud Engine. Previously he was a developer advocate at Okta as well as a startup entrepreneur in the identity management space. You can follow him on Twitter at @rlondner