Leonard Lobel

Leonard Lobel (Microsoft MVP, Data Platform) is the chief technology officer and co-founder of Sleek Technologies, Inc., a New York-based development shop with an early adopter philosophy toward new technologies. He is also a principal consultant at Tallan, Inc., a Microsoft National Systems Integrator and Gold Competency Partner.

Programming since 1979, Lenni specializes in Microsoft-based solutions, with experience that spans a variety of business domains, including publishing, financial, wholesale/retail, health care, and e-commerce. Lenni has served as chief architect and lead developer for various organizations, ranging from small shops to high-profile clients. He is also a consultant, trainer, and frequent speaker at local usergroup meetings, VSLive, SQL PASS, and other industry conferences.

Lenni has also authored several MS Press books and Pluralsight courses on SQL Server programming

I want to…

To begin, let’s be clear that an Azure Cosmos DB container can have only one partition key. I say this from the start in case “multiple partition keys” in the title is somehow misinterpreted to imply otherwise. You always need to come up with a single property to serve as the partition key for every container and choosing the best property for this can sometimes be difficult.

Understanding the Problem

Making the right choice requires intimate knowledge about the data access patterns of your users and applications. You also need to understand how horizontal partitioning works behind the scenes in terms of storage and throughput, how queries and stored procedures are scoped by partition key, and the performance implications of running cross-partition queries and fan-out queries. So there are many considerations to take into account before settling on the one property to partition a container on. I discuss all this at length in a previous blog post Horizontal Partitioning in Azure Cosmos DB.

But here’s the rub. What if, after all the analysis, you come to realize that you simply cannot settle on a single property that serves as an ideal partition key for all scenarios? Let’s say for example, from a write perspective, you find one property will best distribute writes uniformly across all the partitions in the container. But from a query perspective, you find that using the same partition key results in too much fanning out. Or, you might identify two categories of common queries, where it’s roughly 50/50; meaning, about half of all the queries are of one type, and half are of the other. What do you do if the two query categories would each benefit from different partition keys?

Your brain can get caught in an infinite loop over this until you wind up in that state of “analysis paralysis,” where you recognize that there’s just no single best property to choose as the partition key. To break free, you need to think outside the box. Or, let’s say, think outside the container. Because the solution here is to simply create another container that’s a complete “replica” of the first. This second container holds the exact same set of documents as the first but defines a different partition key.

I placed quotes around the word “replica” because this second container is not technically a replica in the true Cosmos DB sense of the word (where, internally, Cosmos DB automatically maintains replicas of the physical partitions in every container). Rather, it’s a manual replica that you maintain yourself. Thus, it’s your job to keep it in sync with changes when they happen in the first container, which is partitioned by a property that’s optimized for writes. As those writes occur in real time, you need to respond by updating the second collection, which is partitioned by a property that’s optimized for queries.

Enter Change Feed

Fortunately, change feed comes to the rescue here. Cosmos DB maintains a persistent record of changes for every container that can be consumed using the change feed. This gives you a reliable mechanism for retrieving changes made to any container, all the way back to the beginning of time. For an introduction to change feed, have a look at my previous blog post Change Feed – Unsung Hero of Azure Cosmos DB

In this three-part series of blog posts, I’ll dive into three ways you can use the change feed to implement a solution for synchronizing containers:

Querying the change feed directly

Using the Change Feed Processor (CFP) library

Writing an Azure Functions Cosmos DB trigger

Let’s get started. Assume that we’ve done our analysis and established that city is the ideal partition key for writes, as well as roughly half of the most common queries our users will be running. But we’ve also determined that state is the ideal partition key for the other (roughly half) commonly executed queries. This means we’ll want one container partitioned by city, and another partitioned by state. And we’ll want to consume the city-partitioned container’s change feed to keep the state-partitioned container in sync with changes as they occur. We’ll then be able to direct our city-based queries to the first container, and our state-based queries to the second container, which then eliminates fan-out queries in both cases.

Setting Up

If you’d like to follow along, you’ll need to be sure your environment is setup properly. First, of course, you’ll need to have a Cosmos DB account. The good news here is that you can get a free 30-day account with the “try cosmos” offering, which doesn’t even require a credit card or Azure subscription (just a free Microsoft account). Even better, there’s no limit to the number of times you can start a new 30-day trial. Create your free account at http://azure.microsoft.com/try/cosmosdb.

You’ll need your account’s endpoint URI and master key to connect to Cosmos DB from C#. To obtain them, head over to your Cosmos DB account in the Azure portal, open the Keys blade, and keep it open so that you can handily copy/paste them into the project.

You’ll also need Visual Studio. I’ll be using Visual Studio 2019, but the latest version of Visual Studio 2017 is fine as well. You can download the free community edition at https://visualstudio.microsoft.com/downloads.

Querying the Change Feed Directly

We’ll begin with the raw approach, which is to query the change feed directly using the SDK. The reality is that you’ll almost never want to go this route, except for the simplest small-scale scenarios. Still, it’s worth taking some time to examine this approach first, as I think you’ll benefit from learning how the change feed operates at a low level, and it will enhance your appreciation of the Change Feed Processor (CFP) library which I’ll cover in the next blog post of this series.

Fire up Visual Studio, create a new .NET Core console application, and name the project ChangeFeedDirect, and name the solution ChangeFeedDemos (we’ll be adding more projects to this solution in parts 2 and 3 of this blog series). Next, add the SDK to the ChangeFeedDirect project from the NuGet package Microsoft.Azure.DocumentDB.Core:

We’ll write some basic code to create a database with the two containers, with additional methods to create, update, and delete documents in the first container (partitioned by city). Then we’ll write our “sync” method that directly queries the change feed on the first container, in order to update the second container (partitioned by state) and reflect all the changes made.

Note: Our code (and the SDK) refers to containers as collections.

We’ll write all our code inside the Program.cs file. First, update the using statements at the very top of the file to get the right namespaces imported:

using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

Next, at the very top of the Program class, set two private string constants for your Cosmos DB account’s endpoint and master key. You can simply copy them from the Keys blade in the Azure portal, and paste them right into the code:

Most of this code is intuitive, even if you’ve never done any Cosmos DB programming before. We first use our endpoint and master key to create a DocumentClient, and then we use the client to create a database named multipk with the two containers in it.

This works by first calling DeleteDatabaseAsync wrapped in a try block with an empty catch block. This effectively results in “delete if exists” behavior to ensure that the multipk database does not exist when we call CreateDatabaseAsync to create it.

Next, we call CreateDocumentCollection twice to create the two containers (again, a collection is a container). We name the first container byCity and assign it a partition key of /city, and we name the second container byState and assign /state as the partition key. Both containers reserve 400 request units (RUs) per second, which is the lowest throughput you can provision.

Notice the DefaultTimeToLive = -1 option applied to the first container. At the time of this writing, change feed does not support deletes. That is, if you delete a document from a container, it does not get picked up by the change feed. This may be supported in the future, but for now, the TTL (time to live) feature provides a very simple way to cope with deletions. Rather than physically deleting documents from the first container, we’ll just update them with a TTL of 60 seconds. That gives us 60 seconds to detect the update in the change feed, so that we can physically delete the corresponding document in the second container. Then, 60 seconds later, Cosmos DB will automatically physically delete the document from the first container by virtue of its TTL setting. You’ll see all this work in a moment when we run the code.

The other point to call out is the creation of our sync document, which is a special metadata document that won’t get copied over to the second container. Instead, we’ll use it to persist a timestamp to keep track of the last time we synchronized the containers. This way, each time we sync, we can request the correct point in time from which to consume changes that have occurred since the previous sync. The document is initialized with a lastSync value of null so that our first sync will consume the change feed from the beginning of time. Then lastSync is updated so that the next sync picks up precisely where the first one left off.

Now let’s implement CreateDocuments. This method simply populates three documents in the first container:

Notice that all three documents have city and state properties, where the city property is the partition key for the container that we’re creating these documents in. The state property is the partition key for the second container, where our sync method will create copies of these documents as it picks them up from the change feed. The slogan property is just an ordinary document property. And although we aren’t explicitly supplying an id property, the SDK will automatically generate one for each document with a GUID as the id value.

We’ll also have an UpdateDocument method to perform a change on one of the documents:

Remember that (currently) the change feed doesn’t capture deleted documents, so we’re using the TTL (time to live) technique to keep our deletions in sync. Rather than calling DeleteDocumentAsync to physically delete the Orlando document, we’re simply updating it with a ttl property set to 60 and saving it back to the container with ReplaceDocumentAsync. To the change feed, this is just another update, so our sync method will pick it up normally as you’ll see in a moment. Meanwhile, Cosmos DB will physically delete the Orlando document from the first container in 60 seconds, giving our sync method up to one minute to pick it up from the change feed and delete it from the second container.

And finally, the sync method, which is what this whole discussion is all about. Here’s the code for SyncCollections:

Let’s break this down. First, we grab the last sync time from the sync document in the first container. Remember, this will be null the very first time we run this method. Then, we’re ready to query the change feed, which is a two-step process.

For step 1, we need to discover all the partition key ranges in the container. A partition key range is essentially a set of partition keys. In our small demo, where we have only one document each across three distinct partition keys (cities), Cosmos DB will host all three of these documents inside a single partition key range.

Although there is conceptually only one change feed per container, there is actually one change feed for each partition key range in the container. So step 1 calls ReadPartitionKeyRangeFeedAsync to discover the partition key ranges, with a loop that utilizes a continuation token from the response so that we retrieve all of the partition key ranges into a list.

Then, in step 2, we iterate the list to consume the change feed on each partition key range. Notice the ChangeFeedOptions object that we set on each iteration, which identifies the partition key range in PartitionKeyRangeId, and then sets either StartFromBeginning or StartTime, depending on whether lastSync is null or not. If it’s null (which will be true only on the very first sync), then StartFromBeginning will be set to true and StartTime will be set to null. Otherwise, StartFromBeginning gets set to false, and StartTime gets set to the timestamp from the last sync.

After preparing the options, we call CreateDocumentChangeFeedQuery that returns an iterator. As long as the iterator’s HasMoreResults property is true, we call ExecuteNextAsync on it to fetch the next set of results from the change feed. And here, ultimately, is where we plug in our sync logic.

Each result is a changed document. We know this will always include the sync document, because we’ll be updating it after every sync. This is metadata that we don’t need copied over to the second container each time, so we filter out the sync document by testing the city property for “sync.”

For all other changed documents, it now becomes a matter of performing the appropriate create, update, or delete operation on the second container. First, we check to see if there is a ttl property on the document. Remember that this is our indication of whether this is a delete or not. If the ttl property isn’t present, then it’s either a create or an update. In either case, we handle the change by calling UpsertDocumentAsync on the second container (upsert means “update or insert”).

Otherwise, if we detect the ttl property, then we call DeleteDocumentAsync to delete the document from the second container, knowing that Cosmos DB will delete its counterpart from the first container when the ttl expires.

Let’s test it out. Start the console app and run the DB (create database) and CD (create documents) commands. Then navigate to the Data Explorer in the Azure portal to verify that the database exists with the two containers, and that the byCity container has three documents in it, plus the sync document with a lastSync value of null indicating that no sync has yet occurred:

The byState container should be empty at this point, because we haven’t run our first sync yet:

Back in the console app, run the SC command to sync the containers. This copies all three documents from the first container’s change feed over to the second container, skipping the sync document which we excluded in our code:

Refresh the second container view, and you’ll see that the slogan property has now been updated there as well.

Finally, run the DD command in the console app the delete the Orlando, FL document from the first container. Remember that this doesn’t actually delete the document, but rather updates it with a ttl property set to 60 seconds. Then run SC to sync the containers again:

You can now confirm that the Orlando, FL document is deleted from the second container, and within a minute (upon ttl expiration), you’ll see that it gets deleted from the first container as well.

However, don’t wait longer than a minute after setting the ttl before running the sync or you will run out of time. Cosmos DB will delete the document from the first container when the ttl expires, at which point it will disappear from the change feed and you will lose your chance to delete it from the second container.

What’s Next?

It didn’t take that much effort to consume the change feed, but that’s only because we have a tiny container with just a handful of changes, and we’re manually invoking each sync. To consume the change feed at scale, much more work needs to be done. For example, the change feed on each partition key range of the container can be consumed concurrently, so we could add multithreading logic to parallelize those queries. Long change feeds can also be consumed in chunks, using continuation tokens that we could persist as a “lease,” so that new clients can resume consumption where previous clients left off. We also want the sync automated, so that we don’t need to poll manually.

Fortunately, the Change Feed Processor (CFP) library handles all these details for you. It was certainly beneficial to start by querying the change feed directly, since exploring that option first is a great way to learn how change feed works internally. However, unless you have very custom requirements, the CFP library is the way to go.

So stay tuned for part 2, and we’ll see how much easier it is to implement our multiple partition key solution much more robustly using the CFP library.

Introduction

Azure Cosmos DB is rapidly growing in popularity, and for good reason. Microsoft’s globally distributed, multi-model database service has massively scalable storage and throughput, provides a sliding scale for consistency, is fully and automatically indexed, and it exposes multiple APIs. Throw in a server-side programming model for ACID transactions with stored procedures, triggers, and user-defined functions, along with 99.999% SLAs on availability, throughput, latency, and consistency, and it’s easy to see why Cosmos DB is fast winning the hearts of developers and solution architects alike.

Yet still today, one of the most overlooked capabilities in Cosmos DB is its change feed. This little gem sits quietly behind every container in your database, watches for changes, and maintains a persistent record of them in the order they occur. This provides you with a reliable mechanism to consume a continuous and incremental feed of changes, as documents are actively written or modified in any container.

There are numerous use cases for this, and I’ll call out a few of the most common ones in a moment. But all of them share a need to respond to changes made to a Cosmos DB container. And the first thought that comes to the mind of a relational database developer is to use a trigger for this. Cosmos DB supports triggers as part of its server-side programming model, so it could be natural to think of using this feature to consume changes in real time when you need to.

Unfortunately, though, triggers in Cosmos DB do not fire automatically as they do in the relational world. They need to be explicitly referenced with each change in order to run, so they cannot be relied upon for capturing all changes made to a container. Furthermore, triggers are JavaScript-only, and they run in a bounded execution environment within Cosmos DB that is scoped to a single partition key. These characteristics further limit what triggers can practically accomplish in response to a change.

But with change feed, you’ve got a reliable mechanism for retrieving changes made to any container, all the way back to the beginning of time. You can write code (in your preferred language) that consumes the change feed to process it as needed and deploy that code to run on Azure. This paves an easy path for you to build many different solutions for many different scenarios.

Scenarios for Change Feed

Some of the more common use cases for change feed include:

Replicating containers for multiple partition keys

Denormalizing a document data model across containers

Triggering API calls for an event-driven architecture

Real time stream processing and materialized view patterns

Moving or archiving data to secondary data stores

Each of these deserves their own focused blog post (and will hopefully get one). For the broader context of this overview post, however, I’ll discuss them each at high level.

Replicating containers for multiple partition keys

One of the most (perhaps the most) important things you need to do when creating a container is to decide on an appropriate partition key – a single property in your data that the container will be partitioned by.

Now sometimes this is easy, and sometimes it is not. In order to settle on the correct choice, you need a clear understanding of how your data is used (written to and queried), and how horizontal partitioning works. You can read all about this in my previous blog post, Horizontal Partitioning in Azure Cosmos DB.

But what do you do when you can’t decide? What if there are two properties that make good choices, one for write performance, and another that’s better for query performance? This can lead to “analysis paralysis,” a scary condition that is fortunately and easily remedied using change feed.

All you do is create two containers, each partitioned by a different partition key. The first container uses a partition key that’s optimized for writes (it may also be appropriate for certain types of queries as well), while the second one uses a partition key optimized for most typical queries. Simply use change feed to monitor changes made to the container as the writes occur and replicate the changes out to the second container.

Your application then writes to the first container and queries from the second container, simple as that! I’ll show you a detailed walkthrough of exactly how to implement this using C# and the Change Feed Processor Library with Azure Functions in my next post.

Denormalizing a document data model across containers

Developers with a background in relational database design often struggle initially with the denormalized approach to data modeling in the NoSQL world of JSON documents. I personally empathize; from my own experience, I know that it can be difficult at first to embrace concepts that run contrary to deeply engrained practices that span decades of experience in the field.

Data duplication is a case in point, where this is considered a big no-no in the normalized world of relational databases with operational workloads. But with NoSQL, we often deliberately duplicate data in order to avoid expensive additional lookups. There is no concept of a JOIN in any NoSQL database engine, and we can avoid having to perform our own “manual” joins if we simply duplicate the same information across documents in different containers.

This is a somewhat finer-grained version of the previous scenario, which replicates entire documents between two containers. In this case, we have different documents in each container, with data fragments from changed documents in one container being replicated into other (related) documents in another container.

But how do you ensure that the duplicated data remains in sync as changes occur in the source container? Why, change feed of course! Just monitor the source container and update the target container. I’ll show you exactly how to do this in a future post.

Triggering API calls for an event-driven architecture

In this scenario, you source events to a set of microservices, each with a single responsibility. For instance, an ecommerce website with a large-scale order processing pipeline. The pipeline is broken up into a set of smaller microservices, each of which can be scaled out independently. Each microservice is responsible for a single task in the pipeline, such as calculating tax on each order, generating tax audit records, processing each order payment, sending orders off to a fulfillment center, and generating shipping notifications.

Thus, you potentially have N microservices communicating with up to N-1 other microservices, which adds significant complexity to the larger architecture. The design can be greatly simplified if, instead, all these microservices communicate through a bus; that is, a persistent event store. And Cosmos DB serves as an excellent persistent event store, because the change feed makes it easy to broker and source these events to each microservice. Furthermore, because the events themselves are persisted, the order processing pipeline itself is very robust and incredibly resilient to failures. You can also query and navigate the individual events, so that this data can be surfaced out through a customer care API.

Real time stream processing and materialized view patterns

The change feed can also be used for performing real time stream processing and analytics. In the order processing pipeline scenario, for example, this would enable you to take all the events and materialize a single view for tracking the order status. You could then easily and efficiently present the order status through a user-facing API.

Other examples of so called “lambda architectures” include performing real time analytics on IoT telemetry or building a scoring leader board for a massively multiplayer online video game.

Moving or archiving data to secondary data stores

Another common scenario for using the change feed involves replicating data from Cosmos DB as your primary (hot) store to some other secondary (cold) data store. Cosmos DB is a superb hot store because it can sustain heavy write ingestion, and then immediately serve the ingested records back out to a user-facing API.

Over time as the volume of data mounts, however, you may want to offload older data to cold storage for archival. Once again, change feed is a wonderful mechanism to implement a replication strategy that does just that.

Consuming the Change Feed

So how do you actually work with the change feed? There are several different ways, and I’ll conclude this blog post by briefly explaining three of them. (Don’t worry, I’ll drill deeper into all three next post!)

Direct Access

First, you can query the change feed directly using the SDK. This raw approach works but is the hardest to implement at large scale. Essentially, you first need to discover all the container’s partitions, and then you query each of them for their changes. You also need to persist state metadata; for example, a timestamp for when the change feed was last queried, and a continuation token for each partition. Plus, you’ll want to optimize performance by spawning multiple tasks across different partitions so that they get processed in parallel.

If all this sounds like a lot of work, it is. Which is why you’ll almost certainly want to leverage the Change Feed Processor (CFP) library instead.

Change Feed Processor (CFP) Library

The CFP library provides a high-level abstraction over direct access that greatly simplifies the process of reading the change feed from all the different partitions of a container. This is a separate NuGet package that you pull into your project, and it handles all the aforementioned complexity for you. It will automatically persist state, track all the partitions of the container, and acquire leases so that you can scale out across many consumers.

To make this work, the CFP library persists a set of leases as documents in another dedicated Cosmos DB container. Then, when you spin up consumers, they attempt to acquire leases as they expire.

All you do is write an observer class that implements IChangeFeedObserver. The primary method of this interface that you need to implement is ProcessChangesAsync, which receives the change feed as a list of documents that you can process as needed. No partitions to worry about, no timestamps or continuation tokens to persist, and no scale-out needs to concern yourself with.

However, you still need to write your own host, and deploy the DLL with your observer class to an Azure app service. Although the process is straightforward, going with Azure Functions instead provides an even easier deployment model.

Azure Functions

The simplest way to consume the change feed is by using Azure Functions with a Cosmos DB trigger. If you’re not already familiar with Azure Functions, they let you write individual methods (functions), which you deploy for execution in a serverless environment hosted on Azure. The term “serverless” here means without also having to write a host and deploy an Azure app service to run your code.

Azure Functions are invoked by one of any number of triggers. Yes, Azure Functions also uses the term triggers, but unlike triggers in Cosmos DB, an Azure Functions trigger always fires when its conditions are met. There are several different triggers available, including the one that we care about here, the Cosmos DB trigger. This Azure Functions trigger binds to configuration that points to the container you want to monitor changes on, and the lease collection that gets managed by the CFP library under the covers.

From there, the process is identical to using the CFP library, only the deployment model is dramatically simpler. The Azure Functions method that is bound using the Cosmos DB trigger receives a parameter with the same list of documents that a CFP library’s observer class does in its ProcessChangesAsync method, and you process change feed data the way you need to just the same.

What’s Next?

Hopefully, this blog post opened your eyes to the change feed in Azure Cosmos DB. This powerful feature creates all sorts of exciting possibilities across a wide range of use cases. In the next installment, I’ll focus on container replication for multiple partition keys, and walk you step by step through the process of building a change feed solution using direct access, CFP library, and Azure Functions – all with working code samples. So, stay tuned!