To build this system you will use MongoDB’s flexible schema to store
all content “nodes” in a single collection regardless of type. This
guide will provide prototype schema and describe common operations for
the following primary node types:

Basic Page

Basic pages are useful for displaying infrequently-changing text
such as an ‘about’ page. With a basic page, the salient information
is the title and the content.

Blog entry

Blog entries record a “stream” of posts from users on the CMS and
store title, author, content, and date as relevant information.

Photo

Photos participate in photo galleries, and store title,
description, author, and date along with the actual photo binary
data.

This solution does not describe schema or process for storing or using
navigational and organizational information.

Although documents in the nodes collection
contain content of different types, all documents have a similar
structure and a set of common fields. Consider the following
prototype document for a “basic page” node type:

Most fields are descriptively titled. The section field identifies
groupings of items, as in a photo gallery, or a particular blog . The
slug field holds a URL-friendly unique representation of the node,
usually that is unique within its section for generating URLs.

All documents also have a detail field that varies with the
document type. For the basic page above, the detail field might hold
the text of the page. For a blog entry, the detail field might
hold a sub-document. Consider the following prototype:

Photos require a different approach. Because photos can be potentially
larger than these documents, it’s important to separate the binary photo
storage from the nodes metadata.

GridFS provides the ability to store larger files in MongoDB.
GridFS stores data in two collections, in this case,
cms.assets.files, which stores metadata, and cms.assets.chunks
which stores the data itself. Consider the following prototype
document from the cms.assets.files collection:

This section outlines a number of common operations for building and
interacting with the metadata and asset layer of the cms for all node
types. All examples in this document use the Python programming
language and the PyMongodriver for
MongoDB, but you can implement this system using any language you
choose.

The most common operations inside of a CMS center on creating and
editing content. Consider the following
insert()
operation:

db.cms.nodes.insert({'nonce':ObjectId(),'metadata':{'section':'myblog','slug':'2012-03-noticed-the-news','type':'blog-entry','title':'Noticed in the News','created':datetime.utcnow(),'author':{'id':user_id,'name':'Rick'},'tags':['news','musings'],'detail':{'publish_on':datetime.utcnow(),'text':'I noticed the news from Washington today…'}}})

Once inserted, your application must have some way of preventing
multiple concurrent updates. The schema uses the special nonce
field to help detect concurrent edits. By using the nonce field in
the query portion of the update
operation, the application will generate an error if there is an
editing collision. Consider the following update

To support updates and queries on the metadata.section, and
metadata.slug, fields and to ensure that two editors don’t
create two documents with the same section name or slug. Use the
following operation at the Python/PyMongo console:

Because uploading the photo spans multiple documents and is a
non-atomic operation, you must “lock” the file during upload by
writing datetime.utcnow() in the
record. This helps when there are multiple concurrent editors and lets
the application detect stalled file uploads. This operation assumes
that, for photo upload, the last update will succeed:

defupdate_photo_content(input_file,section,slug):fs=GridFS(db,'cms.assets')# Delete the old version if it's unlocked or was locked more than 5# minutes agofile_obj=db.cms.assets.find_one({'metadata.section':section,'metadata.slug':slug,'metadata.locked':None})iffile_objisNone:threshold=datetime.utcnow()-timedelta(seconds=300)file_obj=db.cms.assets.find_one({'metadata.section':section,'metadata.slug':slug,'metadata.locked':{'$lt':threshold}})iffile_objisNone:raiseFileDoesNotExist()fs.delete(file_obj['_id'])# update content, keep metadata unchangedfile_obj['locked']=datetime.utcnow()withfs.new_file(**file_obj):whileTrue:chunk=input_file.read(upload_file.chunk_size)ifnotchunk:breakupload_file.write(chunk)# unlock the filedb.assets.files.update({'_id':upload_file._id},{'$set':{'locked':None}})

As with the basic operations, you can use a much more simple operation
to edit the tags:

Create a unique index on {metadata.section:1,metadata.slug:1}
to support the above operations and prevent users from creating or
updating the same file concurrently. Use the following operation in
the Python/PyMongo console:

To retrieve a list of images based on their tags, use the following
operation:

image_file_objects=db.cms.assets.files.find({'metadata.tags':tag})fs=GridFS(db,'cms.assets')forimage_file_objectindb.cms.assets.files.find({'metadata.tags':tag}):image_file=fs.get(image_file_object['_id'])# do something with the image file

In a CMS, read performance is more critical than write performance. To
achieve the best read performance in a sharded cluster, ensure
that the mongos can route queries to specific shards.

Also remember that MongoDB can not enforce unique indexes across
shards. Using a compound shard key that consists of
metadata.section and metadata.slug, will provide the same
semantics as describe above.

Warning

Consider the actual use and workload of your cluster before
configuring sharding for your cluster.