MongoDB is a NoSQL-like document store database, developed and open sourced by 10gen. The initial release of the database server was in 2007 and it was made open source in 2009. When MongoDB started to spread, its biggest advantage was that it had a schema-less object structure, which stored JSON-like object structure in their proprietary BSON format.

Many developers say, MongoDB is very good for prototyping and it’s great for small websites, because development is fast. Meanwhile, there are big portals too, which are using MongoDB as their data store, like Foursquare, SourceForge and New York Times (according to Wikipedia).

MongoDB has a huge fan club (at the time of writing the article it’s the fifth most widely adopted database engine with a score of 246.5 listed on http://db-engines.com) and there are many different APIs to use it with; some of the most widely adopted ones are created for Java, .NET, node.js, C++, Ruby and of course Python.

In this article I'll present the CRUD operations using pymongo API, the official Python API for MongoDB. In case you are not familiar with MongoDB, they have a very good online interactive tutorial at http://try.mongodb.org webpage.

What are CRUD operations?

The acronym CRUD stands for Create, Read, Update and Delete. These operations are considered to be the four basic functionalities of a repository (a.k.a data storage). There are people who extend these basic functionalities with Search, the acronym changes to SCRUD.

CRUD operations can be mapped directly to database operations:

Create matches insert

Read matches select

Update matches update

Delete matches delete

CRUD operations with pymongo

Prerequisites

If you want to work with MongoDB, Python and PyMongo, you will have to install all three. Here are the links (default configuration and how to install is explained):

The Data Model

Python is a good choice to work with MongoDB, because in Python the dictionary data structure has JSON format and MongoDB also stores JSON-like data, so there is no need for data conversion when storing data to collections (MongoDB collections are the equivalent of tables in relational databases).

The class is very simple, the constructor assigns values to the class attributes. The project as a model (and as a class also) has _id, title, description, price and assigned_to attributes. The _id field is special, it’s used by MongoDB to uniquely identify an entry (document) in a collection. In case the _id field is not added to the document, MongoDB will create an _id and will add it to the structure. The pymongo API has a special python implementation for this structure, called ObjectId. If invoked without any parameters it generates a new identification number.

In python classes the attributes assigned to a class are stored within an internal dictionary, named __dict__:

def get_as_json(self):
""" Method returns the JSON representation of the Project object, which can be saved to MongoDB """
return self.__dict__

So in this case the method get_as_json(self) is a helper method, which returns the __dict__ attribute of the class. Since python dictionaries have JSON representation, the result of get_as_json(self) can be stored directly to MongoBD.

There is a @staticmethod defined in the class. Python’s static methods are basically the same as static methods in any other object oriented programming language, these can be invoked using the class name. The method def build_from_json(json_data) will help to create new instances of Project class when loading data from MongoDB.

Repository with CRUD operations

The class ProjectsRepository implements the CRUD operations using pymongo:

from pymongo import MongoClient
from bson.objectid import ObjectId
from project import Project
class ProjectsRepository(object):
""" Repository implementing CRUD operations on projects collection in MongoDB """
def __init__(self):
# initializing the MongoClient, this helps to
# access the MongoDB databases and collections
self.client = MongoClient(host='localhost', port=27017)
self.database = self.client['projects']
def create(self, project):
if project is not None:
self.database.projects.insert(project.get_as_json())
else:
raise Exception("Nothing to save, because project parameter is None")
def read(self, project_id=None):
if project_id is None:
return self.database.projects.find({})
else:
return self.database.projects.find({"_id":project_id})
def update(self, project):
if project is not None:
# the save() method updates the document if this has an _id property
# which appears in the collection, otherwise it saves the data
# as a new document in the collection
self.database.projects.save(project.get_as_json())
else:
raise Exception("Nothing to update, because project parameter is None")
def delete(self, project):
if project is not None:
self.database.projects.remove(project.get_as_json())
else:
raise Exception("Nothing to delete, because project parameter is None")

The MongoClient class from pymongo API helps to create a connection and manage data in the MongoDB database. In the constructor I create a new instance of the MongoClient class, passing in the host and port of the MongoDB server (since mine was installed locally and I used host='localhost' and the port=27017 – these are the predefined values of the MongoClient class, but I added them here so you have an example how can this be customized). The databases from the MongoDB server can be accessed same way as dictionary values are in python, ex: client['projects']. The create(self, project), update(self, project) and delete(self, project) methods receive a parameter of type Project.

The create(self, project) method

Uses the insert() method of the pymongo’s collection API, passing as parameter the JSON representation of the Project class. In case the project parameter does not have any value it raises an Exception. The projects.insert() method in traditional SQL would be insert into projects(_id, title, description, price, assigned_to) values (…). The insert() method can raise an OperationFailure error in case there were some errors during save.

def create(self, project):
if project is not None:
self.database.projects.insert(project.get_as_json())
else:
raise Exception("Nothing to save, because project parameter is None")

The read(self, project_id) method

Uses the find() method from pymongo API. This gets a project_id as parameter and queries the database for the project with the given id, otherwise it will return all the items in the database. The projects.find({}), in normal SQL language would be select * from projects. In case a project_id is available the SQL would be select * from projects where _id=project_id.

The update(self, project) method

Uses the save() method from pymongo API. The save method is special, because it’s behavior depends on the data it gets as parameter. If the passed in JSON contains an _id field, it will look-up the object with that _id in the collection and it will update the fields which were changed. In case the passed in JSON does not have an _id it will insert the value in the collection and that will receive a new _id. The save method can be matched with SQL’s update or insert operations, depending on the scenario.

def update(self, project):
if project is not None:
# the save() method updates the document if this has an _id property
# which appears in the collection, otherwise it saves the data
# as a new document in the collection
self.database.projects.save(project.get_as_json())
else:
raise Exception("Nothing to update, because project parameter is None")

The delete(self, project) method

Uses the remove() API method from pymongo. Please be attentive when using this method, it’s affect cannot be reverted. In case the remove() method is invoked with empty JSON or without any parameter all the documents from the collection will be deleted. In SQL the remove() method is equal to delete from projects, or in case there is an project_id available delete from projects where _id=project_id.

def delete(self, project):
if project is not None:
self.database.projects.remove(project.get_as_json())
else:
raise Exception("Nothing to delete, because project parameter is None")

I created a simple console application which demonstrates how the ProjectsRepository can be used. The code within main.py is executed only when it is launched as a main program, this can be checked using the following if statement:

if __name__ == '__main__':
# the script is executed as main, do something here

Demo Code

The demo code is simple. I created five methods, four methods for testing the CRUD operations and one main() method which glues the steps together.

Working with pymongo is very easy and very fast, once you get familiar with the basic 6-7 methods of the API you can build very robust and dynamic applications which can serve as backend for webpages or desktop applications.

I am a Software Engineer with over 7 years of experience in different domains(ERP, Financial Products and Alerting Systems). My main expertise is .NET, Java, Python and JavaScript.
I like technical writing and have good experience in creating tutorials and how to technical articles.
I am passionate about technology and I love what I do and I always intend to 100% fulfill the project which I am ...