Some Links

Tutorials

License

ZODB, a NoSql Database

“The way you use data is the way you store it”

In case we need to store large volumes of data, we
are accustomed to using a relational database. We rarely look for
alternates unless we run into a bottleneck. Even then, we are likely
to spend a lot of effort to optimise the database rather than step
outside the relational model. Non-relational databases have been
around for many years. When object oriented programming became
popular, a number of object databases were created but none captured
any substantial mind share. Object relational mapping software like
Hibernate for Java, SQLAlchemy for Python, ActiveRecord for Ruby,
fulfilled the need of using relational databases within the object
oriented programming paradigm.

SQL is a wonderful tool for arbitrary queries on a
relational database. However, its need may be overestimated by us.
For example, when dealing with a content management system, we are
more likely to need a keyword retrieval option rather than a flexible
SQL query. We use keyword search with Gmail and I have rarely felt
the need to narrow the search to, say, the subject only. Even if I
search the subject line, I still need a keyword search. I can't
recall needing to search where the use of an index on the subject
would have been beneficial, e.g. matching a prefix. Hence, a keyword
search tool like Apache Lucene (http://lucene.apache.org/)
along with any database, whether relational or not, can be a superb
solution.

In the last few years, the need for web-scale
databases has increased the interest in 'nosql' databases, a
misleading term which is now often interpreted as 'not only sql'
(http://nosql-database.org/).
One category of such databases are the object database management
systems (ODBMS) and among them is a native object database for Python
- ZODB (http://www.zodb.org/).
Object databases provide ACID support. Object databases reduce the
friction of having to transform objects into relational table rows
and vice versa; thus, improving the efficiency of accessing and
manipulating objects. There is no need to map all our information
needs into a well defined schema, which can be very difficult at
times. Imagine a shopping engine. Each category or even a product
group may need attributes which are a unique combination to the
product. Do we create a superset of all attributes or do we create a
keyword, value pair? Or should we just dump them in a string
description and interpret the string at runtime?

ZODB in Practice

ZODB is like a dictionary. It stores data in a key
value pair, where the value is a pickled(serialised) object. An
object could be a container, which is like a dictionary for storing a
very large number of elements.

Let us look at a simple example which would be
perfectly suitable for a relational database and see how it may be
implemented in ZODB. We have a set of albums and a set of tracks. We
may wish to access the tracks and from there, if need be, access the
album of which it is available. Or we may access an album and then
access the tracks which make up the album. In the relational model,
we will need a table each for albums and tracks. We will need a
foreign key from a track to an album. And we will need an additional
table to maintain the relationship between the album and tracks.
Suppose we realise that a track can be on multiple albums, we will
need to create one more table for that relationship instead of using
a foreign key.

Now, let us see how we do the same ZODB. The
initial step is to create/open the database, open a connection and
access its root. Let's write this basic code in app_db.py as we will
need to use it in each script.

The next step is to define the models we need.
Let's write them in app_models.py. Each track can belong on multiple
albums and each album contains multiple tracks. The only noticeable
line is the assignment of _p_changed variable to 1 to tell ZODB that
a mutable structure like a list or a dictionary has changed.

Finally, we print the data to see how to access
the data in zodb. We iterate over each album and each track and print
the values of the object. Details flag is used to prevent an
indefinite recursive loop.

def print_track(track,details=True): print('Title: %s by %s'%(track.title, track.artist)) if details: for album in track.albums: print_album(album,details=False) print('')

db = app_db() # iterate over albums and tracks print('List of Albums') for album in db.dbroot['Albums'].values(): print_album(album) print('List of Tracks') for track in db.dbroot['Tracks'].values(): print_track(track) db.close()

Working with ZODB is almost as easy as dealing
with dictionaries. One can use the Python method isinstanceof to
determine the type of an object with which we are dealing and write
very versatile and flexible code. ZODB has been around for over a
decade and has been used in various production environments though
Zope community does not seem to have been successful in marketing it
to developers for use outside the Zope (or Plone) environments.