Drill-down on Three Major New Modules in Python 2.5 Standard Library : Page 3

The freshly minted 2.5 version of Python has lots of goodies, but the three in this article are the cream of the crop. Find out how ctypes, pysqlite, and ElementTree can save you time and aggravation in this extensive article with a ton of great sample code.

by Gigi Sayfan

Nov 22, 2006

Page 3 of 5

Module No. 2: sqlite3
Python takes data persistence very seriously. If you want to see some evidence just check: http://docs.python.org/lib/persistence.html. You will find 12 modules besides sqlite3. These modules can be classified into two groups: object serialization and dictionary-like databases based on the ndbm interface. I have seen the object serialization modules (especially pickle) used a lot in Python projects and libraries but I have never seen any of the database modules used in any real code. This doesn't say much because I am just one person and maybe my interests don't intersect with the people who use these interfaces. However, there is another path to data persistence that I believe is often traveled. In addition to its plethora of data persistence modules Python also defines a standard DB interface called DB API 2.0.

DB API 2.0 is defined at http://www.python.org/dev/peps/pep-0249/. This API provides a different conceptual model based on connection and cursor objects and is tailored for accessing relational databases. Now, this is an API I have seen used extensively, especially in the web programming world (TurboGears, SQLObject, SQLAlchemy).

There are DB-API 2.0 bindings for all the major relational databases out there and it is used by many libraries and middleware object-relational mappers. The full list is here: http://www.python.org/doc/topics/database/modules/. All these bindings aren't part of the Python standard library and of course the databases themselves are third-party software. This means that if you develop an application that accesses a relational database you have to do some installation; that can be as complicated as it gets for enterprise-scale databases. sqlite3 provides a great option for developers. You can develop your application using the built-in sqlite3 module without even thinking about it, and if you ever need to upgrade to a stronger database your code should be compliant in most cases. I will talk later about porting sqlite code to other databases.

Using sqlite for rapid development is not a new trick. People have used it prior to Python 2.5, but it could get irritating in some cases. Modern Linux distros come with sqlite, but which version? The two ubiquitous versions are sqlite 2.8.x and the 3.x. On Windows you have to download the sqlite3 DLL and put it on your path. Once you had the right version of sqlite installed, you had to download and install the pysqlite bindings, which had confusing nomenclature (pysqlite 1.x for sqlite 2.8.x and pysqlite 2.x for sqlite 3.x). I wasted a whole day trying to install pysqlite 2.1.2 on Gentoo due to conflicts with pysqlite 1.x , which was already installed. Now that it's part of the standard library it is guaranteed to be available and there is no need for special installation on any platform.

sqlite3 itself is a fantastic embedded database. I had the pleasure to work with it directly in C in a couple of C++ projects and through an object-relational mapper (SQLObject) in a Python/TurboGears project. It is not fully ANSI-SQL 92 compliant, but comes close enough. It requires zero administration, provides an easy command line interface for browsing DB files and there are a bunch of graphical front-ends you can use.

Enough talking; let's see some code. The goal of the next example is to develop an inventory system for a role-playing game. The player is a barbarian battle mage that can equip itself with weapons, armor, and rings. The database will contain four tables: item_types, equipment, inventory, and items. The equipment table will contain all the items the hero is currently wearing and wielding. The inventory table will contain all the non-equipment items in its possession. The items table will contain all the items in the game. The item_types table will contain the possible types of items ('armor, 'weapon', 'ring'). This is very simplistic of course and you may come up with a different DB design, but it will do for showcasing sqlite3.

First thing is to create the database. With sqlite3 you can just open a file and if it doesn't exist a fresh empty database is created. How convenient is that? To create a schema and populate it with values you can use a simple text file that contains DDL (data definition language) and SQL commands. Listing 1 shows the game.sql file for the game.db.

You can do it programmatically too, which may be appropriate for insertions. You will see how later. For the time being I want to create a DB file and populate it given game.sql. I invoke the sqlite3 interactive console with a parameter game.db. It creates this DB file automatically (subsequent calls will load the existing file). The sqlite prompt comes up. The version I use (3.1.3) is pretty outdated and is just what's installed on my Mac by default. It will do just fine for playing interactively with databases processed by other 3.x versions of sqlite programmatically. Interactive commands start with a dot ('.'). (For example, type .help to get some help.) I used the .read to load the game.sql file and then .tables to see that all the tables were created. Finally, I select all the items:

Now I have a database with some items, but the hero is unarmed, bankrupt, and naked: no weapon, no inventory, and no armor. If you don't trust me just check the inventory and equipment tables and verify that they are empty.

The sqlite3 module implements the DB-API 2.0 interface. This interface is built on the notions of connections and cursors. Connections allow you to connect to a particular instance of your DB. And cursors allow you to execute DDL and SQL commands, which extract data from the DB and populate the cursor, allowing you to iterate over it.

I'll begin by accessing my items table programmatically. First I have to create a connection by calling the connect function and pass a DB filename. You can set the isolation_level to None to get auto-commit behavior (every statement is committed immediately to the DB). Then you should call the connection's cursor() method to get a cursor you will use to execute commands and browse the results.

Each row in the in the result set is represented by a tuple. The order of the columns corresponds to the definition of the order of fields in the query. In case of 'select * from ...' the order is determined by the schema. Note that there is no need to terminate the command with the mandatory SQL semi-colon.

The whole model is very simple and intuitive: You create a connection and get a cursor, then you execute commands on the cursor, and the cursor returns the results.

To make it a little more interesting I'll define an Item class and a Hero class, which manage equipment and inventory persistently by interacting with the DB. The implementation is very simplistic and should not be used as the basis of industrial strength code. Both classes rely on the global a connected cursor named 'c'. Again, don't do it at home. This is for demonstration purposes only.

The Item class accepts an item_id, selects it from the items table using the 'c' cursor, and provides some accessors to its attributes. The only interesting part is that I have to use the fetchone() method of the cursor to get to the selected item. You can use next() too. Cursors are iterables: Even if the result set contains just one row, you can use either fetchone() or next() to get to it. When you're using the for loop Python calls next on your behalf.

The Hero class (see Listing 2) is more exciting. It provides a pickItem and dropItem methods that insert/delete items to/from the inventory. It also provides a semi-private _show method that simply dumps the content of a table to the standard output using print. In the __init__ method I'm using the new Python 2.5 partial function application feature to define concisely two new methodsshowInventory() and showEquipment()which are just a partial application of _show() method. This is much cleaner than defining actual functions that call _show with a parameter (see Listing 2).

Now, I can create some items in the game world and let the hero pick them up:

Dropping an item is as easy as calling the dropItem() method and passing in an item id to drop (if there multiple items with the same id, all of them will be dropped.)

h.dropItem(i1.id)
h.showInventory()

Output:

----- inventory -----
(26, 1, 2) gold armor
---------------------

You know how to pick up items and store them in the protagonist's inventory. The hero is still freezing though. To equip him with some outerwear you'll need some more code. The equipment table represents all the items the hero wears or can use. I'll add an equip() method to the Hero class. This method will delete an item from the inventory and insert it into the equipment table. This is a non-atomic operation; if only one of them succeed you will either lose an item (if delete succeeds but insert fails) or end up with a duplicate (if delete fails but insert succeeds). Relational databases thrive on such conundrums and sqlite is no different. The way to handle it properly is using transactions.

Remember that the connection operates in auto-commit mode. That means that it commits after each command. However, by wrapping the two operations in begin and commit/rollback commands you get transactional semantics without having to modify the isolation_level. The try-except block guarantees that if something goes wrong the transaction will be rolled back. If everything works, the transaction is committed at the end of the try block.

The Item class seems like a kludge. It's convenient but if you have lots of objects writing such a class for each one gets old fast. There are various solutions to this problem, everything from ad-hoc DAO (data access object) code generators to full-fledged ORM (object-relational mappers). The sqlite3 module features a modest and lightweight solution of its own. Enter the row factory. The connection object has a row_factory attribute that you can set to any callable to control the object that's returned for each row (instead of the default tuple). sqlite3 provides a useful row_factory called Row that returns an object that allows accessing columns by index like a tuple but also by name (insensitive).

sqlite3 has a few other interesting features like an authorizer that can intercept access to any column, converters and adapters to translate between SQL types and Python types and of course the memory database that allows for lighting quick tests. I encourage you to explore this wonderful module further.