Python Data Persistence using MySQL

Packt Publishing

This is the perfect book for refining your Python skills through learning best practices and the latest development techniques. Packed with real-world examples and written by an expert, this is the authoritative tutorial.

To keep things simple though, the article doesn’t discuss how to implement database-backed web pages with Python, concentrating only on how to connect Python with MySQL.

Sample Application

The best way to learn new programming techniques is to write an application that exercises them. This article will walk you through the process of building a simple Python application that interacts with a MySQL database. In a nutshell, the application picks up some live data from a web site and then persists it to an underlying MySQL database. For the sake of simplicity, it doesn’t deal with a large dataset. Rather, it picks up a small subset of data, storing it as a few rows in the underlying database.

In particular, the application gets the latest post from the Packt Book Feed page available at http://feeds.feedburner.com/packtpub/sDsa?format=xml. Then, it analyzes the post’s title, finding appropriate tags for the article associated with the post, and finally inserts information about the post into the posts and posttags underlying database tables. As you might guess, a single post may be associated with more than one tag, meaning a record in the posts table may be related to several records in the posttags table.

Diagrammatically, the sample application components and their interactions might look like this:

Note the use of appsample.py. This script file will contain all the application code written in Python. In particular, it will contain the list of tags, as well as several Python functions packaging application logic.

Software Components

To build the sample discussed in the article you’re going to need the following software components installed on your computer:

Python 2.5.x

MySQLdb 1.2.x

MySQL 5.1

All these software components can be downloaded and used for free. Although you may already have these pieces of software installed on your computer, here’s a brief overview of where you can obtain them.

You can download an appropriate Python release from the Downloads page at Python’s web site at http://python.org/download/. You may be tempted to download the most recent release. Before you choose the release, however, it is recommended that you visit the Python for MySQL page at http://sourceforge.net/projects/mysql-python/ to check what Python releases are supported by the current MySQLdb module that will be used to connect your Python installation with MySQL.

MySQLdb is the Python DB API-2.0 interface for MySQL. You can pick up the latest MySQLdb package (version 1.2.2 at the time of writing) from the sourceforge.net’s Python for MySQL page at http://sourceforge.net/projects/mysql-python/. Before you can install it, though, make sure you have Python installed in your system.

Setting up the Database

Assuming you have all the software components that were outlined in the preceding section installed in your system, you can now start building the sample application. The first step is to create the posts and posttags tables in your underlying MySQL database. As mentioned earlier, a single post may be associated with more than one tag. What this means in practice is that the posts and posttags tables should have a foreign key relationship. In particular, you might create these tables as follows:

As you might guess, you don’t need to populate above tables with data now. This will be automatically done later when you launch the sample.

Developing the Script

Now that you have the underlying database ready, you can move on and develop the Python code to complete the sample. In particular, you’re going to need to write the following components in Python:

tags nested list of tags that will be used to describe the posts obtained from the Packt Book Feed page.

obtainPost function that will be used to obtain the information about the latest post from the Packt Book Feed page.

determineTags function that will determine appropriate tags to be applied to the latest post obtained from the Packt Book Feed page.

insertPost function that will insert the information about the post obtained into the underlying database tables: posts and posttags.

execPr function that will make calls to the other, described above functions. You will call this function to launch the application.

All the above components will reside in a single file, say, appsample.py that you can create in your favorite text editor, such as vi or Notepad.

First, add the following import declarations to appsample.py:

import MySQLdbimport urllib2import xml.dom.minidom

As you might guess, the first module is required to connect Python with MySQL, providing the Python DB API-2.0 interface for MySQL. The other two are needed to obtain and then parse the Packt Book Feed page’s data. You will see them in action in the obtainPost function in a moment.

But first let’s create a nested list of tags that will be used by the determineTags function that determines the tags appropriate for the post being analyzed. To save space here, the following list contains just a few tags. You may and should include more tags to this list, of course.

tags=["Python","Java","Drupal","MySQL","Oracle","Open Source"]

The next step is to add the obtainPost function responsible for getting the data from the Packt Book Feed page and generating the post dictionary that will be utilized in further processing:

Now that you have obtained all the required information about the latest post on the Packt Book Feed page, you can analyze the post’s title to determine appropriate tags. For that, add the determineTags function to appsample.py:

By now, you have both the post and tags to be persisted to the database. So, add the insertPost function that will handle this task (don’t forget to change the parameters specified to the MySQLdb.connect function for the actual ones):

Now let’s test the code we just wrote. The simplest way to do this is through Python’s interactive command line. To start an interactive Python session, you can type python at your system shell prompt.

It’s important to realize that since the sample discussed here is going to obtain some data from the web, you must connect to the Internet before you launch the application. Once you’re connected, you can launch the execPr function in your Python session, as follows:

>>>import appsample>>>appsample.execPr()

If everything is okay, you should see no messages. To make sure that everything really went as planned, you can check the posts and posttags tables. To do this, you might connect to the database with the MySQL command-line tool and then issue the following SQL commands:

Please note that you may see different results since you are working with live data. Another thing to note here is that if you want to re-run the sample, you first need to empty the posts and posttags tables. Otherwise, you will encounter the problem related to the primary key constraints. However, that won’t be a problem at all if you re-run the sample in a few days, when a new post or posts appear on the Packt Book Feed page.

Conclusion

In this article you looked at a simple Python application persisting data to an underlying MySQL database. Although, for the sake of simplicity, the sample discussed here doesn’t offer a web interface, it illustrates how you can obtain data from the Internet, and then utilize it within your application, and finally store that data in the database.

Alerts & Offers

Series & Level

We understand your time is important. Uniquely amongst the major publishers, we seek to develop and publish the broadest range of learning and information products on each technology. Every Packt product delivers a specific learning pathway, broadly defined by the Series type. This structured approach enables you to select the pathway which best suits your knowledge level, learning style and task objectives.

Learning

As a new user, these step-by-step tutorial guides will give you all the practical skills necessary to become competent and efficient.

Beginner's Guide

Friendly, informal tutorials that provide a practical introduction using examples, activities, and challenges.

Essentials

Fast paced, concentrated introductions showing the quickest way to put the tool to work in the real world.

Cookbook

A collection of practical self-contained recipes that all users of the technology will find useful for building more powerful and reliable systems.

Blueprints

Guides you through the most common types of project you'll encounter, giving you end-to-end guidance on how to build your specific solution quickly and reliably.

Mastering

Take your skills to the next level with advanced tutorials that will give you confidence to master the tool's most powerful features.

Starting

Accessible to readers adopting the topic, these titles get you into the tool or technology so that you can become an effective user.

Progressing

Building on core skills you already have, these titles share solutions and expertise so you become a highly productive power user.