Login

Perl and DBI

Databases are a mission-critical part of any company’s resources. If you program in Perl, you’ll want to learn about the DBI, which can help you connect to many popular databases. This article, the first part of a series, is excerpted from chapter 15 of the book Beginning Perl (Apress; ISBN: 159059391X).

It is now time to talk about one of Perl’s best modules: the Database Independent (
DBI
) module.
DBI
provides an easy-to-use and portable (both across operating systems and across databases) application programming interface (API) that allows us to connect to a wide variety of databases including Oracle, Sybase, Informix, MySQL, mSQL, Postgress, ODBC, and many others, even files with comma-separated values (CSV). With this module we can access and administer data
bases from our Perl programs, combining the power and enjoyment of Perl with the usefulness of databasing information.

In this chapter we will introduce the concept of SQL and discuss the most common ways to use it. Then we will discuss
DBI
and the related
DBD
(Database Driver) modules. We will then write some Perl code to access and update a MySQL database. Finally, we will take our newfound knowledge and connect it with our topic from the last chapter and create a simple web interface to a database by combining Perl,
DBI
and CGI. This sounds like fun, so let’s get to it.

Structured Query Language (SQL, pronounced as “EssQueueEl” by most and “Sequel” by some) is a language allowing a programmer access to a relational database. It is relatively easy to use—compared to Perl, learning SQL is a snap. We will talk about some of the most common SQL queries, or commands that access a database, and in talking about them we will describe the language to the point that learning the remaining details will be simply a matter of referring to an SQL book or website.

But we are getting ahead of ourselves. Before we can talk about SQL we need to discuss relational databases.

{mospagebreak title=Introduction to Relational Databases}

In order to talk about SQL, we will need to start by talking about relational databases. There are two important facts about relational databases. First, the content in a relational database is persistent—the data continues to exist after the execution of the program that accesses or modifies it. This is much like writing the data to a file on disk that will stay on the disk after the file is created, read from, or modified. The second important fact is that relational databases, unlike files on disk, allow concurrent access and updates from multiple users and processes. This means that more than one user can access the database at the same time—the database server takes care of making sure the changes are made to the data in a safe way.

A relational database, simply put, is a database of tables that can relate to one another in some way. A table is a collection of rows of data. Every row of data has the same basic pieces of information, called fields. There are a lot of buzzwords here, so let’s describe each of these by an example.

Let’s say we want to keep some information about our favorite musicians. The information includes their name, phone number (since we often call them up and chat), and the instruments that they play. We might start by creating a list of the musicians like this:1

Roger Waters

555-1212

Geddy Lee

555-2323

Marshall Mathers III

555-3434

Thom Yorke

555-4545

Lenny Kravitz

555-5656

Mike Diamond

555-6767

This list of musicians shows six lines of data. These lines are called rows in relational database–speak. We would take these six rows and place them together into one collection of data, called a table. Normally, when we place data within a table, we want to create a unique identifier for the row, called a key—just in case we had two different Marshall Mathers III in our table we could access the one we are interested in using this unique value. We will name the key
player_id
and name the other columns, or fields, as well:

player_id

name

phone

1

Roger Waters

555-1212

2

Geddy Lee

555-2323

3

Marshall Mathers III

555-3434

4

Thom Yorke

555-4545

5

Lenny Kravitz

555-5656

6

Mike Diamond

555-6767

What we have created here is a table (let’s name it
musicians
) with three fields (
player_id
,
name
, and
phone
) and six rows of information. With this one example we have defined most of our relational database buzzwords, except relational.

{mospagebreak title=The Relational of Relational Database}

Normally when we create a database of information, we spread the data out among several different tables. These tables will relate to one another in some way, usually by a key or other field in the table.

As an example, let’s expand our information about musicians to describe what instruments each of our musicians play and some important facts about those instruments. We could add each instrument to the row in the
musicians
table, but that would cause a lot of repeated information. For instance, three of our musicians play the guitar, so any information we provide for a guitar would have to be repeated for each of the three musicians. Also, several of our musicians play more than one instrument (for instance, Thom Yorke plays guitar, sings vocals, and also plays keyboard). If we provide each instrument that Thom plays, our table would become big and difficult to work with.

Instead, let’s create another table, named
instruments
, that will have this information:

inst_id

instrument

type

difficulty

1

bagpipes

reed

9

2

oboe

reed

9

3

violin

string

7

4

harp

string

8

5

trumpet

brass

5

6

bugle

brass

6

7

keyboards

keys

1

8

timpani

percussion

4

9

drums

percussion

0

10

piccolo

flute

5

11

guitar

string

4

12

bass

string

3

13

conductor

for-show-only

0

14

vocals

vocal

5

Now that we have defined some instruments and our opinions of their related difficulties, we somehow need to map the instrument information to the information stored in the
musicians
table. In other words, we need to indicate how the
instruments
table relates to the
musicians
table. We could simply add the
inst_id
value to the
musicians
table like this:

player_id name phone inst_id

1 Roger Waters 555-1212 12

and so on, but remember that many of our musicians play more than one instrument. We would then need two rows for Roger Waters (he sings, too) and three rows for Thom Yorke. Repeating their information is a waste of memory and makes the database too complex. Instead, let’s create another table that will connect these two tables. We will call it
what_they_play
and it will have two fields:
player_id
and
inst_id
.

player_id

inst_id

1

11

1

14

2

12

2

14

3

14

4

7

4

11

4

14

5

11

5

14

6

9

To read all this information and make sense of how it relates, we would first look in the
musicians
table and find the musician we want, for instance Geddy Lee. We find his
player_id
, 2, and use that value to look in the
what_they_play
table. We find two entries in that table for his
player_id
that map to two
instr_id
s: 12 and 14. Taking those two values, we use them as the keys in the
instruments
table and find that Geddy Lee plays the bass and sings for his band.2

This example illustrates that the
musicians
table relates to the
instruments
table through the
what_they_play
table. Breaking up the data in our database into separate tables allow us to list the information that we need only once and is often more logical than listing all the information in a single table—this is called normalization.

{mospagebreak title=We Need an SQL Server—MySQL}

Before we can show examples of SQL, we need an SQL server. There are many available to choose from, some that cost money, some that cost a lot of money, and some that are free. Given that we like free, we are going to choose one of the best, most powerful SQL servers available: MySQL.

MySQL (www.mysql.com) is open source and available for many different operating systems. It is relatively easy to install and administer. It is also well documented (http://dev.mysql.com/doc/mysql/en/) and there are many good books available including the excellent The Definitive Guide to MySQL, Second Edition by Michael Kofler (Apress, 2003). MySQL is an excellent choice for small, medium, and large databases. And did we mention it is free?

Installing MySQL

If you are a Linux user, the chances are MySQL is installed already. Do a quick check of your system to see. If not, it will have to be installed.

Installation instructions can be found at the MySQL website (http://dev.mysql.com/doc/ mysql/en/Installing.html). Since it is so well documented there, we will not repeat that information here. You can also check out The Definitive Guide to MySQL, Second Edition.

Testing the MySQL Server

Just to be sure all is well, let’s enter a few MySQL commands to the shell prompt to see if everything is working. The following examples assume that the MySQL root
user (not to be confused with the Unix
root
user) has been given a password. Giving the MySQL
root
user a password is a very good idea if your server will be available over the network—you don’t want a pesky cracker logging into the server and being able to do devastating and destructive things like modifying or deleting your data. Let’s say
root
’s password is “RootDown”.3