Python Programming on Win32: Chapter 14 - Working with Email

Email is everywhere these days and is so simple, it can be used for many tasks beyond personal communications. It is not uncommon to find a program that sends an email to an administrator when it encounters some critical situation. Forms on the Web often run a simple CGI script that send the details to a specific email address. Once the volume of these emails increase, an automated script may process the mailbox and further process the messages according to some criteria based on the email contents.

For these and many other reasons, it is not a surprise to find that working with email is a common task, particularly for scripting languages. There are many email systems in use, including SMTP/POP3 facilities, Microsoft Exchange Server, and IBM's Domino (previously known as Lotus Notes) among others.

In this chapter, we look at some common techniques for working with email on Windows. For each technology, we develop short examples that send an email and then attempt to get that same mail back.

SMTP and POP3

SMTP is an acronym for Simple Mail Transfer Protocol. This is an Internet standard, specified in RFC-821, and as its name implies, is a protocol for transferring mail messages. When an SMTP server receives a piece of mail, it does one of two things: forwards the email to a host closer to the intended recipient, or if the recipient is local, places the email in the recipient's mailbox. Thus, SMTP provides a technique for putting messages in a mailbox, but it doesn't define a technique for retrieving existing messages from a mailbox. To this end, the Post Office Protocol. Version 3 (POP3) has been designed, as specified in RFC-1725. Its explicit purpose is to allow remote access to a mailbox managed on a remote computer.

In practice, this means that SMTP can send Internet email, and POP3 can retrieve Internet email.

As is common for Internet protocols, both mail protocols use a simple conversation between a client and a server. This conversation is ''line-based'' (meaning all commands and responses are sent as complete lines) and works exclusively with 7-bit ASCII data. Each protocol defines its own special command and response sequence to support its various options.

The mail messages handled by both these protocols must be formatted as specified in various RFCs, starting with RFC-822, to the latest, which is RFC-1521. In a nutshell, these RFCs define the format of the message header (a list of headers for the message, including the subject, recipient information, etc.), and the message body. The message body must consist of 7-bit ASCII and may optionally include a number of different sections. These sections typically encode binary attachments or alternative renderings of the message text. Messages with multiple sections are typically referred to as Multipurpose Internet Mail Extensions (MIME) messages. Unfortunately, MIME is a complex beast and beyond the scope of this chapter. Python does support various MIME standards, but using and packing everything into an email message is not for the faint hearted. If you have this requirement, and Microsoft Exchange or a slightly higher-level email system is available, you should consider using that.

Sending an SMTP Message

To begin, we'll use Python to send a simple message using the SMTP protocol. Our message will contain the minimum number of message headers, a plain ASCII message body, and no attachments.

To assist in this task, we'll use the Python module smtplib. This module contains a single class, SMTP, that manages the connection with the SMTP server and provides useful methods for interacting with the server.

Sending a simple message using SMTP is so simple it's not worth writing a sample source file for this purpose; you can do it at the interactive window. The SMTP class provides the following method:

bad_addresses = sendmail( from, to, message )

fromA string with the address of the sender.

toA list of strings, one for each recipient.

messageA message as a string formatted as specified in the various RFCs.

So all you need is the message itself, a list of recipients, and your own email address.

As per RFC-822, the format of the message is simple. It consists of a list of message headers, followed by a blank line, followed by the message body. For this demonstration, you can set up a message with the following code:

>>> msg="Subject: Hi from Python\n\nHello."

Define the subject of the message as "Hi from Python" and the body as "Hello."

The result from this function is a dictionary of email addresses in the to list that failed; the dictionary is keyed by the email address, with the error message as the value. In this example you received an empty dictionary, so everything went OK. See the smtplib module documentation for more information on error handling.

Receiving via POP3

POP3 downloads messages from a remote mailbox. As we discussed previously, SMTP is used typically to send Internet email messages, and POP3 receives them.

Like most Internet protocols, POP3 uses a line-based communications protocol, and also like most Internet protocols, you will find a Python module designed to ease working with that protocol; in this case the Python module is poplib.

Before delving into a discussion of POP3, it is worth noting that an improved protocol known as Internet Message Access Protocol (IMAP) has been designed. Although this fixes many of the shortcomings in the POP3 protocol, it's not used as widely as POP3. Therefore, we will discuss using POP3 to ensure the code works on the widest possible range of mail servers. If you need to investigate using the IMAP protocol, you should view the module imaplib and its associated documentation.

There are three steps to establishing a connection to a POP3 mailbox:

1. Connect to the server by creating a poplib.POP3() instance, specifying the hostname.

2. Send the mailbox account name, using the user() method.

3. Send the mailbox password using the pass_() method (pass is a reserved word in Python, hence the trailing underscore).

You now have a valid connection, and the mailbox is locked. While the mailbox is locked, no other connections are possible, so it's important to unlock the mailbox when you're done using the quit() method. If you don't unlock the mailbox, other mail clients (such as your regular email client) won't be able to connect until the connection times out, which may take some time. It would be appropriate to use a Python finally block for this purpose, as the example will show.

POP3 messages are numbered from 1-n, where n is the number of messages currently in the mailbox. Obviously, these message numbers are not unique and are valid only for the given session. So the first step to reading the mailbox is to determine the number of messages in the mailbox using the stat() method. Then you can request each message by number. For the first example, don't bother looping over all the messages, but, instead, just look at the first message:

This is the same message you sent previously. Notice all the headers this message now has; even though you specified only a few, the mail transport system has added many more. The output shown has had many headers removed for brevity.

At this point you may start to get a little worried. Looking at the code, you can see the message is returned as a list of lines, but many of those lines are headers. Worse, some of the headers are split over multiple lines (as supported by the relevant RFC). Does this mean you need to understand all this before doing anything useful?

Fortunately, Python has library support for parsing and using data of this format. The most basic support can be found in the rfc822.Message() class, but the mimetools module supports an extension to this class that supports the various MIME extensions (as described earlier). Since MIME is an extension to the basic standard, you can safely use it even for non-MIME messages.

A slight complication is that the mimetools.Message() class expects to receive a file object from which it obtains its data, rather than a list of lines! The StringIO (or cStringIO) module can make a file object from a string, but there is a list of strings. The simplest solution is to join the list back into a huge string and feed that into cStringIO.

Once you create mimetools.Message(), all the headers are read, and the file is positioned at the start of the body. You can then use the various methods to examine the headers. Depending on the message content, you can either read the rest of the file to obtain the body or use some of the MIME-specific features to process the various sections.

You can now modify the example to take advantage of this class. Loop over all messages in the mailbox and examine the Subject header for the test message. When you find the message, print the message body and delete the message.

The significant additions to the new example are:

A loop to examine all the messages.

Using cStringIO to create a file object as discussed.

Examine the Subject header of each message using the getheader() method.

If you experiment with this code, you'll see that the Message class has correctly handled the continuation of long header lines. Working with the message headers is made far simpler with the mimetools.Message class and worth the small hoops you need to jump through to use it.

Microsoft Exchange/Outlook

The use of Microsoft messaging products is becoming quite common in larger organizations. The Microsoft Exchange Server is often used at the backend, and various versions of Microsoft Exchange or Microsoft Outlook may be used as the client.

One key feature of Microsoft Exchange is that it exposes a rich and powerful API developers can use to extend their applications. Tasks such as form processing, or processing appointments or contact lists, can all be accessed from a COM interface. Although we will only discuss sending a simple message using Microsoft Exchange, you should peruse the documentation supplied with Exchange to get a feel for its capabilities.

Collaboration Data Objects

Collaboration Data Objects (CDO) is a general-purpose COM automation interface for working with Microsoft Exchange. Because CDO is an automation interface, it's suitable for use with scripting languages, such as Visual Basic, JavaScript, and of course, Python.

CDO has gone through various name changes over its long life. Its evolution can be traced through "Simple MAPI," a set of APIs for Visual Basic 1, through a more general-purpose Visual Basic Extension (VBX), then into a general-purpose COM interface known as Active Messaging, and finally receiving even more features and being renamed CDO.

It provides a rich object model; there are objects for messages, folders, users, distribution lists, etc. The object model is "rooted" from a MAPI session object. The session object identifies the mailbox and provides a list of subfolders, each of which has its own list of messages and subfolders.

First, let's experiment with MAPI from a Python prompt. Create a MAPI session using the standard COM technique:

However, since we are indirectly calling the Item() method and documentation is found under the method name, we'll stick to the slightly longer version.

Sending a Message with CDO

The procedure to send an email with CDO is simple; create a new message in the outbox, set the message's properties, and send it. Let's do this interactively using the session object created previously. First, create a new message in the outbox using the Add() method. The CDO documentation states that this takes two parameters: the subject of the message and the text of the message:

>>> newMsg = s.Outbox.Messages.Add("Hi from Python", "Hello") >>>

Now add a single recipient using the message's Recipients property. The Recipients.Add() method takes two parameters: the display name of the recipient and the email address. Note that the email address must be prefixed with the Exchange Transport to be used; in this case, use the SMTP transport for Internet email addresses:

Now the message is sitting in the outbox, waiting to be delivered. Depending on the local configuration options, it may be some time before the next scheduled connection for delivery and receipt of mail. You can force this by calling the DeliverNow() method on the session:

>>> s.DeliverNow() >>>

Retrieving a Message with CDO

Now that we have sent out message using Microsoft Exchange, let's write a few lines to read the message back. Depending on the speed of your email server and the route the email takes before getting back, it may take some time for the mail to be returned. At any time you can force the client to connect to the server to check for new messages by calling the DeliverNow() method.

The first thing to do is print the subject of the last message in the inbox:

Another demonstration would be to loop over all messages in the inbox, find the test message sent previously, and delete it. CDO provides special methods for iterating over all messages, in either a forward or reverse direction. You could even allow CDO to perform additional filtering of the message, but for now, try it for yourself.

The methods we will use for iterating are GetFirst() and GetNext(). These are methods of a Messages collection, so the first thing to do is save the Messages collection to a local variable:

>>> messages = s.Inbox.Messages

You can then write a loop checking each message, and when you find one to delete, call the Delete() method on the message. Here's the code:

As you can see, the code found and deleted exactly one message. CDO exposes a rich object model for folders and messages; every property imaginable about a message can be obtained. See the CDO documentation for more details.

Conclusion

In this chapter, we presented a quick overview of two common mail systems used on Windows: Internet email and Microsoft Exchange.

The protocols defined by the various standards are still the most common in use for Windows. Many Windows users use email only through an Internet service provider, and the vast majority of these provide email servers that use the POP3 and SMTP protocols. We presented enough information for you to have a basic understanding of these protocols, and how to make use of them from Python. For further information, you should consult the Python documentation on these Python modules.

In many corporate Windows environments. Microsoft Exchange is the mail server of choice. Although Microsoft Exchange generally supports the Internet protocols, it also supports a far more flexible and simple interface using COM. If you work in an Exchange environment, we've given you enough information to get started with the rich model exposed by Exchange. For more information, see the CDO documentation at http://www.microsoft.com/exchange.