This is an example of an incredibly powerful and incredibly irritating tool known as a regular expression parser. A full rant on what regular expressions are and why I hate them would probably run a bit long. But, to amuse myself, some quick examples drawn from around the net:

“RegExp” : “Translation”

“/\s.*\s/” : “Match any string of characters with two spaces around it”

“/\S.*\S/” : “Match any string of characters with two NON-space characters around it, obviously“

it’s basically absurd, impossible to validate, and just plain unreadable. It’s also the only way to do string parsing in a decent fashion.

The line above takes the string burped out by the IMAP server, something like:
“55 (UID 82 FLAGS (\Seen) BODY[HEADER.FIELDS (SUBJECT FROM)] {65}”
and turns it into a python dictionary, like:
{ id:’55’, uid:’82’, flags:’\\Seen’ }
to further your education, let me explain in words, what each part of the regexp does:

(?P<id>\d*) : “create a python dictionary entry ‘id’ and fill it with the first number you find”

\(UID (?P\d*) : “create entry ‘uid’ and fill it with the first number you find after the string ‘(UID’ “

The worst part about using regexp is that with the wrong sort of programmer around, it becomes a pissing contest where they insist with a straight face that their 180 character monstrosity is both ‘intutitive’ and ‘unlikely to fail’. And I’m the queen of France.

imap_server.search(character-set, search_string) is decently well documented in the python library, or you can always refer to the RFC3501 docs. If reading an RFC memo doesn’t fill your heart with dread, you haven’t been doing this long enough … or too long, I dunno. In any case, search() returns a string with a list of message IDs that you need to join into a comma delimited string. Once that’s done, you can actually fetch the message excerpt data,

imap_server.fetch takes the comma delimited fetch_list we’ve prep’d before, as well as a list of the IMAP metadata and RFC822 headers we want. Note we call the headers using the BODY.PEEK flag so to not change the ‘Unread’ flag of the messages.

Once we’ve got the huge-ass array of messages in f[1] (with f[0] containing our ‘OK’,’NOT OK’ status), we loop through the fetched message string (fm) entries. The length test “len(fm) > 1” is because the buggy Gmail IMAP implementation seems to toss back an extra ‘)’ that trips up imap_server.fetch. Now we parse the metadata contained in fm[0] and the RFC822 headers in fm[1]. With that, the rest of the code should be fairly readable. We populate a gmail_message object and toss it onto the messages list.