Dabeaz

Sunday, August 09, 2009

Essential Misconceptions

A few days ago, Mike Riley posted a great review of the new "Python Essential Reference, 4th Edition" on Dr. Dobb's CodeTalk. In that review, he writes:

"While the author could have taken the easy path of regurgitating the online documentation, he has instead reworked the explanation for each class and function call in the Python core library with commendable clarity, frequently accompanying these detailed examinations with extremely useful and meaningful code examples. The book is also very well designed and organized, making it a snap to find information within a matter of seconds."

This is a reviewer who really gets what this book is about. However, for every great review like this, I also encounter comments that simply dismiss the book out-of-hand saying it "offers nothing" over Python's online documentation. With all due respect to Python's fine documentation, I beg to differ.

First and foremost, I've always viewed the Python Essential Reference as a serious programming reference for myself (yes, I always have a copy next to my desk and I use it regularly). Although, I will admit that Python certainly has a lot of online documentation, it's also missing a lot of essential details. For example, I can't count the number of times I've looked at the online documentation for something only to have to go out and do some kind of extended Google search to fill in a missing detail (or worse, having to load the source code for some module and look through it).

Let's look at an example. Suppose you're writing some networking code with the socket module and you want to use the recv(bufsize [, flags]) method of a socket. If you head off to the online documentation you will certainly find some information.

"Receive data from the socket. The return value is a string representing the data received. The maximum amount of data to be received at once is specified by bufsize. See the Unix manual page recv(2) for the meaning of the optional argument flags; it defaults to zero."

Yes, this is all very useful. Especially that part about having to refer to a Unix man page. I'm sure the Windows programmers find that especially useful. If you turn to the Essential Reference p. 483, you'll not only find a description, but you will also get a complete table showing you exactly what can be given for flags along with a brief description of each option. This approach is found throughout the book--with few exceptions are readers simply referred to other documentation. As another example, I would challenge anyone to effectively use something like the setsockopt() or getsockopt() methods of a socket using nothing by Python's online docs.

The other thing that I've tried to do in the book is answer all sorts of questions about tricky interactions between different parts of Python. Take, for example, this question: Can a separate execution thread safely close a generator/coroutine function by invoking the generator's close() method? Sure, that's not the kind of question that comes up every day, but if you know a thing or two about generators and coroutines, you'll know that they are often used in the context of concurrent programming, just like threads. Not only that, threads and generators might be used together (for example, using threads to carry out blocking operations). Thus, it is reasonable to assume that programmers working with both threads and generators in the same program might start to wonder about their possible interaction. I know I did.

If you try to find an answer to this question using the online documentation, you will be searching for some time and probably come up with nothing. Although there is plenty of discussion about generators, the yield statement, and other matters, you really don't find much about generators and threads mixed together. Even PEP 342, the official specification that introduced the generator close() method says nothing on this matter.

Now, let's look at the Essential Reference. First, if you turn to the index and look up "Threads", you will find about a half-page of subentries. In fact, there is even an entry labeled "Threads: close() method of generators, p. 104." If you turn to p. 104, you will find a sentence "if a program is currently iterating on a generator, you should not call close() asynchronously on that generator from a separate thread of execution or from a signal handler."

This is certainly not the only example, but there are a wide variety of similar questions that I try to address. For example, can you use a decorator with a recursive function? (p. 113). Or what is the interaction between the __slots__ feature of a class and inheritance? (p. 133). Or, does the name mangling of private attributes (e.g., __foo) in a class introduce a runtime performance penalty? (p. 128). All of these questions fall into a general category of issues related to the "side-effects" of using various Python features. Although you can find some of this in the online docs, it is often scattered and incomplete. I've tried to fix that.

Finally, I've really tried to make the Essential Reference a kind of programming "cookbook" of sorts. Although its primary goal is to be a reference, I have also incorporated a wide variety of practical examples from the Python training courses that I run. For instance, if you know about the Generators or Coroutines tutorials I presented at PyCON, you'll find similar information. I also include examples that explore tricky interactions and customization features of certain library modules. For example, how do I customize an XML-RPC server to only accept connections from known IP addresses? (p. 494). Or how do I use the ssl module to implement a secure server? (p. 489). Many of these examples are related to things that I've had to figure out once before, but can never quite remember on a day-to-day basis. By putting them in the book, it helps me remember how to do a variety of tricky things.

So, that's about it. I hope people find the book to be useful. If so, tell your friends. If not, feel free to use it for propping up some uneven furniture. Just don't say that it's the same as the online docs.