Python unit testing part 2: the doctest module

The doctest module has been part of the Python standard library since version 2.1.

Ease of use

It's hard to beat doctest in this category. There is no need to write separate test functions/methods. Instead, one simply runs the function/method under test in a Python shell, then copies the expected results and pastes them in the docstring that corresponds to the tested function.

In my example, I simply added the following docstrings to the post_new_entry and delete_all_entries methods of the Blogger class:

I then added the following lines to the __main__ section of the Blogger module:

if __name__ == "__main__":
import doctest
doctest.testmod()

Now the Blogger module is "doctest-ready". All you need to do at this point is run Blogger.py:

# python Blogger.py

In this case, the fact that we have no output is a good thing. Doctest-based tests do not print anything by default when the tests pass.

API complexity

There is no API! Note that doctest-enabled docstrings can contain any other text that is needed for documentation purposes.

However, there are some caveats associated with the way doctest interprets the docstrings (most of the following bullet points are lifted verbatim from the doctest documentation):

any expected output must immediately follow the final '>>> ' or '... ' line containing the code, and the expected output (if any) extends to the next '>>> ' or all-whitespace line

expected output cannot contain an all-whitespace line, since such a line is taken to signal the end of expected output

output to stdout is captured, but not output to stderr (exception tracebacks are captured via a different means)

doctest is serious about requiring exact matches in expected output. If even a single character doesn't match, the test fails (so pay attention to those white spaces at the end of your copied-and-pasted lines!)

the exact match requirement means that the output must be the same on every run, so in the output that you capture you should try not to:

print a dictionary, because the order of the items can vary from one run to the other

operate with floating-point numbers, because the precision can vary across platforms

print hard-coded addresses, such as

Test execution customization

The output of a doctest run can be made more verbose by means of the -v flag. Here is an example:

The amount of output seems a bit too verbose to me, but in any case it gives a feeling for how doctest actually runs the tests.

One other important customization that can be done for doctest execution is to include the docstrings in a separate file. This can be beneficial when the docstrings become too large and start detracting from the clarity of the code under test, instead of increasing it. Having separate doctest files scales better in my opinion (and this seems to be the direction the Zope project is heading with their test strategy, according again to Jim Fulton's PyCon 2004 presentation).

As an example, consider the following text file, which I saved as testfile_blogger:

Note that free-flowing text can coexist with the actual output and there is no need to use quotes. This is especially advantageous for interspersing the output with descriptions of test scenarios, special boundary cases, etc.

To have doctest run the tests in this file, you need to put the following 2 lines either in their own module, or replacing the 2 lines at the end of the Blogger module:

import doctest
doctest.testfile("testfile_blogger")

I chose to save these lines in a separate file called doctest_testfile.py. Here is the result of the test run in this case:

doctest does not provide any set-up/tear-down hooks for managing test fixture state (although this is a feature that's considered for inclusion in future release). This can sometimes be an advantage, in those cases where you do not want each of your test methods to be independent of each other, and you do not want the overhead of setting up and tearing down state for each test run.

Test organization and reuse

A new feature of doctest in Python 2.4 is the ability to piggyback on unittest's suite management capabilities. To quote from the doctest documentation:

As your collection of doctest'ed modules grows, you'll want a way to run all their doctests systematically. Prior to Python 2.4, doctest had a barely documented Tester class that supplied a rudimentary way to combine doctests from multiple modules. Tester was feeble, and in practice most serious Python testing frameworks build on the unittest module, which supplies many flexible ways to combine tests from multiple sources. So, in Python 2.4, doctest's Testerclass is deprecated, and doctest provides two functions that can be used to create unittest test suites from modules and text files containing doctests. These test suites canthen be runusing unittest test runners.

The two doctest functions that can create unittest test suites are DocFileSuite, which takes a path to a file as a parameter, and DocTestSuite, which takes a module containing test cases as a parameter. I'll show an example of using DocFileSuite. I saved the following lines in a file called doctest2unittest_blogger.py:

Note that this is a unittest-specific output. As far as unittest is concerned, it executed only 1 test, the one we added via the suite.addTest() call. We can increase the verbosity by calling unittest.TextTestRunner(verbosity=2).run(suite):

To convince myself that the doctest tests are really being run, I edited testfile_blogger and changed the expected return codes from True to False, and the last expected value from 0 to 1. Now both doctest tests should fail:

As you can see, it's pretty easy to use the strong unittest test aggregation/organization mechanism and combine it with doctest-specific tests.

As a matter of personal taste, I prefer to write my unit tests using the unittest framework, since for me it is more conducive to test-driven development. I write a test, watch it fail, then write the code for making the test pass. It is an organic process that sort of grows on you and changes your whole outlook to code design and development. I find that I don't get the same results with doctest. Maybe I'm just not used to playing with the Python shell that much while I'm developing. For me, the biggest plus of using doctest comes from having top-notch documentation for my code. This style of testing is rightly called "literate testing" or "executable documentation" by the doctest folks. However, I only copy and paste the expected output into the docstrings AFTER I know that my code is working well (because it was already unit-tested with unittest).

Assertion syntax

There is no special syntax for assertions in doctest. Most of the time, assertions will not even be necessary. For example, in order to verify that posting a new entry increments the number of entries by 1, I simply included these 2 lines in the corresponding docstring:

>>> num_entries == init_num_entries + 1
True

Dealing with exceptions

I'll use an example similar to the one I used for the unittest discussions. Here are some simple doctest tests for the sort() list method:

Note that for testing exceptions, I simply copied and pasted the traceback output. doctest will look at the line starting with Traceback, will ignore any lines that contain details likely to change (such as file names and line numbers), and finally will interpret the lines starting with the exception type. Both the exception type (NameError in this case) and the exception details (which can span multiple lines) are matched against the actual output.

The doctest documentation recommends omitting traceback stack details and replacing them by an ellipsis (...), as they are ignored anyway by the matching mechanism.

To summarize, here are some Pros and Cons of using the doctest framework.

doctest Pros

available in the Python standard library

no API to remember, just copy and paste output from shell session

flexibility in test execution via command-line arguments

perfect way to keep documentation in sync with code

tests can be kept in separate files which can also contain free-flowing descriptions of test scenarios, special boundary cases, etc.

doctest Cons

output matching mechanism mandates that output must be the same on every run

no provisions for test fixture/state management

provides test organization only if used in conjunction with the unittest framework

7 comments:

Note that doctest in Python 2.4 has a number of hooks to control how input is compared -- including ellipses (which like a wildcard match anything, e.g., an object's address) and a token for matching blank lines.

The original idea of doctest was to test documentation, i.e., make sure your examples are up to date. While it can be used for more than that, the documentation testing alone is a good reason to know and use it. I also find it is great for small functions, where a unittest setup seems too heavy; e.g., a five-line function can easily require twenty or more lines to test with unittest. doctest is generally nice when you have code with few dependencies and little internal state, and when you are doing input/output kind of testing (e.g., call x(foo) and you get y, call x(foo+1) and you get z, x(-1) gives you an exception, etc). I find this to be one of unittest's weakest areas, simply because of the tedium involved in the class declarations.

One last doctest detail that I like is __test__, a magic variable which can be a dictionary of doctest strings. This is somewhere between putting tests in docstrings, and putting tests in a separate text file.

As for TDD, I think it's just fine -- there are certainly places where the interactive prompt doesn't feel entirely predictable, so you'll copy and paste, but often it is predictable (as predictable as in a unittest at least). Or you'll put in a failure condition, but won't bother with identifying the specific exception until you run the test, which is still reasonable when you are only making sure that corner cases fail, not that they fail in a particular way.

I actually prefer it for TDD, because I never know what to start with in unittest. In a doctest, when I don't know what test to write, I start writing documentation, and it lets me ease into what I want the test to say. It seems easier to start with small tests if I think about trying to explain to someone what the finished code will be doing. And, I get some documentation written. :)

I have found that writing doctests requires a different way of thinking about tests, but once you grok it (examples from Zope 3 and PJE's blog help a lot here), the resulting tests are much more readable.

Switching from unittest to doctest is like switching to a different programming language -- you need to change the way you think about tests. As they say, you can write FORTRAN programs in any language.

>As a matter of personal taste, I prefer to write my >unit tests using the unittest framework, since for me >it is more conducive to test-driven development. I >write a test, watch it fail, then write the code for >making the test pass. It is an organic process that >sort of grows on you and changes your whole outlook to >code design and development. I find that I don't get >the same results with doctest.

I like unittest, but I've lately been doing some things with nested classes, and I'm wondering if one could do something interesting with doctest for inner classes (in more of a white-box sense), and unittest for more of a black-box feel. It's nice to be able to look at the shape of the code, and have it suggest its usage at a glance...