Make a habit of writing automated tests and running them frequently as you
code!

For a while it might seem like you can get away without doing it.
For a while, you can, sort of.
But learning how to use automated testing might be the most important thing you
can do to improve the quality and reliability of your code. If you don’t
use automated testing, you just aren’t working in a professional way.

Note

Testing is an area where there are lots of disagreements on what things
should be named, and what’s best to do. For example, passionate and endless
arguments rage about the exact meaning and correct ways for “unit testing.”
Don’t let this discourage you from getting started with testing.
Please take what’s said here with a grain of salt, and do some research on
your own so you can have informed opinions about testing.

In their simplest form, tests are simply bits of code you can run,
which cause an exception to be raised when something is broken or incomplete in
the code you want to test. When a test is run and an exception escapes or an
assertion fails, the test fails. And the test should fail any time the right
things did not happen.

If you think about it, this isn’t much different from the little snippets of
code you type into a Python interpreter shell, or into throwaway files, to try
out new ideas and see if they work. You just save the code and structure it
a little so you can easily run it over and over, to tell you if you are
finished with something and to detect any new problems you might have caused.
Instead of having to do it manually and inspect the results manually, you
make it so it can run automatically and the failure can be detected
automatically.

Whenever you check whether your programs work, you’re testing them. At first,
it seems like the easiest thing to do is just to try it out yourself, then fix
problems you see on the fly. This might work pretty well for new projects which
are small (and probably not very important yet). But as your programs do more,
the time required to manually check their behavior
gets longer, and it becomes easier to get lazy or just forget to check some
things. It also starts to waste your life on tedious waiting and repetition.
And as you are fixing and testing one thing, it is often the case that you are
creating bugs somewhere else, in a potentially endless stream of bugs. This
creates insane levels of frustration and is a terrible way to live. It’s also
bad for the software. You get broken software, abandoned projects or both.
The problems only get worse on multiple-person projects with real requirements
and deadlines.

Since you are going to test anyway, you might as well do it in a way that is
both easier and more thorough - by writing automated tests.

So far this is all pretty abstract, and it may be unclear how to actually
get started. So let’s look at an example to get a feel for what testing
looks like.

The silly example task is to write a function which returns a sequence of
strings containing the numbers 1 to 10, with “fizz” appended to multiples of
three and “buzz” appended to multiples of five. (This problem is totally
ridiculous, it exists solely to provide an example that is simple to
understand.)

This test is simple, but a bit silly. Often you cannot give all the desired
outputs, or you would want to organize them differently, but this will do to
give you an idea, and it does test the code. Anyway: better a silly test
than no test, right?

Now here’s some silly example code meant to pass the test (fizzbuzz.py):

There are many possible solutions to this problem, and this one is a bit lame.
The point is this: once you have adequate tests, it becomes much easier
to rewrite without breaking anything. It’s better to get something simple that
you have tests to verify, than to try to get something perfect the first time
without any tests to check it.

The example test file can be easily found and run by utilities like
py.test (pip install pytest) or
nosetests (pip install nose).
e.g., as run from the top directory of your project:

py.test

or

nosetests

It can be pretty much that easy, if you have structured your project to make
the tests easily discoverable. These tools typically search directories named
things like test/ or tests/, finding files that start with test_
and looking for test functions which have names starting with test_ or test
classes which have names starting with Test. In some complex cases,
automated test discovery of this kind can be confusing and you might have to
use verbose flags or docs for the project in order to get a handle on it.
Usually, though, it should save some trouble and make testing a bit more fun.

Several test runners also have useful functionality like running multiple tests
at once or distributing tests across multiple machines.

With minor modification, one could also write and run the test using Python’s
builtin unittest module. I find it to be verbose and lacking in some
useful features, but that’s just my opinion; lots of people use it, and it’s
a reasonable thing to use.

Assuming you named this file test_fizzbuzz2.py, one way to run its tests from
the same directory would be:

python -m unittest -v test_fizzbuzz2

Actually you can use py.test or nosetests to run this style of
test too, if you prefer. Even if you want to use unittest to write
your tests, I recommend picking up py.test or nosetests just to run them.

The idea behind Python’s old doctest module is to basically paste
transcripts of shell interaction into text files or docstrings, and then run
doctest on these files to check that the results given by the code are the
same as in the transcript.

Although it might initially seem like less work than writing real tests,
there are several reasons you might not want to use doctest to test your Python
modules.

The tests are brittle, with small changes affecting the text output.
Making the tests succeed despite non-deterministic runtime variations and
trivial changes across versions tends to require ridiculous contortions that
(in the author’s opinion) remove the apparent ease-of-use advantage for
doctest.

Many tests one should normally do are very awkward to render as text
transcripts, greatly reducing the appeal in practice; for example,
comprehensive doctests tend to result in many complex lines yielding True or
False, interrupted periodically by lines which do nothing but set up or tear
down pieces of the test environment. The format also encourages writing long
lumps of doctest containing multiple tests.

Putting doctests in docstrings clutters up the docstrings, obscuring the
kinds of documentation which really belong there. I don’t like to read long
shell transcripts in docstrings, and I suspect many other Python programmers
feel the same.

Apart from its merits, by now it looks pretty old-fashioned and sloppy to do
most of your testing this way.

However, doctest is still useful for ensuring that examples you give in
documentation work properly. Normally you would configure Sphinx to do this for
you.

This section contains some general and philosophical thoughts on how to test
effectively with less unnecessary pain. It’s aimed primarily at people with
less exposure to testing, who do not necessarily understand or agree with the
need. If that is not you, you probably don’t need to read it. In particular,
if you are a diehard of TDD or a unit testing purist, you may find this section
objectionably vague or weakly stated. (Oh well.)

Writing tests can be boring.
Probably the hardest thing about it is getting started. It gets
harder the longer you wait after the beginning of a project. Once you get
started, it’s usually not very hard to continue. Whatever arguments you
encounter regarding the right way to write tests, remember that the core point
is to have automated tests early on, and do not get discouraged. You can refine
your approach, speed up your tests or add other types of tests as you go. Just
get started, so you have something to fix and a reason to fix it.

As you make changes, run your tests frequently so you know when something
starts being broken. This can often let you catch smaller bugs before they
compound and accumulate into giant, mysterious tangles which make the project
hell to work on. At a minimum, this lets you know about impending issues
earlier, when there is more chance of fixing them and avoiding big problems.
Let too much time go between test runs and the tests are likely to become
irrelevant, then ignored, and then you lose all the benefits of having tests at
all.

It is reasonably common to prevent code from being committed unless it passes
tests using commit hooks (e.g.,
git commit hooks
or mercurial commit hooks).
This is intended to ensure that there are no outright broken versions of
the code that someone might check out, which can save a lot of time (for
example) when one needs to roll back from a deployed version containing a
previously undetected bug to a previous version.

Continuous testing is also a big reason people use CI (Continuous Integration)
servers like Jenkins or
Travis, to monitor the health of committed code in
version control repositories in addition to whatever testing developers are
doing on their own machines.

Though everyone will tell you to do it, don’t misunderstand writing tests as
a matter of holiness: it’s really a practical matter of reducing pain, for you
as much as others. The time and effort of writing tests is justified by the
work that it saves. However, that time and effort shouldn’t be ignored, as
bad practices can increase it and somewhat reduce the advantage of testing.

In extreme cases, highly overdone or stupidly-written tests CAN cause more work
than they save - but that’s rare in practice. It’s much more common that we say
that to ourselves in order to rationalize our own laziness, when it’s not
really true. The tests you write, however imperfect, are almost certain to
catch more bugs in your code than anybody will do by hand, and to require much
less effort than repeating the same manual testing over and over.

There is a trap for people who underestimate the trouble saved by tests.
It’s so tempting to think: “these tests are taking too much time to write.
I need that time to fix bugs and add features” - especially under pressure to
deliver quickly.
The tighter the deadline and the more urgently your fixes and features are
needed, the worse it will be when you unexpectedly have to drop everything to
fix a pile of emergency bugs. In that extreme urgency, you will have even less
room to write tests to help dig out of this vicious cycle. And the fixes are
very likely to take more work than writing the tests would have.
Urgent and important goals are reasons to take a little time up front to
write tests, not reasons to skip them.
Code that matters needs tests.
(Even for less critical code like quick prototypes, some basic tests can help
a lot in getting something fully functional as directly as possible, or
ensuring that something remains functional through large overhauls in the
code.)

However, given that we are going to write tests,
it’s only sensible to do it in ways that save unnecessary work.
We want tests which are focused on important things. We don’t want tests which are
irrelevant or redundant. We don’t want tests which require a lot of unnecessary
maintenance. Here’s a simple triage strategy for reducing test-writing effort
at the early stages of a project:

Start with tests relating directly to subgoals which are already clear. Add
tests to specify more as the design becomes clearer.

Check a few simple cases before working
on many complex or subtle cases. A few simple cases should be easy to test
and provide a good basis for refinement - don’t get initially stuck
generating endless cases or testing complex and subtle things unless you know
they are likely to be vital.

Try to check that things do what they should without depending on the
specific way they are done. A good test doesn’t need to be updated every time
an implementation detail in the code is changed.

While writing tests, don’t obsess over perfection, and aim for clear over
clever. Being too clever takes extra time and makes things harder later.
Some repetition and lack of elegance can easily be refined if needed,
and are much better than writing things you won’t understand in 6 months.

In order to reach greater maturity, a project does need to commit to specific
goals and design decisions and stop changing everything, so it can get
important features working and keep them working.
This work is reinforced with more comprehensive testing: more tested subgoals,
and more carefully chosen cases (see Test Thoroughly.)

Typically this does mean more time allocated to tests, and less flexibility for
new development. That directly reflects the fact that more people are depending
more heavily on the functionality of the project. There’s no way out of
this without removing functionality: a new branch or project can set different
expectations, but it can’t replace the older project perfectly until it passes
its important tests. If we exhaust techniques to test more efficiently, we have
to recognize that the costs of writing and maintaining tests are simply
inherent to the process of developing functional software. More functionality
means more test code.

The most important thing about your tests is that they really test whether your
program is doing what it should. That’s test quality. High-quality tests
are sensitive to the functioning of your program, and nothing else: they must
be able to fail, they must fail when something happens that can hurt the
program’s function, and they shouldn’t fail for completely incidental reasons.

The other aspect of thorough testing is test coverage.
Sometimes bugs can escape notice for some time because they are in parts of the
code that are rarely run. Ideally, the code is not only run, but it’s
actually tested to see that it does what it should.
When you need to make sure more of your code is getting run during tests,
use Ned Batchelder’s tool coverage (very well documented on
that page, and also used by plugins to py.test and nosetests).
This tool records which code ran during your tests, and
generates a report of what you missed so you can find what code isn’t
being tested. This doesn’t ensure that your tests are meaningful,
that’s still up to you. The right way to use a coverage tool is to use it
to find code you aren’t testing, so you can write new high-quality tests
of that code.

Testing thoroughly isn’t incompatible with testing strategically. Tests which
don’t really detect anything of interest are just slowing you down every time
you have to run the tests.

Ideally, you would only ever have to develop programs against your favorite
language and your favorite version of that language - whether it is an awesome
new version or a favorite old one. Alas, programmers often need to support
old users and also stay up to date, whether we want to or not.

If you need to support multiple versions of Python (or similarly varied
environments), then use Holger Krekel’s
testing utility, tox.
It takes care of setting up virtualenvs and running tests in those virtualenvs
for you. So you just configure and run tox, and it runs your tests in each
version of Python that you mean to support. If you ever work on porting
a library or keeping it compatible with multiple versions of Python, this is
incredibly useful. Otherwise, it can become unmanageably frustrating to make
changes aimed at one version, only to find that they break the other version.