Python has a built-in testing framework called unittest. As time went by Python programmers looked for better-suited solutions. Some of them created a new quality (like doctest) and others build on top of unittest. One of frameworks which intends to replace unittest is nose. I use it constantly during development of Cheesecake and I must admit it’s been very helpful, being easy to setup, integrate and extend to my needs. But it has to be said that among many of its features it also have slightly different philosophy than unittest. In this short article I’m going to describe one of the main issues that nose users may accidentally step into – weak test isolation.

Unittest, which nose is a great successor, was based on JUnit. One of the core priciples of JUnit was that (quoting Martin Fowler) no test should ever do anything that would cause other tests to fail. Nose doesn’t give you such certainty, because (for performance reasons) it uses one interpreter for all tests. We’ll look at the examples of possible bugs that this approach can cause.

This post is not meant to be a rant of any sort. Actually, by learning from this cases I’ve improved my understanding of testing in general. I’d be happy if at least one of the readers will learn something new from my experience. To those of you seeking a quick fix, I inform you that this issue has been bureported and there is a solution under way. Please remember that tests isolation won’t be a default though.

Why is that? Nose reads tests from the directory in lexical order, so test_hello_a.py gets read and executed before test_hello_b.py. Nose runs both tests in the same environment, so hello module is read and initiated only once. This means any changes made by test A to the module hello will be present during run of test B.

Resolution

This particular example seems like an error on the test runner side, because we have two different scripts that interfere each other in a way that wouldn’t happen if you run them in separate runs. On the other hand, it may pinpoint a bug in your setup/teardown logic. If you’re changing global state during setup to exercise behaviour of objects in certain environment you should revert to the original state during teardown. Having this in mind, test A should look more like this:

Reverting changes may seem impossible in some cases, but you should try hard to do it. Remember what they say: Your code sucks if it isn’t testable. And it isn’t really testable if you can’t isolate a state you’re interested in testing.

Mocking errors

Problem

State of builtin modules preserve from one test to another.

Example

This is similar to the first example, but approaches the problem from a different direction. Now we’re writing a simple locking mechanism that uses files as locks. Currently code looks like this:

Does the TestLockedLocker leave a lock file? No. But it leaves mocked version of os.path.exists, which always return True. Builtin modules doesn’t get reloaded for each test, so any changes you do to them will remain during execution of other tests. The same goes for builtins.

Resolution

State of the interpreter should be considered external, just like the state of a filesystem or environmental variables: If you’re making any changes to it, remember to revert this change at the end. This especially affect __builtins__ and standard library, where change of one function/variable can affect majority of other tests.

Import clashes

Problem

Subdirectories you keep your tests in cannot contain modules with the same names.

Example

As your set of test cases grow, you will probably start arranging them in a hierarchical structure. First obvious level of partitioning is placing your unit tests and functional tests in separate directories. That is what Grig done for Cheesecake project and this is convention I’ve followed. As I was writing more unit and functional tests a bit of redundancy in code started to show up. When the time to refactor came, I took common functionality and placed in a handy module helper.py. That’s when nose came into my way. My directory structure looked like that:

In the example above, unit test test_this tried to access unit_help function, which exists in unit/helper.py, but doesn’t in functional/helper.py. Apparently functional helper got imported first, and it was saved as ‘helper‘ in sys.modules, thus inhibiting “import helper” in unit test (remember that all tests share the same interpreter).

Resolution

Conclusion

Next release of nose will contain an isolation plugin, which responsibility will be to purge sys.modules list between running different test files. This will fix some of isolation problems, but not all of them. You will still be able to kill the runner or modify builtins in a way that breaks other tests. Complete solution would involve spawning a new Python process for each test, which is unacceptable from a performance point of view. Current state is just good enough, so with a bit of nose-specific knowledge and common sense you can get the best of both worlds: testing speed and stability.