<Ctrl-Z>

Pages

Monday, November 30, 2015

I'm a big proponent of unit-testing. In a nutshell, unit-testing means writing one program to test that another works as expected. While not perfect (who tests the unit testers?), in my humble experience, unit tests achieves two very important results:

It dramatically increases the quality of your code (i.e. less bugs.)

It dramatically increases your confidence when updating your code.

The first result is pretty self-explanatory, when you test your code, you are likely to find bugs, fix them, and thus your code becomes less buggy. One important twist here is that in complicated software, there are parts that testing manually is very ineffective or impossible, and the only way to really tell that its working well is by having another program test it (more on this later.)

The other result is less obvious. Because unit-testing means writing code to test your code (as opposed to manually running your code to see that it works alright), you end up with the ability to quickly verify that your code is working as expected. This ability is very important when you want to make changes to your code long after you initially wrote it. At that point, no one will remember exactly all the details of what that particular code was designed to do, and fear that fixing one bug will end up creating new ones. It's unfortunate that in many organizations, the most important part of their code is the most buggy and most untouched. The reason for this is the fear that fixing one bug there will create a catastrophic new one, and thus the code remains as it is. Unit testing helps prevent this, because you will have the confidence that any updates you make will at the very least meet previous expectations.

Now, writing unit testing is a skill all in of itself. In many cases, writing the code to test the software may be dramatically more complicated that writing the software itself! The good news is that Python has some great tools to help you unit-test your work...

Unittest and nose

Two great tools to be familiar with are unittest and nose. While each can be used stand-alone, I find that they compliment each other well.

Let's start with unittest. It's built-in with Python, and thus you don't have to install anything to use. It's docs are great, and I highly recommend you look them over. But before you do that, I want to point some of its highlights.

To use it, one typically creates a new python file for testing purposes. You can name it "test_some_other_file.py" if you like. In it, you need to import the unittest module, and then create a class that is derived from unittest.TestCase. Then, to actually run the tests, you need to call unittest.main(). Here is a simple example that does't do anything:

If you run it (by running "python test_some_other_file.py"), you should get something about running 0 tests OK. Great!

Now, let's make it actually test stuff. To add a test, one only needs to add a method that starts with "test". Thus "testThis" and "test_now" are tests, while "notAtest" is not:

So if you run it now, it should say something about passing two tests. Yippie!

But wait, how does it know that a test passes or fails? Basically, if it calls a test method, and that method doesn't raise an exception, then it is assumed that the test passes. If it raises an AssertionError, then it assumes that the test failed. If it raises any other exception, it assumes that something else went wrong, and the test errored-out. In other words, you can think of an AssertionError as a controlled failure, and any other error as an uncontrolled failure. Both are bad. :)

An AssertionError is typically raised by an assert statement, let's take a look:

If you run it now, you should see a bunch of information.It should say that it ran 3 tests, and that test_another() got an error, with details about it. It should also say that test_that() failed. Notice that it gives you "problem!!" as added information. In an assert statement, the first expression is the test, and if it's True, then it "passes" the test. If it's False, then it fails, and if there is a comma, then what comes after it is basically a not with additional info for the tester.

Assignment 1: Create a test file that tests a bunch of random stuff.

Now, asserts are nice and all, but when they fail, don't don't give you much information. It's up to you to manually add it as a "helper note" after the comma. To make life easier for you, unittest.TestCase offers a bunch of helper "assert" methods. Let's take a look at one that I really like:

Notice how the two tests are basically the same (and both fail). But when test_annoying() fails, you don't get much information about why, where as when test_nice() fails, it tells you exactly why: the "there" entries have different values! How nice is that?

Assignment 2: Explore the different helper "assert" methods that unittest has to offer. Play around with a few that look particularly interesting to you.

A "real" example

So far, it's all been very theoretical. In practice, you're going to want to test actual code. Say you have a python called "module1.py", which contains:

def square(num):

return num*num

def is_even(num):

return is_even % 2

(You can copy-paste this into such a file.)

To test it, you may create a "test_module1.py" which contains:

Go ahead and run it. It should say that it passed 2 tests OK. Great! So is all good? Of course not! Our test_even() method doesn't really test anything. :)

Assignment 3: fill in test_even() with some tests (is_even() should return True if the number is even, and False if its odd.)

Great! Did it pass your tests? If not, did you find the bug in is_even()? Unit testing is great because it helps you discover subtle (and sometimes not so subtle) bugs.

The nose knows...

Remember that I mentioned nose earlier? It's ok if you didn't. :) I won't be offended, really.

Well, nose has a few nice benefits on top of unittest. The most important one (in my opinion,) is that it lets you organize your tests better. Instead of putting your tests in the same folder as your code, nose lets you easily place them in a sub-folder, called "tests". It's important for both your main and your "tests" folders to contain an "__init__.py" file, which will be empty, in order for you to easily access the files that you want to test from within your test files. An example will clarify.

So say your folder looks like this:

+ my_code

- module1.py

- module2.py

- main.py

Then, to add tests, you may want to do something like this:

+ my_code

- __init__.py

- module1.py

- module2.py

- main.py

+ tests

- __init__.py

- test_module1.py

- test_module2.py

- test_main.py

Now we can take our test_module1.py file from our previous example, and modify it like so:

Please notice two main differences:

Instead of the line "from module1 import *" we now have "from ..module1 import *". The ".." are needed because now test_module1.py is in a sub-folder (tests). This is only possible to run properly if you have the two "__init__.py" files.

There is no longer a call to unittest.main().

With nose, all we need to do to run the tests is run the command "nosetests." We can run it either from the main "my_code" folder, or from the "tests" folder within. Nosetests looks for any file that begins with "test" and checks to see if it has any tests in it. If it does, it runs them for you! The big advantage here is that in a large project, you can have dozens of different test files, and now, with a single command, you can run all of them at once!

Assignment 4: install nose, and put some stuff in "module2.py" that you can test in "test_module2.py". See how running nosetests runs all your tests!

If you want to run a test in just one file, you can run "nosetests test_module1.py". If you want to run a particular class, you can do "nosetests test_module1.py:Tests" or "nosetests test_module1.py:Tests.test_square".

If you want to display the output (if any) of your script as it runs, run "nosetests -s".

This is a no-conflict zone...

An important matter when writing unit tests is making sure that they don't conflict. Not with each other, not with tests in other test files, and not with your system in general. Because the order in which tests run is pretty much arbitrary, it's good practice to make sure each test runs in a "bubble", one that doesn't depend on other tests running before it, and doesn't mess things up for tests running after it.

Creating such a bubble for your tests to run in is not always an easy feat to accomplish. There are some tools that I will talk about that help you achieve this, but it's something that always has to be in your head.

Not every test is difficult to build safely. Testing the square() and even() functions above require no special effort because they don't interact with the environment in any way. But say you had a test that had to read a file. And say you depend that the file had particular data in it for the test to run OK. You may just write the file somewhere once, and then test on it. But what happens if another test modifies or erases that file? In this case, your test will fail, and it may not always be obvious why. A good approach would be to set up the test by re-creating this file before running the actual test, this way you always ensure that the file is the way you expect it to be. Even better would be to clean up the test by removing this file, or perhaps restoring the file that was originally there.

setUp() and tearDown() run right before and right after each test is run (i.e. test_square() and test_even() from our example above.) setUpClass() and tearDownClass(), as the name implies, run at the start and finish of running a class full of tests.

Thus if you need to have a certain file in a certain place, you can create it in setUp(), and remove it in tearDown(). If for some reason creating this file is very time consuming, you may want to create it once in setUpClass(), and then copy it to where the tests are expecting it in setUp(). Just don't forget to remove it now in tearDownClass().

Files are just one type of resource that may require set-up and clean-up. If you're working with a database, then it may need to be set up and cleaned up properly as well. A good database framework, like Django, will help you out with this.

The key is when running a test, ask yourself how it depends or affects the environment, and see if, with proper set-up and clean-up, you can remove the dependency / effect on the environment.

Let's play pretend...

Code in the real world can get quite complicated, and testing it can become a very dirty feat. Say you want to test a function that behaves differently at different times of the day?

You can see how testing this may be a problem. Now, you may be able to think of different solutions to this, such as passing hour as a parameter to the function. That may be fine, and in some cases, re-structuring the code to make it cleaner and more testable is a good idea. But in some cases, restructuring is not a good option. Perhaps a better illustration is this:

In this contrived example, greetings() first attempts to get an appropriate greeting from the internet, but if the internet connection is down, then it returns "Hello." Here structure of the function is not the problem here, the problem is now the unreliability of the internet.

Testing these sort of functions can get messy. If you find yourself putting test-specific code in your code, you are probably doing something wrong:

The above is not only ugly, but may actually introduce new bugs to your code!! In situations like these, it's good to be able to play pretend.

Mock, or if you're using Python < version 3.3, then mock, is an extremely powerful tool in the tester's arsenal. Besides the library's documentation, there is a great tutorial that I highly recommend that you go over. Here I will only offer a little glimpse to better understand its power:

The above example lets me "mock" different return values for datetime.datetime.now(). First, I set it to around 7:15am, and then to 1:15am. Each time, greetings.greetings() is called in the same way, but returns a different (and appropriate) value.

But to get things to work well with nose, notice a few things here:

I'm patching (using '@mock.patch') the datetime module as imported in greetings. This is important because if I mock 'datetime.datetime', it won't affect the greetings module. Relating to this, I had to import the greetings module by doing "from .. import greetings".

Decorating a test function with @mock.patch will cause a mock parameter to be passed to the test function. In this example, I call it mock_datetime.

To change the return value of a function inside a mocked module, simply change the appropriate mocked parameter. Thus to change greetings.datetime.now(), I had to mock greetings.datetime as mock_datetime, and then set mock_datetime.now.return_value to whatever I wanted.

Notice that you can set the return_value as many times as you want within a test function.

The decorator @mock.patch automatically "unmocks" the module as soon as the test function exists, thus there is no additional set-up or clean-up to perform.

Now there is a lot more you can do with the mock module! But I just wanted to give you a little practical glimpse. The beauty here is that mocking lets us test complicated code without having to change it.

Final thoughts...

But more than a just skill, unit-testing is a mindset. For it to have its impact on a project, it's important to understand that writing well-tested code is an investment. If one puts in the time to test the code, one saves time later dealing with less bugs.

But as with any software project, it's important to have good tools at your disposal. Unittest, nose, and mock, and such tools. But they are not the only tools out there.

For instance, when looking for a good Python framework for writing web applications, of the many advantages that Django offers, not least is its well thought-out and integrated unit-testing capability. This means that making sure your code works is all that much easier using Django. All the better!

Now, I haven't covered everything that unit-testing may involve. Some other tools / techniques that may (or may not) be of interest are:

doctest: a neat way of having the unit tests be a part of the documentation!!

Code coverage: a convenient way to sport parts of the code that aren't being tested. Of course, this doesn't mean that the parts of the code that are being tested are being tested properly.