Sessions at PyCon US 2012 about Testing

Wednesday 7th March 2012

Are you new to Python and want to learn how to step it up to the next level? Have you wondered about functional programming, closures, decorators, context managers, generators or list comprehensions and when you should use them and how to test them? This hands-on tutorial will cover these intermediate subjects in detail, by explaining the theory behind them then walking through examples.

Tutorial will be in the the form of short lecture, short assignment to practice the concepts.

Attendees should come with a laptop and Python 2.x (3.x differences will be noted).

Tutorial will cover:

Testing (unittest and doctest)

Functional Programming

Functions

Closures

Decorators

Class decorators

Properties

Context Managers

List comprehensions

Iterator pattern

Generators

Materials include an ebook covering the material, slides, handout and assignment code. Prizes to be awarded for completion of assignment.

Friday 9th March 2012

Project Hosting at Google Code is a large, well-established system written mostly in Python. We'll share our battle-born convictions about creating tests for test-unfriendly code and the larger topic of testing.

When launched, Project Hosting’s testing consisted of the stock Subversion test suite and a handful of ad hoc smoke test scripts that required starting the entire system and manually inspecting the test’s output.

Over six years of codebase evolution, tests have been added with varying degrees of coverage and maintainability. Early system design decisions made adding tests difficult: the first tests added to the system used mock objects unwisely and large numbers of mock objects made refactoring costly in time and effort.

Frustration with the difficulty of enhancing the service led us to reevaluate our testing practice and led to the discovery of better ways to test applications of this complexity. We will share our experiences with testing and discuss designing for maintainability and testability and appropriate use of testing tools such as frameworks and mocks.

Most unit tests aren't and their authors suffer for it. What is a unit test, really? How can writing them prevent classic testing problems? If you do write them, what trade-offs are you implicitly making?

Your unit test suite takes three minutes to run 500 tests. On a modern CPU, that's just shy of a billion instructions per test. Can something that takes a billion CPU instructions really be called a "unit"? What is that suite really testing?

Many (and probably most) unit testing failures are related to size. Most common are the suite that takes half an hour to run (so no one runs it), the suite whose runtime scales like lines_of_code^2 (so, again, no one runs it), and the suite that requires huge maintenance for small changes (leading to the "testing is slow" myth).

This talk is about the "unit" in "unit test": what does it really mean, why do we care, and how does it prevent the three crippling problems above? And, of course, if we do shift our focus toward unit tests, what trade-offs are we really making?

Mozilla's projects have thousands of tests, so we've had to venture beyond vanilla test runners to keep things manageable. Our secret sauce can be used with your project as well. Reach beyond the test facilities that came with your project, harnessing pluggable test frameworks, dynamically reordering tests for speed, exploring various mocking libraries, and profiling your way to testing nirvana.

A partial outline:

Intro
Motivation: a test not run is no test at all.
For most web apps, the easiest test speed win is a conquest of I/O.
The nose testrunner
Test discovery lets you organize tests well.
Pluggability
Gluing to projects with custom testrunners: django-nose and test-utils
py.test
Compare to nose. Nose forked from it. Explain history.
Very cool assertion re-evaluation
Plugin compatibility between py.test and nose
Profiling
Start here. Premature optimization sucks.
time on the commandline to divide CPU from I/O
--with-profile
Killing I/O for speedy justice: case study of support.mozilla.com
Fixture speed hacks (a 5x improvement!)
Once-per-class setup
How to use DB transactions to avoid repetitive I/O
Dynamic test reordering and fixture sharing
DB reuse and other startup optimizations
37,583 queries to 4,116. Watch them fly by!
What to do instead of fixtures: the model-maker pattern
Lexical proximity
Lower coupling
Speed
Using mocking to kill the fixtures altogether
mock, the canonical lib
fudge, new declarative hotness
Syntax, capabilities
Example: oedipus, a better API for the Sphinx search engine. I used fudge to unit-test oedipus without requiring devs to set up and populate Sphinx.
Dangers of mocking
Don't mock out your caching unless your invalidation is perfect.
Some of our mistakes in oedipus
The nose-progressive display engine
Test results that are a pain to read don't get read.
Progress indication
Elision of junk frames
Easier round-tripping from test failure to source code
Continuous integration
Motivation
Jenkins
Buildbot
IRC bots
Next steps: what to do once you're CPU-bound
More parallelization.
Multithreading really buys you no speed bump for CPU-bound (or I/O bound?) tasks in Python due to the GIL. (Ref: PyCodeConf talk by David Beazley.)
State of multiprocess plugins in various testrunners.
Mozilla's Jenkins test farm
QA's big stacks of Mac Minis
What global warming? ;-)

The py.test tool presents a rapid and simple way to write tests. This talks introduces common testing terms, basic examples and unique pytest features for writing unit- or functional tests: assertions and dependency injection mechanisms. We also look at other features like distributing test load, new plugins and reasons why some Redhat and Mozilla people choose pytest over other approaches.

The py.test tool presents a rapid and simple way to write tests for your Python code. This talks introduces some common testing terminology and basic pytest usage. Moreover, it discusses some unique pytest features for writing unit- or functional tests. For unit tests, the simple Python "assert" statement is used for coding assertions. As of 2011, this assert support has been perfected for Python 2.6 or higher, finally removing what some people have formerly called "magic side effects". For writing functional or acceptance tests py.test features a unique depdendency injection mechanism for managing test resources. The talk shows how to setup these resources and how to configure it via command line options. More recently, QA folks from Mozilla and Redhat QA people have endorsed come to appreciate these unique features and the general customizability. The talk concludes with a look on other features like distributing test load and other recently released plugins.

In this talk, aimed at intermediate Pythonistas, we'll have a look at some common, simple patterns in code, and the testing patterns that go with them. We'll also discover what makes some code more testable than others, and how mocks and fakes can help isolate the code to be tested (and why you want to do that). Finally, we'll touch on some tools to help make writing and running tests easier.

Overview
You've heard the gospel of 'test, test, test!' over and over again, and may have even felt some jealousy or guilt because you're not using Test-Driven Development. Maybe you've even seen talks or read blog posts about writing 'testable code', but it just hasn't sunk in.

The reality is that writing effective unit tests can be somewhat difficult to wrap your head around. What's a unit test? When is a unit test not a unit test? What's a functional test? When is a Mock really a fake or stub? There's a good bit of lingo, a fair amount of religion, but not enough instruction on effective testing patterns and idiomatic, Pythonic testing practices.

As programming and application architecture is heavily influenced by the use of patterns, it's only logical that those patterns produce the side effect of making the way they'll be tested more predictable, and yet discussions of patterns regularly leave out coverage of testing, and most testing talks fail to link a methodology to patterns in the code. This changes now.

In this talk, aimed at intermediate Pythonistas, we'll have a look at some common, simple patterns in code, and then have a look at the testing patterns that go with them. We'll also discover what makes some code more testable than others, and how mocks and fakes can help truly isolate the code to be tested (and why you really want to do that). Finally, we'll touch on some tools to help make writing and running tests easier.

Outline
What is a unit test? (3 minutes)
Unit Test definition
Unit tests vs. functional, integration, and acceptance tests
Why Unit Tests? (3 minutes)
"Why isolate the code?" (I get this question a lot)
"But, I use functional tests & have 100% coverage!"
Three pieces of code, and how to make it more testable. (8 minutes)
Patterns in code, patterns in tests (15 minutes)
A Simple datetime abstraction library, its patterns and tests.
A REST Client Module, its patterns and tests
A cmd-module-based command shell, its patterns and tests
A microframework-based service, its patterns and tests
Tools You Want to Use (5 minutes)
Mock
Nose
Coverage
Tox
TBD (Possibly PyCharm's test support, which is getting good w/ 2.0, but there are many candidates)

Saturday 10th March 2012

A deep dive into writing tests with Django, covering Django's custom test-suite-runner and the testing utilities in Django, what all they actually do, how you should and shouldn't use them (and some you shouldn't use at all!). Also, guidelines for writing good tests (with or without Django), and my least favorite things about testing in Django (and how I'd like to fix them).

Django has a fair bit of custom test code: a custom TestSuiteRunner, custom TestCase subclasses, some test-only monkeypatches to core Django code, and a raft of testing utilities. I'll cover as much of that code as I find interesting and non-trivial, taking a close look at what it's actually doing and what that means for your tests.

This will be a highly opinionated talk. There are some things in Django's test code I really don't like; I'll talk about why, and how I'd like to see them changed. As a natural part of this, I'll also be outlining some principles I try to follow for writing effective and maintainable tests, and note where Django makes it easy or hard.

This is an "extreme" talk, so I'll be assuming you've used Django and done some testing, and you're familiar with the basic concepts of each. This won't be an introductory "testing with Django" howto.

Can your robot play Angry Birds? On an iPhone? Mine can. I call it "BitbeamBot". It started as an art project, but it has a much more serious practical application: mobile web testing. To trust that your mobile app truly works, you need an end-to-end test on the actual device. BitbeamBot is an Arduino-powered open-source hardware CNC robot that can test any application on any mobile device.

For the confidence that your mobile app truly works, you need an end-to-end test on an actual device. This means the full combination of device manufacturer, operating system, data network, and application. And since mobile devices were meant to be handled with the human hand, you need something like a real hand to do real end-to-end testing. At some point, after lots of repetitive manual testing, the inevitable questions is asked "Can we / should we automate the testing of the old features, so I can focus the manual testing effort on the new features?"

That's where the BitbeamBot comes in. BitbeamBot is an Arduino-powered open-source hardware CNC robot that can test any application on any mobile device -- touching the screen of a mobile device just like a user would. It also uses Python and the Selenium automation API to work its magic. In the future your testing will be automated... with robots.

At the moment, BitbeamBot is just a prototype, but it can play games with simple mechanics, like Angry Birds. However, it's not very smart; it can't yet "see" where objects are on the screen. From my computer, I send electrical signals to two motors to move the pin over any point on an iPhone screen. I then use a third motor to move the pin down to the screen surface and click or drag objects. This open loop, flying-blind approach to automation is how automated software testing was done years ago. It's the worst way to automate. Without any sense of what's actually visible on screen, the script will fail when there's a discrepancy between what's actually on the screen and what you expected to be on screen at the time you wrote the automation script.

A better approach to testing with BitbeamBot will involve closing the feedback loop and have software determine where to click based on what is actually on screen. There are two styles I'll experiment with: black box and grey box. Black box testing is easier to get started with, but grey box testing is more stable long term.

Black box testing requires no internal knowledge of the application. It treats the application like a metaphorical "black box". If something is not visible via the user interface, it doesn't exist. To get this approach to work with BitbeamBot, I'll place a camera over the phone, send the image to my computer, and use an image-based testing tool like Sikuli. Sikuli works by taking a screenshot and then using the OpenCV image recognition library to find elements like text or buttons in the screenshot.

The other style is grey box testing. It's a more precise approach, but it requires access to and internal knowledge of the application. I'll implement this approach by extending the Selenium library and tethering the phone to my computer via a USB cable. With the USB debugging interface turned on, I can ask the application precisely which elements are on screen, and where they are before moving the BitbeamBot pin to the right location.

BitbeamBot's home is bitbeam.org. The hardware, software, and mechanical designs are open source and available on github.