There’s a test code maintenance issue I’ve been grappling with, and watching others grapple with for a while now. I’ve blogged about some infrastructural things related to it before, but now I think its time to talk about the problem itself. The problem shows up as soon as you start writing setUp functions, or custom assertThing functions. And the problem is – where do you put this code?

If you have a single TestCase, its easy. But as soon as you have two test classes it becomes more difficult. If you choose either class, the other class cannot use your setUp or assertion code. If you create a base class for your tests and put the code there you end up with a huge base class, and every test paying the total overhead of your test needs, rather than just the overhead needed to test the particular system you want to test. Or with a large and growing list of assertions most of which are irrelevant for most tests.

The reason the choices have to be made is because test code is just code; and all the normal issues there – separation of concerns, composition often being better than inheritance, do-one-thing-well – all apply to our test code. These issues are exacerbated by pyunit (that is the Python ‘unittest’ module included with the standard library and extended by various projects)

Lets look some (some) of the concerns involved in a test environment: Test execution, fixture management, outcome decision making. I’m using slightly abstract terms here because I don’t want to bind the discussion down to an existing implementation. However the down side is that I need to define these terms a little.

Test execution – by this I mean the basic machinery of running a single test: the test framework calling into user code and receiving back an outcome with details. E.g. in pyunit your test_method() code is called, success is determined by it returning successfully, and other outcomes by raising specific exceptions. Other languages without exceptions might do this returning an outcome object, or passing some object into the user code to be called by the test.

Fixture management – the non trivial code that prepares a situation where you can make assertions. On the small side, creating a few object instances and glueing them together, on the large end, loading data into a database (and creating the database instance at the same time). Isolation issues such as masking out environment variables and creating temp directories are included in this category in my opinion.

Outcome decision making – possibly the most obtuse label I’ve ever given this, I’m referring the process of deciding *what* outcome you wish to have happen. This takes different forms depending on your testing framework. For instance, in Python’s doctest:

>>> x

45

provides a specification – the test framework calls str(x) and then compares that to the string ’45′. In pyunit assertions are typically used:

self.assertEqual(45, x)

Will call 45 == x and if the result is not True, raise an exception indicating a Failure has occured. Unexpected exceptions cause Errors, and in the most recent pyunit, and some extensions, other exceptions can signal that a test should not be run, or should have failed.

So, those are the three concerns that we have when testing; where should each be expressed (in pyunit)? Pragmatically the test execution code is the hardest to separate out: Its partly outside of ‘user control’, in that the contract is with the test framework. So lets start by saying that this core facility, which we should very rarely need to change, should be in TestCase.

That leaves fixture management and outcome decision making. Lets tackle decision making… if you consider the earlier doctest and assertion examples, I think its fairly clear that there are multiple discrete components at play. Two in particular I’d like to highlight are: matching and signalling. In the doctest example the matching is done by string matching – the reference object(s) are stringified and compared to an example the test writer provides. In the pyunit example the matching is done by the __eq__ protocol. The signalling in the doctest example is done inside the test framework (so we don’t see any evidence of it at all). In the pyunit example the signalling is done by the assertion method calling self.fail(), that being the defined contract for causing a failure. Now for a more complex example: testing a float. In doctest:

>>> “%0.3f” % x

0.123

In pyunit:

self.assertAlmostEqual(0.123, x, places=3)

This very simple check – that a floating point number is effectively 0.123 exposes two problems immediately. The first, in doctest, is that literal string comparisons are extremely limited. A regex or other language would be much more powerful (and there are some extensions to doctest; the point remains though – the … operator is not enough). The second problem is in pyunit. It is that the contract of assertEqual and assertAlmostEqual are different: you cannot substitute one in where the other was expected without partial function application – something that while powerful is not the most obvious thing to reach for, or to read in code. The JUnit folk came up with a nice way to address this: they decoupled /matching/ and /deciding/ with a new assertion called ‘assertThat’ and a language for matching – expressed as classes. The initial matcher library, hamcrest, is pretty ugly in Python; I don’t use it because it tries too hard to be ‘english like’ rather than being honest about being code. (Aside, what would ‘is_()’ in a python library mean to you? Unless you’ve read the hamcrest code, or are not a Python programmer, you’ll probably get it wrong. However the concept is totally sound. So, ‘outcome decision making’ should be done by using a matching language totally seperate from testing, and a small bit of glue for your test framework. In ‘testtools’ that glue is ‘assertThat’, and the matching language is a narrow Matcher contract (in testtools.matchers) which I’m going to describe here, in case you cannot or don’t want to use the testtools one.

This permits composition and inheritance within your matching code in a pretty clean way. Using == only permits this if you can simultaneously define an __eq__ for your objects that matches with arbitrarily sensitivity (e.g. you might not want to be examining the process_id value for a process a test ran, but do want to check other fields).

Now for fixture management. This one is pretty simple really: stop using setUp (and other similar on-TestCase methods). If you use them, you will end up with a hierarchy like this:

Note that there are some things around that offer this sort of convention already: thats all it is – convention. Pick one, and run with it. But please don’t use setUp; it was a conflated idea in the first place and is a concrete problem. Something like testresources or testscenarios may fit your needs – if it does, great! However they are not the last word – they aren’t convenient enough to replace just calling a simple helper like I’ve presented here.

To conclude, the short story is:

use assertThat and have a seperate hierarchy of composable matchers

use or create a fixture/resouce framework rather than setUp/tearDown

any old TestCase that has the outcomes you want should do at this point (but I love testtools).

Free network services – A discussion session led by Bradley Kuhn, Mako & Matt Lee : Libre.fm encouraged last.fm to write an API so they didn’t need to screen scrape; outcome of the network services story still unknown – netbooks without local productivity apps might now work, most users of network office apps are using them because of collaboration. We have a replacement for twitter – status.net, distributed system, but nothing like facebook [yet?]. Bradley says – like the original GNU problem, just start writing secure peer to peer network services to offer the things that are currently proprietary. There is perhaps a lack of an architectural vision for replacing these proprietary things: folk are asking how we will replace ‘the cloud’ aspects of facebook etc – tagging photos and other stuff around the web, while not using hosted-by-other-people-services. I stopped at this point to switch sessions – the rooms were not in sync session time wise.

Mentoring in free software – Leslie Hawthorne: Projector not working, so Leslie carried on a discussion carried on from the previous talk about the use of sexual themes in promoting projects/talk content and the like. This is almost certainly best covered by watching the video. A few themes from it though:

for anyone considering joining a community, they are assessing whether that community is ‘people like us’ – and for many people, including both women *and* men, blatant sexuality, isn’t something that fits the ‘people like us’ assessment. Note that this is in addition to offensive and inappropriate aspects of the issue.

The lack of support in the community has for at least one project led to a complete loss of the women contributors to that project – and they are still largely lacking many years later.

We then got Leslies actual talk. Sadly I missed the start of it – I was outside organising security guards because we had (and boy it was ironic) a very loud, confrontational guy at the front who was replying to every statement and the tone in the room had gotten to the point that a fight was brewing.

From where I got back:

Check your tone

help people be productive in your community

cultivate creativity

know yourself

do not get caught up in perfectionism

communicate – both big stuff, but also just take the time to talk – how are you going, etc.

Share your mistakes

Guide don’t order

Recognition = Retention

Recognition = Delegation – its ok to let other people be responsible for stuff

http://bit.ly/MentorGuide

http://bit.ly/MentoringArticle

Chris Ball, Hanna Wallach, Erinn Clark and Denise Paolucci — Recruiting/retaining women in free software projects. Not a unique problem to women – things that make it better for women can also increase the recruitment and retention of men. Make a lack of diversity a bug; provide onramps – small easy bugs in the bug tracker (tagged as such), have a dedicated womens sub project – and permit [well behaved ] men in there – helps build connections into the rest of the project. Make it clear that mistakes are ok. On retention… recognise first patches, first commits in newsletters and the like. Call out big things or long wanted features – by the person that helped. Regular discussion of patches and fixes – rather than just the changelog. CMU did a study on undergrad women participation in CS : ‘Lack of confidence preceeds lack of interest/partipation’. Engagement with what they are doing is a key thing too. ‘Women are consistently undervaluing their worth to the free software community’. ‘Its the personal touch that seems to make a huge difference’. ‘More projects should do a code of conduct – kudos to Ubuntu for doing it’ — Chris Ball.

I found the mentoring and women-in-free-software talks to have extremely similar themes – which is perhaps confirmation or something – but it wasn’t surprising to me. They were both really good talks though!

And thats my coverage of LibrePlanet – I’m catching a plane after lunch . Its a good low-key conference, and well put together.

John Gilmore keynote – What do we do next, having produced a free software system for our computers? Perhaps we should aim at Windows? Wine + an extended ndiswrapper to run other hardware drivers + a better system administration interface/resources/manuals. However that means knowing a lot about windows internals – something that open source developers don’t seem to want to do. We shouldn’t just carry on tweaking – its not inspiring; whats our stretch goal? Discussion followed – reactos, continue integrating software and people with a goal of achieving really close integration: software as human rights issue! ‘Desktop paradigm needs to be replaced’ : need to move away from a document based desktop to a device based desktop. Concern about the goal of running binary drivers for hardware: encourages manufacturers to sell hardware w/out specs; we shouldn’t encourage the idea that that is ok. Lots of concern about cloning, lots of concern about what will bring more freedom to users, and what it will take to have a compelling vision to inspire 50000 free software hackers. Free software in cars – lots of safety issues in .e.g brake controllers, accelerators.

Eben Moglen – ‘We’re at the inflection point of free software’ – because any large scale global projects these days are not feasible without free software. Claims that doing something that scales from tiny to huge environment requires ‘us’ — A claim I would (sadly) dispute. Lots of incoming and remaining challenges. ‘Entirely clear that the patent systems relationship to technology is pathological and dangerous’ – that I agree with! Patent muggings are a problem – patent holders are unhappy with patents granted to other people . Patent pools are helping slowly as they grow. Companies which don’t care about the freedom aspect of GPLv3 are adopting it because of the patent protection aspects. Patent system is at the head of the list of causes-of-bad-things affecting free software. SFLC is building coalitions outside the core community to protect the interests of the free software community. We are starting to be taken for granted at the high end of mgmt in companies that build on free software. … We face a problem in the erosion of privacy. We need to build a stack, running on commodity hardware that runs federated services rather than folk needing centralised services.

Marina Zhurakhinskaya on GNOME Shell: Integrates old and new ideas in an overall comprehensive design. Marina ran through the various goals of the shell – growing with users, being delightful, starting simply so new users are not overwhelmed. The activities screen looks pretty nice The workspace rearrangement UI is really good. The notifications thing is interesting; you can respond to a chat message in-line in the notification.

Richard Stallman on Software as a Service – he presented verbally the case made in the paper. Some key quotes… “All your data on a server is equivalent to total spyware” – I think this is a worst-case analogy; it suggests that you can never trust another party: kindof a sad state of paranoia to assume that all network servers are always out to get you all the time. And I have to ask – should we get rid of Savannah then (because all the data is stored there) – the argument for why Savannah is not SaaS is not convincing: its just file storage, so what makes it different to e.g. Ubuntu One? “If there is a server and only a little bit of it is SaaS, perhaps just say don’t worry about it – because that little bit is often the hardest bit to replace.” ”Lets write systems for collaborative word process that don’t involve a central server” — abiword w/the sharing plugin ? RMS seems to be claiming that someone else sysadmining a server for you is better than someone else sysadmining a time-shared server for you: I don’t actually see the difference, unless you’re also asserting that you’ll always have root over your’ own machine’. The argument seems very fuzzy and unclear to me as to why there is really a greater risk – in particular when there is a commercial relationship with the operator (as opposed to, say, an advertising supported relationship).

GNU Hackers meetups are a face to face meeting to balance the online collaboration that GNU maintainers and contributors do all the time. These are a recent (since 2007) thing, and are having a positive effect within GNU and the FSF.

The LibrePlanet 2010 GNU Hackers meetup runs concurrent with the first day of LibrePlanet.

We started with some project updates:

SipWitch – a project to do discovery of SIP endpoints and setup encryption etc. This looks quite interesting, and is looking for contributors.

Bazaar – I presented an update on where Bazaar is at and what we’re focusing on now and in the future:

short term: merging and collaboration:

merge behaviour

conflict behaviour

develop a rebase that can combine unrelated branches

looms to be polished, or pipelines extended – something to manage long-standing patches for distributions, or other environments that need long lived patch sets.

long term

continuing optimisation of network and local perf

meta-branch operations – mirror collections of branches,

work with many branches at once (many branches in one dir (a-la git, hopefully less confusing)

easier ‘get up and go’ for new contributors

now and forever

keep fostering community growth

we’re aiming for negative bug growth- get on top and stay there

Felipe Sanches presented his list of things that should be on the high priority project list:

accessibility since 1st boot

reconfigurable hardware development (FPGA tools) – this is particularly relevant for handling e.g. wifi cards that have a FPGA in the card, so we can replace the non-free microcode.

nonfree firmware issue

–lunch–

John Eaton on Octave. John compared the octave contributors – 30 or so over the years, and never more than 2 at a time. The Proprietary product Matlab that Octave is very similar to has 2000 staff working at the company producing it. Users seem to expect the two products to be equivalent, and are disappointed that Octave is less capable, and that the community is not as able to do the sort of support that a commercial organisation might have done. Octave would like to gain some more developers and be able to educe users more effectively – convert more to become developers.

Rob Myers, the chief GNU webmaster gave a description of his role: The webmasters deal with adding new content, dealing with mail to webmaster@, which can be queries for the GNU project, random questions about CDs, and an endless flood of spam. The webmasters project is run as a free software project – the site is in CVS (yes CVS), visible on Savannah. Templates could be made nicer and perhaps move to a CMS.

Aubrey Jaffer on cross platform. There is a thing called Water which is meant to replace all the different languages used in web apps – generates html, css, alters the DOM, does what you’d do with javascript. So there is a Water -> backend translator that outputs Java for servers, C# for windows, and so on. (I think, this wasn’t entirely clear). He went on to talk about many of the internals of a thing called Schlep which is used as a compiler to get scheme code running in C/C#/Java so as to make it available to Water backends in different environments.

Matt Lee spoke about GNU FM – GNU FM is a free ‘last.fm’ site. The site is running at http://libre.fm/. 24ish devs, but stalle after 6 months – whats next? Matt has started GNU Social to build a communication framework for GNU projects to talk to each other – e.g. for each GNU FM site to communicate on the back end, with a particular focus on doing social functionality – groups, friendships, personal info. The wiki page needs ideas!

GNU advisory board discussion… too much to capture, but focused GNU wide issues – things like how projects get contributors, contributions, coordination. Teams were a big discussion point, bug trackers – how to coordinate teams followed up of that, and there is s ‘GNU Source Release Collection’ project to do coordinated releases of GNU software that are all known to work together.

Dell has been offering Ubuntu on selected models for a while. I had however nearly given up hope on being able to buy one, because they hadn’t started doing that in Australia. I am very glad to see this has changed though – check out their notebook page. Not all models yet, but a reasonable number have Ubuntu as an option.

So, we wanted to move a Hudson CI server at Canonical from using chroots to VM’s (for better isolation and security), and there is this great product Ubuntu Enterprise Cloud (UEC – basically Eucalyptus). To do this I needed to make some changes to the Hudson EC2 plugin – and thats where the fun starts. While I focus on getting Hudson up and running with UEC in this post, folk generally interested in the differences between UEC and EC2, or getting a single-machine UEC instance up for testing should also find this useful.

Firstly, getting a test UEC instance installed was a little tricky – I only had one machine to deploy it on, and this is an unusual configuration. Nicely though, it all worked, once a few initial bugs and misconfiguration items got fixed up. I wrote up the crux of the outcome on the Ubuntu community help wiki. See ‘1 Physical system’. The particular trap to watch out for seems to be that this configuration is not well tested, so the installation scripts have a hard time getting it right. I haven’t tried to make it play nice with Network Manager in the loop, but I’m pretty sure that that can be done via interface aliasing or something similar.

Secondly I needed to find out what was different between EC2 and UEC (Note that I was running on Karmic (Ubuntu 9.10) – so things could be different in Lucid). I couldn’t find a simple description of this, so this list may be incomplete:

So the next step then is to modify the Hudson EC2 plugin to support these differences. Fortunately it is in Java, and the Java community has already updated the various libraries (jets3t and typica) to support UEC – I just needed to write a UI for the differences and pass the info down the various code paths. Kohsuke has let me land this now even though it has an average UI (in rev 27366), and I’m going to make the UI better now by consolidating all the little aspects into a couple of URL’s. Folk comfortable with building their own .hpi can get this now by svn updating and rebuilding the ec2 plugin. We’ve also filed another bug asking for a single API call to establish the endpoints, so that its even easier for users to set this up.

Finally, and this isn’t a UEC difference, I needed to modify the Hudson EC2 plugin to work with the ubuntu user rather than root, as Ubuntu AMI’s ship with root disabled (as all Ubuntu installs do). I chose to have Hudson reenable root, rather than making everything work without root, because the current code paths assume they can scp things as root, so this was less disruptive.

With all that done, its now possible to configure up a Hudson instance testing via UEC nodes. Here’s how:

Install UEC and make sure you can run up instances using euca-run-instances, ssh into them and that networking works for you. Make sure you have installed at least one image (EMI aka AMI) to run tests on. I used the vanilla in-store UEC Karmic images.

Install Hudson and the EC2 plugin (you’ll need to build your own until a new release (1.6) is made).

Go to /configure and near the bottom click on ‘Add a new cloud’ and choose Amazon EC2.

Look in ~/.euca/eucarc, or in the zip file that the UEC admin web page lets you download, to get at your credentials. Fill in the Access Key and Secret Access key fields accordingly. You can put in the private key (UEC holds onto the public half) that you want to use, or (once the connection is fully setup) use the “Generate Key’ button to have a dedicated Hudson key created. I like to use one that I can ssh into to look at a live node – YMMV. (Or you could add a user and as many keys as you want in the init script – more on that in a second).

Click on Advanced, this will give you a bunch of details like ‘EC2 Endpoint hostname’. Fill these out.

Sensible values for a default UEC install are: 8773 for both ports, /services/Eucalyptus and /services/Walrus for the base URLs, and SSL turned off. (Note that the online help tells you this as well).

Set an instance cap, unless you truely have unlimited machines. E.g. 5, to run 5 VMs at a time.

Click on ‘Test Connection’ – it should pretty much instantly say ‘Success’.

Thats the Cloud itself configured, now we configure VM’s that Hudson is willing to start. Click on ‘Add’ right above the ‘List of AMIs to be launched as slaves’ text.

Fill out the AMI with your EMI – e.g. emi-E027107D is the Ubuntu 9.10 image I used.

for remote FS root, just put /hudson or something, unless you have a preseeded area (e.g. with a shared bzr repo or something) inside your image.

For description describe the intent of the image – e.g. ‘DB test environment’

For the labels put one or more tags that you will use to tell test jobs they should run on this instance. They can be the same as labels on physical machines – it will act as an overflow buffer. If no physical machines exist, a VM will be spawned when needed. For testing I put ‘euca’

For the init script, its a little more complex. You need to configure up java so that hudson itself can run:

For number of executors, you are essentially choosing the number of CPU’s that the instance will request. E.g. putting 20 will ask for an extra-large high-cpu model machine when it deploys. This will then show up as 20 workers on the same machine.

Click save

Now, when you add a job a new option in the job configuration will appear – ‘tie this job to a node’. Select one of the label(s) you put in for the AMI, and running the job will cause that instance to start up if its not already available.

Note that Hudson will try to use java from s3 if you don’t install it, but that won’t work right for a few reasons – I’ll be filing an issue in the Hudson tracker about it, as thats a bit of unusual structure in the existing code that I’m happier leaving well enough alone .

Looking at using google apps for my home email, as I want to be able to have my home machines totally turned off from time to time.

Found this interesting gem in the sign up agreement (which I have not yet agreed to ):

11. PR. Customer agrees not to issue any public announcement regarding the existence or content of this Agreement without Google’s prior written approval. Google may (i) include Customer’s Brand Features in presentations, marketing materials, and customer lists (which includes, without limitation, customer lists posted on Google’s web sites and screen shots of Customer’s implementation of the Service) and (ii) issue a public announcement regarding the existence or content of this Agreement. Upon Customer’s request, Google will furnish Customer with a sample of such usage or announcement.

This is rather asymmetrical: If I agree to the sign up page, I cannot say ‘I am using google apps’, but google can say ‘Robert is using google apps’. While I can appreciate not wanting to be dissed on if something goes wrong, this is very much not open! A couple of implications: Everyone seeking support for google apps in the apps forums is probably in violation of the sign up agreement; we can assume that anyone having a terrible experience has been squelched under this agreement.

Scott recently noted that we don’t have Klingon available in Ubuntu. Klingon is available in ISO 639, so adding it should be straight forward.

Last time I blogged about this three packages needed changing, as well as Launchpad needing a translation team for the language. The situation is a little better now: only two packages need changing as gdm now dynamically looks for languages based on installed locales.

Secondly, langpack-locales has to change for two reasons. Firstly a locale definition has to be added (and locales define a place – a language and locale information like days of the week, phone number formatting etc. Secondly the language needs to be added to the SUPPORTED list in that package, so that language packs are generated from Launchpad translations.

Now, gdm autodetects, but it turns out that only ‘complete’ locales were being shown. And that on Ubuntu, this was not looking at language pack directories, rather at

/usr/share/locale

which langpack-built packages do not install translations into. So it could be a bit random about whether a language shows up in gdm. Martin Pitt has kindly turned on the ‘with-incomplete-locales’ configure flag to gdm, and this will permit less completely translated locales to show up (when their langpack is installed – without the langpack nothing will show up).