Ok, so micro rant time: this is the effect of not taking things upstream: hardware doesn’t work Out Of The Box.

Very briefly, I purchased a Vodafone prepaid mobile broadband package today, which comes with a modem and SIM. The modem is a K3571-Z, and Ubuntu *thinks* it knows how they work (it doesn’t). So it fails to connect in NetworkManager with a rather opaque ‘NO CARRIER’ message.

Thanks to excellent assistance from ﻿﻿Matt Trudel, we tracked this down to a theory that perhaps modemmanager is using the wrong serial port – and voila, it is. From there, the config file (/lib/udev/rules.d/77-mm-zte-port-types.rules) was an obvious next step – and indeed there is no entry in there for the 19d2:1010 – the K3571-Z. Google found one immediately though, on a Vodafone research site.

The awful shame is this: that was committed to the bcm project in March this year. If Vodafone had shipped off a patch to modemmanager, we could have had that in 10.10, and possibly even in 10.04. There are plenty of users having trouble on Whirlpool etc with this model who would have had a better experience – helping Vodafone’s users be happier.

All it would have taken is an email

I’m sure Vodafone want a great experience for their users, but I think they’re failing to separate out platform improvements – share and share alike, and branding / custom facilities. The net impact is harmful, not helpful.

Tesetrepository has a really nice workflow for fixing a set of failing tests:

Tell it about the failing tests (e.g. by doing a full test run, or running a single known failing test)

Run just the known failing tests (testr run –failing)

Make a change

Goto step 2

As you fix up the tests testr will just give your test runner a smaller and smaller list of tests to run.

However I haven’t been able to use that feature when developing (most) Python programs.

Today though, I added the necessary support to testtools, and as a result subunit (which inherits its thin test runner shim from testtools) now supports –load-list. With this a simple .testr.conf can support this lovely workflow. This is the one used in testrepository itself: it runs the testrepository tests, which are regular unittest tests, using subunit.run – this gives it subunit output, and tells testrepository how to run a subset of tests.
[DEFAULT]
test_command=python -m subunit.run $IDOPTION testrepository.tests.test_suite
test_id_option=--load-list $IDFILE

So a while back I blogged about maintainable test suites. One of the things I’ve been doing since is fiddling with the heart of the fixtures concept.

To refresh your memory, I’m defining fixture as some basic state you want to reach as part of doing a test. For instance, when you’ve mocked out 2 system calls in preparation for some test code – that represent a state you want to reach. When you’ve loaded sample data into a database before running the actual code you want to make assertions about – that also represents a state you want to reach. So does simply combining three or four objects so you can run some code.

Now, there are existing frameworks in python for this sort of thing. testresources and testscenarios both go some way towards this (and I and to blame for them ), so does the zope testrunner with layers, and the testfixtures project has some lovely stuff as well. And this is without even mentioning py.test!

There are a few things that you need from the point of view of running a test and establishing this state:

You need to be able to describe the state (e.g. using python code) that you wish to achieve.

The test framework needs to be able to put that state into place when running the test. (And not before because that might interfere with other tests)

And the state needs to be able to be cleaned up.

Large test suites or test suites dealing with various sorts of external facilities will also often want to optimise this process and put the same state into place for many tests. The (and I’m not exaggerating) terrible setUpClass and setUpModule and other similar helpers are often abused for this.

Why are they terrible? They are terrible because they are fragile; there is no (defined in the contract) way to check that the state is valid for the next test, and its common to see false passes and false failures in tests using setUpClass and similar.

So we also need some way to reuse such expensive things while still having a way to check that test isolation hasn’t been compromised.

Having looked around, I’ve come to the conclusion we’ll all benefit if there is a single core protocol for doing these things, something that can be used and built on in many different ways for many different purposes. There was nothing (that I found) that actually met all these requires and was also tasteful enough that folk might really like using it.

I give you ‘fixtures‘. Or on Launchpad. This small API is intended to be a common contract that all sorts of different higher level test libraries can build on. As such it has little to no policy or syntatic sugar.

It does have a nice core, integration with pyunit.TestCase, and I’m going to add a library of useful generic fixtures (like temporary directories, environment isolators and so on) to it. I’d be delighted to add more committers to the project, and intend to have it be both Python 2.x and 3.x compatible (if its not already – my CI machine isn’t back online after the move yet, I’m short of round tuits).

I do hope that the major frameworks (nose, py.test, unittest2, twisted) will include the useFixture glue themselves shortly; I will offer it as a patch to the code after giving it some time to settle. Further possibilities include declared fixtures for tests, and we should be able to make setUpClass better by letting fixtures installed during it get reset between tests.

I recently moved withing Canonical from being a paid developer of Bazaar to take on a larger challenge Technical Architect for Launchpad. Its been two months now, and its time to put my head up out of the coal face, have a look around and regroup.

When I worked on Bazaar, every day when I started work got up I was working on a tool anyone can use, designed for collaboration upon sourcecode, for people writing software. This is a toolchain component right at the heart of the free software world. Bazaar and tools like it get used everyday to manage, distribute and collaborate on the sourcecode that makes up the components of Ubuntu, Debian, Fedora and so forth. Every time someone new starts using Bazaar for a new free or open source project, well I felt happy – happy that in my small part I’m helping with this revolution we’re carrying out.

Launchpad is pretty similar to Bazaar in some ways. Obviously they are both free software, both are written in Python, and both are sponsored by Canonical, my employer. And they both are designed to assist in collaboration and communication between free software developers – albeit in rather different ways.

Bazaar is a tool anyone can install locally, run as a command line, GUI, or local webserver, and share code either centrally (e.g. by pushing to Launchpad), or in a peer to peer fashion, acting as their own server.

Launchpad, by contrast is a website which (usually) folk will use as a service – in their browser, from the comand line – FTP (for package building), ssh (for Bazaar branch pushing or pulling), or even local GUI programs using the Launchpad API service. This makes it more approachable for first time collaborators, but its less able to be used offline, and it has all the usual caveats of web sites : it needs a username and password, it’s availability depends on the operators – on the team I’m part of. So there’s a lot less room for error: if we do something wrong, the system is unavailable, and users can’t just ‘apt-get install’ an older release.

With Launchpad our goal is to to get all the infrastructure that open source need out of the way, so that they can focus on their code, collaboration within their team – and almost uniquely – collaboration with other teams. As well as being open source, Launchpad is free for all open source projects to use – Ubuntu is our single biggest user – they use it for all bugtracking, translation and package building, and have a hugefraction of the total storage overhead in the database.

Launchpad is a pretty nice system, so people use it, and as a result (on a technical basis) it is suffering from its own success: small corner cases in the code turn up every day or two, code written years ago to deal with a relatively small data set now has to deal with data sets a thousand or more times larger (one table, for instance, has over 600,000,000 rows in it.

For the last two months then, I’ve been working on Launchpad. As Technical Architect, I need to ensure that the things that we (users, stakeholders and developers of Launchpad) want to do are supported by the structure of the system : the platform(s) we’re building on, the way we approach problems, coding standards and diagnostic tools. That sounds pretty dry and hands off, but I’m finding its actually very balanced. I wrote a presentation when I started the job, which encapsulated the challenges I saw in front of the team on this purely technical front, and what I thought I needed to do.

I think I was about right in my expectations: On a typical day, I’ll be hands on in a problem helping get it diagnosed, talking long term structural changes with someone around how to make things more efficient / flexible / maintainable, and writing a small patch here or there to help move things along.

In the two months since I took on this challenge, we’ve made significant headway on the problem of performance for Launchpad : many inefficient code paths have been identified and removed, some new infrastructure has been created as is being rolled out to make individual pages faster, and we’ve massively increased the diagnostic data we get when things go wrong. We’ve introduced facilities for responding more rapidly to issues in the software (but they have to be rolled out across the system) and I hope, over the next 4 months we’ll reach the first of my performance goals: for any webpage in Launchpad, it will complete rendering in 99% of the time. (Note that we already meet this goal if you measure the whole system, but this is biased by some pages being very frequently hit and also being very small).

In their post the author notes that subunit is not easy_installable at the moment. It will be shortly. Thanks to Tres Seaver there is a setup.py for the python component of Subunit, and he has offered to maintain that going forward. His patch is in trunk, and the next release will include a pypi upload.

The next subunit release should be pretty soon too – the unicode support in testtools has been overhauled thanks to Martin[gz], and so we’re in much better shape on Python 2.x than we were before. Python3 for testtools is trouble free in this area because confused strings don’t exist there

There’s a test code maintenance issue I’ve been grappling with, and watching others grapple with for a while now. I’ve blogged about some infrastructural things related to it before, but now I think its time to talk about the problem itself. The problem shows up as soon as you start writing setUp functions, or custom assertThing functions. And the problem is – where do you put this code?

If you have a single TestCase, its easy. But as soon as you have two test classes it becomes more difficult. If you choose either class, the other class cannot use your setUp or assertion code. If you create a base class for your tests and put the code there you end up with a huge base class, and every test paying the total overhead of your test needs, rather than just the overhead needed to test the particular system you want to test. Or with a large and growing list of assertions most of which are irrelevant for most tests.

The reason the choices have to be made is because test code is just code; and all the normal issues there – separation of concerns, composition often being better than inheritance, do-one-thing-well – all apply to our test code. These issues are exacerbated by pyunit (that is the Python ‘unittest’ module included with the standard library and extended by various projects)

Lets look some (some) of the concerns involved in a test environment: Test execution, fixture management, outcome decision making. I’m using slightly abstract terms here because I don’t want to bind the discussion down to an existing implementation. However the down side is that I need to define these terms a little.

Test execution – by this I mean the basic machinery of running a single test: the test framework calling into user code and receiving back an outcome with details. E.g. in pyunit your test_method() code is called, success is determined by it returning successfully, and other outcomes by raising specific exceptions. Other languages without exceptions might do this returning an outcome object, or passing some object into the user code to be called by the test.

Fixture management – the non trivial code that prepares a situation where you can make assertions. On the small side, creating a few object instances and glueing them together, on the large end, loading data into a database (and creating the database instance at the same time). Isolation issues such as masking out environment variables and creating temp directories are included in this category in my opinion.

Outcome decision making – possibly the most obtuse label I’ve ever given this, I’m referring the process of deciding *what* outcome you wish to have happen. This takes different forms depending on your testing framework. For instance, in Python’s doctest:

>>> x

45

provides a specification – the test framework calls str(x) and then compares that to the string ’45′. In pyunit assertions are typically used:

self.assertEqual(45, x)

Will call 45 == x and if the result is not True, raise an exception indicating a Failure has occured. Unexpected exceptions cause Errors, and in the most recent pyunit, and some extensions, other exceptions can signal that a test should not be run, or should have failed.

So, those are the three concerns that we have when testing; where should each be expressed (in pyunit)? Pragmatically the test execution code is the hardest to separate out: Its partly outside of ‘user control’, in that the contract is with the test framework. So lets start by saying that this core facility, which we should very rarely need to change, should be in TestCase.

That leaves fixture management and outcome decision making. Lets tackle decision making… if you consider the earlier doctest and assertion examples, I think its fairly clear that there are multiple discrete components at play. Two in particular I’d like to highlight are: matching and signalling. In the doctest example the matching is done by string matching – the reference object(s) are stringified and compared to an example the test writer provides. In the pyunit example the matching is done by the __eq__ protocol. The signalling in the doctest example is done inside the test framework (so we don’t see any evidence of it at all). In the pyunit example the signalling is done by the assertion method calling self.fail(), that being the defined contract for causing a failure. Now for a more complex example: testing a float. In doctest:

>>> “%0.3f” % x

0.123

In pyunit:

self.assertAlmostEqual(0.123, x, places=3)

This very simple check – that a floating point number is effectively 0.123 exposes two problems immediately. The first, in doctest, is that literal string comparisons are extremely limited. A regex or other language would be much more powerful (and there are some extensions to doctest; the point remains though – the … operator is not enough). The second problem is in pyunit. It is that the contract of assertEqual and assertAlmostEqual are different: you cannot substitute one in where the other was expected without partial function application – something that while powerful is not the most obvious thing to reach for, or to read in code. The JUnit folk came up with a nice way to address this: they decoupled /matching/ and /deciding/ with a new assertion called ‘assertThat’ and a language for matching – expressed as classes. The initial matcher library, hamcrest, is pretty ugly in Python; I don’t use it because it tries too hard to be ‘english like’ rather than being honest about being code. (Aside, what would ‘is_()’ in a python library mean to you? Unless you’ve read the hamcrest code, or are not a Python programmer, you’ll probably get it wrong. However the concept is totally sound. So, ‘outcome decision making’ should be done by using a matching language totally seperate from testing, and a small bit of glue for your test framework. In ‘testtools’ that glue is ‘assertThat’, and the matching language is a narrow Matcher contract (in testtools.matchers) which I’m going to describe here, in case you cannot or don’t want to use the testtools one.

This permits composition and inheritance within your matching code in a pretty clean way. Using == only permits this if you can simultaneously define an __eq__ for your objects that matches with arbitrarily sensitivity (e.g. you might not want to be examining the process_id value for a process a test ran, but do want to check other fields).

Now for fixture management. This one is pretty simple really: stop using setUp (and other similar on-TestCase methods). If you use them, you will end up with a hierarchy like this:

Note that there are some things around that offer this sort of convention already: thats all it is – convention. Pick one, and run with it. But please don’t use setUp; it was a conflated idea in the first place and is a concrete problem. Something like testresources or testscenarios may fit your needs – if it does, great! However they are not the last word – they aren’t convenient enough to replace just calling a simple helper like I’ve presented here.

To conclude, the short story is:

use assertThat and have a seperate hierarchy of composable matchers

use or create a fixture/resouce framework rather than setUp/tearDown

any old TestCase that has the outcomes you want should do at this point (but I love testtools).

Free network services – A discussion session led by Bradley Kuhn, Mako & Matt Lee : Libre.fm encouraged last.fm to write an API so they didn’t need to screen scrape; outcome of the network services story still unknown – netbooks without local productivity apps might now work, most users of network office apps are using them because of collaboration. We have a replacement for twitter – status.net, distributed system, but nothing like facebook [yet?]. Bradley says – like the original GNU problem, just start writing secure peer to peer network services to offer the things that are currently proprietary. There is perhaps a lack of an architectural vision for replacing these proprietary things: folk are asking how we will replace ‘the cloud’ aspects of facebook etc – tagging photos and other stuff around the web, while not using hosted-by-other-people-services. I stopped at this point to switch sessions – the rooms were not in sync session time wise.

Mentoring in free software – Leslie Hawthorne: Projector not working, so Leslie carried on a discussion carried on from the previous talk about the use of sexual themes in promoting projects/talk content and the like. This is almost certainly best covered by watching the video. A few themes from it though:

for anyone considering joining a community, they are assessing whether that community is ‘people like us’ – and for many people, including both women *and* men, blatant sexuality, isn’t something that fits the ‘people like us’ assessment. Note that this is in addition to offensive and inappropriate aspects of the issue.

The lack of support in the community has for at least one project led to a complete loss of the women contributors to that project – and they are still largely lacking many years later.

We then got Leslies actual talk. Sadly I missed the start of it – I was outside organising security guards because we had (and boy it was ironic) a very loud, confrontational guy at the front who was replying to every statement and the tone in the room had gotten to the point that a fight was brewing.

From where I got back:

Check your tone

help people be productive in your community

cultivate creativity

know yourself

do not get caught up in perfectionism

communicate – both big stuff, but also just take the time to talk – how are you going, etc.

Share your mistakes

Guide don’t order

Recognition = Retention

Recognition = Delegation – its ok to let other people be responsible for stuff

http://bit.ly/MentorGuide

http://bit.ly/MentoringArticle

Chris Ball, Hanna Wallach, Erinn Clark and Denise Paolucci — Recruiting/retaining women in free software projects. Not a unique problem to women – things that make it better for women can also increase the recruitment and retention of men. Make a lack of diversity a bug; provide onramps – small easy bugs in the bug tracker (tagged as such), have a dedicated womens sub project – and permit [well behaved ] men in there – helps build connections into the rest of the project. Make it clear that mistakes are ok. On retention… recognise first patches, first commits in newsletters and the like. Call out big things or long wanted features – by the person that helped. Regular discussion of patches and fixes – rather than just the changelog. CMU did a study on undergrad women participation in CS : ‘Lack of confidence preceeds lack of interest/partipation’. Engagement with what they are doing is a key thing too. ‘Women are consistently undervaluing their worth to the free software community’. ‘Its the personal touch that seems to make a huge difference’. ‘More projects should do a code of conduct – kudos to Ubuntu for doing it’ — Chris Ball.

I found the mentoring and women-in-free-software talks to have extremely similar themes – which is perhaps confirmation or something – but it wasn’t surprising to me. They were both really good talks though!

And thats my coverage of LibrePlanet – I’m catching a plane after lunch . Its a good low-key conference, and well put together.

John Gilmore keynote – What do we do next, having produced a free software system for our computers? Perhaps we should aim at Windows? Wine + an extended ndiswrapper to run other hardware drivers + a better system administration interface/resources/manuals. However that means knowing a lot about windows internals – something that open source developers don’t seem to want to do. We shouldn’t just carry on tweaking – its not inspiring; whats our stretch goal? Discussion followed – reactos, continue integrating software and people with a goal of achieving really close integration: software as human rights issue! ‘Desktop paradigm needs to be replaced’ : need to move away from a document based desktop to a device based desktop. Concern about the goal of running binary drivers for hardware: encourages manufacturers to sell hardware w/out specs; we shouldn’t encourage the idea that that is ok. Lots of concern about cloning, lots of concern about what will bring more freedom to users, and what it will take to have a compelling vision to inspire 50000 free software hackers. Free software in cars – lots of safety issues in .e.g brake controllers, accelerators.

Eben Moglen – ‘We’re at the inflection point of free software’ – because any large scale global projects these days are not feasible without free software. Claims that doing something that scales from tiny to huge environment requires ‘us’ — A claim I would (sadly) dispute. Lots of incoming and remaining challenges. ‘Entirely clear that the patent systems relationship to technology is pathological and dangerous’ – that I agree with! Patent muggings are a problem – patent holders are unhappy with patents granted to other people . Patent pools are helping slowly as they grow. Companies which don’t care about the freedom aspect of GPLv3 are adopting it because of the patent protection aspects. Patent system is at the head of the list of causes-of-bad-things affecting free software. SFLC is building coalitions outside the core community to protect the interests of the free software community. We are starting to be taken for granted at the high end of mgmt in companies that build on free software. … We face a problem in the erosion of privacy. We need to build a stack, running on commodity hardware that runs federated services rather than folk needing centralised services.

Marina Zhurakhinskaya on GNOME Shell: Integrates old and new ideas in an overall comprehensive design. Marina ran through the various goals of the shell – growing with users, being delightful, starting simply so new users are not overwhelmed. The activities screen looks pretty nice The workspace rearrangement UI is really good. The notifications thing is interesting; you can respond to a chat message in-line in the notification.

Richard Stallman on Software as a Service – he presented verbally the case made in the paper. Some key quotes… “All your data on a server is equivalent to total spyware” – I think this is a worst-case analogy; it suggests that you can never trust another party: kindof a sad state of paranoia to assume that all network servers are always out to get you all the time. And I have to ask – should we get rid of Savannah then (because all the data is stored there) – the argument for why Savannah is not SaaS is not convincing: its just file storage, so what makes it different to e.g. Ubuntu One? “If there is a server and only a little bit of it is SaaS, perhaps just say don’t worry about it – because that little bit is often the hardest bit to replace.” ”Lets write systems for collaborative word process that don’t involve a central server” — abiword w/the sharing plugin ? RMS seems to be claiming that someone else sysadmining a server for you is better than someone else sysadmining a time-shared server for you: I don’t actually see the difference, unless you’re also asserting that you’ll always have root over your’ own machine’. The argument seems very fuzzy and unclear to me as to why there is really a greater risk – in particular when there is a commercial relationship with the operator (as opposed to, say, an advertising supported relationship).