Scott recently noted that we don’t have Klingon available in Ubuntu. Klingon is available in ISO 639, so adding it should be straight forward.

Last time I blogged about this three packages needed changing, as well as Launchpad needing a translation team for the language. The situation is a little better now: only two packages need changing as gdm now dynamically looks for languages based on installed locales.

Secondly, langpack-locales has to change for two reasons. Firstly a locale definition has to be added (and locales define a place – a language and locale information like days of the week, phone number formatting etc. Secondly the language needs to be added to the SUPPORTED list in that package, so that language packs are generated from Launchpad translations.

Now, gdm autodetects, but it turns out that only ‘complete’ locales were being shown. And that on Ubuntu, this was not looking at language pack directories, rather at

/usr/share/locale

which langpack-built packages do not install translations into. So it could be a bit random about whether a language shows up in gdm. Martin Pitt has kindly turned on the ‘with-incomplete-locales’ configure flag to gdm, and this will permit less completely translated locales to show up (when their langpack is installed – without the langpack nothing will show up).

Jeremy Allison on ‘The elephant in the room – free software and microsoft’. While he works at Google, this talk was ‘off the leash’ – not about Google . As usual – grab the video We should care about Microsoft because Microsoft’s business model depends on a monopoly [the desktop]. Microsoft are very interested in ‘Open Source’ – Apache, MIT, BSD licenced software – the GPL is intolerable. Jeremy models Microsoft as a collection of warring tribes that hate each other… e.g. Word vs Excel.

The first attack was on protocols – make the protocols more complex and sophisticated. MS have done this on Kerberos, DCE/RPC, HTTP, and higher up the stack via MSIE rendering modes, ActiveX plugins, Silverlight… The EU case was brought about this in the ‘Workgroup Server Market’. MS were fined 1 Billion Euros and forced to document their proprietary protocols.

OOXML showed up rampant corruption in the ISO Standards process – but it got through even though it was a battle against nearly everyone! On the good side it resulted into an investigation into MS dominance in file formats -> MS implemented ODF and MS have had to document their old formats.

All of these things are long term failures for MS… so what next?… Patents . Patents are GPL incompatible, but fine with BSD/MIT. The Tom Tom is the first direct attack using MS’s patent portfolio. This undermines all the outreach work done by the MS Open Source team – which Jeremy tells us are true believers in open source, trying to change MS from the inside. Look for MS pushing RAND patented standards: such things lock us out.

Netbooks are identified as a key point for MS to fight on – lose that and the desktop position is massively weakened.

We should:

Keep creating free software and content *under a copyleft license*.

Keep pressure on Governments and organisations to adopt open standards and investigate monopolies.

Lobby against software patents.

Search for prior art on relevant patents and destroy them.

Working for a corporation is a moral choice: respectfully call out MS employees.

Paul McKenney did another RCU talk – and as always it was interesting… Optimisation Gone Bad (RCU in Linux 1993-2008). Linux 2.6 -rt patch made RCU much much much more complex with atomic operations, memory barriers, frequent cache misses, and since then it was slowly being whittled back, but there is now a new simpler RCU based around the concept of doing the accounting during context switches & tracking running tasks.

Glyn Moody – Hackers at the end of the world. Rebel code is now 10 years old… 50+ interviews over a year – and could be considered an archaeology now I probably haven’t down the keynote justice – it was excellent but high density – you should watch it online

Glyn talks about open access – various examples like the public library of science (and how the scientific magazine business made 30%-40% profit margins. The Human Genome Project & the ‘Bermuda Principles’: public submimssion of annotated sequences. In 2000 Celera were going to patent the entire human genome. Jim Kent spent 3 weeks writing a program to join together the sequenced fragments on a 100 PC 800Mhz Pentium processor. This was put into the public domain on just before Celera completed their processing – and by that action Celera were prevented from patenting *us*.

Openness as a concept is increasing within the scientific community – open access to result, open data, open science (the full process). An interesting aspect to it is ‘open notebook science’ – daily writeups, not peer reviewed: ‘release early, release often’ for science.

Glyn ties together the scientific culture (all science is open to some degree) and artistic culture (artists share and build on /reference each others work) by talking about a lag between free software and free content worlds. In 1999 Larry Lessig setup ‘Copyright’s Commons’ built around an idea of ‘counter-copyright’ – copyleft for non-code. This didn’t really fly, and Creative Commons was setup 2 years later.

Wikipedia and newer sharing concepts like twitter/facebook etc are discussed. But… what about the real world: transparency and governments, or companies? They are opening up.

However, data release != control release. And there are challenges we all need to face:

GFinancialC “my gain is your loss”. Very opaque system.

GEnvironmentalC “my gain is our loss”

Glyn argues we need a different approach to economic governance: the commons. 2009 Nobel laureate for Economic Sciences – Elinor Ostrom – work on commons and their management via user associations… which is what we do in open source!

Pandora-build. There for support – I’ve contributed patches. Pandora is a set of additional glue and layers to improve autotools and make it easier to work with things like gettext and gnulib, turn on better build flags and so forth. If you’re using autotools its well worth watching this talk – or hop on #drizzle and chat to mtaylor

The open source database survey talk from Selena was really interesting – a useful way of categorising databases and a list of what db’s turned up in what category. E.g. high availability,community development model etc. Key takeaway: there is no one-true-db.

I gave my subunit talk in the early afternoon, reasonably well received I think, though I wish I had been less sick last week: I would have loved to have made the talk more polished.

Ceph seems to be coming along gangbusters. Really think it would be great to use for our bzr hosting backend. 0.19 will stablise the disk format! However we might not be willing to risk btrfs yet

Another must-grab-the-video talk : Mako’s keynote. Antifeatures, principles vs pragmatism do come together. The principled side – RMS & the FSF – important to control ones technology because its important to control ones life. The pragmatic side – quality, no vendor lock etc. False dichotomy.. freedom imparts pragmatic benefits even though it doesn’t intrinsically import quality, good design: 95% of projects 5 contributors; median number of contributors 1, and such small collaborations are no different than a closed source one.

Definition of antifeatures – built functionality to make a product do something one does not want it to do. Great example of phone books: spammers pay for access to the lists, and thus we have to pay *not to be listed*, but its actually harder to list and print our numbers in the first place. Mako makes a lovely analogy to the mafia there. Similarly with Sony charging 50 dollars not to install trialware on windows laptops in the past.

Cameras: Canon cameras disabled RAW saving…. CHDK, an open source addon for the camera outputs RAW again. Panasonic are locking down their cameras to reject third party batteries.

The tivo is an example of how focusing on licensing can miss the big picture: free stack, but still locked into a subscription to get an ongoing revenue stream.

Dongles! Mako claimed there wasn’t a facebook appreciation group for dongles… there is.

Github: paying for the billing model – lots of code there to figure out how many projects in a repo, so that they can charge on that basis.

DRM is the ‘mother of all antifeatures’ – 10K people writing DRM code that no users want!

Gabriella Colemans keynote was really good; grab it from the videos once they come online.

WETA run Ubuntu for their render farm: 3700 machines, 35000 cores, 7kw per ‘cold’ rack and 22kw per ‘hot’ rack. (Hot racks are rendering, cold racks are storage). Wow. Another talk well worth watching if you are at all interested in the issues related to running large numbers of very active machines in a small space.

And a classic thing from the samba4 talk at the start of the afternoon: MS AD domain controllers do no validation of updates from other domain controllers: classic crunchy surface security. (Discovered by samba4 borking AD while testing r/w replica mode).

Blue-ray on linux is getting there, however one sad thing is that the Blue ray standard has no requirement that vendors make players be able to play un-encrypted content – and there are some hints that in fact licences may require them to not play un-encrypted content.

Peter Chubb’s talk on Articulate was excellent for music geeks: midi that sounds like music from lillypond.

Ben Balbo talked about ‘Roll your own dropbox’. Ben works at a multimedia agency, but the staff work locally and don’t use the file server…. use instant messenger to send files around! Tried using subversion… too hard. Dropbox looked good but 3-7 hundred a month – too pricey given an existing 1.4TB of spare capacity.

He then considered svn + cron but deleted directories cause havoc & something automatic was wanted… so git + cron instead. Key thing used in doing this was having a work area with absolutely no metadata. Conflicts dealt with by filename.conflict.DATESTAMP.HOSTNAME.origextention

Doesn’t trigger of inotify, no status bar widget, only single user etc at the moment, but was written to meet the office needs so is sufficient. Interestingly he hadn’t looked at e.g. iFolder.

For a while now I’ve been using subunit as part of my regular development workflow. I would pipe test results to a file, use subunit to report on failures from that file, and be able to inspect all the failures at my leisure without rerunning tests or copy and pasting from far back in my history.

However this is a bit adhoc, and its not trivial to get good pipelines together – while its not hard, its not obvious either. And commands like tee are less readily available for Windows users.

So during my holidays I started a small project to automate this workflow. I didn’t get all that much done due to a combination of travel and coming down with a nasty bug near the end of my holidays – which I’m now recovering from. Yay health returning + medicines. If only we had medichines .

However, I managed to get a reasonable first release out the door this evening. Grab it from launchpad or pypi.

Testrepository has a few deps – all listed in INSTALL.txt. Folk on Ubuntu Lucid should be able to just apt-get them all (sudo apt-get install subunit will be enough to run testrepository). If you’re not on Lucid you can grab the debs manually, or use the subunit ppa (sudo add-apt-repository ppa:subunit), though I’ve noticed just today that that karmic subunit build there only works with python 2.5, not the default of 2.6 – I will fix that at some point.

The actual subunit streams are stored in .testrepository in sequentially numbered files (for now at least). So its very easy to get at them (for instance, subunit-stats < .testrepository/12).

If you are not using python, you can still use subunit easily if you are using shunit, ‘check’ or ‘cppunit’. subunit ships with bindings for shunit and cppunit, and check uses libsubunit with the CK_SUBUNIT output mode. TAP users can use tap2subunit to get a subunit stream from a TAP based testsuite.

It’s still early days but I’m finding this much nicer than the adhoc subunit management I was doing before.

Evolution recently moved to a sqlite summary db rather than a custom summary db implementation. Its great to see such reuse of code.

However, it’s not really a complete transition yet as I’ve had cause to find out today. I’ve blogged before about performance with the sqlite summary sqlite database. Today I was greet with a crash-on-startup bug which happily has a patch upstream already. Before I looked in the bug tracker though, I did some house cleaning.

I started with a 900MB folders.db. Doing a vacuum on the db dropped that to 300MB. It doesn’t appear to be something that evolution does itself. Firefox too appears to lack an automatic vacuum. sqlite is an embedded database, and its wonderful at doing that, but its not as install-and-forget as (say) PostgreSQL which does autovacuum. So an additional tip is vacuum your folders, e.g. with http://www.gnome.org/~sragavan/evolution-rebuild-summarydb, a helper script that will run vacuum on all your account summary db’s. Note that it *does not rebuild*, it solely vacuums, and as such does not add or delete (modulo bugs in sqlite) data to the summary database.

After the housecleaning, I checked that the sqlite database was in good condition:

sqlite3 folders.db
pragma integrity_check;

This returned a number of indexing issues, so I reindexed:

reindex;

Evolution now starts up and crashes in a fraction of a second - a big improvement. Finally, I started looking at the evolution code as I now was fairly confident it was a bug - it was in a sqlite callback function - and the column the function extracts data from (flags) is missing a NOT NULL constraint, but the code doesn't check for NULL - boom. From there to finding the bug report and existing patch was trivial.

And this is where my comment on reliability turns up: Evolution doesn't anticipate NULL flag values in its code, so why does it insert them into the database at all ? I suspect its due to some aspect of the incremental conversion to using sqlite summaries. More concerning for me is the possibility that there are many other such crash bugs lurking in the new sqlite based code.

There are possibly some clues as to the excessive table scans done by evolution in the use of a flags bitset rather than separate columns, but I haven't looked close enough to really say.