There you are, in some Openstreemap editor, correcting the same typo for the 16th time, cursing contributors who neglect correct capitalization and thinking about how tedious this necessary data gardening is. While JOSM is endowed with unfathomable depths of cartographic potentiality, you long for a way to simply whip out your favourite text editor and apply its familiar power to the pedestrian problem of repeatedly editing text. Or the problem requires editing multiple mutually dependent tags and some XML-aware logic is therefore required – all the same: you just want to perform Openstreetmap editing as text processing.

Of course, as an experienced Openstreetmap gardener, you are well aware of the dangers of casually wielding a rather large chainsaw around our burgeoning yet fragile data nursery. So you understand why automated processing is generally not conducive to improvement in data quality – rare is the automation whose grasp of context equals human judgment. But human judgment could use some power tools… So there.

The meticulous reader might object that making the reviewing an explicit step separate from the editing is superfluous since no self-respecting cartographer would commit edited data without having performed a review as a mandatory step integral to edition. But the reader who closely observes Openstreetmap activity might counter that this level of self-disciplined care might not be universal, so the step is worth mentioning. Moreover, I’ll add that as soon as any level of automation is introduced, I consider the reviewing as a necessary checklist item.

So, first let’s get the data ! There are many ways… The normal JOSM way of course – but your mass edition requirement probably means that you wish to edit a body of data much larger than what the Openstreetmap servers will let JOSM download at once – and, if you ever had to repeatedly download rectangles until you have covered you whole working area, you don’t want to do it again.

Yes, I didn’t take relations into account – there are only a couple of amenity=place_of_worship relations in Senegal’s Openstreetmap data… So adding relations to this query is left as an exercise for the reader.

A gigabyte download and a couple of minutes of osmosis execution later, your data is ready and you have found new appreciation of how fast Overpass Turbo is. Our Osmosis computation might have been a little faster if there was a Senegal planet extract available, but we had to contend with taking the whole of Africa as an input and filtering it through a bounding box.

By the way, the dedicated reader who assiduously tries to reproduce my work might notice that the two methods don’t return the same data. This is because the Overpass Turbo query filters properly by intersection with Senegal’s national borders whereas my Osmosis command uses a rectangular bounding box that includes bits of Mauritania, Mali, Guinea and Guinea Bissau. One can feed Osmosis a polygon produced out of the national borders relations, but I have not bothered with that.

The first two example OSM XML elements will not come as a surprise: they both contain <tag k=”amenity” v=”place_of_worship”/> – but what about the third, which does not ? Take a look at its node id – you’ll find it referred by one of the first example’s <nd /> elements, which means that this node is one of the six that compose this way. Including nodes used by the ways selected by the query is the role of the –used-node option in the osmosis command.

But anyway, why are we including nodes used by the ways selected by the query ? In the present use-case, I only care about correcting trivial naming errors – so why should I care about the way’s geometry ? Well… Remember the step “3 – Review data” ? Thanks to being able to represent the way geometrically, I can visually find that an English-language name may not be an error because the node that bears it is located in Guinea Bissau, not in Senegal where it would definitely be an error outside of the name:en tag. Lacking this information I would have erroneously translated the name into French. Actually, I first did and only corrected my error after having reviewed my data in JOSM – lesson learned !

Talking about reviewing, is your selection of data correct ? Again, one way to find out is to load it in JOSM to check tags and geographic positions.

And while in JOSM, you might also want to refresh your data – it might have become stale while you were mucking around with osmosis (do you really think I got the query right the first time ?) and Geofabrik’s Planet extracts are only daily anyway… So hit Ctrl-U to update your data – and then save the file.

This concludes step “1 – Get data” – let’s move on to step ‘2 – Edit data’ ! First, do not edit the file you just saved: we will need it later to determine what we have modified. So produce a copy, which is what we’ll edit – execute ‘cp senegal-place_of_worship.osm senegal-place_of_worship.mod.osm’ for example.

Now take your favourite text processing device and go at the data ! I used Vim – here is how it looks like:

Let’s open the file in JOSM and upload this awesome edit to Openstreeetmap – here we go !

“Whaaat ? No changes to upload ? But where are my edits ? You said we could just edit and upload !” – No, and anyway I said that you had to review your data beforehand !

Fear not, your edits are safe (if you saved them before closing your editor…) – it is only JOSM who does not know which objects you edited. Looking at the above data, it has now way to determine if any part has been edited. We’ll have to tell it !

“Whaaat ? Do I really have to write or copy/paste action=”modify” on the parent Openstreetmap object of every single modification ? You said this article was about automation !” – fear not, I have you covered with this article’s crowning achievement: the OSMXML_mark_modified_JOSM-style script.

Remember when I’ll said earlier “First, do not edit the file you just saved: we will need it later to determine what we have modified. So produce a copy, which is what we’ll edit – ‘cp senegal-place_of_worship.osm senegal-place_of_worship.mod.osm’ ” ? We are now later and the OSMXML_mark_modified_JOSM-style script will not only determine what we have modified but also mark the parent Openstreetmap object of each modification with an action=”modify” attribute.

Now open the modified file in JOSM and review the result. As I mention in passing in the script’s comments: BLOODY SERIOUSLY REVIEW YOUR CONTRIBUTION IN JOSM BEFORE UPLOADING OR THE OPENSTREETMAP USERS WILL COME TO EAT YOU ALIVE IN YOUR SLEEP ! Seriously though, take care : mindless automatons that trample the daisies are a grievous Openstreetmap faux pas. The Automated Edits code of conduct is mandatory reading.

Ok, I guess you got the message – you can now upload to Openstreetmap:

If you spent too long editing, you might encounter conflicts. Carefully resolve them without stepping on anyone’s toes… And enjoy the map !

Incidentally, this is my first time using XML::LibXML and actually understanding what I’m doing – I love it and there will be more of that !

You have a nice amplifier in a corner of the living-room, tethered to nice loudspeakers. Alas the spot where you want to control the music from is far away – maybe in another corner of the livingroom, maybe in your office room or maybe on another continent… No problem – for about 40€ we’ll let you switch your music to this remote destination just as easily as switching between the headphones and speakers directly connected to your workstation. The average narcissistic audiophile pays more than that for a RCA cable.

First step is to set it up with an operating system. Since I love Debian, I chose Raspbian.A handy way to install Raspbian quick & easy is raspbian-ua-netinst, a minimal Raspbian unattended installer that gets its packages from the online repositories – it produces a very clean minimal setup out of the box.

So, go to raspbian-ua-netinst’s latest release page and download the .img.xz file – then put it on the SD card using ‘xzcat /path/to/raspbian-ua-netinst-<latest-version-number>.img.xz > /dev/sdX’ for which you may have to ‘apt-get install xz-utils’. Stick that card into the Raspberry Pi, connect the Raspberry Pi to Ethernet on a segment where it can reach the Internet after having been allocated its parameters through DHCP or NDP+RDNSS – and power it up.

Let raspbian-ua-netinst do its thing for about 15 minutes – time enough to find what IP address your Raspberry Pi uses (look at your DHCP server’s leases or just ‘nmap -sP’ the whole IP subnet to find a Raspberry Pi). Then log in over ssh – default root password is raspbian… Use ‘passwd’ to change it right now.

The default install is quite nice, but strangely doesn’t include a couple of important features… So ‘apt-get install raspi-copies-and-fills rng-tools’ – raspi-copies-and-fills improves memory management performance by using a memcpy/memset implementation optimised for the ARM11 used in Raspberry Pi, rng-tools lets your system use the hardware random number generator for better cryptographic performance. To finish setting up the hardware RNG, add bcm2708-rng to /etc/modules.

Also, the default install at the time of this writing uses Debian Wheezy, which contains a Pulseaudio version too old for our purposes – we need Debian Jessie which offers Pulseaudio 5 instead of Pulseaudio 2. And anyway, Jessie is just plain better – so let your /etc/apt/sources.list look like this:

Then ‘apt-get update && apt-get -y dist-upgrade && apt-get -y autoremove’… This should take a while.

Now install Pulseaudio the piece of software that will offer a receiving end to your network audio stream: ‘apt-get install pulseaudio’. I assume you have Pulseaudio already set up on the emitter station – your favourite distribution’s default should do fine, as long as it provides Pulseaudio version 5 (use ‘pulseaudio –version’ to check that).

Pulseaudio is primarily designed to cater to desktop usage by integrating with the interactive session of a logged in user – typically under control of the session manager of whatever graphical desktop environment. But we don’t need such complex thing here – a dumb receptor devoid of any extra baggage is what we want. For this we’ll use Pulseaudio’s system mode. Pulseaudio’s documentation repeatedly hammers that running in system mode is a bad idea – “nobody should run it that way, with the exception of very few cases”… Well – here is one of those very few cases.

In their zeal to discourage anyone from running Pulseaudio in system mode, the Pulseaudio maintainers do not ship any startup script in the distribution packages – this ensures that users who don’t know what they are doing don’t stray off the beaten path of orthodox desktop usage and end up on a forum complaining that Pulseaudio doesn’t work. But it also annoys the other users, who actually need Pulseaudio to run at system startup – but that is easily fixable thanks to another creation of Lennart’s gang: all we need is a single file called a systemd unit… I copied one from this guy who also plays with Pulseaudio network streaming (but in a different way – more on that later). This systemd unit was written for Fedora, but it works just as well for Raspbian… Copy this and paste it in /etc/systemd/system/pulseaudio.service :

Then ‘systemctl enable pulseaudio’ and ‘systemctl start pulseaudio’ – you now have a properly set up Pulseaudio daemon. Now is a good time to take a moment to consider how much more fastidious the writing of a SysVinit script would have been compared to just dropping this systemd unit in place.

Now let’s see the meat of this article: the actual setup of the audio stream. If you stumbled upon this article, you might have read other methods to the same goal, such as this one or this one. They rely on the server advertising its Pulseaudio network endpoint through Avahi‘s multicast DNS using the module-zeroconf-publish pulseaudio module, which lets the client discover its presence and so that the user can select it as an audio destination after having told paprefs that remote Pulseaudio devices should be available locally. In theory it works well and it probably works well in practice for many people but Avahi’s behaviour may be moody – or, in technical terms, subject to various network interferences that you may or may not be able to debug easily… Struggling with it led me to finding an alternative. By the way, Avahi is another one of Lennart’s babies – so that might be a factor towards Pulseaudio’s strong inclination towards integrating with it.

Discoverability is nice in a dynamic environment but, in spite of my five daughters, my apartment is not that dynamic – my office and the livingroom amplifier won’t be moving anytime soon. So why complicate the system with Avahi ? Can’t we just have a static configuration by declaring a hardcoded link once and for all ? Yes we can, with module-tunnel-sink-new & module-tunnel-source-new !

Module-tunnel-sink-new and module-tunnel-source-new are the reason why we require Pulseaudio 5 – they appeared in this version. They are a reimplementation of module-tunnel-sink, using libpulse instead of reinventing the wheel by using their own implementation of the Pulseaudio protocol. At some point in the future, they will lose their -new suffix and officially replace module-tunnel-{sink,source} – at that moment your setup may break until you rename them in your /etc/pulse configuration to module-tunnel-sink and module-tunnel-source… But that is far in the future – for today it is all about module-tunnel-sink-new and module-tunnel-source-new !

Now let’s configure this ! Remember that we configured the Raspberry Pi’s Pulseaudio daemon in system mode ? That means the relevant configuration is in /etc/pulse/system.pa (not in /etc/pulse/default.pa – leave it alone, it is for the desktop users). So add those two load-module lines to the bottom of /etc/pulse/system.pa – the first one to declare the IP addresses authorized to use the service, the second one to declare the IP address of the client that will use it… Yes – it is a bit redundant, but that is the way (two single load-module lines – don’t mind the spurious carriage return caused by this blog’s insufficient width):

It is possible to authenticate the client more strictly using a cookie file, but for my domestic purposes I decided that identification by IP address is enough – and lets leave some leeway for my daughters to have fun discovering that, spoof it and stream crap to the livingroom.

Also, as some of you may have noticed, this works with IPv6, but it works well with legacy IPv4 too – in which case the address must not be enclosed in brackets.

Then on the client side, add this single load-module line to the bottom of /etc/pulse/default.pa (not in /etc/pulse/system.pa – leave it alone, it is for headless endpoints and your client side is most probably an interactive X session). This is one single load-module line – don’t mind the spurious carriage return caused by this blog’s insufficient width:

Actually I didn’t use sink_name, but I understand you might want to designate your network sink with a friendly nickname rather than an IPv6 address – though why would anyone not find those lovely IPv6 addresses friendly ?

Anyway, log out of your X session, log back in and you’re in business… You have a new output device waiting for you in the Pulseaudio volume control:

So now, while some of your sound applications (such as the sweet Clementine music player pictured here) plays, you can switch it to the remote device:

That’s all folks – it just works !

While you are joyously listening to remote music, let’s have a word about sound quality. As any sound circuit integrated on a motherboard where it cohabits with a wild bunch of RF emitters, the Raspberry Pi’s sound is bad. The Model B+ claims “better audio – the audio circuit incorporates a dedicated low-noise power supply” but actual testing shows that it is just as bad and sometimes even worse. So I did what I nowadays always do to get decent sound: use a cheap sound adapted on a USB dongle, in the present case a ‘Creative Sound Blaster X-FI Go Pro’ which at 30€ gets you a great bang for the buck.

Upon reboot after upgrading yet another Debian host to sweet Jessie, I was dismayed to lose connectivity – a slight annoyance when administering through the Internet. Later, with screen & keyboard attached to the server, I found that the Intel Ethernet interface using the e1000e module had not come up on boot… A simple ‘ip link set eth0 up’ fixed that… Until the next reboot.

/etc/network/interfaces was still the same as before upgrade, complete with the necessary ‘auto eth0’ line present before the ‘iface eth0 inet static’ line. And everything was fine once the interface had been set up manually.

As the author remarked: “The hardware has a set limit on supported maximum frame size (9018), and with the addition of the VLAN_HLEN (4) in calculating the header size (now it is 22) , the max configurable MTU is now 8996”.

I just wanted to create an Apache virtual host responding to queries only over IPv6. That should have been most trivial considering that I had already been running a dual-stacked server, with all services accessible over both IPv4 and IPv6.

Following the established IPv4 practice, I set upon configuring the virtual host to respond only to queries directed to a specific IPv6 address. That is done by inserting the address in the opening of the VirtualHost stanza : <VirtualHost [2001:470:1f13:a4a::1]:80> – same as an IPv4 configuration, but with brackets around the address. It is simple and after adding an AAAA record for the name of the virtual host, it works as expected.

I should rather say it works even better than expected : all sub-domains of the second-level domain I’m using for this virtual host are now serving the same content that the new IPv6-only virtual host is supposed to serve… Ungood – cue SMS and mail from pissed-off users and a speedy rollback of the changes; the joys of cowboy administration in a tiny community-run host with no testing environment. As usual, I am not the first user to fall into the trap. Why Apache behaves that way with an IPv6-only virtual host is beyond my comprehension for now.

And now the IPv6-only virtual hosts serves as designed and the other virtual hosts are not disturbed. The world is peaceful and harmonious again – except maybe for that ugly post-up declaration in lieu of declaring an aliased interface the way the Unix gods intended.

All that just for creating an IPv6 virtual host… Systems administration or sleep ? Systems administration is more fun !

When I set up an Ubuntu host, I can’t help feeling like I’m installing some piece of proprietary software. Or course that is not the case : Ubuntu is (mostly) free software and as controversial as Canonical‘s ambitions, inclusion of non-free software or commercial services may be, no one can deny its significant contributions to the advancement of free software – making it palatable to the desktop mass market not being the least… I’m thankful for all the free software converts that saw the light thanks to Ubuntu. But nevertheless, in spite of all the Ubuntu community outreach propaganda and the involvement of many volunteers, I’m not feeling the love.

It may just be that I have not myself taken the steps to contribute to Ubuntu – my own fault in a way. But as I have not contributed anything to Debian either, aside from supporting my fellow users, religiously reporting bugs and spreading the gospel, I still feel like I’m part of it. When I install Debian, I have a sense of using a system that I really own and control. It is not a matter of tools – Ubuntu is still essentially Debian and it features most of the tools I’m familiar with… So what is it ? Is it an entirely subjective feeling with no basis in consensual reality ?

Again, I’m pretty sure that Mark Shuttleworth means well and there is no denying his personal commitment, but the way the whole Canonical/Ubuntu apparatus communicates is arguably top-down enough to make some of us feel uneasy and prefer going elsewhere. This may be a side effect of trying hard to show the polished face of a heavily marketed product – and thus alienating a market segment from whose point of view the feel of a reassuringly corporate packaging is a turn-off rather than a selling point.

Surely there is is more about it than the few feelings I’m attempting to express… But anyway – when I use Debian I feel like I’m going home.

And before you mention I’m overly critical of Ubuntu, just wait until you hear my feelings about Android… Community – what community ?

If you want to skip the making-of story, you can go straight to the laconica2IRC.pl script download. Or in case anyone is interested, here is the why and how…

Some of my best friends are die-hard IRC users that make a point of not touching anything remotely looking like a social networking web site, especially if anyone has ever hinted that it could be tagged as “Web 2.0” (whatever that means). As much as I enjoy hanging out with them in our favorite IRC channel, conversations there are sporadic. Most of the time, that club house increasingly looks like an asynchronous forum for short updates posted infrequently on a synchronous medium… Did I just describe microblogging ? Indeed it is a very similar use case, if not the same. And I don’t want to choose between talking to my close accomplices and opening up to the wider world. So I still want to hang out in IRC for a nice chat from time to time, but while I’m out broadcasting dents I want my paranoid autistic friends to get them too. To satisfy that need, I need to have my IRC voice say my dents on the old boys channel.

The data source could be an OpenMicroblogging endpoint, but being lazy I found a far easier solution : use Laconi.ca‘s Web feeds. Such solution looked easier because there are already heaps of code out there for consuming Web feeds, and it was highly likely that I would find one I could bend into doing my bidding.

To talk on IRC, I had previously had the opportunity to peruse the Net::IRC library with great satisfaction – so it was an obvious choice. In addition, in spite of being quite incompetent with it, I appreciate Perl and I was looking for an excuse to hack something with it.

With knowledge of the input, the output and the technology I wanted to use, I could start implementing. Being lazy and incompetent, I of course turned to Google to provide me with reusable code that would spare me building the script from the ground up. My laziness was of course quick to be rewarded as I found rssbot.pl by Peter Baudis in the public domain. That script fetches a RSS feed and says the new items in an IRC channel. It was very close to what I wanted to do, and it had no exotic dependancies – only Net::IRC library (alias libnet-irc-perl in Debian) and XML::RSS (alias libxml-rss-perl in Debian).

So I set upon hacking this script into the shape I wanted. I added IRC password authentication (courtesy of Net::IRC), I commented out a string sanitation loop which I did not understand and whose presence cause the script to malfunction, I pruned out the Laconi.ca user name and extraneous punctuation to have my IRC user “say” my own Identi.ca entries just as if I was typing them myself, and after a few months of testing I finally added an option for @replies filtering so that my IRC buddies are not annoyed by the noise of remote conversations.

I wanted my own IRC user “say” the output, and that part was very easy because I use the Bip an IRC proxy which supports multiple clients on one IRC server connection. This script was just going to be another client, and that is why I added password authentication. Bip is available in Debian and is very handy : I usually have an IRC client at home, one in the office, occasionally a CGI-IRC, rarely a mobile client and now this script – and to the dwellers of my favorite IRC channel there is no way to tell which one is talking. And whichever client I choose, I never missing anything thanks to logging and replay on login. Screen with a command-line IRC client provides part of this functionality, but the zero maintainance Bip does so much more and is so reliable that one has to wonder if my friends cling to Irssi and Screen out of sheer traditionalism.

All that remained to do was to launch the script in a sane way. To control this sort of simple and permanently executed piece of code and keep it from misbehaving, Daemon is a good way. Available in Debian, Daemon proved its worth when the RSS file went missing during the Identi.ca upgrade and the script crashed everytime it tried to access it for lack of exception catching. Had I simply put it in an infinite loop, it would have hogged significant ressources just by running in circles like a headless chicken. Daemon not only restarted it after each crash, but also killed it after a set number of retries in a set duration – thus preventing any interference with the rest of what runs on our server. Here is the Daemon launch command that I have used :

And that’s it… Less cut and paste from Identi.ca to my favorite IRC channel, and my IRC friends who have not yet adopted microblogging don’t feel left out of my updates anymore. And I can still jump into IRC from time to time for a real time chat. I have the best of both worlds – what more could I ask ?

The PGP web of trust is no longer the only application that supports a social graph. With the recent mainstream explosion of social networking and digital identity applications, there is an embarrassing wealth of choices such as Google’s OpenSocial specificationhat propose a common set of API for social applications across multiple sites. Social networking in a web environment, including all forms of publication such as blogging, microblogging, forums and anything else that support links is a way to build digital identity. Each person that follows your updates or links to your articles is in effect vouching for the authenticity of your personae, and each one who adds you as a “friend” on a social network is an even stronger vote toward the authenticity of your profile, even if some people add any comer as their “friend”.

The vetting process in social networking applications is in effect just as good as the average key signing outside of a proper key signing process : some will actually check who they are vetting, others will happily sign anything – and it does not matter too much because the whole point of the web of trust is to handle a continuous fabric whose nodes have different reputations and no guarantee of reliability. The result is a weak form of pseudonymous web of trust – just like the PGP web of trust. But with an untrusted technological infrastructure, it is only about strong enough for common social use.

An anaemic GPG web of trust and thriving social networking applications are obvious matches. So what about a social networking application that handles the PGP web of trust ? As usual, similar inputs through similar individuals generate similar outputs – the same problems with the same environment and the same tools handled by people who share backgrounds produce the same conclusions. So now that I am trawling search engines about that concept I find that I am not the only one to hav thought about it. Who will be the first to develop a social networking application plug-in that links a profile to a GPG key to facilitate and encourage key signing between members of the same platform that know each other ?

So I decided to give it a spin. ‘apt-cache search ack’ matched 15971 packages related to crack, backup, pack, slack and whatever else you can imagine, but an ‘apt-cache search ack | grep -E ^ack’ solved that problem at once.

In Debian, there was already a kanji code converter named “ack” – but since we don’t intend as far as I know to write Kanji anytime soon I tought it was reasonable to alias ack=’ack-grep’ so that our shell takes advantage of the short name.

After toying with it a little, it appears that Ack is indeed a handy tool which I’ll be definitely using in the future. There is a performance tradeoff when operating with the C locale, but since we strive to be an all UTF-8 shop I don’t care much about that. And anyway, in most interactive situations, brain usage and fingers motricity are far more precious ressources than run time or CPU time…

We never asked for a manual rebuild of that RAID array so I started thinking I was on to something interesting. But ever suspicious of easy leads I went checking for some automated actions. Indeed that was a false alarm : I found that a Debian Cron script packaged with mdadm at /etc/cron.d/mdadm contained the following :

# cron.d/mdadm -- schedules periodic redundancy checks of MD devices
# By default, run at 01:06 on every Sunday, but do nothing unless
# the day of the month is less than or equal to 7. Thus, only run on
# the first Sunday of each month. crontab(5) sucks, unfortunately,
# in this regard; therefore this hack (see #380425).
6 1 * * 0 root [ -x /usr/share/mdadm/checkarray ] && [ $(date +%d) -le
7 ] && /usr/share/mdadm/checkarray --cron --all --quiet

So there, Google fodder for the poor souls who like me will at some point wonder why their RAID array spontaneously rebuilds…

Now why does the periodic redundancy check appear like a rebuild ? Maybe a more explicit log would be nice there.

I upgraded the Sympa mailing list manager to 5.2.3-2 using the Debian package from the “Testing” repository. The database part of the upgrade procedure was a bit fussy so instead of solving its problems I simply backed up the tables, dropped them, ran the upgrade procedure and restored them. That workaround worked fine for making the Debian packaging system happy.

But Sympa itself was definitely not happy. On starting Sympa I got the following logs in /var/log/sympa.log :

With no database access, Sympa was not operational. Double plus ungood !

The very strange thing is that the database is fine : the right tables with the right fields and the right records are all present. It even worked with the preceding version of Sympa. It looked like Sympa itself was unable to recognize that my database setup was correct, subsequently reported those errors and thereafter refused to run with it at all.

The only problem is that while Sympa was down, people wondered why the messages did not go through and resent some of their messages. None of those messages were lost – they were just piling up in a queue. So when Sympa restarted many duplicates were sent.

But at least now it’s working. So for now I’m going to use dselect to freeze the Sympa Debian package at its current version so that it is kept back next time I upgrade my system.