symlink.jnl

Warning: One of those really boring posts in which I brag about my epic hax. (But well, that's the point of this whole site, isn't it?)

Like many Linux users, I waste a lot of time customizing the hell out of my terminal's appearance. Part of this is creating an awesome shell prompt. Everyone likes to put all sorts of information there – whether you're in a Git or Hg repository, what branch you're on, what the exit status of last command was (sometimes even expressed in the form of elaborate Unicode emoticons)...

Mine is plain in comparison – it just shows the hostname, path, and branch. So far it looked pretty much like this:

rain~/pkg/abs/telepathy-mission-control/git/srcmaster
$ foo

Over time I ended up implementing various unusual things in it, however – for example, highlighting the last directory component, or collapsing the path when it becomes too wide for the terminal:

It generally shows enough of the path to remember where I am, although I'm probably going to adjust it a little bit yet. Today I also changed the highlight to always start at the repository root, which makes things much clearer when dealing with nested repositories – it looks a bit ugly however:

(Disclaimer: Here is my point of view about the Irssi IRC client. It is not particularly objective. It is not about why everyone should suddenly drop Irssi. Nor is it meant to be blindly shoved to every Irssi user you talk with.)

I've used Irssi as my main IRC client for almost 5 years, before switching to Weechat. Despite being pretty much unmaintained (and lacking some features), Irssi is still a good client, but… it has a problem: the users.

Specifically, the users who always feel the need to declare that Irssi is better, that "irssi > *", that Irssi is perfection.

Most users of other IRC clients openly admit that there's some misfeature or something else that they don't like. For example, the way Weechat works, it must wrap overly long URLs into multiple lines, making them unclickable. Meanwhile, Irssi users (at least the vocal ones) insist that their chosen client is perfect, and if it doesn't have a feature, then it is only because said feature is a) "you don't want it" (obviously unnecessary), b) "why would anyone want it" (obviously stupid), or c) "just install a script :)" (can be implemented using the exposed API).

For example, the nick list. Most clients let you have a sidebar that lists all people currently in the channel, usually sorted by rank. Now, I don't care if it's useful or just clutter for you; that's not my point. My point is that Irssi users always say: "Oh, if I ever wanted a nicklist, I could just install nicklist.pl and have it."

What is always left unsaid is that Irssi does not actually have any API for creating vertical regions, so the script works only if you open a new terminal window running cat ~/.irssi/nicklist-fifo. Alternatively, if you happen to be using SCREEN, the script actually reconfigures Irssi's tty to be narrower than the SCREEN window, and draw directly in the blank space that appears...every single time Irssi's own area is updated. In contrast, even though Weechat has several such scripts (although nicklist is built-in) they do not have to do anything special; they simply create a "bar" and put text inside. (There is no difference between the built-in "nicklist" bar and the scripted "buffers" bar, as far as the user is concerned.)

And there are more examples like that – for example, the cap_sasl.pl script in Irssi doesn't just implement the SASL cap, it has to implement all of capability negotiation on its own, and you cannot write your own scripts that make use of other capabilities unless you change cap_sasl to request them. (Although I have an idea on how this could be done, if the CAP negotiation was split into a second script.)

Somewhat similar to the nicklist example is implementing the server-time capability, which lets bouncers attach the original message timestamp when you connect and see the last messages being replayed. Yes, it is possible to do that from an Irssi script. But again, the only way it can be done is a hack upon a hack:

the script has to implement IRCv3 message tag parsing all by itself, which means other scripts cannot know about future tags;

and it has to set the timestamp_format and log_timestamp options to a fixed value before letting Irssi print the line, and reset them afterwards.

Is that a good example of "flexible API"? Not very. But again, it's not really the client itself that's the problem – all clients have all sorts of limitations (like Weechat lacking any sort of hook for defining custom SASL mechanisms) – but rather the users who basically worship it, refusing to admit any imperfections.

Recent versions of Linux translate incoming ICMPv6 "Administratively prohibited" errors (type 1 code 1) to local -EACCES ("Permission denied") errno's, which is an interesting way of being informed that the server's firewall is blocking you. Unfortunately, all other operating systems (Windows, older Linuxes, various BSDs) appear to just ignore these ICMP packets, which is a bit sad – I expected them to at least terminate the TCP connection attempt with something generic like "Connection reset by peer", but instead they just wait until the connection times out.

Then again, the other OSes often do the same even for ICMP "Port unreachable". Also sad. Also strange that even on Linux, only ICMPv6 uses this translation – the equivalent ICMPv4 "Communication administratively prohibited" (type 3 code 13) results in -EHOSTUNREACH, "No route to host".

Still, I really like the whole translating remote failures to local errno's thing. Somehow it actually makes me feel as if I'm using a network where everything is integrated and where I'm receiving feedback from the network, instead of just a bunch of computers exchanging data.

Similarly, the ping tool on Windows displays the message "Negotiating IP Security" whenever Windows is performing IPsec key exchange, which is a nice touch – when the same is happening in Linux, the packets just go nowhere. (I don't remember offhand if they're queued or discarded; either way, there's just no feedback.)

(On a related note, IPsec with strongSwan is hella confusing at times.)

Spent the majority of the past year on IRC. Somehow I ended up being an operator in #archlinux, then in freenode's #irchelp, finally even in #systemd. Yes, #systemd was finally registered with network services after three years – for a project like this it's really surprising that the channel hasn't been attacked or invaded by trolls even once.

Kind of wondering why I now have +v in #inspircd, too. Given that I've only used InspIRCd for a hour or two, and I mostly just lurk in the channel... But I'm not complaining.

Messing around with Windows on the desktop PC while sister's out somewhere. (I never got around to installing the TermiServ patch since the reinstall last month, so it only allows one user at a time.) It seems that the smaller disk is about to die sometime this year – SMART just started showing a large number of reallocations and failed writes. Which is a bit unexpected, because the disk hasn't been used for almost anything since the reinstall; it only has a tiny boot partition with NTLDR on it. (For some reason, NTLDR refuses to work at all when started from the larger disk – maybe 1 TB is too much for it?)

On the other hand, I did know that it wasn't going to live long – the Event Log started showing "controller errors" in 2010, and I moved all user files to the new disk in early 2012, so when the data corruption started occurring, I only had to reinstall the OS...and, well, everything else.

There was a time when I tried setting up backups on the desktop, but it was the same story again. WinRAR actually has several useful features – storing multiple versions, NTFS streams, file permissions, &c. – but it also turned out to be much slower than expected, and it could not deal with encrypted or locked files at all. RoboCopy was roughly the same, although much faster.

I even ended up writing my own tool in C#, which would just copy a directory tree but also worked with locked/in-use files (using temporary Shadow Copy snapshots, which XP happens to ... kind of "support"), insufficient permissions (using SeBackupPrivilege to bypass the checks), and even encrypted files (using EFS APIs to read the raw contents, without Windows trying to transparently decrypt them). But it was in C#, and the .NET runtime actually took way too long to even start. So in the end, I still have no real backups of the desktop PC, only a snapshot of F:\Users from before the reinstall.

So I've spent the past week trying to find a good backup program. I still haven't found one.

It could be that my requirements are impossible. I want a tool that would be reasonably fast both when copying data, and when adding a lot of small files; have some form of deduplication to avoid wasting gigabytes of data after I simply move files around; and not require a command-line tool to actually access the backups. But apparently no tool can do all three at once.

I tried rsnapshot (which seems to be just a wrapper around rsync --link-dest), as well as plain rsync combined with btrfs snapshots. While rsync is fast enough, it turns out it is too dumb to detect moves and renames, so if I simply rename ~/Videos/anime to ~/Videos/Anime, or if I move a dozen of CD images from ~/TODO to ~/Attic/Software/OS/WinNT, rsync thinks all files are new and spends ages copying them again, instead of hardlinking from previous snapshot as the --link-dest option normally would. (I'd be happy to know if I'm wrong on this one and if it can actually detect renames.) Plus, copying to a btrfs partition is much slower than expected; only 15 MB/s instead of the usual 25-30 MB/s (that's over USB 2.0).

I also used obnam for quite a while. It's fast and it has deduplication built in so I can easily keep a few dozen weekly snapshots. But I'm not exactly a fan of having to use obnam restore whenever I want my files out. While that's a rather minor problem, and there apparently is a FUSE plugin in the works, there's also the risk that obnam's repository will get corrupted and won't let me access anything anymore. I'm also not exactly a fan of obnam growing to 1.5 GB of memory during its run – and that wasn't even the entire run, that was maybe 1/3 when I finally killed it. (I do hope that's a bug.) Also, while adding data is fast, obnam is slow when adding files – and if I have a directory with 200k smallish files in it, it goes at maybe 6-10 files per second, which means it takes hours to copy a mere 5.2 GB of Gale chatlogs.

Next option is ZFS with dedup enabled – either rsnapshot or plain rsync with ZFS snapshots would work. The problem with it, however, is that it's a pain in the ass to maintain on Arch. Every time I install a new kernel version from [testing], there are four packages I need to rebuild, and since they all have versioned dependencies (e.g. zfs 0.6.2_3.10.9-1 depends exactly on linux=3.10.9-1 and zfs-utils=0.6.2_3.10.9-1) it means I must remove all ZFS tools entirely, then upgrade my kernel, then start rebuilding ZFS. (Of course, I just wrote a shellscript that sed's the versions out of depends= lines, but that doesn't make it any less of a pain in the ass.)

For now, I guess, I'll just stick with rsnapshot and limit it to four, maybe three, snapshots at once... but fuck, how can there not be a backup tool that doesn't suck in some way or other?

A few years ago I wrote a cronjob for updating ~/.ssh/authorized_keys on various servers. (It ended up having the name update-authorized-keys after a few renames.) It basically downloaded my authorized_keys file over HTTP (using one of a dozen HTTP clients to be extra portable), checked if it had my PGP signature on it, and supported some cpp-ish filtering. I was extra careful to look for a specific PGP key by fingerprint and all that.

And several months later, I wrote another cronjob – this time for updating my script collection and my dotfiles – called dist/pull this time. It first updated ~/code over Git, then exec'd the updated version of itself (just in case), which then updated ~/lib/dotfiles (also over Git). Sometimes I would patch dist/pull to do various cleanup jobs, and they would always run at midnight automatically. (As a bonus, it also ran the SSH key updater, instead of having two separate cronjobs.)

And I just realized that despite all my carefulness, I still ended up having an easily pwnable cronjob that automatically downloads and runs code every night without verification. Crap.

Many IRC networks now support SASL as the standard authentication method, which removes certain race conditions such as having your client auto-join channels before auth is complete – as a result your vhost/cloak would get applied too late, you might be denied entirely if the channel requires being authenticated, etc.

One day, out of boredom, I wrote a mostly-pure-Tcl implementation of IRCv3 CAP and SASL for the Eggdrop IRC bot. At the moment, it is located on GitHub Gist, and consists of three Tcl scripts – Base64; CAP negotiation and SASL PLAIN; plus a demo script for several other IRCv3 capabilities.

Saying "mostly-pure-Tcl" because the CAP negotiation still needs a one-line patch to the core code. However, two days ago, the "preinit-server" patch was merged into the main Eggdrop 1.8 repository, so it can be used with the scripts without any modification.

I'm still trying to set up a sane virtual machine network, one that would put VMs on both the laptop and the desktop in their own networks routed to each other and to the real LAN and to still let my own VMs access "LAN subnets only" services on the desktop, like the file sharing.

It's not going well – I ended up running Unbound, BIND, and dnsmasq on the same laptop: Unbound I already had running before as my validating resolver; dnsmasq serves DHCP to the VM network and hosts a simple dynamic-DNS LAN domain for accessing random PCs; BIND hosts a static domain for accessing the two Active Directory realms installed in two VMs, because dnsmasq's static DNS settings are plain stupid. So now I have all my VMs nice and clean in their own net, routed to the real LAN – that is, routed and NATed so that LAN hosts would see the VM's real addresses but the LAN router/gateway/cheap-ass-DSL-modem could still do its own NAT thing properly but the desktop also needs to see the NATed addresses when VMs try to access shared files, so that the firewall would let them through... I might have written the stupidest NAT rules ever just to make this work:

This whole mess turned out to be needed because my ISP configures its routerdems to have a "management" network in addition to "user" and "Internet", and that network happens to be using 10.0.0.0/8 (already confused the hell out of me once, when I wanted to connect to a VPN but the traceroute to 10.0.x.x addresses kept going through my ISP) which makes the routerdem think the packets from my VMs aren't actually coming from inside the LAN, so it refuses to apply NAT to them, so my laptop (the VM host) has to NAT all of them to the LAN address range... On the other hand, I still want all VMs to be reachable from the real LAN using their own IP addresses, hence the ACCEPT rule.

Aside: The Spooler service in Windows XP is rather picky about the hostname you use to access it. Apparently, the full UNC path of the printer is sent when conneting to it, so if you're trying to connect to \\snow.virt\FooPrinter but the server thinks it's \\snow.home (not \\snow.virt), it will return "Invalid printer name" to the OpenPrinter request – despite it having already accepted a SMB connection to \\snow.virt\IPC$ without even a blink.

I used to use the wmii tiler for a long time (before going back to GNOME), and recently it seems i3 has become popular, so I decided to try it out. I'm not going to comment on the usability, features, etc. – but I sometimes have really odd criteria for choosing software, so here's one such odd comment.

When I used wmii, it had a really sweet control interface, styled after various plan9 software: the configuration file was essentially a bash script (later ported to various other languages) that had its own event loop. The control interface was 9P over a Unix socket: read /event, write to /ctl, list files under /tags, and so on. You could even mount it as a local filesystem, using native 9p.ko.

(Later I went back to GNOME. The Shell isn't scriptable externally (at least not easily), but overall, almost all programs I run use DBus in some way or other. It's also somewhat nice and consistent.)

Then I tried i3, which claimed to be heavily inspired by wmii – and at least the appearance and the control keys were quite similar. (Although wmii has a simpler layout model – it always splits the screen into columns, similar to acme.) But I was somewhat disappointed at i3's IPC protocol – even though I have zero experience in designing such things, it still looks ugly to me.

There's a "command" message type, and six "get_foo" message types. There's "check the highest bit to see if it's an event reply or normal reply. There are no event names – there's a list of magic number definitions in i3/ipc.h which has to be copied into your i3ipc implementation; this is not a problem by itself, of course, but only when the definitions are assumed to be stable – which, in this case, they aren't.

Today while discussing home directory permissions and the 'finger' command, I mentioned the long list of users at @linerva.mit.edu. Someone quickly discovered a user or two having their contact information in .plan files, and the general reaction was:

While I've never been at MIT, "public" seems to be the default there – not only that user's contact info, but their entire home directory is world-accessible over AFS. I often do the same; I consider contact information to be more-or-less public, as it has been in the past. So it is quite unusual for me to see other people finding a random user's phone number, and reacting as if it were a precious gem. Even calling it "doxing" just doesn't seem to fit here.

Today I added my Yahoo IM account to Pidgin, just to see if it still works. It did – and as soon as it connected, I got 10 messages from ten different spambots (apparently YMSG stores offline messages). Windows XP has this feature where you can Ctrl+click on taskbar buttons to select multiple windows, the same way you would select multiple files, and then close them at once (or tile/cascade the selected windows). It's something GNOME 3 still lacks.

I did this after Microsoft decided to kind-of shut down their MSN Messenger servers, to make more space for Skype. The standard servers are already refusing raw MSNP connections, although Pidgin can still connect using its "HTTP method". I'm somewhat amazed that even on various Linux geek channels on IRC, people are saying things along the lines of "good riddance", not realizing that Micros~1 is shutting down a sufficiently reverse-engineered IM protocol in favor of a secret one that requires a tightly locked down client. There are at least a dozen unofficial MSNP clients for both Windows and Linux. Hell, MSN Messenger had official XMPP servers. Meanwhile, who still remembers how the attempts to reverse-engineer Skype went? Not well.

Oh well. Maybe things will get better when Microsoft tries to integrate Skype into its build system and forgets to enable obfuscation, or writes a HTML5 client, or something. Meanwhile, Yahoo! Messenger is still online, as are ICQ and AOL Instant Messager. I still remember my UIN, it seems. (And I've never had more than three contacts total over all four protocols, but that's off-topic.)

Recently I found another IM protocol, Gale, which feels somehow like a cross between XMPP, Zephyr and IRC.

Zephyr because Gale's interface was designed to resemble Zephyr's, being based purely on transient subscriptions rather than persistent "channels" or "buddy lists". To receive public chat messages, one subscribes to pub@ofb.net; to receive personal IMs, one subscribes to foonly@example.com.

XMPP because they have a similar network topology – each domain has its own server to which clients can connect, and which exchanges messages with other servers. All messages to pub@example.com go through example.com's servers, for example.

IRC because multi-user chats receive as much importance as one-to-one messages, in both the protocol and the community. (I haven't used MUCs in XMPP much, but it has always had that stuck on with chewing gum feeling, even compared to all idiosyncrasies of IRC.)

In other words, Gale takes the best parts of all three, while keeping a very simple interface (and one much more scriptable than, say, XMPP). Similar to Zephyr, there's no full-blown client by default, only separate command-line tools for subscribing and for posting a message. You can compose a message in Vim and send it with :w !gsend pub.

Unlike IRC, it's possible to subscribe to the same address from many locations; join/part notifications do not exist; there's no way to know who's reading messages to a public address. The gsub client does support sending special "presence state" messages, but those are merely informative, not persistent. Addresses can be hierarchical – one could subscribe to pub@example.com or only to pub.tv.fox@example.com.

There's a downside, too. Gale messages can be encrypted, and to authenticate senders & receivers everyone has a RSA keypair, which are verified hierarchically – the "ROOT" key signs TLD keys, the TLD keys sign domain keys, domain keys sign user and/or subdomain keys, user keys can sign subkeys. To set up a new domain, one needs to email their domain's public key to the root key's owner and receive a signed key back. So far, signing has been done all by the same person, Gale's creator Dan Egnor. There have been proposals for a notary, but nobody cares enough to finish them... Nevertheless, the scheme is better than Zephyr's Kerberos-based trust relationships, which simply do not scale above half a dozen realms.

Unfortunately, there are very few users of Gale by now. Maybe a dozen still post to pub@ofb.net to this day; most of them probably have migrated to IRC or XMPP or Skype. Overall, it feels as if Gale should have received a lot more attention than it has.

Update: Since the CVS server described in Gale's website is now defunct, I've obtained a copy of the entire repository and imported to Git – it's available at github.com/grawity/gale, with minor fixes such as better libc locale support.

The next post, if I ever get around to it, should be about IRC, Zephyr and PSYC.