It's been a busy few weeks - my mom, then my brother and his
girlfriend came to visit, and somewhere in the middle of all that we
had an Artifex staff meeting.

Things are quieting down now. We had a nice family evening, playing
video games and doing a little papercraft. I
tried the tiger,
just on cheap paper and b/w laser printing, and it came out ok. Of course, Max
then wanted to do one of the motorcycles, but I convinced him that we
would some other day.

BitTorrent and RSS

There's a thread
going around the net on the benefits of combining RSS with
BitTorrent. I agree there's something there, but want to make a
distinction between the "easy" combination which is quite feasible
right now, and one that requires a bit more rocket science (actually,
Internet protocol design, but from what I know of both, the latter is
more difficult to do well).
In the "easy" combination, you have your whole RSS infrastructure
exactly as it is now, but use BitTorrent to distribute the
"attachments". People have been experimenting with RSS enclosures (for
speech, music, video, and whatnot) for a while, but they're not hugely
popular yet. One of the reasons is the difficulty and expense of
providing download bandwidth for the large files that people will
typically want to enclose. BitTorrent can solve that.

In fact, BitTorrent's strengths seem to mesh well with RSS. BT shines
when lots of people want to download the same largish file at the same
time - it's weaker at providing access to diverse archives with more
random patterns of temporal access. Also, BT scales nicely with the
number of concurrent downloaders - you get about equally good
performance with a dozen or ten thousand. So if someone shoots a
really cool digital video, posts it to their blog, then gets
Slashdotted, it all still flows.

Integrating BT with a daemon that retrieves RSS feeds in the
background has other advantages, as well. If the person opens the file
a while after the download begins (which might be as soon as the RSS
is updated), most or all of the latency of downloading that file can
be hidden. Further, since the BT implementation is released under a
near-public domain license, it should be relatively easy for people to
integrate it into their blog-browsing applications.

An example of a blog that would work superbly with BT is Chris Lydon's series of
interviews.

But Steve Gillmor's article isn't primarily about enclosures - it
suggests that we can use BT to manage the RSS feed itself. I think
there's something to the idea, but the existing protocol and
implementation isn't exactly what's needed. BT is best at downloading
large static files. You start with a "torrent" file, which is
essentially a Merkle hash tree of the file packaged up with a URL
where the "tracker" can be reached. All peers uploading and
downloading the file register with the tracker, and get a list of
other peers to connect with. Then, peers exchange blocks of the file
with each other, using very clever techniques to optimize the overall
throughput. After each block is transferred, its hash is checked
against what's in the torrent file, and discarded if it doesn't match.

But RSS files themselves are relatively small, so it's unlikely that
all that much bandwidth would be saved sending torrent files and
running a tracker, as opposed to simply sending the RSS file itself.
Further, the big performance problem with RSS is the tradeoff between
polling the RSS feed infrequently, resulting in large latencies
between the time the feed is updated and viewers get to see it, or
polling it frequently and chewing up tons of bandwidth from the
server. BT doesn't do much to help with this - you'd be polling the
torrent file exactly as frequently as you're polling the RSS file now.

I believe, however, that the BitTorrent protocol could be adapted into
one that solves the problem of change notification. The protocol is
very smart, and already has much of the infrastructure that's needed.
In particular, peers already do notify each other when they receive
new blocks. That's not change notification because the contents of the
blocks are immutable (and that's enforced by checking the hash), but
it's not too hard to see how it could be adapted. At heart, you'd
replace the static hash tree of the existing torrent file format with
a digital signature. The "publisher" node would then send new
digitally signed blocks into the network, where they'd be propagated
by the peers. There'd be essentially no network activity in between
updates, and, as in the existing BitTorrent protocol, the load on the
publisher node would be about the same whether it was feeding a dozen
or ten thousand listeners. I'd also expect latency to scale very
nicely as well (probably as the log of the number of peers, and with
fast propagation along the low latency "backbone" of the peer
network).

I'd hate to see such a beautiful work of engineering restricted to
just providing RSS feeds - ideally, it would be general enough to
handle all sorts of different applications which require change
notification. One such is the propagation of RPM or Debian package
updates, which obviously has strong requirements for both scaling and
robustness. The main thing that's keeping it from happening, I think,
is the dearth of people who really understand the BitTorrent protocol.

Proof systems

I've been hacking a bit on my toy proof language. Aside from
slowly bringing the verifier up to the point where it checks
everything that should be checked, I'm also hacking up an implementation
of the HOL inference rules constructed in ZF set theory.

It's immensely satisfying to construct proofs that are
correct with high assurance, which is such a contrast from
hacking code - any time you write nontrivial code, you know it's got
lots of bugs in it, many of which no doubt can be exploited to create
security vulnerabilities.