Hard to believe I've consumed all of 1981's Usenet posts now on
olduse.net, and it's been running for 7 months
already.

Last night, there was a "very long"
post, describing nearly
every node on usenet in 1982. There had been a warning about this post the
day before, since it would take many sites half an hour to download
at 300 baud. It was handily formatted as a shell script, which created
per-node files.

So, I ran this code nobody has run since 1982. It worked. I got files. I
tossed them on the olduse.net wiki, and used some ikiwiki
code TOVA contracted me to write just a few months ago, to make
clickable links on my usenet map.

The map data was contributed in another post a while back. By 1982, usenet is
getting nearly impossible to map with 1982 technology of ascii art. I enjoyed
throwing graphviz, git, wikis, and the web at it.

So, we have a collaboration across time, me and "Mark" and a lot of
people who described their usenet nodes and piles of technology
that make creating a mashup easy. Awesome!

I blog about stuff I find on the olduse.net
blog. It's an open blog;
Koldfront also blogs there, and we welcome other
bloggers.

Some of the highlights for me have included:

As the space shuttle program is winding down, reading the excitement about
the first shuttle flights, and the play-by-play coverage of a launch,
posted to net.columbia by a high school student borrowing his dad's
account. (A usegroup name that's hard to read without remembering
its fate).

The announcements of the Motorola M68k, the IBM PC, and the CD-ROM.

Reading the TCP-IP digest, and Postel's plans for launching IPv4 soon,
while the world IPv6 launch is being
planned now. (The nay-sayers are especially fun to read. Including the
guy who was concerned about the address space size, in 1981!)

The general development of usenet. B-news being rolled out, groups
proliferating, many first inklings of what will be major problems
and developments in 5 or 10 years. A shift in tone is already apparent,
by now usenet is not only about announcements, there are already some flames.

Today I released two entirely different pieces of software with the
identical version number 3.20120115. Debian developers also will be soon
noticing a piece of software I released with the version number 9.20120115.

I expect to move more of my software to this version number scheme over
time, unless I find something badly wrong with it. It reflects how I think
about versions for my software; there's a kind of continual "now" that
development progresses through, in which individual releases have little
discrete meaning and at the same time, there can also be significant
discontinuities, that require the user to do something to deal with
(such as a new debhelper compat version, or a new git-annex repository
format).

Those two things are really all that I need a version number for my
software to communicate. I can do without the rest of the things that
version numbers are used for:

The marketing of version 1.0 and 2.0.

The comparative nuances such as whether 1.0 to 1.1 is a relatively
big change, and 1.0 to 1.0.1 is a relatively small change

The implication that 0.99 is almost 1.0 ready, and 1.1a is some kind
of alpha release.

There is so much software, with so many version numbers that any signal
encoded in such version numbers is swamped in the noise. Even on projects
that I develop a version number like 2.88 is meaningless to me. All
I care about is, how long ago was that version? Has there been a major
change breaking compatibility since that version? "2.88" doesn't answer
these questions well; "3.20111111" does.

It is a little wordy to have the full year in there, and it can be annoying
to remember to set the version to the right date on release day (TODO:
automate). This is balanced with the version not being so wordy as to
include the time of day, which means I might have to do a 3.20120115.1 if I
goof up. These minor problems are worth it to instantly know how old a
version is when a user pastes it into a bug report.

And that is probably all I will ever have to say about version numbers. :)

Last year, my new year's resolution was to write in my journal every day.
That actually stuck, I wrote 262 journal entries in 2011. While I've been
keeping a journal intermittently since 1998, last year I doubled the number
of entries in it. And wrote a novel's worth of entries -- 53 thousand words!

Most of it is of course banal and mundane stuff. Not good compared with
Lars, who does something with his journal where he goes into some detail
about code he's working on, and other work. The excerpts I've seen are
quite nice. But after I've written code, written a commit message,
documentation, perhaps bug reports etc, I often can't find much to say
about it in my journal, beyond the bare bones that I worked on $foo today or
faced a particularly hard bug. I also worry that the journal, and my
reluctance to repeat myself, often tips the balance away from me blogging,
if I write down something in the journal first.

Here's my journal for today:

Compare what jokes are funny now with those in 1982.
The 1982 ones from net.jokes on olduse.net seem juvenile.
Now compare what Unix joke man pages are funny now with
those I'm reading from 1982. They seem basically the same.
What would Biella make of this?

Liw noticed ikiwiki OOM on pell. Tracked down to a perl markdown
bug with long lines. Had quite enough of perl markdown; ikiwiki will be
moving to a different engine. Added discount support to it today,
still needs Debian package tho.

[censored]

Really gorgeous sunset, with a high wind, moon, puffy low, fast moving
clouds. Enjoyed it ecstaticly. It's going to get cold soon. Very rainy
early, but then got intermittently sunny; power is holding out ok.

Was going to roast a chicken today, but got distracted and had a large
lunch besides. Need to find some quick food for supper.

I need to start a new book, should it be the River Cottage book about
meat that I stole from Anna, or some SF?

Blogged about journaling, and put this journal entry in it, so also
journaled about blogging. Wrote it somewhat self-conciously.

The benefits for me have ranged from being able to go back and work out
dates of events, to forwarding the odd excerpts to others. The best thing
though is certianly having a regular time of introspection, to look back
over my the day.

If you've not got a new year's resolution yet, I recommend this one.
(Learning Haskell would be another good one, if you haven't yet.)

I've been at the cabin, on solar power, for a year now. I have a year of
data!

Everything went pretty well until last month. There was an April rainy
spell where power felt slightly tight. Then over the summer, plenty of
power, no need to conserve. The last month though had what seemed like
weeks of continual grey clouds, where I never saw the sun.

high noon today

Of course, even on a sunny day in winter, it does not get far
above the hills, and the peak production window is only a few hours.
This bad combination had my battery power dipping below the 10 volts
that I consider low, down to 9, and even to 8 volts.

I use kerosine lamps in the winter. (I prefer the light anway.)
I've also started unplugging my Thecus server at night to conserve power,
meaning no internet late or early. For four or so nights, I had no power to
run even my laptop after sunset. On one notable day, there was no power
even in the daytime.

Even when it turned sunny again, I found that the batteries would seem to
charge to 12 volts during the day, but then precipitously drop to 10 and 9
volts at night. I think the problem was not damaged batteries, but that
these Nicads charge most efficiently above 12 volts (14 volts is best), and
there was never enough power saved up to get them full enough that they
could charge really efficiently.

So, I reluctantly spent three days away this week, to let the batteries
soak up sun and recover. It seems to have worked; they've been holding
a 12 volt charge overnight again.

The great thing about git and other distributed version control systems is
that once you clone (or fork) a repository, you have all the data. You
don't have to trust that Github will preserve it; everyone who develops
the project is a backup.

Github carries this principle quite far amoung the features they provide.
But not all the way. Today I have surveyed their features, and where the
data for each is stored.

relationships between repos (who forked what, pull requests)
-- in a database accessible by an API

your account details and activity -- in a database, accessible
by you via an API

list of all projects and users -- in a closed database (AFAIK)

The two that really stand out are the issues and notes not being stored in
git. This means that, if a project uses github, it gets locked into github
to a degree. The records of bugs and features, all the planning, and
communication, is locked away in a database where it cannot be cloned,
where every developer is not a backup.

Github's intent here is not to control this data to lock you in (to the
extent they want to lock you in, they do that by providing a proprietary UI
that people rave about); it was probably only expedient to use some sort of
database, rather than git, when implementing these features.

They should automatically produce git repository branches containing a
project's issues, and notes, based on the contents of their database.
(For notes, git notes is the obviously right storage location.)
Along with ensuring every developer checkout is a backup, this would
allow accessing that data while offline, which is one of the reasons
we use distributed version control.

The lack of a global list of projects is problimatic in a more global sense.
It means that we can't make a backup of all the (public) repositories in
Github (assuming that we had the bandwidth and storage to do it). I
recently backed up all the repositories on Berlios.de, when it looked to be
shutting down; this was only possible because they allowed enumerating them
all.

People at The Internet Archive say that their archival coverage of free
software is actually quite bad. We trust our version control systems
to save our free software data, but while this works individually, it
will result in data loss globally over time. I'd encourage Github to help
the Internet Archive improve their collections by donating periodic snapshots
of their public git repositories to the Archive. You're located in the same
city, 5 miles apart; they have lots of hard drives (though less right
now during the shortage than usual); this should be pretty easy to do.

Full disclosure: Github has bought me dinner and seemed like
stand-up guys to me.

180 minutes

300 minutes

Just watched the whole Douglas Engelbart demo from
1968.
Somehow I'd only heard of this as the first demo of the computer mouse,
and only seen a brief clip on youtube. All three 30-minute reels of
the film are available online,
and well worth a watch in full.

The mouse is the least of it, the demo includes an outlining text editor,
model-view-controller, hypertext, wiki, domain specific programming
languages, a precurser to email, bug tracking, version control(?), a
chorded keyboard. (Ok, that last one didn't really take off.) Probably a
dozen other things I've forgotten. All in a single interface, and all
before I was born.

Just like any tech demo, there are fumbles and mistakes, which is
reassuring to anyone who has tried to give a tech demo.

There's also the awesome crazy hack shown here.
They could only afford these
tiny, round CRTs, so they pointed a television camera at it, and the
camera image was piped to their television console. (So add KVM switch to
the list of firsts!) The demo
was done in San Fransisco, with the computer system remote in Palo Alto, so
in this image you see the text on the CRT overlaid with the video from the
camera.

Engelbart points out that the delay this added to the system acts as a
short-term memory that filtered out flicker in the original display (and
made the mouse have a mouse trail). To me it gives the whole demo a unique
quality, as if it were underwater.

Despite the piping around of audio and video signals, and the multiuser
system, the glaring thing missing from the demo that we have these days is
networking. Although there is this amusing bit at the end where they compile
a regular expression and then apply it, in order to search for documents
containing certain terms, and end up with a hyperlinked list of 10 results,
ordered by relevance. Yes, think Google.

First thought is this: A bug's likelyhood of ever being fixed decays with
time, starting when I first read it. If I have to read it a second time,
the bug has already become more complex, since something prevented me from
just fixing it the first time. If more information has to be added to the
bug, that makes it yet more complex. If there is an argument in the bug
about whether it is a bug, or how to fix it, just revisiting the bug at a
later date can become more expensive than it's worth. Much of what is
involved in filing good and effective bug reports are obvious corollaries
of this. It also follows that it's best to either fix, or at least plan how
to fix a bug immediatly upon reading it.

Second thought is about "wontfix". A bug submitter and the developer
responsible for the bug see this state in very different ways, but the name
hides what it really means, which is that there is a meta-bug affecting
either the bug submitter, the developer, or both. Once you realize this,
wontfix bugs, from either side, become a bit personally insulting. They
also quickly decay to uselessness (see first thought), and then just lurk
there wasting the developer's time in various ways. Bug tracking systems
should not provide a "wontfix" state; if they want to track meta-bugs
they should provide a way to reassign such a bug to some other party who
can actually resolve such a meta-bug.

I attended the Git Together earlier this week. I was tenative about this,
since I'm not really much of a git developer; all my git work is building
stuff on top of it. It turned out great though.

At first it seemed like one of those parties where you don't know anyone.
But then I got to reconnect with Avery Pennarun for the first time since
DebConf 2, and got to know Jonathan Nieder better, and it was also nice
to see Jelmer Vernooij. And the core developers were also very welcoming.
Junio Hamano knew of my work (and I am in awe of his), and Jeff King
thinks my take on SHA1 security issues has value, and has been
expanding on it. Shawn Pearce managed the unconference subtly and well.
Lots of very smart people. At one point I found myself accross the table
from Android's lead developer.

I was very happy that everything I think needs improvement in git was
discussed during the unconference:

big files: My postit suggesting this got more checks than most
anything else, and I briefly presented git-annex at the start of a
session on general scalability -- on its 1-year anniversary. Some ideas
for improved hooks that git-annex and other tools could use are developing.
Better scalability to lots of files and more efficient index files were
also discussed.

git as a filesystem: There was a consensus that gone are the days
when git was just about managing source code. (I remember being told on #git
before I wrote etckeeper, that no, git should not be used for that..)

submodules: I was astounded that they're now considering supporting
"floating" submodules, which would track the head of a branch, rather
that the specific rev committed in the superproject. Many other problems
that have kept me from ever trying submodules are also being worked on.
This seems unlikely to replace mr, but who knows -- at least getting
rid of repo
is a goal.

SHA1 security was discussed for quite a long while, long enough that I
felt a bit guilty for bringing it up, but it was an interesting and
fruitful discussion. I went in thinking
that the checksum basically has to be parameterized, but they have some
good reasons not to do that, and some other good ideas, although what
to do and when best to do it is still open for discussion. Signed
commits are certianly coming soon. Also this
amazing patch was developed.

Metadata storage was briefly discussed, but nobody seemed sure how
to deal with it. Ideas floated included a metastore like tool that
uses mergeable files, or storing metadata in some sort of notes-like
separate branch.

Visiting California this week and having a great time. Experienced my first
earthquake; visited the Noisebridge hackspace with Seth and Mako; and
yesterday went up to Point Reyes and flew a kite from cliffs over Drake's
Bay.