Thursday, January 29

raspberry pi kernels

Wednesday, January 28

on replicating process

Ok, so here we are. It’s 2015. The gold standard for explaining how you
solved a technical problem to the internet at large is a blog post with things
you can copy and paste or maybe some pictures.

If you’re really lucky, someone actually has reusable a public repository of
some kind. If you’re really lucky, their code works, and if all the gods
are smiling on you at once, their code is documented.

It seems to me that we can do better than this. We possess a great many of the
right tools to do better than this, at least for a lot of common problems.
What does it take to make a given workflow both repeatable and legible to
people without the context we have for a given thing (including ourselves)?
Writing about it is surely desirable, but how do you approach a problem so
that, instead of being scattered across your short term memory and a dozen
volatile buffers, your work becomes a kind of document unto itself?

This is the (beautiful) root of what version control does, after all: It
renders a normally-invisible process legible, and in its newfound legibility,
at least a little susceptible to transmission and reuse.

What do I know works well for transmitting process and discovery, as far as it
goes?

version control (so really git, which is severally horrible but also
brilliant and wins anyway)

Makefiles (except that I don’t understand make at all)

shell scripts (except that shell programming is an utter nightmare)

Debian packages (which are more or less compounded of the above, and
moderately torturous to build)

IRC, if you keep logs, because it’s amazing how much knowledge is most purely
conveyed in the medium of internet chat

Stackoverflow & friends (I hate this, but there it is, it’s a fact, we have to
deal with it no matter how much we hate process jockies, just like Wikipedia)

screenshots and screencasts (a pain to make, all-too-often free of context, and
yet)

Here are some things that I think are often terrible at this stuff despite
their ubiquity:

web forums like phpBB and stuff (so bad, so ubiquitous, so going to show up
in your google results with the hint you desperately needed, but only if you’re
smart enough to parse it out of the spew)

Here’s one problem: There are a lot of relatively painless once you know them
tools, like “let’s just make this a dvcs repo because it’s basically free”,
that if you know they exist and you really want to avoid future suffering you
just get in the habit of using by default. But most people don’t know these
tools exist, or that they’re generally applicable tools and not just
specialist things you might use for the one important thing at your job because
somebody said you should.

debian packaging again

https://www.debian.org/doc/manuals/packaging-tutorial/packaging-tutorial.en.pdf
(actually more helpful than anything else I’ve found so far)

vagrant

Vagrant is a thing for quickly provisioning / tearing down / connecting to
virtual machines. It wraps VirtualBox, among other providers. I think the
basic appeal is that you get cheap, more-or-less disposable environments with a
couple of commands, and there’s scaffolding for simple scripts to configure a
box when it’s brought up, or share directories with the host filesystem. It’s
really lightweight to try out.

Go to the downloads page and install from
there. I used the 64 bit Ubuntu .deb.

People around me have been enthusing about this kind of thing for ages, but I
haven’t really gotten around to figuring out why I should care until recently.
I will probably be using this tool for a lot of development tasks.

armhf

During diagnosis, the question becomes, how can I determine whether my Linux
distribution is based on armel or armhf? Turns out this is not as
straightforward as one might think. Aside from experience and anecdotal
evidence, one possible way to ascertain whether you’re running on armel or
armhf is to run the following obscure command:

$ readelf -A /proc/self/exe | grep Tag_ABI_VFP_args

If the Tag_ABI_VFP_args tag is found, then you’re running on an armhf system.
If nothing is returned, then it’s armel. To show you an example, here’s what
happens on a Raspberry Pi running the Raspbian distribution:

It seems like there may be ways to conditionalize this, but at this point I’m
tempted to just pull some simple templating system into my dotfile
stuff and generate a subset of config files on a per-host basis.

Thursday, January 22

deleting files from git history

Working on a project where we included some built files that took up a bunch of
space, and decided we should get rid of those. The git repository isn’t public
yet and is only shared by a handful of users, so it seemed worth thinking about
rewriting the history a bit.

There’s reasonably good documentation for this in the usual places if you look,
but I ran into some trouble.

First, what seemed to work: David Underhill has a good short script from
back in 2009 for using git filter-branch to eliminate particular files from
history:

I recently had a need to rewrite a git repository’s history. This isn’t
generally a very good idea, though it is useful if your repository contains
files it should not (such as unneeded large binary files or copyrighted
material). I also am using it because I had a branch where I only wanted to
merge a subset of files back into master (though there are probably better
ways of doing this). Anyway, it is not very hard to rewrite history thanks to
the excellent git-filter-branch tool which comes with git.

I’ll reproduce the script here, in the not-unlikely event that his writeup goes
away:

#!/bin/bash
set -o errexit
# Author: David Underhill
# Script to permanently delete files/folders from your git repository. To use
# it, cd to your repository's root and then run the script with a list of paths
# you want to delete, e.g., git-delete-history path1 path2
if [ $# -eq 0 ]; then
exit 0
fi
# make sure we're at the root of git repo
if [ ! -d .git ]; then
echo "Error: must run this script from the root of a git repository"
exit 1
fi
# remove all paths passed as arguments from the history of the repo
files=$@
git filter-branch --index-filter "git rm -rf --cached --ignore-unmatch $files" HEAD
# remove the temporary history git-filter-branch otherwise leaves behind for a long time
rm -rf .git/refs/original/ && git reflog expire --all && git gc --aggressive --prune

A big thank you to Mr. Underhill for documenting this one. filter-branch
seems really powerful, and not as brain-hurting as some things in git land.
The docs are currently pretty good, and worth a read if you’re trying to
solve this problem.

Lets you rewrite Git revision history by rewriting the branches mentioned in
the <rev-list options>, applying custom filters on each revision. Those
filters can modify each tree (e.g. removing a file or running a perl rewrite
on all files) or information about each commit. Otherwise, all information
(including original commit times or merge information) will be preserved.

After this, things got muddier. The script seemed to work fine, and after
running it I was able to see all the history I expected, minus some troublesome
files. (A version with --prune-empty added to the git filter-branch
invocation got rid of some empty commits.) But then:

That second repo is a clone of the original with the script run against it.
Why is it only tens of megabytes smaller, when minus the big binaries I zapped,
it should come in somewhere under 10 megs?

I will spare you, dear reader, the contortions I went through arriving at a
solution for this, partially because I don’t have the energy left to
reconstruct them from the tattered history of my googling over the last few
hours. What I figured out was that for some reason, a bunch of blobs were
persisting in a pack file, despite not being referenced by any commits, and no
matter what I couldn’t get git gc or git repack to zap them.

…where cat-file is a bit of a Swiss army knife for looking at objects, with
-s meaning “tell me a size”.

(An aside: If you are writing software that outputs a size in bytes, blocks,
etc., and you do not provide a “human readable” option to display this in
comprehensible units, the innumerate among us quietly hate your guts. This is
perhaps unjust of us, but I’m just trying to communicate my experience here.)

Also somewhere in there I learned how to use git bisect (which is
really cool and likely something I will use again) and went through and made
entirely certain there was nothing in the history with a bunch of big files
in it.

So eventually I got to thinking ok, there’s something here that is keeping
these objects from getting expired or pruned or garbage collected or whatever,
so how about doing a clone that just copies the stuff in the commits that still
exist at this point. Which brings us to:

--local
-l
When the repository to clone from is on a local machine, this flag
bypasses the normal "Git aware" transport mechanism and clones the
repository by making a copy of HEAD and everything under objects and
refs directories. The files under .git/objects/ directory are
hardlinked to save space when possible.
If the repository is specified as a local path (e.g., /path/to/repo),
this is the default, and --local is essentially a no-op. If the
repository is specified as a URL, then this flag is ignored (and we
never use the local optimizations). Specifying --no-local will override
the default when /path/to/repo is given, using the regular Git
transport instead.

And --single-branch:

--[no-]single-branch
Clone only the history leading to the tip of a single branch, either
specified by the --branch option or the primary branch remote’s HEAD
points at. When creating a shallow clone with the --depth option, this
is the default, unless --no-single-branch is given to fetch the
histories near the tips of all branches. Further fetches into the
resulting repository will only update the remote-tracking branch for
the branch this option was used for the initial cloning. If the HEAD at
the remote did not point at any branch when --single-branch clone was
made, no remote-tracking branch is created.

I have no idea why --no-local by itself reduced the size but didn’t really do
the job.

It’s possible the lingering blobs would have been garbage collected
eventually, and at any rate it seems likely that in pushing them to a remote
repository I would have bypassed whatever lazy local file copy operation was
causing everything to persist on cloning, thus rendering all this
head-scratching entirely pointless, but then who knows. At least I understand
git file structure a little better than I did before.

For good measure, I just remembered how old much of the software on this
machine is, and I feel like kind of an ass:

This is totally an old release. If there’s a bug here, maybe it’s fixed by
now. I will not venture a strong opinion as to whether there is a bug. Maybe
this is entirely expected behavior. It is time to drink a beer.

postscript: on finding bugs

The first thing you learn, by way of considerable personal frustration and
embarrassment, goes something like this:

Friday, January 16

Wednesday, January 14, 2015

On making a web page remind me of a quality I never fully appreciated in
HyperCard.

So I generally am totally ok with scrolling on web pages. I think in
fact it’s a major advantage of the form.

Then again, I just got to indulging a few minutes of thinking about
HyperCard, and I think that this time rather than read the same old
articles about its ultimate doom over and over again, maybe I should do
something by way of recreating part of it that was different from the
web in general.

The web has plenty of stupid carousels and stuff, but despite their example I’m
curious whether HyperCard’s stack model could still hold up as an idea. I was
never sure whether it was the important thing or not. It was so obviously and
almost clumsily a metaphor. (A skeuomorphism which I have never actually
seen anyone bag on when they are playing that game, perhaps because Designer
Ideologues know there’s not much percentage in talking shit about HyperCard.)

I’ll spare you the usual slow-composition narrative of where I go from here,
and jump straight to my eventual first-pass solution.

(Ok, actually I just repurposed a terrible thing I did for some slides a while
back, after recreating about 75% without remembering that I had already written
the same code within the last couple of months. It’s amazing how often that
happens, or I guess it would be amazing if my short term memory weren’t so
thoroughly scrambled from all the evil living I do.)

Tuesday, January 13

rtd / bus schedules / transit data

I’m taking the bus today, so I got to thinking about bus schedules. I use
Google Calendar a little bit (out of habit and convenience more than any
particular love), and I was thinking “why doesn’t my calendar just know the
times of transit routes I use?”

I thought maybe there’d be, say, iCal (CalDAV? What is actually the thing?)
data somewhere for a given RTD schedule, or failing that, maybe JSON or TSV or
something. A cursory search doesn’t turn up much, but I did find these:

Ok, waitasec. What the fuck is going on here? The string 20921 appears
nowhere in these lines. It takes me too long to figure out that the
text files have CRLF line-endings and this is messing with something in
the chain (probably just output from grep, since it’s obviously
finding the string). So:

Why does dos2unix operate in-place on files instead of printing to STDOUT?
It beats me, but I sure am glad I didn’t run it on anything especially
breakable. It does do what you’d expect when piped to, anyway, which is
probably what I should have done.

This should probably be where I think oh, right, this is a Google spec—maybe
there’s already some tooling. Failing
that, slurping them into SQLite or something would be a lot less painful. Or
at least using csvkit.

Heretic — still a pretty solid game and maybe my favorite iteration of the Doom Engine

Rise of the Triads — there is absolutely no way that ROTT actually
looked as bad as this emulation at the time on baseline hardware, but we’ll let
that slide — the graphics may have been better than they show here, but it
was the Duke Nukem property of its moment, which is to say ultimately a
regressive and not-very-consequential signpost on the way to later
developments

And then I got to thinking about the Adventure Game Toolkit, which was this
sort of declarative, not-really-programmable interpreter for simple adventure
games. The way I remember it, you wrote static descriptions of rooms, objects,
and characters. It was a limited system, and the command interpreter was
pretty terrible, but it was also a lot more approachable than things like TADS
for people who didn’t really know how to program anyway. (Like me at the time.)

I’d like to get AGT running on squiggle.city, just because. It turns out
there’s a portable interpreter called AGiliTY, although maybe not
one that’s well packaged. I’ll probably explore this more.

Wednesday, January 7, 2014

local webservers and static html generation

I haven’t always run an httpd on my main local machine, but I’ve been doing it
again for the last year or two now, and it feels like a major help. I started
by setting up a development copy of display under Apache, then noticed
that it was kind of nice to use it for static files. I’m not sure why it’s any
better than accessing them via the filesystem, except maybe that
localhost/foo is easier to type than file://home/brennen/something/foo, but
it has definitely made me better at checking things before I publish them.

(Why Apache? Well, it was easier to re-derive the configuration I needed for
p1k3 things under Apache than write it from scratch under nginx, although one
of these days I may make the leap anyway. I don’t see any reason Perl FastCGI
shouldn’t work under nginx. I also still think Apache has its merits, though
most of my domain knowledge has evaporated over the last few years of doing
mainly php-fpm under nginx.)

I’ve resisted the static blog engine thing for a long time now, but lately my
favorite way to write things is a super-minimal Makefile, some files in
Markdown, and a little bit of Perl wrapping Text::Markdown::Discount. I
haven’t yet consolidated all these tools into a single generically reusable
piece of software, but it would probably be easy enough, and I’ll probably go
for it when I start a third book using this approach.

I’d like to be able to define something like a standard book/ dir that would
be to a given text what .git/ is to the working copy of a repo. I suppose
you wouldn’t need much.

book/
authors
title
description
license
toc

toc would just be an ordered list of files to include as “chapters” from the
root of the project. You’d just organize it however you liked and optionally
use commands like

book add chapter/index.md after other_chapter/index.md
book move chapter/index.md before other_chapter/index.md

to manage it, though really a text editor should be enough. (Maybe I’m
overthinking this. Maybe there should just be a directory full of chapters
sorted numerically on leading digits or something, but I’ve liked being able to
reorder things in an explicit list.)

I should add a feature to Display.pm for outputting all of its content
statically.

monday, january 5

driving down 36 to see you
i grasp at the scene around me
trying to fix in mind for you
some list or hierarchy
of attributes and aspects:
snow on the hills
snow on the plains
the moon on the snow
sundown on the clouds
the haze over the city lights
electricity vivid and gleaming
within the field of some
greater radiance

Saturday, January 3, 2015

ipv6

I was hanging out on the internet and heard that imt@protocol.club had set up
club6.nl, a tildebox reachable only over ipv6. I applied
for an account and got one (very speedy turnaround,
~imt).

The next problem was how to connect. I am an utter prole when it comes to
networking. The first thing I remembered was that DigitalOcean optionally
supports ipv6 when creating a new droplet, and sure enough they
also have a guide for enabling it on existing droplets.