Temporary exile

Pleased to say that Continuity Control has
gotten some rather positive feedback while presenting at this
year’s Finovate! While not the only
feedback, some of my favorites here: Tweets:
one -
two -
three -
four -
five Photos:
one -
two Truly
proud of be part of the team.

While getting into bundler
(which is great, btw), I recalled how often I have longed for a way
to do something similar in Debian. What I’d really like to be able
to do is something like aptitude backup and get some sort of
backup file which contained all of the packages I had asked to be
installed so that I could install a new base system and then do an
aptitude restore and have those packages be installed. Obviously,
this would pull in the required dependencies which is what we
wanted anyway. I’ve seen only a few attempts at this and most are
pretty hairy. Is there a good way to do this currently? Are there
plans for this in the future? Is this an irrational desire?

Being a Debian user, I don’t make a habit of compiling things by
hand and on those occasions when I do need to do so, I usually use
apt-build (good article, though a bit old
here).
However, today I had to get a particularly odd shared library with
a very specific version to match a production environment we have
at work. So, I downloaded the tarball, which was quite large, and
then compiled it. Which took four and a half minutes on my
dual-core (1.83GHz) ThinkPad running make -j 4 which isn’t all
that slow, really. Once it was completed, I looked around (grepped
around) for the output file and was dismayed to discover that there
weren’t any. After tinkering around for about an hour, I finally
figured out that you can request shared libraries be built by
providing an option to the configure script:
./configure --enable-shared Shortly thereafter, I got the shared
library version I needed and was on the road again. Just a useful
tip which I hope can be of help to someone.

My GPG public key had expired a few months ago so I made a new one
and set this one not to expire. Backed up in many places this one
is so I’ll be excited to get it signed at the upcoming
SCOSUG key signing party. So you can get it
right here on this page and
download it right here.
Also, this post was written on my Android phone (which I love)
using the Wordpress app.

So, I know that the Microformats
project has has varying degrees of success in their endeavor to
embed data in HTML such that it does not violate web standards. As
John Resig pointed out,
others have used things like XML namespacing in XHTML to achieve
similar goals. The most notable usages of this technique are most
likely to be found in the
applications of
RDFa. However, when looking at
the new
unobtrusive JavaScript helpers
in the forthcoming Rails 3, I was tipped off to the huge scope of
the new data- attributes in HTML5. The custom
data- attributes excite me. In HTML5, including any arbitrary
attribute may be included in any element provided that it is
prefixed with data- and doesn’t interfere with the rest of the
standard. Anything. So I can do the following and have it be
perfectly valid:

I spent some time with a few colleagues yesterday and came to the
conclusion that true enthusiasts gravitate toward talking shop with
one another. This is true even of casual situations. I’ve had
plenty of “getting to know you” and “let’s just hang” social
situations with other geeks where we might start out talking about
any old thing but end up having an enthusiastic exchange about
computers. I don’t view this is as negative. On the other hand,
plenty of mailing lists and other communication media devoted to
discussions of specific topics end up going “off-topic” as people
socialize more generally. Fine, so it goes both ways. It’s good to
love what you love and even better to share it.

When stuck inside due to all the snow, there is no better time to
consider the topic of backup. Seriously, backup is important and
the issue is fascinating. The area of backup brings together so
many topics in computing. Think about it! To do backup
successfully, you must deal with data transfer, data integrity
validation, networks, distributed systems, compression and, if
you’re doing it right, cryptography. It is not just computer
science however, but also workflow management and that
somewhat-nebulous-yet-often-referred-to thing of systems thinking.
I got to thinking about the topic of backups and was curious as to
the state of research as well as backup tools. A great intro to
so-called “backup theory” is available on the
‘Backup’ Wikipedia article
and
others have written
on the subject
(Google will verify
that this is true). As it turns out, advances in storage have
recently offered many new opportunities for improving the way that
a given backup process might work. Several distributed
fault-tolerant filesystems are coming along quite nicely. Though
the Google File System
and the equivalent HDFS project
which is part of Apache’s Hadoop have
gotten much attention, there are other options.
GlusterFS (follow
GlusterFS on Twitter, perhaps?) and
Ceph are two excellent examples of
Free/Open Source Software projects which offer the compelling combo
of fault tolerance and distributed storage. They each employ
replication and have similar architectures insofar as they abstract
individual machines into “chunks” or “blocks” and manage
replication automatically. One interesting difference is that while
GlusterFS exports filesystems “as-is” (see the docs for an
explanation), Ceph exports entire block devices. So then what about
the theory and goals of backup? In my opinion, backing up is not
enough. Having a backup plan and executing it perfectly doesn’t
mean a thing if the data can’t be recovered. (In fact this issue
relates strongly to a larger discussion of reliability. Rather than
focusing exclusively on total time spent in a failure state,
individuals concerned with reliability would also do well to
consider how fast a system can recover from those failures. If a
system can be up and running again after only 5 minutes, then that
system can go down 12 times before reaching an hour of downtime. If
another system takes 20 minutes to rebound from a failure, then
that system can only go down 3 times!) I haven’t yet gotten to
thinking about the problem of restoration following failure for
anything other than plain old files. For example, I backup my
personal data to servers in Philadelphia and California in addition
to an external hard drive in my apartment. I make careful use of
old stand-byes like
tar),
gzip and
rsync along with a checksum
utility like md5sum or, more
recently, something in the
SHA family. Plus I use the
git revision control system for code and
etckeeper for config. By
backing up all of my configuration data in addition to my personal
files, I make it so that I can easily return a given system to a
usable state, if not restoring it perfectly. I have successfully
restored systems by doing little more than a reverse rsync. To be
fair, I just spoke about about the few systems under my personal
control which constitute a small and limited case. I back up to
different locations which is good practice but my local copy is
certainly not sufficient for anything industrial. If the disk
breaks, I’m out of luck. Considering a larger-scale system is when
GlusterFS, Ceph and others would come in handy. Obviously there are
a greatmanybookswritten about
this topic but, for the purposes of
discussion, if I were to build a platform for the reliable storage
of huge amounts of data, my project would look something like
this… First, I would round up spare computers with room for extra
disks. They machines would not need to be particularly fast or
possess large amounts of memory. I’m not sure of the exact hardware
requirements for either GlusterFS
(vague wiki entry)
or Ceph but it’s hard to imagine that they’d require huge amounts
of anything but disk space. Anyway, if my organization had old
desktops or something which were being replaced then they might be
perfect candidates. Second comes storage. It seems that at the time
of this writing, one can purchase a 1TB hard disk for around US$85.
Let us assume that 20 desktop machines could be procured and each
had two spare disk slots. For around US$3500 (figuring 2 US$85
disks per machine and a little more for tax+shipping, etc.) one
could buy 20TB of storage. Now, it’s not quite that simple as the
fault-tolerance scheme in both GlusterFS and Ceph relies on
replication. Assuming the accepted replication factor of 3 (a norm
adhered to by the Google File System), that would reduce the 20TB
storage block by a third leaving around 6 and two thirds terabytes
of fault-tolerant storage. Filesystems of this type usually require
cluster control processes which (ideally, I think) reside on
dedicated machines so an extra machine or two would also be
required. I got a good explanation of how metadata servers work in
GFS/HDFS by reading the
HDFS design document,
actually. Metadata servers and other control processes serve
similar functions. Ceph documents are very explicit about not
having a single point of failure whereas GlusterFS is not quite so
adamant. I need some help figuring out if GlusterFS is as
fault-tolerant in that respect. For under US$4000, one could
theoretically build over 6TB of fault-tolerant distributed storage
(provided that spare machines are plentiful, something which
shouldn’t be a problem for organizations with a semi-regular
hardware replacement cycle). Now, as for the usage of that storage
system for backup, it’s a different piece of the discussion. I’ve
seen quite a few setups where a single-but-very-large server
provides networked storage to a large number of users via something
like samba or NFS. In this case, the big server is usually blessed
with some complicated RAID arrangement which people put entirely
too much faith in. No one ever listens that RAID is not a backup
strategy. Personally, I don’t believe in hardware reliability
because it seems silly to spend money trying to prevent hardware
from failing when you know it’s going to break (eventually) anyway.
I’m not saying that RAID isn’t useful because it *is* useful and
has a place. What I am saying is that it’s not backup. So if an
organization has a big network storage server then where does that
get backed up to? It’s hard to backup 6 TB off site but it can
(certainly) be done. However, if an organization is lucky enough to
have multiple buildings then a distributed storage cluster like I
described earlier would be an excellent addition to the overall
backup infrastructure. Putting a few machines in different
buildings and using it as a place to shadow the main file server
and whatever else needs to be backed up would grant an added
measure of security. Snow days are a good time for thinking and my
backup jobs went smoothly. However, I feel as if I have begun a
track of study which might yield some good results. Granted, it’s
not just about the technology
(humans screw things up) but the
ability to build reliable and fault-tolerant storage systems for
massive amounts of data using only commodity hardware is a huge
boon to users everywhere. This concludes my backup rant.

As has been spoken about endlessly
(OStatic,
OSnews),
there is a great blog post from 0x1fff with many (started at 35, is
now many more)
open source projects from Google.
In fact and indeed, there is some cool stuff on there. I knew about
Caja and
Protocol Buffers (wish there
was a JS port of protocol buffers) but did not know about
CRUSH and
skia. Honestly, there are plenty
of cool projects out there and my already-positive opinion of
Google is only bolstered by the fact that they give back so
willingly. Gotta love it.