Sat, 09 Sep 2017

Restic Systems Backup Setup, Part 1

This is the first in what will undoubtedly be a series of posts
on the new restic-based
system backup setup.

As I detailed
earlier this week, I've started playing around with using restic
for backups. Traditionally, I've used a variant of the venerable
rsync snapshots
method to backup systems, wrapped in some python and make, of
all things. Some slightly younger scripts slurp everything down
to a machine at home so I've got at least another copy of everything.

In my previous post, I discussed my initial attempt at restic,
simply replicating that home backup destination into
Backblaze B2. That
works, but it feels a bit brute-force, and there have been other
things I've wanted to change about this for a while:

Replicating from colo to home takes an order of
magnitude longer: Backing up the ten or so VMs
I have on my colo machine takes about 10 minutes. Pulling
that down to home takes 100 minutes or so. (I'll note here
that the bulk of my 'large' data is in AFS; what I'm backing
up on systems is primarily configuration files, logs, and
some things that happen to live locally on a system).

Some of this is due to the fact that the replication traffic goes
from Michigan to New York, while the initial backups are all
happening within the same physical host. But the larger part,
I think, is due to the fact that in order to replicate my system
backups, I have to preserve hardlinks. A bit of background
here: the 'rsync snapshots' method works by using the
--link-dest option to rsync. As I backup a system,
if the file hasn't been changed, rsync makes a hardlink to the
corresponding file in the --link-dest directory. This
doesn't use any additional space, and it's an easy way of keeping,
say, fourteen days worth of backups while only using more space
for the files that change from day-to-day. Most of my systems
keep that may days of backups around.

Since I want to replicate all of those backups (and not, for
example, only replicate the latest day's worth of backups),
but I want to keep the space savings that --link-dest
gets me, I need to use the -H argument to the replicating
rsync so it can scan all the files to be sent to find multiply hard-linked
files. This takes a long long time — so much so that the
sshd man page warns about it:

Note that -a does not preserve hardlinks, because finding
multiply-linked files is expensive. You must separately specify -H.

The backing-up or replicating rsync must run as root:
Of course the rsync on the machine being backed up must run as root,
it needs to be able to read everything to be backed up. But the
destination side also has to run as root, because I want to preserve
permissions and ownership, and only root can do this. I've long wished
for an rsync 'server' that spoke the rsync protocol out one side and
simply stored everything in some sort of object storage. Unfortunately,
the rsync protocol is less a protocol and more akin to C structs
shoved over a network, as far as I understand. And the protocol isn't
really defined except as "here's some C that makes it go".

Restoring files is done entirely on the backup server:
Because of the previous issue, I didn't want root on the client servers
to ssh in as root on the backup server — I felt it was much safer
and easier to isolate backups by having the backup server reach out to
do backups. There's no ssh key on the client to even be able to get into
the backup server. It's not a big issue, but if I need to restore a
handful of files spread out I've got kinda stage them somewhere and then
get them over to the client system. And because the backup server has
a command-restricted ssh key on the client server, it takes some convoluted
paths to get stuff moved around.

Adding additional replicas adds even more suck: Adding
another replica means another 100 minutes somewhere pulling stuff down.
And it also means a full-blown server, someplace where I can run rsync
as root, and it's got to be some place I trust. Also, most of the
really cheap storage to be found is in object storage, not disks (real
or virtual) — part of what attracted me to restic in the first
place.

When I started playing with restic, I saw a tool that could solve a
bunch of those problems. Today I've been playing around with it, and
here's my ideas so far.

Distinct restic repositories: One of the benefits of restic
is the inherent deduplication it does within a repo. And if I were backing
up a large number of systems, I might save something by only having one copy of,
say, /etc/resolv.conf. But really, most of what I'm backing up
is either small configuration files, or log files. And these days, the few
tens of gigabytes of backups I have there isn't really worth deduplicating.
In addition, the largest consumer of backup space for me — stupidly
unrotated log files that get a little bit appended to them every day —
would benefit from the deduplication, even if it's only deduplicating on a
single system.

More important than that, however, is that I want isolation between my systems.
For example, the backups of my kerberos kdc are way more important than, say,
web server logs. And I really don't want something that would run on a
public-facing system be able to see backups for an internal system. So, distinct
repositories.

Use minio as the backend: My first thought when I was going
to experiment was to use the sftp backend to restic. But to isolate things
fully, I'd have to make a distinct user on the backup server to hold backups
for each client, and that sounds like too damn much work.

Unrelated, I've been playing around with minio.
Essentially, it's about the simplest thing you can get that exposes the 90% of
S3 that you want. "Here's an ID and a KEY, list blobs, store blobs, get blobs,
delete blobs". Because it's very simple, it doesn't offer multi-tenancy, so
I will have to run a distinct minio for each client. That said, I think that
should be easy enough, especially if I use something like
runit to manage all of them.

Benefit from the combination of minio and restic for replication:
Minio is very simplistic in how it stores objects: some/key/name is
stored as the file /top/of/minio/storage/some/key/name. This has two
benefits: first, because the minio storage directory is also a restic
repository, I can just point a restic client at that directory, and as long
as I have a repository password, I can see stuff there. Second, every file in
the restic repository other than the top level 'config' file is named after
the sha256 hash of the file as it exists on disk, and all files in a repository
are immutable. This makes it trivial to copy a restic repository elsewhere.
While I'll likely start by simply using the b2 command line tool to
sync things into B2, I think you can do it even faster. I haven't looked deeply,
but my gut feeling is that the b2 sync command looks at the sha1
hash of the source file to decide if it needs to re-upload a file that exists
already in B2. We don't need to do that at all; repository files are named after
their sha256 hash, so if the files have the same name, they have the same
contents [0]. So moving stuff around
is incredibly trivial.

Future niceties. I've got a bunch of other ideas floating around
in the back of my head for restic. One is a repository auditing tool: since nearly
everything in restic is named for the sha256 hash of the file content, I'd like a
tool I could run every day that would pull down, say, 1/30th of the files in the
repository and run sha256 on them, to make sure there's no damage.

The second is some way of keeping a local cache of the restic metadata so operations
what have to read all that are much faster. Third, and related, a smarter tool for
syncing repositories. For example, I'd love to, say, keep three days of backups in
my local repository, and be able to shove new things to an S3 repository but keeping
seven days there, and shove things in B2 and keep there until my monthly bill finally
makes me care.

Anyways, this has been a few hour brain dump of a few hours of experimentation, so
I'll end this part here.