New Data Integrity Tools

I’ve recently added a couple tools to my standard set, and have at least a 4x improvement in the safety of my data by doing so.

The process was complicated a bit because I’ve become very sensitive about only depending on FOSS tools (ex:As much as I like SublimeText2, I stopped using it because it once demanded to be updated before it would run.), but frankly I think that constraint produced better results than I would have reached without it. Because it was something of a hunt, I’d like to recommend the particular tools I settled on, in particular are KeepassX, Attic, and Seafile, described individually below.

A password manager that uses a nice simple, widely supported format for GPG protected archives of account information. It supports all kinds of fields including open-form textual notes chunks, which are nominally for Reset questions and such, but I find I have a couple small pieces of data I’d like encrypted in easy to get to places that aren’t strictly passwords, which ride around in entries with only notes fields filled.

There are clients that will run everywhere from command lines to cell phones, and the database is a regular file that can be synchronized by whatever external means you would like.

This has allowed me to make my password generation scheme ridiculously more complex, including just using long pseurdorandom strings automatically generated (by KeepassX, with nice options to suit different password fields) for things I don’t care if I can log into from memory. This has not only improved the quality of individual passwords, it has also drastically cut down on my password reuse, which is the major security contribution.

My backup scheme has been shameful lately, basically consisting of periodic manual synchronization to a large HD in my apartment, and occasional manual extraction of tarballs from the web-server. This had to be fixed, and I spent a good month reading about various tools, all of which did things I didn’t like, until I came upon Attic.
Attic does precicely what I want to an amazing degree: Deduplicated storage for incremental backups. AES-encrypted archives with decent password/key-phrase generation. Archives can be mounted via FUSE for inspection. The Deduplication is fast and almost startlingly effective. The command line syntax is generally well designed to be comfortable for both humans and scripting (Lots of -`date +%Y-%m-%d`s inserted into my commands, because Unix).

The only minor criticisms I’ve come up with so far are that the exclude syntax is a little twitchy (shell globbing risks) and can’t be dry-run, like everywhere, the key/passphrase-handling is a little clumsy, and (very minor) it demands a fairly hefty Python3 environment.

I’m currently running automated nightlies of my webserver’s entire FS, as well as manual periodic pulls of my homedirs and /etc on my laptops (automating laptop backups always sucks), to a disc housed in the Lab’s machine room on campus. If I decide I want to move them, the archives are tolerant to handling as normal files, and the tool merely requires Unix file or SFTP access to the the target volume, so any cheap storage that allows SFTP will do as a target.

I am running push backups with the intention that that configuration does not require the backing storage to ever have access to the backed up data/keys, which does add a small amount of exposure in that a compromised machine could theoretically be used to wipe out it’s own backups, but I view that as an acceptable risk. The remaining challenge is that I do most of my computing on laptops, which move about to various addresses and connection capacities, ruining the conventional backup automation schemes.

I’ve never been comfortable with syncing through services that leave my plaintext at rest on someone else’s machines, and have just relied on ssh and rsync. Unfortunately, that is prone to errors and conflicts, is extremely awkward from devices with crappy environments, and requires a level of foresight I don’t always posses, so I wanted something more “Dropbox-like” that would run on my own hardware. This was partly motivated by starting to carry one of the tiny laptops, and a desire to be able to seamlessly work on projects from it as well as my machines.

I specifically wanted something that encrypted the data in flight and server-side, had an open-source client, was web-accessible, and could be self-hosted. Seafile is all of the above.

Right now I’m syncing my live set (Keepass DB, Notes -which some time ago became a folder of plain-text not obscured by any silly tools-, the current semester of my school stuff, Etc.) through a free 1gb seacloud.cc instance, but I’m now sold on the stack, and when I get some time I’ll switch to fully self hosted.

I’m a little skeptical of their encryption (It uses some slightly funny AES modes, and doesn’t protect metadata), but it is way better than most alternatives. It also apparently has issues with extremely large numbers of files, so I’m intending to only use it for a pruned live set rather than entire homedirs or the like, as much as that would sometimes be convenient. My only other criticism is that the Android client seems to be a little buggy, but overall it just seems to work on every platform I’ve tried, and without unreasonable bandwidth use.

I’m currently pleased with all of my selections, especially by comparison to the other options I explored.

Recent Posts

Random Quote

If you have an apple and I have an apple and we exchange apples then you and I will still each have one apple. But if you have an idea and I have an idea and we exchange these ideas, then each of us will have two ideas.