A RAM-disk based workflow

It’s easy to pick up a laptop today with 16 or more GB of memory. Or spin
up a cloud instance with as much as you need. ramdisks are scary fast, and
as most cloud instances have poor IO, it’s a great way of getting high
performance servers without striping multiple disks. As an added benefit,
if your cloud provider offers sub-hour pricing, as GCE does, you’ll even save
cash as well as time, by finishing well before an equivalent disk workload,
despite using a very fast instance.

I use ramdisks almost exclusively now, with a couple of quick tricks to avoid
data loss. This also has the benefit of keeping the load on my disk or SSD
to an absolute minimum.

The general idea is that for each project, I typically have some virtual
machines for testing stuff out, and a git repo that contains source.

What I want is that all git commits are stored on permanent storage, and that
the VM contains a fully reproducible environment for both build and deployment.

By using a ramdisk, I can be pretty comfortable that the environment is
reproducible, as I need to rebuild it each reboot, which is roughly weekly
for me.

Let’s break this down into 3 parts:

create a ramdisk

link our git repo into it

spin up a ramdisk-based vm to work in

Creating and deleting a ramdisk

These two zsh shell functions are pretty straightforwards. You could easily
do this in bash with minor changes.

The second function is straightforwards. When the ramdisk is ejected, the
corresponding /dev/disk* device and allocated RAM is also freed.

Pulling in the git repo

The key point here is that the git repo we are working from will have a
permanent copy of any commits on disk, and we’ll be working from a ramdisk
copy all the time. This uses a neat git trick called git-new-workdir that
I learned from Markus Prinz.

git-new-workdir /projects/couch/git /ramdisk/couch 1.6.x

How git-new-workdir works is ludicrously simple, it creates the .git/
dir using softlinks to the original data, so any commits, stashes, config or
branch changes we make will get written to permanent storage, and uses the
current directory (which in our case is a ramdisk) to store the working
tree. So all we need to do to ensure our changes are stored permanently is to
commit them, or push a branch. There’s no extra commands nor things to remember.

The first parameter is the on-disk location of the original repo we are using,
the second is the new location we want to set up, and the optional third one
is the branch we want to check out into our new ramdisk backed working dir.

I’ve got this aliased as gnw as I use it all the time.

Spinning up a VM

The same thing applies here with vagrant. It’s as simple as softlinking the
Vagrantfile I am using into the ramdisk, assuming my working dir is in the
ramdisk already:

ln -s /projects/couch/Vagrantfile

Then vagrant up as usual and Bob’s your uncle. As the image is already stored
in ~/.vagrant.d/boxes/ we get a nice repeatable image for free. Finally, as
part of my workflow, I have a provisioner built in for vagrant already, using
ansible that ensures whether I run a local instance or a cloud server, the
post-installation setup is identical and idempotent:

Bonus Hacks

ZFS is an advanced filesystem supporting snapshots, inbuilt lz4 compression,
automatic checksumming to prevent and detect bitrot, and many more features.

It was developed by Sun Microsystems, and luckily was open sourced before the
oracle buyout. It’s now available on linux, OSX, FreeBSD, and many more
variants of Solaris such as Illumos or SmartOS.

While I’m not worried about bitrot, the compression and snapshot based
replication make ramdisks even better. Compression means that on my 16GB OSX
laptop, I can comfortably run an entire 1GB RAM Windows 7 VM (20GB disk) in
a 10GB ramdisk, and still have a reasonably functional OSX environment. On a
larger 32GB FreeBSD server it’s scary fast. I can keep the original VM image
safely snapshotted on my main disk, and replicate it into the ramdisk at
almost raw disk throughput. With an SSD this is under 5 seconds to copy, and
launch a fully encapsulated VM in the ramdisk.

zsh functions

We need just 2 function again, one to create a zdisk as I named it, and
another to destroy it:

Vagrant and ZFS

I store a gold image of all my projects in a zfs dataset, along with the
provisioning script that sets that image up from a base OS. The base OS
images themselves either come from the cloud provider (EC2 or GCE for example)
or from a reference vagrant box. My entire vagrant setup is also stored in
zfs, as VMs compress really well, i.e. ~/.vagrant.d/ is just a softlink
to a zfs mountpoint. And as we are using the ramdisk based workflow above
this data rarely changes.