Gentoo on EC2 From Scratch

I’ve been beginning to tinker with Amazon EC2 and I figured that many might benefit from a Gentoo-from-scratch recipe. There aren’t too many gotchas. Credit is due to this blog for providing a good non-Gentoo-specific overview.

Before trying any of this, you need to be at least a little familiar with EC2. They have a great getting started walkthrough here. Once you know how to start instances and connect to them you’re halfway there. You’ll also need to set up an S3 account and have an access key and secret key to store your images. On your gentoo box install ec2-ami-tools and ec2-api-tools.

The biggest issue we’re going to have to deal with is that you can’t supply your own kernel. The two issues this creates are udev compatibility and kernel modules. The former is dealt with by setting package.mask, and the latter by hunting down module tarballs online. The modules are fairly optional unless you want to bundle a running image (this requires loop support, which isn’t built in the EC2 kernel).

You might be tempted to start from an existing EC2 AMI. If you find a good one, that isn’t a bad idea, but most of them are VERY out of date. Updating a gentoo install with a pre-EAPI portage and glibc/gcc versions that aren’t even in the tree any longer will be painful. Creating one from scratch isn’t actually that hard.

So, without further ado, here is the recipe (the steps generally follow the Gentoo handbook, so be sure to follow along in that):

Get yourself a stage3 and portage snapshot per the handbook. x86 or amd64 is fine, but be sure to check the Amazon EC2 product page to see which of their offerings are 32/64-bit. This guide is written for 64-bit.

Create yourself a disk image with dd if=/dev/zero of=image.fs bs=1M count=5000 – feel free to tailor the size to your needs but mind the EC2 root filesystem limitations unless you want to run it on elastic storage. The uploaded image will be compressed so you won’t be paying for any unused space in your image.

Point a loopback at your image, format it with a supported filesystem of your choice, and mount it.

Configure sshd to run at startup, and edit your sshd config to allow root to login.

Exit your chroot, umount anything mounted inside of it.

Clean up tmp, var/tmp, and usr/portage/distfiles, and any other messes you have made. I suspect that to compress the image fully you probably need to zero the free space (dd if=/dev/zero of=file-inside-filesystem BS=1M count=5000 ; rm file-inside-filesystem).

Umount your image and delete any loops you created. Congratulations, you now have a raw image file suitable for EC2. Now we just need to bundle, upload, and register it.

Congratulations, you now have what should be a working ami. If you ever need to update it you can just chroot into your image, adjust it, and then re-bundle, upload, and register. If you need to delete an ami there is a command that will do it, but I usually just use s3cmd.

You might want to see my updated guide on building with a custom kernel.

30 Responses

Interesting, I didn’t spot it on the list of public ami’s, unless it was the 2008.0 hardened image I tried to use. There are some decent 32-bit gentoo images. Amazon really should provide some way to annotate images, it can be tricky to figure out which ones are worth looking at and of course you pay just to boot them up.

Your kernel (on the box you’re creating the image from) needs loopback filesystem support enabled (I’m guessing genkernel enables this by default). To use it use the command losetup /dev/loop1 image.fs and now the device /dev/loop1 is pointed at your image. You can do a mkfs.ext3 /dev/loop1 to create a filesystem, or mount /dev/loop1 mountpoint/ to mount it. When you’re done with it and have unmounted the mountpoint use losetup -d /dev/loop1 to unlock the image file and delete the loopback association.

Loopback filesystems can be very handy things. They basically turn a file into a block device.

Incredible work, who thought it’d be that easy? It’s quite a powerful setup. I just used your ami-11a14f78 to create a custom lamp server in under an hour. Your /root/.bash_history is also quite helpful.

[…] issue with EC2 is that they supply the kernel, and that already caused difficulties with my first EC2 tutorial – the image I created doesn’t let you create a new snapshot from a running image since […]

Thank you for this howto! How do I create a custom kernel with JFS on /root? I noticed that if I boot into a custom kernel with ext3 on /root I will run out of inodes very fast while still having lots of disk space.

Well, I don’t know if EC2 supports JFS (I’m guessing it does but you’d have to do your research). If it doesn’t then it won’t be able to load your kernel. Now, if it doesn’t you could put your kernel on an ext3 partition and then mount everything else from another partition of your choosing (since you can put JFS or any other filesystem support in your custom kernel). There are also variants on this using initrds, but no matter what the kernel image itself has to be on a filesystem the EC2 bootloader can read.

However, there may be an easier fix – when you create an ext3 partition you can specify things like inode space, or allowing room for future volume expansion (useful on LVM/MD/etc). Options like -T small in mkfs.ext3 are probably the easiest way to expand space for inodes. The gentoo handbook recommends this for disks smaller than 4GB.

Basically mkfs.ext3 scales the number of inodes based on the size of a disk. Any distro is going to have small-file overhead in /etc, /var, etc (Gentoo’s portage tree uses quite a few inodes). This becomes disproportionately large if the OS itself is about the only thing on the disk, which of course is common on EC2. So, you need to allocate space for inodes.

it took me hours, but when you find the root of the problem everything is simple, infact it’s enough emerge at least app-admin/ec2-ami-tools-1.3.57676-r1 (ec2-ami-tools-1.3.34544 is affected by “stdin =” bug)

@rich0
thank you! You rocks! Its very useful have a working ami (unfortunatly it’s only us and not eu). Any plans for ebs one?

[…] be a bad thing. I would hit the IO limit or the CPU limit and actually have to pay money, ugh. From rich0′s blog lets assume your first instance is up and running, has been for a while, and now needs an […]

I tried to set the generator to add After=network.target and After=network-online.target. I have dhcpcd.service in multi-user.target.wants. The issue is that dhcpcd never sets the network-online.target to be finished so you end up running the public-keys.start script way before you have a connection.

A better solution is always welcome.

You also need to fix or remove the killall-nash script in /etc/local.d. If you never run that service, you can delete the script that kills it on boot.

Right now based on Pygoscelis-Papua’s image I have a hardened Gentoo with systemd working fully in AWS. Not in degraded state (systemctl status).

Obviously this guide predates systemd considerably. One of these days I should update it. I’d probably just create a real service in /etc/systemd/system if possible and then the dependencies can be set correctly.