My hobby…

An Ubuntu 10.04 Home Server

I’ve recently been setting up a home server for my parents using lucid. While it’s not quite a point and click setup process, the process is a lot more streamlined than it used to be.

They have an individual computer each running Windows 7 and, one laptop between them running XP. Mum is also a photographer and generates a large amount of data. Dad also generates a fair bit of data, less than Mum although he does do the occasional home video.

Backups are an ad-hoc affair. Mum has three hard disks in her computer which she manually copies files between and tries to ensure she has two copies of everything. Dad has a portable external drive which he backs up to infrequently. Between them, neither is confident that they’d get all their data back in the event of a disaster.

Dad also liked how my HTPC (running XBMC) worked, and decided one of those would be nice too. So I decided to setup a home server for them and solve all their computer problems. Well, almost.

I started writing this as a single article, but it got a bit long so I’ve decided to break it up into a series. This first post is an overview, the links to the other posts are at the bottom of this article.

I’m assuming a fairly good degree of technical knowledge here, but if there are any gaps you feel I should add please feel free to leave a comment. I am aiming this at a reader who is familiar with Linux and Ubuntu, has installed software with apt-get or Synaptic, is comfortable with using the command line, and understands the implications of using raid5.

Overview

This home server will perform the following tasks:

Play music and video via the TV

Present a file share to the network, with individual folders for Mum and Dad

Backup the contents of their folders nightly to an external hard drive

Provide a GUI-based remote administration interface

Monitor backups and the raid array, sending emails to both Mum and Dad if something is amiss

Software that needs to be configured to perform these tasks:

MDADM for RAID

Xbox Media Center (XBMC) for media playback

Samba for file sharing

Back in Time for backup

NeatX for remote administration

The main boot device in this case will be an IDE compact flash card. I did this partly because it makes recovery easier (just write an image to a flash card rather than a whole hard drive), but mainly because it frees up a SATA port!

The hardware components for this particular HTPC are:

Gigabyte M85M-US2H motherboard

AMD Athlon II 250

2gb DDR2 ram

4x640gb Western Digital 6400AAKS hard drives

1x1TB Western Digital Green

1x2TB Western Digital Green (in external esata case)

4 Raidon/Stardom hotswap drive bays

IDE Compact Flash adaptor and 8gb 133x CF card

A note on raid

The 4x640gb drives are configured in a raid 5 array. Personally, this is about as large an array as I would trust Raid5 to, the future is redundancy at the file system layer, as ZFS and Btrfs are capable of. ZFS can’t be used in the Linux kernel and Btrfs isn’t even close to production-ready yet, so for now I believe Raid is still the most sensible option. But if you’re reading this in 2012, you should probably be using Btrfs instead.

Storage

The 1TB hdd is just a single disk for media to be played back on the TV. Anything here is considered replaceable (think of it like the internal HDD in a MySky or TiVO box), so it won’t be backed up at all.

The 2TB hdd is the backup drive. Each night the entire raid array is backed up to it with Back in Time, configured to take snapshots. Since it uses rsync, the backups are incremental and shouldn’t take more than a few minutes to run, depending on how much was changed during the day. Obviously as the array nears capacity fewer snapshots will be able to be kept, and once it does the idea is to replace the 2TB backup hdd with a new one, keep the old one as an archive, delete any data from the raid array that is no longer current, and start again with a fresh clean backup disk. Hopefully by then it will be a 3 or 4TB disk and they can keep more snapshots!

The file system on the backup HDD will be NTFS. This is because it supports hard links and is readable by the Windows machines, which is important for my parents when they go to retrieve files from the archive.

Final notes before we get to the nitty gritty

I had a bit of trouble getting the drive bays lined up with the ports that the OS reported they were attached to. This is important because if mdadm tells Dad that the sata disk on port x has failed, I need him to be able to know that it’s the disk in bay x. Unfortunately on the motherboard I used, Ubuntu assigns them like so:

0 – 1
1 – 3
2 – 2
3 – 4

(motherboard port – ubuntu port)

So while your motherboard may be better designed than mine, don’t assume they are in the same order. The links to the follow-up articles are below:

Hi Jesse, you’d be right I haven’t finished writing them. I was setting up the home server for my parents and unfortunately got busy planning a trip overseas. Can’t make any promises on when they’ll be finished, and it’s unlikely to be before 10.10. In the meantime there are other guides to setting up monitoring for mdadm, and the XBMC wiki is pretty good. Hope you can get it setup nicely.

ok thanx,
i have xbox setup on my media box and ubuntu streaming to it via smb shares, is this what you have done something fancier? id quite like to have a media server that is centrally controlled and remebers (on the server) where im up to. and doesnt stream via upnp.
was kinda hoping ur setup might do some of this?
iv enjoyed experimenting with the articles you have up so far, cheers :)

I think you’re confusing some terminology here, but the setup I did remembers where it’s up to thanks to a handy feature of XBMC – if you play a video that you stopped previously it will ask whether you want to resume or play from the start. It does this whether the video is on the local HDD or an SMB share. Obviously it will not resume if you then start playback from another client – the SMB server has no idea where you’re up to, nor is there any way for the new client to know.

What you probably want is something centrally controlled like MythTV, in which case I’d recommend Mythbuntu rather than what I have done here. XBMC is currently a one-machine solution.

I was reading some talk of a shared SQL database in the XBMC forums which may do something similar. In the very least it synchronised the media library and playback stats between multiple XBMC machines, but last I checked it required recompiling the code with some patches and was really only accessible to developers. Hopefully the feature makes it into a future release, but it’s the kind of feature that can impact a lot of the codebase so you’re probably better off using something built with it in mind – like MythTV.

Great article, Alex. There’s one feature I’d like to have in my server that I haven’t figured out yet, maybe you’ve got an idea, though. WHS has a pretty cool way of handling redundancy that’s not quite RAID, IIRC, I was able to set individual folders to be duplicated, and the data in those folders would be stored on at least two of the hard drives. The upshot is that I didn’t have to deal with actually setting up a RAID, and I was able to add hard drives of any size at any time and WHS just seemed to handle it. Any ideas for doing this with *nix?

The WHS solution is brilliant for that situation, as adding a drive to this setup requires both expanding the RAID and ensuring the backup volume has enough capacity (you could possibly have to add two). I’m not aware of anything that mimic this aspect of WHS on Linux, but Btrfs might be good enough once it’s stable.

WHS asks you to choose which folders you make redundant. The way I look at it, if it’s on the RAID volume it’s important, and if it’s not important enough to duplicate it can be stored elsewhere (on a non-RAID volume).

First, thanks for the thorough and fantastic article!! I’m referencing this quite a bit in my 10.4 home server. For reference, it’s a Dell poweredge 840, upgraded to a Pentium Dual Core, with 4GB RAM, 4x 1TB SATA drives for data, and 1x scsi u160 10krpm 36GB drive for OS. I have a few questions, from the perspective of someone that only occasionally dips his feet into linux:

I’m using 10.4 because of its LTS status. But I’ve read that 10.10 has improved support for cloud storage (and experimental brtfs, but that’s neither here nor there). I plan on using my fileserver for FTP and remote desktop (still torn between VNC & neatx). Do you have any experience or opinion on the use of such a server with cloud storage? Are the improvements in 10.10 not available as stand-alone updates for 10.4, and/or is it worth ditching the LTS operating system for those improvements?

Next, I’ve used ext3 fileshares from windows boxes without a problem. Even after reading about hard vs soft links, I’m not familiar with the practical use those links. Do you know what real-world limits or problems I’d encounter using ext4 for my data drive?

Additionally, I’m not concerned about drive readability in a windows box – if my OS drive or the PC goes, I “should” be able to read the drives from a live CD or new OS install (after re-creating the RAID), even if in another box. right?

Related, do you know if it’s possible to export the RAID configuration settings? I plan on keeping a PARTIMAG image of the OS drive on the data drive and backed up separately. Theoretically, I should be able to re-image a new OS drive and have the RAID properly configured, but I’d like the option of quickly re-configuring the RAID on a new install or while booted via live CD.

How is the performance of the eSATA drive during snapshot back-ups? How’s the performance of CompactFlash over IDE? Are you concerned about rewrite limits using the flash memory as the OS drive?

I feel like RAID6 is a good compromise between space and redundancy. Have you tested with other RAID configurations?

Finally, have you tested restoring a degraded RAID in your server’s configuration?

Hi Steven, apologies if I miss a question or two that was quite a comment!

Basically LTS vs the latest comes down to how “hands on” you’ll be – if you want to set, forget and keep it in service for several years it would probably be more prudent to use the LTS release. Otherwise the latest generally offers worthwhile improvements, and is what I’d use myself. Having said that however, I’ve only used Maverick on my laptop and a couple of test servers thus far, so I don’t know what improvements it has that are relevant to a media PC.

You can absolutely read the dives in another box, and even reassemble the raid because mdadm stores the geometry in a superblock. Basically all you would need to do is plug in all the drives, boot off a box with mdadm install (not sure if it’s on the live cd) and “mdadm –assemble –scan”. The only reason I used NTFS for the backup is to make it easier for my parents to read in the event of disaster (especially since I’m over the other side of the world). Any Linux live CD will read an ext3 or 4 volume.

Actually I haven’t done a partimg of the OS drive, even though that was part of the original intent. After running through the process I really didn’t see the point. My parents wouldn’t be able to do the reimaging any way and in 6 months there’s a new version which I’d rather use, so instead I’d rather document the process and make copies of any config files I alter. That way it’s easy to do again from scratch with a new version.

Performance of eSATA is excellent – it’s exactly the same as an internal SATA drive. But since the backups are incremental snapshots, USB2 is plenty fast enough as well.

CF over IDE is not especially fast, and not what I’d call slow either, but it does depend on the quality of the CF card you use. I actually ended up reverting to a SATA disk for this particular media PC, due to issues with the adaptors. I tried two different ones (both admittedly very cheap), and kept getting write errors. One of the newer SATA ones would probably be better. I did some googling on the issue of write wearing, and from what I read it would seem CF has some measures to reduce write wear, so I wasn’t too concerned about it. You could make the swap file small or put it on the raid array if you’re worried.

RAID6 is a good compromise, it just really needed an extra disk for this setup. If you have 8 or so disks it starts to make sense but with a 4 disk array I’d rather use RAID10.

Yes I have tested the array degraded, it works fine. I once had trouble adding a disk back in to the array so I deleted it, erased the partition tables on all the disks and recreated it from scratch only to find the file system still intact after assembly. So it can be pretty resilient and forgiving of mistakes, but if I’d recreated with a different block size or partition offset it would obviously have been hosed.