Exploring new tools to add to my toolbox

FreeNAS USB Flash Boot Drives: Mirroring For Fault Tolerance.

FreeNAS encourages the use of USB flash drives as the operating system boot drive. This allows FreeNAS to dedicate all of the motherboard SATA connectors for data storage drives. I didn’t think commodity USB flash drives are trustworthy enough to hold the operating system, but I was willing to experiment and be proven wrong.

The very first night, I got worrying news from the nightly system check:

pool: freenas-boot
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using 'zpool clear' or replace the device with 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-9P
scan: scrub repaired 3K in 0h10m with 0 errors
config:
NAME STATE READ WRITE CKSUM
freenas-boot ONLINE 0 0 0
da0p2 ONLINE 0 0 11
errors: No known data errors

Looking on the bright side, “No known data errors” is comforting, as is the “repaired […] with 0 errors”. It’s nice FreeNAS was able to repair whatever was wrong with my USB stick. I suspect inexpensive commodity USB flash drives frequently encounter errors that are silently corrected by the operating system. Still, an error is an error and it’ll only be a matter of time before I run into a serious problem.

Fortunately, FreeNAS authors had the foresight to make sure a bad boot device does not become a single point of failure. A second one can be added to the system act as a mirror to the boot device. If either of them fails, the other can take over.

Much to my dismay, the second USB stick I tried also encountered a data checksum error. I didn’t have much luck figuring out how to interpret the checksum error code, but I did learn that it is supposed to be zero. The first stick returned 21, the second 26.

I tried a third USB stick and was relieved to finally see a zero checksum. The output below was generated when I ran ‘zpool status’ while the third stick is in the middle of replacing the second stick.

Now both boot drives in the mirror set have zero checksum error, but the mirror volume overall still has checksum error 21 from the first USB stick. I’m still learning if that means anything (bad) and what it would take to reset that to zero.