Search This Blog

check integrity of ZFS send streams with zstreamdump

If you wander through the OpenSolarisZFS-discuss archives or look at the ZFS Best Practices Guide, then you can encounter references and debates about whether the zfs send and zfs receive commands are suitable for backups. As I've described before, zfs send and zfs receive can be part of a comprehensive backup strategy for high-transaction environments. But people get nervous when we discuss placing a zfs send stream on persistent storage. The reasoning is that if the stream gets corrupted, then it is useless. There is an RFE open to improve the robustness of zfs receive, but that is little consolation for someone who has lost data.

The fundamental design of ZFS is exposed in zfs send -- the send stream contains an object, not files. This is great for replicating objects, and since ZFS file systems and volumes are objects, it is quite handy. This is why zfs send and zfs receive do not replace the functionality of an enterprise backup system that works on files. So, I expect the technologies to remain complementary for a very long time.

But there are some simple things which can improve management of zfs send streams. It is a good idea to check the integrity of the stream when stored on permanent storage before you try a zfs receive, or just to sleep better at night. You can do that by telling zfs to not actually apply the receive using the "-n" option to zfs receive, but this only returns a boolean response. Something more concrete and descriptive would be nice...

OpenSolaris build 125 brings the zstreamdump(1m) command, that allows you to examine the contents of a zfs send stream. To demonstrate, I made a quick (diving) pool called "zdiving," copied some data to it, made a snapshot saved as a file, and ran zstreamdump. Observe:

Comments

I am very interested in this, however, I confirmed one piece of wrong info you have in this blog -

If I store a zfs send, and then I intentionally corrupt it (toggle bits in the middle) and then I pipe it to "zfs receive -n" ... unfortunately I get a 0 exit status regardless of whether the data stream is actually good or bad.

In order to get a "1" exit status, the -n cannot be used. Which means of course, truly doing the restore and writing it to disk.

Hi, just discovered the inverse of what rahvee said - zstreamdump does not detect a modified stream (simply remove a few trailing bytes at the end of it..) and happily accepts it and exits with status 0, while zfs receive refuses it and exits with 1, so it would be impossible to recover such stream, tested on snv_133, now who/what to believe on checking the stream's integrity ?

Richard, ok, filed a bug in bugster, very simple to reproduce, take any stream and copy the very 1st byte from it into a new file with dd, then pipe it into zstreamdump - it will exit with status code 0.

Post a Comment

Popular Posts

Today, we routinely hear people carrying on about IOPS-this and IOPS-that. Mostly this seems to come from marketing people: 1.5 million IOPS-this, billion IOPS-that. Right off the bat, a billion IOPS is not hard to do, the metric lends itself rather well to parallelization...

This post is the first in a series looking at the use and misuse of IOPS for storage system performance analysis or specification.

Let's do some simple math. We all want low latency -- the holy grail of performance. In the bad old days, many computer systems were bandwidth constrained in the I/O data path, so it was very easy to measure the effect of bandwidth constraints on latency. For example, fast/wide parallel SCSI and UltraSCSI was the rage when the dot-com bubble was bubbling, capped out at 20 MB/sec. Suppose we had to move 100 MB of data, then the latency is easily calculated:

ZFS now offers triple-parity raidz3. Conceptually, raidz3 is an N+3 parity protection scheme. Today, there are few, if any, other implementations of triple parity protection, so when we say "raidz is similar to RAID-5" and "raidz2 is similar to RAID-6" there is no similar allusion for raidz3. I prefer to say "raidz3 is like raidz2 with one additional level of parity protection. But how much better is raidz3 than raidz2? To help answer that question, I used the simple Mean Time to Data Loss (MTTDL) model to calculate the data retention capabilities of the possible configurations of 12 disks under ZFS. To be fair, the same model applies to other RAID implementations, but I'll use the ZFS terminology here.

In this MTTDL model, the configuration includes N total disks. If the data protection scheme is raidz3, then the minimum N = 1 data disk + 3 parity disks = 4. You can add more data disks to increase the overall available space, so if N=6 then you have 3 data…