If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

The Btrfs File-System Repair Tool Is Available

02-21-2012, 08:20 AM

Phoronix: The Btrfs File-System Repair Tool Is Available

After writing about Btrfs LZ4 compression support and that the Btrfs FSCK tool wasn't available, it turns out that there is the new Btrfs repair tool, but it's not widely known and it's not recommended to ever use it -- at least at this stage...

While some of the items merged are very handy and useful, anything that can break a filesystem should be used with extreme care in lab conditions. I've applied the new restriper and the new parser code to my btrfs-progs on a NAS system I've built, but the fsck I will not touch. I figure if I have a filesystem that's become corrupt at this stage in the BTRFS development cycle, they'll probably want to see it before I try to fix it so they can find out what happened and maybe fix a bug.

But as far as the time it's taken BTRFS to get to this point, I'm actually thinking it's right on track. These days filesystems have so many features internally that need to be designed, written and tested that it's a serious undertaking. Add on top of that all of the externally-facing features modern operating systems expect a filesystem to have, and you've got a substantial project on your hands. Now, add in the fact that you're developing a way to store and retrieve data that could be critical and/or expensive and you've raised the bar well beyond where it's ever been before.

When the ext2 filesystem was first written, there weren't as many interfaces, drives were considerably slower, caching and scheduling systems were not nearly as complex or intelligent as they are now, and let's face it- Linux wasn't very popular. Now with things like SSDs becoming ever more common place, massive hardware support, bad behaviors being worked around, and the features an enterprise like Oracle will want...I wouldn't want to be starting a new filesystem project.

Anyway, hopefully after Oracle ships a distro with BTRFS as the primary filesystem we will see a large wave of adoption and code maturity (not to say the code isn't mature...but more users = more corner cases). I'd love to have an in-kernel filesystem capable of closing the feature gap on ZFS!

P.S. The community on the BTRFS mailing list is probably the most approachable and friendly community I've seen in a very long time!

Comment

There is one thing I don't understand with all this "changes the filesystem so can break it" discussion leading to not releasing code for a repair tool.
Why the hell didn't they just wrap it in a command line utility that first dd's the filesystem to be prepared to some image and does the repair work there? Then it's absolutely safe because it doesn't touch the original filesystem and you can release early and often with the only limitation being that the user needs to have access to some large filesystem to store the image, which in some cases (with really large filesystems) is a problem but generally it's quite reasonable to assume that it's possible to get hands on some large external hard drive for many systems.
So that's also what I'd do with a broken Btrfs first dd it to some image and then use these tools, then loop mount it and be happy if it works while having lost nothing but a few hours of work if it doesn't

Comment

There is one thing I don't understand with all this "changes the filesystem so can break it" discussion leading to not releasing code for a repair tool.
Why the hell didn't they just wrap it in a command line utility that first dd's the filesystem to be prepared to some image and does the repair work there? Then it's absolutely safe because it doesn't touch the original filesystem and you can release early and often with the only limitation being that the user needs to have access to some large filesystem to store the image, which in some cases (with really large filesystems) is a problem but generally it's quite reasonable to assume that it's possible to get hands on some large external hard drive for many systems.
So that's also what I'd do with a broken Btrfs first dd it to some image and then use these tools, then loop mount it and be happy if it works while having lost nothing but a few hours of work if it doesn't

This would only work in the case where BTRFS is on a single drive. One of the biggest advantages of BTRFS is that you can create a single filesystem on multiple devices. So, you'd have to DD several devices, change the metadata inside the BTRFS filesystem so it knows what devices to look for (now loop files essentially), and then fsck them. Once that's done, you'd have to do the reverse to get the data back.

Assuming you have a simple RAID10-like setup with four drives of 1TB, you'd have to DD 4 TB of data twice. Even on 6G SAS / SATA that would take an absolute minimum of 3.7 hours at full channel speed. There are no 1TB drives that can actually get near this speed, so it's an impossible goal. But it proves the point.

Finally, the point of an FSCK tool is to be vary fast. The assumption is that a server providing a critical service is down, and you have uptime guarantees to meet. So, taking the time to DD data off a server simply isn't acceptable. For a home computer, maybe. But BTRFS really shines in an enterprise environment, so their goals aren't really focusing on a desktop/laptop situation.