Posted
by
timothy
on Thursday June 09, 2011 @10:06AM
from the taking-a-bold-line dept.

dkd903 writes "According to proposals for Fedora 16, Btrfs will be the default filesystem used in that release. The proposal has been approved by the Fedora Engineering Steering Committee. In Fedora 16, the switch from EXT4 to Btrfs will be a 'simple switch' — it means that major Btrfs features such as RAID and LVM capabilities will not be forced onto users."

Summary: We like it when you test new stuff for us, and our customers are clamoring for this filesystem in RHEL, so we're going to let you try it out on Fedora for a while and experience the hiccoughs. And speaking of new stuff, we're going to finally get around to moving up to grub2 like everyone else, which we haven't bothered to implement even though it's much better, and we allegedly like new stuff.

For some reason I'm getting really low performance on btrfs, both on a single disk and on raid1 configurations. I have tried with -nodatacow and with and without -compress, but it seems it doesn't have any effect. Also, I have 90 gigabytes of free space on Storage1 but I get drive full error when I try to write there. Rebalancing it didn't fix the issue. The btrfs command-line tool is, well, rather incomplete and somewhat buggy, like e.g. when I query 'btrfs fi df/media/Storage2' -- with Storage2 being the raid1 pool -- it reports the size and usage of the smallest disk on it, not the whole thing. I don't understand why. I also have had some filesystem corruption which caused me to lose quite a bit of data, and again the only way to fix it was rebalancing the whole thing which takes the whole damn day.

I do understand that it's a filesystem that's still under development, but the tools atleast need a lot more work. They're just too incomplete at the moment. I'm not really sure pushing it as the default filesystem for end-users is a good idea yet.

Is there a need an fsck? For example, ZFS doesn't have one and I haven't heard of anybody working on it (or of anybody actually wanting one).

Um, yeah, read zfs-discuss. There are helpful folks on there who help people recover their ZFS volumes, but having a tool to do it would be much better.

fsck for ZFS or btrfs means something different than it does for ext* but it's still needed. I just had a client's new 18TB ZFS zvol go TU when the power failed and the UPS->host communication wasn't properly connected. Fortunately it wasn't very important and the important zvol wasn't active when the power failed.

btrfs will be better than ZFS for many use cases once the fsck is stable. For others ZFS will remain better, but you better have battery-backed disk cache or a monitored UPS (neither of which are appropriate requirements for large swaths of the Fedora user base).

I think the bigger issue with btrfs vs mdadm and ext4 will just be maturity. Btrfs repair tools just haven't evolved to the same place the other tools are at, but there is no fundamental reason why they won't eventually make it there.

As you say btrfs has the advantage that it knows something about how the space is used, so if space was just free it could just nuke it and not worry about trying to salvage garbage data. It could also use free space to leverage recovery. And, the concept of COW means that strips are less likely to be in a transitory state of having meaningful data partially overwritten by other meaningful data, since the filesystem would first try to write the stripe over unused space leaving a fully intact backup if it is interrupted and the array is already degraded/etc.

The last time I tried test-converting an existing ext4 into a btrfs on RAID it paniced, but it has been almost a year now...

I'd like to see btrfs implement a proper block tiering system. They're doing something for storing "hot" blocks on SSD, but what about giving us the full monty? Where I can rank storage types myself, assigning a different cost to each type. Hotest blocks in RAMdrive (battery backed of course), next step down fast SSD and then slower SSD, followed by Fibre, SAS, SATA and finally tape. Yes tape. Just create snapshots as backups. These blocks then sit there and drift down to tape storage when required.

Funny how this has all been done before when disks were really slow. I suppose it's the big gap of incredibly fast SSD's (compared to mechanical) that's resurrecting these ideas. With this done, btrfs could be stuck in as a relatively cheap SAN/NAS solution. All done in a big tower case in my loft.

That would still mean a shitload of data to copy around. The point of copy-on-write snapshots is that the cost of copying your whole file system is essentially zero.

To make an normal application analogy: Of course you don't really need Undo/Redo, you can just "Save As" your file at any step of the way, but Undo/Redo makes things a hell of a lot more convenient and more importantly it helps you in those situation where you would have considered a "Save As" to be to much work to bother with and thus have no backup to fall back to.

Thanks for the correction, as that definitely is a notable difference.

Now, if we can get a filesystem that supports autotiering (where it knows which array is SSD, which is spindles and places data accordingly due to times accessed), that would be great. Outside of EMC's offerings, I don't know any really available.