If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

Have also looked forward to this, itīs one of the major features that have been missing.
This together with Btrfs ability to change RAID levels, add new disks to an existing array on-the-fly and rebalance is really nice. Here Btrfs has a real edge over ZFS which is very static once created.
According to the Btrfs FAQ there are also plans for per-subvolume and per-file RAID levels which will become handy in some cases

RAID 1 has no two disk limitation. It appears that btrfs does from Chris Mason's email. You can read It for performance information.

With that said, Chris claims to have implemented a LRU stripe cache. That would imply double buffering with the Linux disk cache, which wastes memory. He also says that this is a clone of what md raid already does, which implies that it suffers from the RAID write hole. It also means that the md raid code is being duplicated, which is generally a bad thing for maintainability.

I looks like I was a little unclear here. (And a little mistaken!) With ordinary RAID1, every disk in the system is a copy of the others. I had been given the impression that btrfs RAID1 just means "put everything in two places". So, on my system with btrfs RAID1, my four 2TB drives would give me 4TB of storage, while ordinary RAID1 (assuming I found an implementation that would accept more than two disks) would only give me 2TB. However, a check with "btrfs filesystem df" tells me that I only have 1.30 TB of space! Crap. Well, at least btrfs's flexibility should let me easily switch to RAID5 or RAID6 in the future.

Also, given that the performance test results were just sitting there in the email, they really should have been included in the article. I'm a bit disappointed, though, that Chris only compared btrfs RAID to MD RAID. I would have liked to see btrfs RAID1 compared to btrfs RAID5/6.

But what about RAID7 (like raidz3 in zfs) and support for ditto blocks?

RAID 7 is not a standard RAID level. As for ditto blocks, I was told by the btrfs developers that they already have something equivalent to ditto blocks. btrfs also lets people turn them off, which makes btrfs' disk format vulnerable to ghost writes.

Originally Posted by benmoran

I'm really looking forward to when this hits mainline. I don't know if this was stated yet, but another advantage of file system aware RAID like btrfs is potentially faster scrubbing. You only need to scrub the blocks that are occupied by file data, which is awesome for massive arrays. Another advantage to btrfs raid is that it's super easy to mount. You can just mount whichever disk you want in the array, and the entire array is assembled automatically. This can certainly have its uses.

Scrubs should verify both data and parity. Otherwise, you can have inconsistent parity that would cause failed rebuilds following disk failures.

Originally Posted by LasseKongo

Have also looked forward to this, itīs one of the major features that have been missing.
This together with Btrfs ability to change RAID levels, add new disks to an existing array on-the-fly and rebalance is really nice. Here Btrfs has a real edge over ZFS which is very static once created.
According to the Btrfs FAQ there are also plans for per-subvolume and per-file RAID levels which will become handy in some cases

I would say that ZFS has an edge. Performance is more important than the ability to reshape. ZFS's raidz is immune to the RAID write hole. That gives ZFS a considerable performance advantage over both software and hardware RAID 5/6. btrfs raid 5/6 might outperform MD RAID 5/6 in benchmarks, but btrfs RAID 5/6 will still incur a performance penalty from the RAID write hole. What you gain in the ability to reshape with btrfs, you lose in performance. You feel performance throughout the time that a system is deployed, while reshaping is incredibly rare.

RAID 7 is not a standard RAID level. As for ditto blocks, I was told by the btrfs developers that they already have something equivalent to ditto blocks. btrfs also lets people turn them off, which makes btrfs' disk format vulnerable to ghost writes.

Yes, RAID7 is not a standard RAID level. I just made it up to denote a RAID level with block-level striping with triple distributed parity instead of double distributed parity as with RAID6. ZFS implements this kind of parity in raidz3 which yields a fault tolerance of up to three failed drives in a storage pool. If ZFS can implement it without a hitch then so can btrfs.

To be honest, I can't see why anyone even would be interested in using RAID5 at all, especially when reading what's on the 'baarf.com' website. There is no sensible reason to go lower than at least a two drive fault tolerance when building a storage pool (or a disk array if you will), disk drives are really not that expensive. So, RAID6 or "RAID7" is way more desirable.

I'm not sure if ditto blocks is the *only* way to protect against "ghost writes" or even should be necessary to use on e.g. a raidzn or a fully mirrored storage pool for full protection against silent corruption. But for disk pools where you don't use any type of RAID (e.g. the system pool or rpool), ditto blocks have turned out to be invaluable. So if you don't want to say, mirror your system pool then ditto blocks is a viable option if you want at least some level of protection.

If I remember well, ZFS is even more memory hungry. Furthermore, it even does not really uses extents and as the result, overall filesystem design seems to be so slow that it could only perform reasonably well with gigabytes of memory buffers. Else it's getting painfully slow. And most tuning/install guides suggest not even to try ZFS without ~2Gb or so. Btrfs appears to be less problematic in this regard.

He also says that this is a clone of what md raid already does, which implies that it suffers from the RAID write hole.

I guess that if it would be really per-file or per-subvolume, this would be much less problem, especially in mixed/multithreaded/... environments.

Alsi ability to reshape could be rarely used but you see, when you have to reshape many TiBs, it would be great option to have. Just because it's very troublesome operation otherwise. And btrfs seems to be really flexible when it comes to allocating things, one way or another. And Sun always preferred to publish far more of colorful marketing bullshit than actual design worth of. Sure, they had a business, but they've achieved that now I'm take their achievements with a grain of salt.