Companies

Links

Meta

Tag: Linux

Ext3 is an successor of Ext2 with support for journaling which means it can have a log of all it’s recent changes it made or is going to make to the file system. This allows fsck to get the file system back in a good state when a power failure happens for example. But what is the size of the journal? Reading the manpage for tune2fs it says it needs to be between 1024 and 102400 blocks which means it can start with 1MB on a file system with a 1KB block size and 4M on a file system with 4KB a block size.

So let start to see which inode contains the journal and normally this should be inode 8 unless you have a file system that was upgraded from Ext2 to Ext3/4.

Now that we have the inode responsible for the in file system journal we can retrieve it details by doing a stat() with debugfs for that inode. Debugfs retrieve the details from the inode and on of them is de allocated size on disk.

In a previous posting I started with BtrFS and as mentioned BtrFS supports snapshotting. With this you can create a point in time copy of a subvolume and even create a clone that can be used as a new working subvolume. To start we first need the BtrFS volume which can and must always be identified as subvolid 0. This as the default volume to be mounted can be altered to a subvolume instead of the real root of a BtrFS volume. We start with updating /etc/fstab so we can mount the BtrFS volume.

As /media is a temporary file system, meaning it is being recreated with every reboot, we need to create a mountpoint for the BtrFS volume before mounting. After that we create two read-only snapshots with a small delay in between. As there is currently no naming guide for how to call snapshots, I adopted the ZFS naming schema with the @-sign as separator between the subvolume name and timestamp.

Creating snapshots is fun and handy for migrations or as on disk backup solution, but they do consume space as the delta’s between snapshots is being kept on disk. Meaning that changes between the snapshots are being keept on disk even when you remove them. Freeing diskspace will not only be removing them from the current snapshot, but also removing previous snapshots that include the removed data.

As last step we unmount the BtrFS volume again. This is where ZFS and BtrFS differ too much for my taste. To create and access snapshots on ZFS the zpool doesn’t needs to be mounted, but then again with the first few release of ZFS the zpool needed to mounted as well. So there is still hope as BtrFS is still under development.

$ sudo umount /media/btrfs-datavol

Seeing what is possible with BtrFS, Sun’s TimeSlider becomes an option. Also the option of Live Upgrades with rollbacks as is possible with Solaris 11, but for that BtrFS with read-write snapshots needs to be tested in the near future.

After using ZFS on Solaris, I missed the ZFS features on Linux and with no chance of ZFS coming to Linux I had to do with MD and LVM. Or at least until BtrFS became mature enough and since the Linux 3.0 that time slowly has come. With Linux 3.0 BtrFS supports autodefragmentation and scrubbing of volumes. The second is maybe the most important feature of both ZFS and BtrFS as it can be used to actively scan data on disk for errors.

The first tests with BtrFS where in a virtual machine already a longtime ago, but the userland tools where still in development. Now the command btrfs follows the path set by Sun Microsystems and basically combines the commands zfs and zpool for ZFS. But nothing compares to a test in the real world and so I broke a mirror and created a BtrFS volume with the name datavol:

$ sudo mkfs.btrfs -L 'datavol' /dev/sdb2

Now we can mount the volume and create a subvolume on it which we are going to be using as our new home volume for users homedirectories.

When updating /etc/fstab we can tell mount to use the volumename instead of a physical path to a device or some obscure UUID number. Also you can tell which subvolume you want to mount.

LABEL=datavol /home btrfs defaults,subvol=home 0 0

After unmounting and disabling the original volume for /home we can mount everything and copy all the data with rsync for example to see how BtrFS is working in the real world.

$ sudo mount -a

As hinted before scrubbing is important as you can verify that all your data and metadata on disk is still correct. You can do a read-write test by default or only read test to see if all data can be accessed. There is even an option to read parts of the volume that are still unused. In the example below the subvolume for /home is being scrubbed and with success.

The first glances of BtrFS in the real world are a lot better with kernel 3.1 then somewhere with kernel 2.6.30 and I’m slowly starting to say it becomes ready to be included in RHEL 7 of Debian 8 for example as default storage solution. The same as ZFS became in Solaris 11. But it is not all glory as still a lot of work needs to be done.

The first is encryption as the LUKS era ends with BtrFS as it is not smart to put it between your disks and BtrFS. You lose the advantage of balancing data between disks when you do mirroring for example. But then again LVM has the same issue where you then also first need to setup software raid with MD with LUKS on top of it and LVM on top of that. For home directories EncFS maybe an option, but it still leaves a lot of area’s uncovered that would be covered by LUKS out of the box.

The second issue is the integration of BtrFS in distributions and the handling of snapshots. As for now you first need to mount the volume before you can make a snapshot of a subvolume. The same for access a snapshot and for that I think ZFS still has an advantage with the .zfs directory accessible for everyone who has access to the filesystem. But time will tell and for now the first tests look great.