So I have a NAS with two different file systems, the primary filesystem has 33 TB of data. I rsync'd the data over to the second file system. df shows about 299 GB less data on the backup filesystem then the primary filesystem. I wrote a script to go through the file system and compare sizes via ls, and they match but df shows differnet size, why?

The backup filesystem is a cluster of 4.5 TB disk's, DEFAULT_BLOCKSIZE is 4096 and Readahead 512KB.

The primary file system is a cluster of 2.1 and 2.7 TB disks, DEFAULT_BLOCKSIZE is 4096 and
Readahead 2048KB.

Thanks.

pan64

11-06-2012 07:02 AM

you can use du to check the occupied disk space. Would be nice to see how rsync was configured/invoked. Do you have sparse files? Rsync is able to create hard links in some cases also.

rknichols

11-06-2012 09:55 AM

Best guess would be sparse files, but only if you used rsync's "-S" (--sparse) option. If you did use that option, any 4096-byte blocks that were all zeros will not use space on the destination regardless of whether they used space on the source. (Without that option, rsync will never make a sparse file, and the destination could use significantly more space than the source.)

Some other cases where less space is used on the destination:

Files that are unlinked from the directory tree on the source but are still held open by some process,

Files that are in the process of being written on the source and thus have some extra space pre-allocated,

Directories that once contained a large number of entries but now contain only a few. Space allocated to a directory file is never reclaimed until the directory is removed entirely. That unneeded space will not be allocated on the destination.

Of those, only the first is likely to involve any significant amount of space.

Afterthought: By any chance was the source a filesystem that supported transparent compression?

ptricky

11-07-2012 05:14 PM

hmmm... on the source filesystem there were a couple of directories that could not be deleted due to dangling inodes and it requires a fsck to fix it. So I think " Space allocated to a directory file is never reclaimed until the directory is removed entirely" is probably the likely cause.