The content of 2010-12-02 is mostly hardlinks back to files in the 2010-11-28 directory, but there are a few new or changed files only in 2010-12-02. On linux, the 'du' utility will tell me the actual size taken by each incremental snapshot. On Windows, explorer and du under cygwin are both fooled by hardlinks and shows 2010-12-02 taking up a little more space than 2010-11-28.

Is there a Windows utility that will show the correct space acutally used?

3 Answers
3

Try using Sysinternals Disk Usage (otherwise know as du), specifically using the -u and -v flags will only count unique occurrences, and will show the usage of each folder as it goes along.

As far as I know the file system doesn't show the difference between the original file and a hard link (that is really the point of a hard link) so you can't discount them on a folder-by-folder basis, but need to do this comparatively.

To test I created a random folder with 6 files in to. Cloned the whole thing. Then created several hard and soft links inside the first folder to reference other files in the first folder, and also some in the second.

Running du -u -v testFld results in (note the values next to the folders are in KiB):

Notice the mismatch?
The symlinks in A that refer to files in B are only counted against A during the "full" run, and B only returns 54 (even though the files were originally in B and hard-linked from A). When you measure B seperately (or, if you don't use the -u unique flag) it will count its "full" measure of 74.

Thanks, I didn't know about the sysinternals du, just the cygwin one. Apparently the cygwin du does what I want as well, I just didn't think to try it before starting the bounty.
–
kbyrdDec 13 '10 at 15:28

Hardlink support is not turned on out of the box: go to Tools > Options > Scan, re-scan, then use Ctrl-1 and Ctrl-2 to switch between Size and Allocated space. Allocated is actual space used, while Size is the statistic normally reported by other programs.

There is a performance penalty for turning on hardlink support (and symlinks and mounts too if you want that also). The colour palette is garish for my taste, but that seems to be par for the course in this genre. Also be careful when clicking around in the box chart area -- it's easy to accidentally move a folder with a mistaken drag-n-drop when you only meant to expand it.

Windows cannot "detect" hardlinks, since every file is actually a hardlink to a bunch of bytes on the disk.

The du tool detects duplicates, but that is false too, since if folder A contains files and B only contains hardlinks to the files in A, then du of A and du of B will return the same answer - the size of the files coming originally from A, but these files are now also in B.

This is actually correct, since for example if you deleted A then its files will not be deleted
on the disk, because they are still referenced by B. With hard-links, which file is the source and which one is the hard-link is quite arbitrary and meaningless.

Products such as du will list a directory while discounting duplicates.
This will only work if all files and hard-links are contained in one directory.
Many folder-list products do that.

Conclusion: With hard-links, the question of "the actual size used in an NTFS directory" is meaningless.