On Mon, Aug 04, 2003 at 03:56:04PM +0200, Stephan von Krawczynski wrote:> On Mon, 4 Aug 2003 15:44:15 +0200 Andries Brouwer <aebr@win.tue.nl> wrote:> > > > Quite a lot of software thinks that the file hierarchy is a tree,> > if you wish a forest.> > > > Things would break badly if the hierarchy became an arbitrary graph.> > Care to name one? What exactly is the rule you see broken? Sure you can build> loops, but you cannot prevent people from doing braindamaged things to their> data anyway. You would not ban dd either for being able to flatten your> harddisk only because of one mistyping char...

Only root can 'dd' to my disks, but anyone can do 'mkdir a; ln a a/a'and get even the simple things really messed up. You can't even rm -rfit anymore.

Currently rm doesn't need to concern itself about loops. If somethingdoesn't go away, it is a non-empty directory that needs to be traversed,and suddenly it has to track inodes and device numbers to identifypotential cycles in the path. Hundreds of simple application suddenlybecome more complex. Can you imagine 'rm' running your machine out ofmemory on a reasonably sized tree.

Also all 'hardlinked name entries' point at the same object. However,'..' could be the parent directory of any one of the name entries, butclearly not more than one at any time. So we actually have two (or more)objects that do not have the identical contents but share the samesupposedly unique object identifier (inode number).

Should we be allowed to 'unlink' a directory that is non-empty? But thenthe rmdir has to check all children, count the number of directories(i.e. fetch the attributes of all children) to compensate for theadditional linkcounts added by the '..' entries. And to avoid a racewith another unlink, or changes to the directory while we are traversingit this has to happen while all access to the directory is locked out.Not very scalable, especially when your directories do containtens-of-thousands of users and gigabytes of data.

Now if we don't allow the unlink for non-empty directories it ispossible to create a loop that can never be removed. ln / /tmp/foo

Typical optimizations for directory traversal make use of the fact thatthe child directories increase the link count. When reading directoryentries and (linkcount - 2) directories have been seen we know all otherentries are files. The additional references make this optimizationuseless.

NFS exporting a filesystem is another good example. NFS is stateless andcurrently identifies files with a cookie that contains the (device/ino)pair. But because inode numbers are no longer unique because we need toknow who the parent is, but the parent ino number also isn't uniqueanymore, so we need to pass a list of device/[list of all parents]/ino.

As a result we're no better of compared to sending fixed pathnames overto the NFS server, and anyone who moves a directory from a to b breaksall other clients that happen to have a reference to a file or directoryanywhere in the tree below the recently moved directory.