File systems are semantically impoverished compared to database and keyword systems: it is time to change!

Let's face it, filesystems are semantically impoverished compared todatabase and keyword systems. Throughout the operating system,developers are fragmenting the OS into multiplicitous name spaces. Theyare doing it because file systems are failing them, failing to provideadequate answer to their needs. These incompatible name spaces are likeseparate nations with customs barriers, devastating efficiency byreducing the ability of components to interact. Let us not exclude byaccident from our society of objects like Microsoft excludes by design.Let us one piece at a time remove the reasons why the filesystem is notthe grand unifying namespace of the OS, let us let it conduct theinteractions of the OS components like the bus on a motherboard.

These features I would add are just a few trivial features, enough tostart with without biting off too much at a time, but there will be manymore. They are designed to be general enough that they will stand for along time, even in a richer syntax competitive with databases and IRsystems.

Enough of application developers implementing one sorry ass specialpurpose namespace after another. Enough of Emacs failing me when mylinux-kernel mailbox contains 12000 emails because emacs had to roll itsown namespace, and didn't have enough programmer resources to do it aswell as someone who is specialized in namespace and balanced tree designwould have. Let us do it right, and do it in the filesystem, and leavethe implementing of the same feature in 12 incompatible ways in 12applications to Microsoft. Or do we want to acknowledge Bill is rightwhen he says that Microsoft's top down organization creates advantagesdue to a unity of architecture that Linux cannot match?

It is time for us to stop foisting storage management work off onto theapplication, so that 100s of developers no longer have to do the work ofone FS developer because he is unconcerned with their needs.

Namespace design is the most important question in OS design, and unityof namespaces is to OS utility what road construction was to the RomanEmpire: the most important determinant of Roman wealth.

As for worrying about whether FAT and other filesystems will be able tosupport richer semantics, I say that while other FS developers may wishto constrain us to the least common denominator of FS functionality, I'dbe a fool to let them.

All of that said, it is important to design the overloading of file anddirectory names, and inheritance, and filters, and all of these otherfeatures, very carefully, because once we implement we are likely to bestuck with our errors for 15 years.

tytso@mit.edu writes: > > Before we go running into a deep technical discussion about how to > design different streams inside a file, we should first stop ask > ourselves how they will be *used*. > > Something that folks should keep in mind is that as far as I have been > able to determine, Microsoft isn't actually planning on using streams > for anything. As near as I can tell it was added so that their SMB > servers could replace Appleshare servers more efficiently, but that's > really about it. I don't believe, for example, that MS Office 2000 is > going to be using the streams functionality at all, and this is for a > very good reason. > > Streams really lose when you need to send them across the internet. How > do you send a multifork file across FTP? Or HTTP? What if you want to

Forgive me, but why is it harder to send a multifork fileacross the internet than to send a tar file? I don't doubt you, I amsure they found some way to screw this up, I am just ignorant of MS/Appleways.

> put the multifork file on a diskette that's formatted with a FAT > filesystem for transport to another OS? What if you want to tar a > multifork file? Or use a system utility like /bin/cp or /usr/bin/mc > that doesn't know about multifork files? > > One of the reasons why the Apple resource-fork was a really sucky idea > in practice was that executables stored dialog boxes, buttons, text, all > in resources --- which would get lost if you tried to ftp the file > unless you binhexed or otherwise prepped the file for transfer first.

But if you send a directory, and you use an old implementation of ftp,you have to tar it up first too.

I do however acknowledge that there is a general problem with virtualfiles (e.g. symlinks) which is that utilities that don't understand thatthey are virtual don't know how to pack them up and transfer them.Maybe we would benefit by some sort of general virtual file attribute,and a standard means for accessing them as either their virtual contentsor their non-virtual contents. Maybe it would be nice to have this besymmetric across all the different types of virtual files, so that tardoes not need to be taught new tricks every time a new virtual file typeis implemented.

> > So I question the whole practical utility of file streams in the first > place. The only place where they don't screw the user is if the > alternate streams are used to store non-critical information where it > doesn't matter if the information gets lost when you ftp the file or > copy the file using a non-multi-fork aware application. For example, > the icon of the file, so the display manager can more easily figure out > what icon to associate with the file --- and of course, some people > would argue with the notion that the icon isn't critical information, > and that it should be preserved, in which case putting it in a alternate > stream may not be such a hot idea. > > However, for speed reasons, a graphical file manager might do better to > have a single file that has all of the icons cached in a few dot files > (for security reasons, you will need a different dot file for each user > who owns files in a directory). Said dot file would have information > associating the name of the file, the inode number and mod time with the > icon. If the icon cache is out of date, and an file appears in a > directory without also updating the icon cache, the graphical file > manager will have to find some way of determining the right icon to > associate with the file. (But, this is a problem the graphical file > manager would have to deal with anyway). The advantage of using a few > dot files in each directory is that it will result in a lot fewer system > calls and files needs that need read and touched than if the graphical > file manager has to open the icon resource fork in each file just to > determine which icon to display for that one file. So I don't even buy > the argument multifork files are required to make graphical file > managers faster; a few dot files in each directory would actually be > more efficient, and would work across non-multi-fork aware remote > filesystems like NFS. I don't think a graphical file manager that only > worked on specialized filesystems would be all that well received!

Eeewwwww Yuck! Make ext2 work effectively for small files so you don't have to dothis sort of thing, please....

Or contribute to reiserfs.... you know, it isn't so bad if you admitthat we do some things well, and add code where we don't do thingswell. We could certainly benefit from your contributions.

It is time for filesystems to come out of deep freeze, it is time forthem to aggressively compete with databases and keyword systems, andwithin 30 years I intend to see them more powerful than any Oracledatabase. You see, we have an advantage. We have less baggage thanthey do. Less functionality present now means that we can do thingsright, which they cannot do because of being tied to relational algebra.

> > So before we design filesystems that support multi-forks, let's please > think about how they will be used, and how they will interact with > current systems that don't really support multiple forks, and in fact > are quite hostile to the whole concept. What's the point of being able > to treat a filesystem object as both directory and a file if none of the > system utilities, file formats (like tar) and internet protocols don't > really support it? Does it really buy us enough to be worth the effort? > And if we don't know exactly how it will be used, how will we know what > sort of performace/feature tradeoffs we need to make before it will be > useful? > > - Ted

I believe that if we implement a protocol for virtual files generally,it will make things simpler and cleaner, and deal with the issues youhave raised.

Ted, I would like to take a moment to thank you very much for anexcellent filesystem that I have used for many years.

Hans

-To unsubscribe from this list: send the line "unsubscribe linux-kernel" inthe body of a message to majordomo@vger.rutgers.eduPlease read the FAQ at http://www.tux.org/lkml/