"For a fairly scruffy looking guy, I have a surprisingly healthy approach to organising my files. However, I'm constantly pushing up against the limitations of a system that is based around directories. I'm convinced that Linux needs to make greater use of tagging, but I'm also beginning to wonder if desktop Linux could abandon the hierarchical directory structure entirely."

First, we need transactional file systems. There is really no good reason not to have a transactional file system. It would make things like updates, installations, and removals much simpler. It would also make a lot the common synchronization hacks unnecessary. The thing is, this really isn't that hard. I created a very primitive transactional file system prototype for Linux some months ago, but I haven't had time to finish it (I plan on basing it on Btrfs). Any user could do transactions, and they would never block. The basic algorithm was that if a transaction wanted to write to something that was being read, it would be canceled, and if it wanted to read something that was being written, it would be cancelled.

Second, we need indexing of extended attributes. BFS got this right. My music should just be a folder with a bunch of files that have metadata. There should be no database. I should be able to search for songs with complex logical queries, not just simple text searches like you would find in a standard music player (e.g. iTunes, Rhythmbox).

Personally, I believe tagging is secondary to all of this. My mind naturally categorizes things hierarchically, but I have had times when I wished a file could be in two folders.

I am quite sure that the reason that none of these ideas have been implemented is not because they are hard, but because people stopped caring. File systems have hardly changed since the 1980s (the interface, not the implementations). I think the biggest problem with Linux is that most people are focused on creating a shiny interface, when the system below is inelegant and full of hacks. Of course, every major OS is like this, but I think it shows more in Linux. This is an area where Linux could really innovate and be better than Windows and Mac OS X.

This has been possible for ages. It's supported by all POSIX-compliant OSes, plus Windows NT4+. It's called a hard link.

Yes, I know about hard links. However, I have to get out the command line to create them.

Also, I would generally like it so that if I delete the file from one directory, it would disappear from all the others too. Hard links don't work like that. (You could do that with symbolic links, but you would be left with broken links.)

unless you have come up with a magical method of concurancy, there will always be blocking unless you take on mutli- versioning but that brings with it its own issues to think through.

Nope, there is no blocking. Whenever a transaction would normally block, it is aborted. If two transactions are competing, the one with the higher priority always wins. Regular file operations are treated as transactions with infinite priority, so they are never aborted or blocked for transactions.

One thing I forgot to mention: file names. We really need to stop relying on names to locate files. Something like a UUID would be much better. The name would solely for display purposes, and would just be a regular indexed extended attribute. Links would reference the UUID, not the name. The entire file system would essentially be a giant database. You could query the file system based on any attributes, and the result would be a list of UUIDs. You could then open a file through the UUID. Directory structures could be implemented using a parent attribute that would refer to the "directory" (really a file) containing a file. To get a listing of the files in a directory, you would query for all files with a parent attribute equal to the directory's UUID. Tagging would be implemented in a similar way.

Unfortunately, this is a bit harder to implement. The major problem is dealing with broken links. If you delete a file, do all the references to it go away, or stay broken? Would it be possible to create a file with a specific UUID in order to fix a broken link? These problems are a lot harder to solve, so I would not expect to see a system like this for a long time. It is somewhat similar to WinFS. Does anyone know how WinFS solves these problems?

Only if by "we" you mean Linux. Non-Linux systems have had transactional filesystems for years now (ZFS, HAMMERFS), and support for versioning in the filesystem (VMS).

There is really no good reason not to have a transactional file system. It would make things like updates, installations, and removals much simpler.

You're right, it does. ZFS snapshot your filesystem(s), do your updates. If it fails, roll-back the snapshot and carry on. If it succeeds, you either keep the snapshot just-in-case, or you delete it. Works beautifully, even across full OS upgrades.

Second, we need indexing of extended attributes. BFS got this right. My music should just be a folder with a bunch of files that have metadata. There should be no database.

Uhm, what do you call your index, if not a database?

Personally, I believe tagging is secondary to all of this. My mind naturally categorizes things hierarchically, but I have had times when I wished a file could be in two folders.

Some kind of tagging or EA system would be nice, for just this reason. After using GMail and Zimbra for the past couple of years, it's nice being able to physically store messages in a hierarchical manner, but also access them via multiple "folders"/tags where appropriate. And having saved searches (virtual folders) that refresh each time you go into them is absolutely wonderful; something I've missed from GUI file managers like Dolphin.

When people think "database", they might think of a userland process, some kind of metadata storage on -top- of a traditional filesystem and some periodic indexing process. BFS indices (in BeOS and in Haiku) are an integral part of the filesystem. Indexing happens in the filesystem (is done by the filesystem) at the exact time when attributes are created/altered. There is no periodic indexing process, and there is no separate metadata storage. (Which could potentially get out of sync with the target files.)

Only if by "we" you mean Linux. Non-Linux systems have had transactional filesystems for years now (ZFS, HAMMERFS), and support for versioning in the filesystem (VMS).

Nope, that's an entirely different type of transaction. The only "real" transactional file systems (i.e. allow multiple user-level transactions that can be cancelled individually) that I am aware of are TxF for Windows Vista/7, and TxOS for Linux: http://www.cs.utexas.edu/~porterde/txos/

You're right, it does. ZFS snapshot your filesystem(s), do your updates. If it fails, roll-back the snapshot and carry on. If it succeeds, you either keep the snapshot just-in-case, or you delete it. Works beautifully, even across full OS upgrades.

That works fine when you only need to do one transaction at a time. There is no reason why a file manager shouldn't be able to do atomic copies or atomic unpacking of archives. Snapshotting the entire file system is not a very general or elegant way to solve the problem.

Uhm, what do you call your index, if not a database?

It is a database, but it's part of the file system (i.e. not updated by applications). Look at BFS on Haiku or BeOS.