ReiserFS at Wikipedia

Hans Reiser, the computer programmer who killed his wife, was formerly known for creating the innovative file system that bears his name. After he was convicted, there were plenty of dreadful Unix jokes (chroot prison, rm -f wife, etc).

Wikipedia has the best/worst of them, however, thanks to its Borgesian faculty for creating new worlds from the dust of academic propriety. Click to enlarge!

Where Hans Reiser was truly something of a genius in his field of filesystems and databases, including his insightful rules of thumb, can be glimpsed by reading his whitepaper — the most important (IMHO) aspect of which I’ve quoted below:

A Naming System Should Reflect Rather than Mold Structure

The importance of not deleting the structure of information is obvious; few would advocate using the keyword model to unify naming. What can be more difficult to see is the harm from adding structure to information; some do recommend the relational model for unifying naming (e.g. OS/400).

By decomposing a primitive of a model into smaller primitives one can end up with a more general model, one with greater flexibility of application. This is the very normal practice of mathematicians, who in their work constantly examine mathematical models with an eye to finding a more fundamental set of primitives, in hopes that a new formulation of the model will allow the new primitives to function more independently, and thereby increase the generality and expressive power of the model. Here I break the relational primitive (a tuple is an unordered set of ordered pairs) into separate ordered and unordered set primitives.

Relational systems force you to use unordered sets of ordered pairs when sometimes what you want is a simple unordered set. Why should a naming system match rather than mold the structure of information? For systems of low complexity, the reasons are deeply philosophical, which means uncompelling. And for multiterabyte distributed systems?…

Reiser’s Rule of Thumb #2: The most important characteristic of a very complex system is the user’s inability to learn its structure as a whole.

We must avoid adding structure, or guarantee that the user will be informed of all structure relevant to his partial information. Avoiding adding structure is both more feasible and less burdensome to the user. Hierarchical, relational, semantic, and hypersemantic systems all force structure on information, structure inherent in the system rather than the information represented. If a system adds structure, and the user is trying to exploit partial knowledge (such as a name embodies), then it inevitably requires the user to learn what was added before he can employ his partial knowledge. With complex systems, the amount added is beyond the capacity of users to learn, and information is lost.

Example: “My name is Kali, your friendly whitepaper.html technical support specialist for REGRES. Our system puts the Library of Congress online! How may I help you.”

George doesn’t know Santa Claus’ name: “I’m trying to find the reindeer chimneys christmas man, and I can’t get your system to do it.”

FIGURE 1. Graphical representation of a typical simple unordered set that is difficult for relational systems.

Kali says: “OK, now let’s define a query.is-a equals man, that’s easy. But reindeer? Is reindeer a property of this man?”

“Uh no. I wish I could remember the dude’s name. I read this story about him a long time ago, and all I can remember is that he had something to do with reindeer and chimneys. The story is on-line, somewhere.”

“Reindeer chimneys presents man, that’s the sort of speech pattern I’d expect from a three year old.” Kali corrects him. “Let’s see if we can structure this properly. Is reindeer an instance-of of this man? A member-of of this man? It couldn’t be a generalization of this man. Hmm…”

“No! It’s not that complicated. They just have something to do with him.”

“Pavlov would probably say you associate reindeer with this man, the way the unstructured mind of an animal thinks. But here in technical support we try to help our customers become more sophisticated. Is reindeer a property of this man?”

“No. Try propulsion-provider-for.”

“Do you think that that was the schema the person who put the information in our system used?”

“No. Shoot. I can think of a dozen different columns it could be under. But what are the chances that the ones I think of are going to be the same as the ones the dude who put the information in used?”

Kali feels satisfaction. “Guess it can’t be done, not if you can’t structure your REGRES query properly. I’ll put you down in my log as a closed ticket, 190 seconds to resolution, not bad.”

“A keyword system could handle reindeer chimneys christmas man.” George grumbles as he stares in despair at his display. Unfortunately, the Library of Congress is only one of REGRES’ many reference aids. George could spend his life at it, and he’d never learn its schema.

“But a keyword system would delete even necessary structure inherent to the information.It couldn’t handle our other needs!” Kali says before she hangs up.

In addition to the searcher’s difficulties, having to manufacture structure by specifying the column for reindeer also adds unnecessary cognitive load to the story author’s indexing tasks.

No, it is not true that “rule of thumb” came from a “mediaeval law that a man could beat his wife with a stick no thicker than his thumb.” A quick google for the phrase will give you plenty of background: this is probably all you’ll need to read, though.