I've tried Symantec products in the past, and they are worse than actually having a virus. They slow your PC to a crawl, get their claws into every part of your computer, and are extremely difficult to purge when you finally give up on them.

It describes the ability to add metadata to web content (tags, etc), and you haven't heard of it because web 2.0 is the more popular term.;)

Personally I think that metadata/tag based systems are the wrong road for semantic analysis of web pages. As soon as the semantics of a thing is decided by additional information added to describe that thing, its open to abuse.

The only advantage is its faster than what should be done, which is using good old maths to extract the true 'meaning' of a document or object.

Its not hard. Well, ok, its a little hard. Oh ok, its really rather difficult, but there are plenty of places you can get example code or libr

Semantic information is not more or less trustworthy than the document itself.

Well, no. Its possible using some admittedly complex math, to strip out all but the core meaning of a document. Its very hard to hide the meaningful content of a page from properly done semantic analysis. I know this because I've done that kind of thing before (deliberately vague mode active here, sorry).

It offers a useful perspective though. If the costs of storing data in a way that preserves more information are low, why not do it.

True, true, but if that additional data conveys less meaning then correct raw document analysis, it becomes potentially less useful. I would be against the cost of storing it, but then that's me.

If Jack the Ripper writes a document and signs it Joe the Plumber, and you proceed to extract that Joe the Plumber wrote the document, you aren't any better off than if Jack the Ripper explicitly marked the author as Joe the Plumber.

Sure, semantic information added by a second party might have a different level of reliability than the original data, but if you don't have the original data, you can't tell if the implied semantics of the data have been changed, so the situation isn't all that different.

If Jack the Ripper writes a document and signs it Joe the Plumber, and you proceed to extract that Joe the Plumber wrote the document, you aren't any better off than if Jack the Ripper explicitly marked the author as Joe the Plumber.

Semantic processing of documents doesn't involve just taking the content and using it as is (where your scenario would indeed occur), instead you process the text to remove all but the most prominent points. Trivial things like the author would be lost in such an analysis, and your left with what the document was essentially about.

I guess what I am getting at is that semantic information mathematically extracted from the source data might be more reliably associated with the data than externally added semantic information, but it isn't necessarily more trustworthy.

Ok, yes, but its not about trust, I would shy away from words like trust, and try to use words like 'robustness' instead. No matter what type of web you have, there will always be

This is the fundamental error of advocates of the Semantic Web: that data have any meaning at all, much less a single "true meaning."

Think of any string of bytes as having the same relationship to a thought as a lossy-compressed document has to the uncompressed document. Lossy compression algorithms depend on shared assumptions, and you can uncompress such a document on the basis of different assumptions than it was compressed with, but it won't give you th

It describes the ability to add metadata to web content (tags, etc), and you haven't heard of it because web 2.0 is the more popular term.;)

Wrong and wrong. Sort of. 8^)

The Semantic Web is the term coined by Tim Berners Lee, describing the ability to associate data using inference (rather than explicit reference). In his conception, it relies on XML data formats and the ability to use common elements to translate between one and the other.

It's not a terribly easy concept to grok at first, but the basic premise is that in data transformation, you only need to know the two steps closest to you in order to translate (and process) data from numerou

I have yet to see "semantic web" fully explained, but Wikipedia is giving some good insight [wikipedia.org] into it, especially into its nebulousness. It is supposed to make web (or in this case, desktop) documents machine-readable.

TFA deals not with the Semantic Web, but rather the "semantic desktop". As it says, "Semantic Web researchers believe the tool will prove a breakthrough for semantic technology. By encouraging people to add semantic meta-data to the information stored on their machines they hope it could succeed where other semantic tools have failed".

You see how it was abused. Any more advanced semantic tools will be similarly abused.

There are other problems, as the wikipedia article explains:

Practical feasibilityCritics question the basic feasibility of a complete or even partial fulfillment of the semantic web. Some develop their critique from the perspective of human behavior and personal preferences, which ostensibly diminish the likelihood of its fulfillment (see e.g., metacrap). Other commentators object that there are limitations that stem from the current state of software engineering itself (see e.g., Leaky abstraction).

Where semantic web technologies have found a greater degree of practical adoption, it has tended to be among core specialized communities and organizations for intra-company projects.[12] The practical constraints toward adoption have appeared less challenging where domain and scope is more limited than that of the general public and the World-Wide Web.[12]

[edit] An unrealized ideaThe original 2001 Scientific American article by Berners-Lee described an expected evolution of the existing Web to a Semantic Web.[13] Such an evolution has yet to occur. Indeed, a more recent article from Berners-Lee and colleagues stated that: "This simple idea, however, remains largely unrealized."[14]

[edit] Censorship and privacyEnthusiasm about the semantic web could be tempered by concerns regarding censorship and privacy. For instance, text-analyzing techniques can now be easily bypassed by using other words, metaphors for instance, or by using images in place of words. An advanced implementation of the semantic web would make it much easier for governments to control the viewing and creation of online information, as this information would be much easier for an automated content-blocking machine to understand. In addition, the issue has also been raised that, with the use of FOAF files and geo location meta-data, there would be very little anonymity associated with the authorship of articles on things such as a personal blog.

[edit] Doubling output formatsAnother criticism of the semantic web is that it would be much more time-consuming to create and publish content because there would need to be two formats for one piece of data: one for human viewing and one for machines. However, many web applications in development are addressing this issue by creating a machine-readable format upon the publishing of data or the request of a machine for such data. The development of microformats has been one reaction to this kind of criticism.

Specifications such as eRDF and RDFa allow arbitrary RDF data to be embedded in HTML pages. The GRDDL (Gleaning Resource Descriptions from Dialects of Language) mechanism allows existing material (including microformats) to be automatically interpreted as RDF, so publishers only need to use a single format, such as HTML.

There's actually a pretty good introduction to the semantic web in this month's Communications of the ACM [acm.org]. You're right when you say that the semantic web is, as yet, mostly unrealized. But it has huge potential.

Relational databases were in the same position in the late 60's/early 70's. We needed ways to combine and extract information automatically with a simple and expressive language. Relational database management systems, combined with SQL were the result of that, and they were a smashing succes

If web content is readable and meaningful to me than it already has inherent meaning. Semantic tagging duplicates effort. Google shows us that machines can read content directly, just like people. I see no need to create separate 'machine readable' meta content alongside the normal content.

Google shows us that machines can read content directly, just like people

Actually NOT "just like people". Google does a simple keyword search, then ranks the various hits in order by criteria they set up, like # of times the search term is repeated, how popular the site is, how many links go to the page returned, etc.

If you do a book search for "Tom Sawyer" you will get Huckleberry Finn returned, after the book "Tom Sawyer" most likely, but the machine will not read either book an any sense of what you woul

If web content is readable and meaningful to me than it already has inherent meaning. Semantic tagging duplicates effort.

Semantic web is also about accessibility. Take a blind person for example surfing the web using a screen reader, do you have any idea how horrible his/her browsing experience would be like in the web today?

Click on a different page, and there you go listening to the same headers _again_. It can get very frustrating

That's only due to the poor design of the document AND the reader. For instance, there should be little or nothing before the page's actual content. A blind person who had stumbled across my old personal site would love it - it had an index, which led to individual pages. Assuming his reader skipped meta tags and such (<head>) he would instantly have the title read, followed by the article. L

I have yet to see "semantic web" fully explained, but Wikipedia is giving some good insight [wikipedia.org] into it, especially into its nebulousness. It is supposed to make web (or in this case, desktop) documents machine-readable.

Talk about nebulous, look at the mission statment for the OSCA. What does that even MEAN?! It's ironic that an organization devoted to making information more easily consumable cannot even get a decent statement of purpose together.

While I am on my soapbox, am I the only one severely annoyed by slashdot's web2.0-wannabe UI?

No, all the comments about it I've seen are caustic and negative. I haven't seen a single comment saying "wow this is kewl" or even one saying "meh, it's ok." I don't know how good a programmer Pudge is, but I know he sucks badly at design.

"Semantics" is information about meaning (whereas syntax is information about form). Semantic tools try to provide meaning by describing relationships between information atoms. The goal is to create systems which can answer questions like "how old is the president's oldest child?" with just the age, instead of listing all documents which contain the words "old" "president" "oldest" and "child".

The Semantic Web is a failed attempt to extend the WWW via "semantic markup", which allows users/editors/etc to tag content (text, images, data) using a standard format that can be read, processed and exchanged by machines which can then give users more useful pointers to stuff that they care about.

The Semantic Web has failed for a bunch of reasons, with many people tending to blame the tools. However, those of us of a particular epistemological bent believe that it is doomed in principle as current conceived because "meaning" is a verb, not an adjective.

"These data mean X" is completely incoherent on this view of meaning, like saying "This smell of orange blossoms has Republican leanings." "Meaning" is simply not an attribute of data, any more than political tendencies are an attribute of scents.

The Semantic Web fails to capture almost everything about the entities that do the meaning (people) but instead is based on the belief that meaning is a property of data. Data inspires meaning, but meaning is something that humans do, and the Semantic Web has no effective mechanism for capturing this, although with sufficient markup by many individuals on the same data it should be possible to do something similar to ROC evaluation of the ways people mean, which would greatly enhance the utility of the Semantic Web.

A colleague who works in GIS pointed out an consequence of this phenomena to me many years ago when he described an experiment involving a bunch of geologists mapping a particular terrain. At the end of the day, after integrating all their inputs, he could tell who mapped where, but not what anybody mapped.

I've got a better reason why it failed that doesn't require delving into first year philosophy.

People are lazy. Look at any image database and figure out why it's difficult to find something. Because people don't want to spend 20 minutes filling in tags for a single image they just want to show off to their friends.

Now expand that to every other form of data type, and its easy to see why the semantic web never did, and never will take off without significant AI involvement.

People are lazy. Look at any image database and figure out why it's difficult to find something. Because people don't want to spend 20 minutes filling in tags for a single image they just want to show off to their friends.

And even when they do fill in the tags, they're sloppy about it. Things get misspelled and mislabeled all the time. Most people are very inconsistent about labeling even when they're trying their best to do an honest, thorough job. Okay, let me tag this photo "wife", because has my wife

>... the semantic web never did, and never will take off without significant AI involvement.

I understand that the point of Nepomuk is to allow for automated tagging by the standard tools of the KDE desktop. For instance, say you receive a picture from an IM contact who KDE also knows (through the address book framework, Akonadi) lives in Europe.

Then Nepomuk would allow you to make search queries as "Bring up all the pictures that people living in Europe sent me last week". Well, that's the theoretical goal anyway; we will see if they ever get there.

There's one nifty application already: you can create a Folder View plasmoid on your desktop, and instead of making it display ~/Desktop/ as usual, you can make it display the result of a query through the Nepomuk KIO slave. See here [osnews.com] how it works.

I wonder if there's an application that will do as you suggest in even more structured environments where such things really ought to be easily possible.

I'd give a minor digit if my Usenet newsreader would tag every download with where I got it from, when, who posted it, and a few other items that should be easily and consistently retrievable from the message headers (that supposedly conform to a defined format). I'd also love it if my web browser would tag every right-click/downloaded picture with the URL

I'd also love it if my web browser would tag every right-click/downloaded picture with the URL it came from and maybe a few other data elements.

netscape 4 was doing that in 1997 on mac os 7.6 (it went in the "comment" metadata of the file). i'm fairly sure safari can do it right now, though it may require a plugin (=hack) or twiddling a defaults setting or something.

Apparently light from my point will not be reaching you for at least a few thousand years, as you've made a suggestion (AI involvement) that is doomed to fail for exactly the reasons I laid out in my original post.

No, I just meant that the difference between meaning and data has no bearing on whether the idea of a semantic web has failed.

Even if the meta attributes stored in the semantic web were purely data and not meaning (eg. the ID3 data on an MP3), the semantic web would still fail simply because people are inherently lazy and won't generate the extra meta data.

The only way that it could work is if meta data were generated automatically (like EXIF on photos). The current batch of meta information that can be gen

I disagree. First of all, the semantic web is just about allowing content creators to associate context with their content [blogspot.com] to facilitate a context sensitive search. The semantic web has lackluster adoption because google does a great job at context sensitive search without the context providing meta-data markup.

A more limited version of semantic web has achieved some notable traction. Microformats [blogspot.com] are another way of associating context with content that is more agreeable with content providers.

The Semantic Web is a failed attempt to extend the WWW via "semantic markup", which allows users/editors/etc to tag content (text, images, data) using a standard format that can be read, processed and exchanged by machines which can then give users more useful pointers to stuff that they care about.

You got that exactly backwards.

The WWW was an earlier doomed attempt at semantic markup, and up until the summer of '93 or so it looked like it might work. That's when the early rants about people using the tags to control layout instead of too convey meta information (e.g. using em to get italics in a bibliography, dt/dd to make roman numeral lists, etc.) started--or at least when I first became aware of them. In fact, pretty much the entire history of HTML has been a tension between the language's designers and purist, who want users to care about what markup means, even if it does nothing, and the vast majority of users who only care about what it does regardless of the "meaning" that may be ascribed to it. Once you can get your head around both perspectives some of the goofier things in the whole tawdry history (the Table Wars, XML, CSS) make a lot more sense.

Ok, a little more sense. But only if you already knew what people are like.

You need to distinguish between document semantics, which is what the SGML purists wanted for HTML, and real world semantics, which is what the Semantic Web people want. It is indeed instructive to note that the document semantics crowd completely failed in their fight to separate the presentation layer from the document model.

The only mechanism that has gained any general traction is CSS, which is about as far from a "real" document-semantics-based styling language as you can get, and I'm betting I'll hav

"Regardless of cause, given the failure of semantic markup in such a limited and controlled scope, it is very unlikely that it will succeed with the richness and complexity of all the data in the world."

What the heck is *with* this argument? It's completely false.

Look, we already have exactly the same thing as what the Semantic Web is trying to do: it's called SQL. Oh noes! Trying to describe real world information on a computer will fail! It's impossible!

You need to distinguish between document semantics, which is what the SGML purists wanted for HTML, and real world semantics, which is what the Semantic Web people want.

They didn't distinguish all that well themselves, and with good reason: there isn't really a bright line distinction to be made. HTML has long supported a range of markup from the purely presentational (br) through what you are calling real-world semantics (meta-keywords, link, and such).

Actually, I'd say it's too early to say that the Semantic Web has failed. What has clearly failed for now is the vision for how the technology was to be used.

For one thing, it turned out that really, really clever textual matching is a lot more powerful than anybody thought possible. Twenty years or so ago, you'd have thought that you'd need to have some kind of sophisticated metadata to do the kinds of stuff we take for granted in Google today. I turns out that a technology that turns a needle in a hay

It has layers. At the outermost layer, on the scale of the WWW, "conveying meaning" as you describe is indeed futile. Cory Doctorow's "Metacrap" essay sums it up nicely (linked to in this discussion thread): People are dishonest, lazy and stupid when it comes to metadata...and when they aren't there is no way to impose standards that more than one person would agree to insofar as imparting meaning on data.

However even Doctorow admits metadata, at some level and taken in context

"I think that is where something like Nepomuk could succeed where internet-wide semantic standards fail to gain traction. People are lazy but they DO devote more effort to organising their own personal data vs. what is on a web app. I do make more effort to tag photos with metadata in F-spot for example. If there is a "structural-level" standard that could be applied to all files so I can tag spreadsheets, photos, databases, addressbook contacts...whatever..and I could follow one simple, consistent process

Basically, think of the tags that are at the bottom of slashdot articles. You can tag them with things like PATRIOT ACT, or EFF, or whatever, and in theory, its going to help you search slashdot and get more relavant articles in your search. Now, when you add that capability to the unwashed internet masses, you see things like story tags with "No" or "itsatrap" or whatever crap people think is funny, which is funny, but ruins its main purpose of helping people find information. Multiply that by the numbe

I've tried out Nepomuk and, while I have to say that it's promising, it's got miles to go before it's even near ready. The main problem is application support. Sure, you can rate and tag and describe your files in the Dolphin file browser. So what? You can do the same in Vista. This doesn't mean anything if applications don't hook into this and make use of it. Of the apps I've used, Gwenview (a photo viewer) has Nepomuk partially implemented but it's buggy and you need to compile it yourself with it explicitly enabled (this will apparently change in KDE 4.2). Digikam, which allows you to rate, tag, and describe photos already, says that they have no plans of integrating with Nepomuk anytime soon. Amarok 2 has work towards a Nepomuk collection, but the devs say that this will always run along side the main, MySql-based collection and it's nowhere near ready yet. My email is in the cloud so I can't even begin to talk about KDE-PIM's support or lack thereof.

The other problem at the moment is a lack of ability to query your semantic data. Can I get anything to show all photos with my wife in them that I've rated four or above? Not at the moment. Hopefully this is coming in KDE 4.2, but as it stands at the moment it makes Nepomuk a case of write-only memory.

So, maybe something to get excited about in the future, but not quite yet.

speed. details. thats up to optimizers.
From what I know about all this, I would say that you get an open, w3c standardized destkop search engine. A slow fruit, but it is open and can grow, compare that to patented-getNervousBeforePublishingWinFS-microsoft.
probably the query thing is possible, but you have to learn the standardized SPARQL language first (oh joy) but wait and see:

Can I get anything to show all photos with my wife in them that I've rated four or above?

someone who has programming skills and also wants to see the pictures of his wife and rating>4 will hopefully help us all b

Just use digiKam for photos. You can search photos by all metadata what is included to photos. Even that digiKam uses SQlite, you can sync metadata between database and photos.

In future, there should come a API to sync database with nepomuk. But currently not planned because nepomuk is not so well implented for KDE4 or the KDE4 at all. Just like all can notice that you cant search files any other way easily than using the kickoff or lancelot menu search bar.

Folders are only so useful at organizing things and we are rapidly approaching the point in our digital lives where we can accumulate far more than we'll ever realisticly be able to handle using the "folder" method.

The unfortunate part, as others have pointed out, is that without some sort of significant AI involvement, semantic anything is unlikely to ever reach critical mass.

If you have 25 gigs of family videos and pictures, it's highly unlikely that if you don't have the time to sep

As much as we'd like to believe that the lack of something existence proves it isn't needed, it's not always the case. Often it simply means that we've found measures that work in the interim that may or may not continue to serve us as the need grows.

There is a reason why they didn't have TV's in the 1800's. And it's not because people back in the 1800's wouldn't have enjoyed them.

Semantic "anything" hasn't taken off because as of yet we don't have the tools necessary to make something semantic in an easy,

In a way, it's *harder* to use hiearchical directories that tags as there will always be debate about what is the 'right hierarchy': do you put all your configuration files under/etc or do you have one configuration subdirectory for each package?With a 'tagged filesystem' this kind of issue becomes obsolete as you can have both view easily..

Given that I consider that the Unix FHS *suck*, I hope that reinventing Unix with a "tagged" FS will happen, but I'm not holding my breath: there's still many people wh

So currently it is just Yet another tagging system...that has the ability of sharing the tags with your friends... the data on my computer is private... that's why it is on my computer and not on my Facebook/MySpace/Bebo etc site....

Another poorly implemented, mis-aimed application by I researchers...!?

Can I get anything to show all photos with my wife in them that I've rated four or above?

Another problem is that you have to have it tagged, if you don't tag all the images with a given person in it, then it's not going to show up in a search.

I think that's also an issue with Semantic web too, not only do items have to be tagged, the tags have to be accurate and trustworthy. It sounds like a nice system that can be undone by a bunch of tag spamming bozos. It will work better under a controlled environment.

or it will never be used. When I download photos, I want my browser to tag where it came from (website) or perhaps which keywords I typed in to find it. I don't want to add this all manually.

The amount allowed per file needs to be limited (perhaps 100 keywords) and managed so the useless ones get weeded out. It will probably be an art to itself, but anything less than filesystem support just won't work.

And yes, in vista you can tag things. But it's tedious and the OS level tools for users aren't there.

Yep, that was my first thought as well. Quickly followed by wondering if 'into a collaboration environment which supports both the personal information management and the sharing and exchange across social and organizational relations' was some kind of euphemism for, eh, group pr0n of some kind.

Oh, well, either they have much less dirty minds than mine, or someone's desire for well-indexed pr0n browsing has gotten slightly out of hand.

I assumed it was KumOpen (come open) backwards. I think the real acronym is even stupider than that.

The official acronym is very contrived so I'm sure it is a "backronym". I also suspect a group of tall-foreheads would deliberately come up with a project name with a suggestive reference like that either.

Google and Wikipedia provide the most likely possibilities for the origin:

Nepomuk is a town in the Czech Republic, in the "kraj" (province or region) called "Pilsen". given this fact, here are some posibilities to explain the name:

* Nepomuk is the birthplace of St. John of Nepomuk, who is considered "the

It's also the middle name of J. N. Hummel [wikipedia.org], Austrian classical composer of the 18th/19th centuries. AFAIK, he has nothing whatsoever to do with those cutesy ceramic figurines old ladies like to collect.

Neopmuk was the half-dragon in "Jim Button and Luke the Engine Driver". His mother was a Hippopotamus. Pleasant enough character and all, but I hope the code looks more like the excellent book and less like a half-dragon, half-hippopotamus hybrid. Yuck!

Oh, shut up. What the fuck is wrong with this site these days, I could swear users here have become more interested in marketing than in technology after OS X became popular.

Nepomuk is the name of one of KDE's small underlying technologies, it's not used for sales and marketing, just like khtml isn't used for sales and marketing. Just shut up if you don't have anything interesting to say.

Yes, I know that Nepomuk means "Networked Environment for Personalized, Ontology-based Management of Unified Knowledge" as stated in the article.

John of Nepomuk is considered the first martyr of the Seal of the Confessional, a patron against calumnies and, because of the manner of his death, a protector from floods.

patron against calumnies sounds good for this kind of project. And he protects us from syn floods.

I'm glad that they don't prefix everything with K though.

NEPOMUK brings together researchers, industrial software developers, and representative industrial users, to develop a comprehensive solution for extending the personal desktop into a collaboration environment which supports both the personal information management and the sharing and exchange across social and organizational relations.

A quick look at their webpage would have told you that it's not a KDE project. KDE just has the first usable (kinda) implementation.

But hey, this is slashdot. If we used google before posting what would peo

I've been experimenting with metadata and blogs, and specifically the cluster analysis of those conversations on the web - so far so good ( http://www.wallcloud.net/ [wallcloud.net] ). I'm really interested in seeing how our desktops change as our information starts "clumping" together for us - our contacts, files, work items, etc arranging themselves on screen. I'd love to have a dev tool that would allow me to right click and jump to the SQL table I'm hovering in the code, and maybe gesture to bring up jobs that inter

The Nepomuk Web site wants to make me chew my own arm off. Now, I'm familar with the Semantic Web, I'm excited by the idea of semantic organisation. But this site is the epitome of grim, lifeless European research-ese. It completely fails to convey the technological approach, how it works, or why you should give a damn. I get the impression that the team was more interested in the EC funding then actually developing a disruptive technology.

Why why can't researchers spend 15 minutes thinking about how to convey the importance and excitement of what they are trying to do in terms of practical examples.

I'm afraid you'll probably have to wait until some enterprising 3rd party to grab the source and build some of the technology into a different product.

I'm afraid you'll probably have to wait until some enterprising 3rd party to grab the source and build some of the technology into a different product.

Don't be afraid anymore - this was actually the plan of the semantic desktop all the time. For instance, a product of the project is http://aperture.sf.net/ [sf.net] which is built-in into http://eclipse.org/SMILA [eclipse.org] at the moment.

Also, Ansgar Bernardi (the project lead who was interviewed in the TR article) says:

All information is semantic. This slashdot post is information encoded using English semantics. Unfortunately for the machines, the English semantics are way to complicated for them to understand. So they need a simpler set of grammar rules to be able to parse it. But why would anyone want to waste time marking it up just for the benefit of machine readability when google basically can accomplish the same thing without all that metadata markup cruft?

Everybody and his uncle tries to make systems that will index every piece of crap on your PC and it invariably results in a useless and horrible waste of resources. The biggest annoyance is trying to figure out how to turn these damn things off.
Considering that the average user only searches for something once in several years, an on-demand search system makes far more sense.

"Everybody and his uncle tries to make systems that will index every piece of crap on your PC and it invariably results in a useless and horrible waste of resources."

On the contrary, we should seriously be asking ourselves *why*, when all our data is sitting there on our PCs, we've let ourselves get into such a state of disorganisation at the operating system level that a class of program called 'indexer' exists as a third-party tool in the first place.

How come it's not already taken as given that the primary thing an operating system *does* is, you know, *know where all its data is*?

It's as if we're living in an age before 'directories' were invented - or before databases had 'indexes' and 'queries' - and we have to manually write down and key in raw sector numbers every time we open a file. And we're okay with that, because we think - and teach - that that's 'just how computers work'. We've accepted that there's a whole class of things our computers can't do 'because there's no application to do that'.

I do search often here... in BeOS (or Haiku).
but usually I search for things that are in the BFS indices, which are maintained automatically from the xattrs. So it's way faster than find. And if I really need I use grep (or TrackerGrep). We now only need something to fill the xattrs for arbitrary files like the mail daemon does for mails. SkyOS ported OpenBFS and added an indexer daemon atop to support full content search.
I still need to see Linux or Vista do it.

I've tried contacting both strigi developers, other one doesn't respond and the other says "ask the other guy".

Anyway, I've got about 10000+ JPEGs off my digicams, all of them are commented - in the JPEGs internal comment field. When reading about strigi and other desktop search tools, I was thrilled - I could just search for stuff instead of my old standby jhead *.jpg | grep Comment | gre

*Please note that centrifugal is a made-up non existent word. The real word should be centripetal. Centrifugal is a made up force that physics people HATE! So please, everyone use the world centripetal, not centrifugal. Thanks!

Even if one is using a proper reference frame (not a rotating one), there is still an outward force in the system, namely the reactionary force to the centripetal force. Said reactionary force could legitimately be called a centrifugal force, but it is a force applied to the central object, not on the outer object, which distinguishes it from what people usually mean when they say centrifugal force.

Never fear - this patent has been badly drafted. You can easily circumvent it by not swinging from a tree, use a wooden frame. Or use ropes for your swing. Or use a different number of chains, one looped over the tree for example...

Easy: Just have a bot add "untagged" tags to everything not yet tagged. Then it's tagged, because it's tagged "untagged".

2) Information must be TRUE (otherwise you will get bad deductions).

Also easy: Just remove all wrong information before making your deduction. OK, so how is the computer to know what is wrong? Well, that's of course again semantic information, so just tag anything wrong as "wrong". If some "wrong" tagging happens to be wrong, you can still tag that as "wrong"

1) Easy: Just have a bot add "untagged" tags to everything not yet tagged. Then it's tagged, because it's tagged "untagged".
I smile once again. Tagging everything as untagged is simply CRAP. OF COURSE I mean that everything should be tagged in a MEANINGFUL way.
2) Also easy: Just remove all wrong information before making your deduction.
AHAHAH.. yes yes.. but have you ever seen some HUGE ontology (huge ontologies usually are the most useful ones)? Of course Removing WRONG information can done with rea