Searching Extensible Metadata

Since the point of extensible metadata is that it is not tied to the structure of the database, there's no command to give for the database client to search it. It's just a big blob of XML in one field of the library. Efficient searching will be a serious problem. In the most obvious brute-force method, each entry's entire XMD field will have to be parsed when XMD is searched. This is a page to organize thoughts on how it can be efficiently implemented.

Idea 1: Triple elimination search
Basic premise: Start with a search that's very efficient to do, but inaccurate. Make several faster searches to close down to the entries that are desired. This plan would make a new field in the database necessary. It would simply be a field where the names of the XMD fields are listed.

Step one: Search through the XMD names field for the name of the field being searched for. Add the hits (entries that have that particular XMD field) to a list of possible matches.

Step two: Search those matches' raw XMD fields (as text in the DB) for the string being searched for. Prune the possible matches list down to those that have that particular string.

Step three: Parse the XMD for the remaining matches and see if the field being searched in has the string being searched for. Return those that do as matches to the search.

Idea 2: Some form of XMD cache.
Basic premise: Store the XMD as live objects on some server somewhere.

Link to this Page

To Do List last edited on 15 February 2005 at 1:53 pm by lawn-199-77-213-141.lawn.gatech.edu