>Yes, I thought of that, but it always felt like a weird idea to me. I
>can't really explain why.... Clemens, what do you think about this? I
>was imagining something like skipping the link parts that are the same
>in the previous link....and now I know where I got that :)
This seems dangerous to me, since Lucene is free to take liberties with
tokens, such as stemming and filtering out stop words. So a URL like
/path/to/foo
might get mapped to
/path/foo
if you used a stopword analyzer.
A very common trick for compressing paths is this: give each known URL
prefix a code. Example:
/foo -> 1 = ("foo")
/foo/bar -> 2 = (1, "bar")
/foo/blah -> 3 = (1, "blah")
/foo/bar/moo -> 4 = (2, "moo")
This trick is used often in caching, to reduce the number of lookups
required to find an element in a hierarchical cache.
--
Brian Goetz
Quiotix Corporation
brian@quiotix.com Tel: 650-843-1300 Fax: 650-324-8032
http://www.quiotix.com
--
To unsubscribe, e-mail:
For additional commands, e-mail: