If you use prefix grams only then you'll get a forward-only suggestion
scheme. I've seen several implementation that use that and it works
quite well.
harry potter: ^ha, ^har, ^harr, ^harry, ^harry p, ^harry po..
harry houdini: ^ha, ^har, ^harr, ^harry, ^harry h, ^harry ho..
I prefere the trie-pattern though. Just rememberd there is an old one
in LUCENE-625.
karl
8 apr 2009 kl. 20.50 skrev Matt Schraeder:
> Corerct me if I'm wrong, but I don't think n-grams is really what I'm
> looking for here. I'm not looking for a spellchecker or phrase
> checker
> style suggestive search, but only based on the exact phrases the
> user is
> currently typing. Since Lucene uses term-based searching, I'm not
> sure
> how to have it search on portions of a full phrase. Using a standard
> lucene search typing in "harr" will result in searching for "harr"
> as a
> term, which will not find "Harry Potter". Using ngrams it would find
> "Harry" as a term, but not at the beginning of an entire phrase. This
> would bring back "My Dog Harry" as a result, which isn't what I'm
> looking for. I just want phrases from fields beginning with "Harr"
> only.
>
> I could easily do this all with our database server by simply doing a
> query for "where searchqueries like 'harr%'" but we're trying to limit
> our hits to the database to keep speed up on the site.
>
>>>> karl.wettin@gmail.com 4/8/2009 12:49:45 PM >>>
>
> For this you probably want to use ngrams. Wether or not this is
> something that fits in your current index is hard to say. My guess is
>
> that you want to create a new index with one document per unique
> phrase. You might also want to try to load this index in an
> InstantiatedIndex, that could speed things up quite a bit if the
> corpus is not too large.
>
> If your suggestion text corpus is really large and you only want
> forward-only suggestions then you might want to consider a trie-
> pattern solution instead. These can be rather resource efficient, even
>
> when loaded to memory.
>
> If you have a lot of user load on your search eninge then it might be
>
> interesting to use old user queries as the base of your suggestions
> and perhaps boost a bit on trends, i.e. the more people search for
> something the more it get boosted in the suggestions list.
>
>
> karl
>
> 8 apr 2009 kl. 15.26 skrev Matt Schraeder:
>
>> I want to add a suggestive search similar to google's to
> autocomplete
>> search phrases as the user types. It doesn't have to be very
>> elaborate
>> and for the most part will just involve searching single fields.
> How
>> can I perform a search to be able to fill in autocomplete text?
>>
>> For instance, if I start typing "Harr" it should bring up "Harry
>> Potter" "Harry Houdini" and "Harry S. Truman"
>>
>> I have tried doing search queries for "Harr*" but it's still doing
>> term-based searching rather than searching a full field. To make a
>> field both searchable as the full field as well as tokenized, would
> I
>> have to duplicate the field and make one a keyword field? Is there a
>> more convenient way to do this? I have also considered making a
> second
>> index for suggestive search, which would only have the fields that I
>> want to enable suggestive search on, but this seems like it would be
>> unneccesary duplication of data as well, though it would probably
> make
>> suggestive search faster due to a smaller index.
>>
>> Ideally it would also be nice to be able to rank these terms based
> on
>> the number of times they have been searched for so that the results
>
>> are
>> tailored more to our users rather than simply just the score that
>> Lucene
>> chooses.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org