Deprecate SpellCheckRequestHandler replace with one that does query analysis and spell checks each token

Details

Description

The current spellchecker does not handle multiword queries very well, if at all. Depending on the settings, it either ignores multiword tokens, or it splits on whitespace. It should use the query analyzer associated with the spelling field to produce tokens for spelling.

We should deprecate the current one and replace it with one that is similar, but does the appropriate thing with the query tokens.

I suppose that depends on whether you think people should still use the existing one. My take is it should be, since it doesn't analyze the tokens, thus things like case, etc. and it can't even take in multiword queries, which would be pretty common. Essentially, it really forces you to do things on the client side that shouldn't need to be done.

Ideally, the new version is a search component, such that one doesn't have to send separate requests, either.

Grant Ingersoll
added a comment - 27/Feb/08 12:21 I suppose that depends on whether you think people should still use the existing one. My take is it should be, since it doesn't analyze the tokens, thus things like case, etc. and it can't even take in multiword queries, which would be pretty common. Essentially, it really forces you to do things on the client side that shouldn't need to be done.
Ideally, the new version is a search component, such that one doesn't have to send separate requests, either.

Otis Gospodnetic
added a comment - 27/Feb/08 15:26 I understand it's different under the hood, just wondering if it would really break things for existing users. If not, perhaps a replacement is enough. No big deal.

I'm about to make some SCRH changes (e.g. read words from one or more files instead of from another index's field, optionally strip diacritics...) and I'm wondering where you are with this, Grant. I'll work off of trunk unless you have something you can attach here.

Otis Gospodnetic
added a comment - 11/May/08 02:03 I'm about to make some SCRH changes (e.g. read words from one or more files instead of from another index's field, optionally strip diacritics...) and I'm wondering where you are with this, Grant. I'll work off of trunk unless you have something you can attach here.

Otis – I'm working on the changes I described in SOLR-507, do you think those changes are better suited for a new RequestHandler? I was adding new request parameters to use the field's query analyzer as described in this issue.

Shalin Shekhar Mangar
added a comment - 11/May/08 06:49 Otis – I'm working on the changes I described in SOLR-507 , do you think those changes are better suited for a new RequestHandler? I was adding new request parameters to use the field's query analyzer as described in this issue.

Shalin - great! I think at this point it makes sense to (re)write the SCRH as a Search Component, so perhaps it's okay to take the deprecation route Grant proposed if the changes you are making look like they could break things for consumers of current SCRH.

Oh, do you know when, roughly, you will have this ready? Not trying to be pushy, but plan to see if/when I should make my SC changes. I'd rather wait for you a little instead of doing similar work in parallel.

Otis Gospodnetic
added a comment - 12/May/08 04:42 Shalin - great! I think at this point it makes sense to (re)write the SCRH as a Search Component, so perhaps it's okay to take the deprecation route Grant proposed if the changes you are making look like they could break things for consumers of current SCRH.
Oh, do you know when, roughly, you will have this ready? Not trying to be pushy, but plan to see if/when I should make my SC changes. I'd rather wait for you a little instead of doing similar work in parallel.

Otis - I was being careful not to break compatibility with current clients but I also think it makes sense to implement this as a Search Component from the ground up. Existing clients can continue to use SCRH and new clients can use the search component. That way, we can provide all the latest and greatest features without resorting to unintuitive syntax that may be needed to remain backwards-compatible.

Shalin Shekhar Mangar
added a comment - 12/May/08 06:10 Otis - I was being careful not to break compatibility with current clients but I also think it makes sense to implement this as a Search Component from the ground up. Existing clients can continue to use SCRH and new clients can use the search component. That way, we can provide all the latest and greatest features without resorting to unintuitive syntax that may be needed to remain backwards-compatible.

Ryan McKinley
added a comment - 16/Jul/08 14:17 Is the concern here not to break compatibility for folks who use are using the /trunk SCRH?
Before releasing 1.3, we could consider reverting SCRH to the 1.2 version – this way we have less code to maintain. As we move forward, are new features added to both?
(I'm fine keeping it in... just want to make sure we consider it before 1.3 release)

Ryan McKinley
added a comment - 16/Jul/08 17:35 I think we should deprecate it and remove it from the example solrconfig.xml
come to think of it we should remove all the deprecated handlers from solrconfig.xml. Dismax is really just a SearchHandler with a queryParser=dismax.