I have a problem. When I use a fuzzy index (Stemming_en), the
SWISH::PhraseHighlight module does not work as expected.
Essentially, after creating the $swish object the script gets the $results
of a query. Then, it executes the following, as detailed in search.cgi.
my %headers = map { lc($_) => ($swish->HeaderValue( $index, $_ )||'') }
$swish->HeaderNames;
my $highlighter = SWISH::PhraseHighlight->new( \%highlight_settings,
\%headers );
my %parse_query = parse_query( join ' ', $results->ParsedWords( $index )
);
my $phrases = $parse_query{$metaname};
$highlighter->highlight( \$text, $phrases);
which highlights the apropriate pieces of text in $text. This works fine
for non-stemmed indexes. So far so good.
For fuzzy indexes, this does not work. And I have found the cause of the
problem. When I execute from the command line
X:\cgi-bin\search\modules>swish-e -H 9 -f /path/to/stemmed/index -w memory
# SWISH format: 2.4.2
# Search words: memory
#
# Index File: /path/to/stemmed/index
# ... lots of headers we don't care about
# Fuzzy Mode: Stemming_en
# Search words: memory
# Parsed Words: memori
then we can see why. It appears that the parsed words are the stemmed
versions of the actual search terms. These, if passed to parse_query do
not match the original search terms, which we obviously want to highlight
as well. The SWISH::FuzzyWord method does not help either as we do not
have the original search terms anymore.
I have checked the archive for anything about this, but could not find
anything. Has anyone experienced this before, and has a solution? Or am I
doing something wrong?
Thanks, Jonas