Hi,
I just started with Lucene today, and the first thing I did was try out the
small demo. I followed the instructions in "Getting started - Building and
Installing the Basic Demo" by the letter -- I downloaded the JAR files
(2.3.2), unpacked and launched the indexer on the src directory -- worked
fine, indexed all java files in the directory and its subdirectories. I
didn't try to search for a swearword, but I did try to search for "vector".
The fact that I got only one result whereas the demo says I should get a
bunch of them isn't really the problem. The problem is that I got only one
result although the word "vector" appears in TWO documents:
src/demo/org/apache/lucene/demo/html/HTMLParser.java
src/demo/org/apache/lucene/demo/SearchFiles.java
(I checked that with grep)
When I enter my query, I get a very clear answer:
Enter query:
vector
Searching for: vector
1 total matching documents
1. src/demo/org/apache/lucene/demo/SearchFiles.java
grep's version:
[silenos:apache/lucene/demo] veda> pwd
/home/veda/lucene/lucene-2.3.2/src/demo/org/apache/lucene/demo
[silenos:apache/lucene/demo] veda> grep -i vector * */*
SearchFiles.java: * are all identical, then single norm vector may be
shared. */
html/HTMLParser.java: private java.util.Vector jj_expentries = new
java.util.Vector();
[silenos:apache/lucene/demo] veda>
So my question is a very easy one: what happened? Is there a special
processing for java files, like for HTML documents, which leaves comments
out? Is that a bug only in the "demo" part of this small program (this would
be surprising, as other queries seem to be working fine)? Is there actually
a way I can check the content of my index -- what files were actually
indexed, or search for a file in particular? A bit like a field search, but
with the URI of the file itself (though I think I read this is
implementation-dependent, that means one could do it programmatically, but
it's not in the demo, right?)?
Anyway, thx for your answers. I hope there is a good one to this question,
cos I'd feel rather deceived if a search engine so obviously ignores some
results...
David
--
View this message in context: http://www.nabble.com/Preliminary%2C-fundamental-question-about-the-demo-tp19367781p19367781.html
Sent from the Lucene - General mailing list archive at Nabble.com.