lucene-java-user mailing list archives

Hi developers,
I am indexing HTML documents in lucene as:
H1:"text in H1 font"
H2:"text in H2 font"
...
H6:"text in H6 font"
content:"all the text"
The problem is that query of a type
+(H1:xyz)
is getting scored with the termFreq of xyz in the H1 field whereas I want
it be scored using the termFreq of xyz in the entire document (i.e.
content field)
Can you point me how to achieve this.
I took a look at Similarity class. It does have a tf() function but it is
actually passed a termFreq value.
Thanks a lot.
PS: I am using lucene for a class project where I am trying to utilize
font information of HTML documents. I am boosting the scores for matches
in H6 field over matches in H5 and so on.
---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org