TRECCollection docnos should be trimmed of whitespace

Details

Description

The CompressingMetaIndex stores items as aspected.

Problem: If you try to get an item (CompressingMetaIndex.getItem(String Key, int docid)) from the index, before it will be returned, the "trim()" method is called. That is a problem in case that the item contained leading/trailing spaces.

I made the decision that metadata is OK to trim(), but that docnos from TRECCollection should be trimmed() by default, as per Benjamin's suggestion. Updating issue title to reflect refocus. I have updated TestTRECCollection to check the docno.

Craig Macdonald
added a comment - 27/Jul/12 5:13 PM Hi folks.
I made the decision that metadata is OK to trim(), but that docnos from TRECCollection should be trimmed() by default, as per Benjamin's suggestion. Updating issue title to reflect refocus. I have updated TestTRECCollection to check the docno.