Store lucene version in segment_N

Details

Description

It would be nice to have the version of lucene that wrote segments_N, so that we can use this to determine which major version an index was written with (for upgrading across major versions). I think this could be squeezed in just after the segments_N header.

Activity

This could also be used to throw IndexFormatTooNewException immediately on trying to open an index, whenever the version we see is > Version.LATEST. Uwe Schindler raised this on LUCENE-5952 but I didn't want to make that change there...

Michael McCandless
added a comment - 16/Sep/14 09:35 This could also be used to throw IndexFormatTooNewException immediately on trying to open an index, whenever the version we see is > Version.LATEST. Uwe Schindler raised this on LUCENE-5952 but I didn't want to make that change there...

I don't want to do what the subject says
Or well, i want to creep the scope.

We need the minimum segment version to clean up the IndexFormatTooOldChecks. The version of the commit itself is somewhat unrelated, but its sorta bogus its not exposed to the user. I think currently our IndexUpgrader will fail to properly upgrade an empty commit for example. There isn't even a simple way to check

IMO any version(s) should be ordered before the segments, so we don't try to e.g. load up old codecs that are too old/not there. This will clean up the current logic of readCodec()

Robert Muir
added a comment - 03/Jun/15 18:13 I don't want to do what the subject says
Or well, i want to creep the scope.
We need the minimum segment version to clean up the IndexFormatTooOldChecks. The version of the commit itself is somewhat unrelated, but its sorta bogus its not exposed to the user. I think currently our IndexUpgrader will fail to properly upgrade an empty commit for example. There isn't even a simple way to check
IMO any version(s) should be ordered before the segments, so we don't try to e.g. load up old codecs that are too old/not there. This will clean up the current logic of readCodec()

Ryan Ernst
added a comment - 04/Jun/15 14:17 Thanks Mike! This looks great.
A couple questions:
Do we really need to add compareTo? Couldn't we use the existing onOrAfter? It seems weird to have two ways of comparing versions.
Is there somewhere we could have a more direct test than deletion policy tests? I took a quick look but couldn't find anything unit testing the segment infos reading/writing...

I guess I was hoping we could take it further. We dont technically need to change file formats to implement this, it could be computed from the segments on read in the 4.0-5.2 case? Its just the min() that it finds there. Or does this become too hairy?

Robert Muir
added a comment - 04/Jun/15 14:32
if (format >= VERSION_53) {
// TODO: in the future (7.0? sigh) we can use this to throw IndexFormatTooOldException ... or just rely on the
// minSegmentLuceneVersion check instead:
infos.luceneVersion = Version.fromBits(input.readVInt(), input.readVInt(), input.readVInt());
} else {
// else leave null
}
I guess I was hoping we could take it further. We dont technically need to change file formats to implement this, it could be computed from the segments on read in the 4.0-5.2 case? Its just the min() that it finds there. Or does this become too hairy?

Michael McCandless
added a comment - 05/Jun/15 13:31 It seems weird to have two ways of comparing versions.
Good point, I'll remove the compareTo and just use the existing onOrAfter.
Is there somewhere we could have a more direct test than deletion policy tests?
I'll try to make a more direct test (TestSegmentInfos)...
We dont technically need to change file formats to implement this, it could be computed from the segments on read in the 4.0-5.2 case?
That's what I do in the patch ... if SIS was written by >= 5.3, I pull the min version that we wrote into the header (and throw IndexTooOldExc if it's too old), else I re-compute it on load.

Ryan Ernst
added a comment - 09/Jun/15 17:38 +1 to the new patch.
I think the inline comment about // use safe maps on the VERSION_53 constant can be removed? Seems like a copy/paste issue from the previous comment?