lucene-dev mailing list archives

[jira] Commented: (LUCENE-1691) An index copied over another index can result in corruption

Date

Mon, 15 Jun 2009 10:53:07 GMT

[ https://issues.apache.org/jira/browse/LUCENE-1691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12719513#action_12719513
]
Michael McCandless commented on LUCENE-1691:
--------------------------------------------
Copying over an existing index, without first removing all files in that index, is not a supported
use case for Lucene.
Ie, to restore from backup you should make an empty dir and copy back your index files.
> An index copied over another index can result in corruption
> -----------------------------------------------------------
>
> Key: LUCENE-1691
> URL: https://issues.apache.org/jira/browse/LUCENE-1691
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Store
> Reporter: Adrian Hempel
> Priority: Minor
> Fix For: 2.4.1
>
>
> After restoring an older backup of an index over the top of a newer version of the index,
attempts to open the index can result in CorruptIndexExceptions, such as:
> {noformat}
> Caused by: org.apache.lucene.index.CorruptIndexException: doc counts differ for segment
_ed: fieldsReader shows 1137 but segmentInfo shows 1389
> at org.apache.lucene.index.SegmentReader.initialize(SegmentReader.java:362)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:306)
> at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:228)
> at org.apache.lucene.index.MultiSegmentReader.<init>(MultiSegmentReader.java:55)
> at org.apache.lucene.index.ReadOnlyMultiSegmentReader.<init>(ReadOnlyMultiSegmentReader.java:27)
> at org.apache.lucene.index.DirectoryIndexReader$1.doBody(DirectoryIndexReader.java:102)
> at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:653)
> at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:115)
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:316)
> at org.apache.lucene.index.IndexReader.open(IndexReader.java:237)
> {noformat}
> The apparent cause is the strategy of taking the maximum of the ID in the segments.gen
file, and the IDs of the apparently valid segment files (See lines 523-593 [here|http://svn.apache.org/viewvc/lucene/java/tags/lucene_2_4_1/src/java/org/apache/lucene/index/SegmentInfos.java?annotate=751393]),
and using this as the current generation of the index. This will include "stale" segments
that existed before the backup was restored.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org