Lars-Erik Bjørk wrote:
>
> Hi Ann!
>
> Thank you for the detailed response
>
> I will have to reproduce the bug one more time to get answers to some
> of your
> questions. I did some testing with rebuilding the index yesterday, so the
> corrupted database is no longer corrupt :)
>
> I have in-line answers to some of your questions, though
>
> Ann W. Harrison wrote:
>> Lars-Erik,
>>
>>> Sometimes we get too many rows returned when doing LIMIT queries.
>>> F.ex my previous RQG run says:
>>>
>>> SELECT * FROM `E` WHERE `int_key` < 5 returns different result when
>>> executed with predicate 'ORDER BY `int_key` LIMIT 1073741824' (1000
>>> vs. 1459 rows)
>>>
>>> The tables has 1000 rows.
>>
>> That's not nice.
>>
>>>
>>> In IndexWalker::getValidatedRecord, I have added debug printouts,
>>> the recordNumber and the pageNumber whenever we return a valid
>>> record. For the previous example, there are plenty of index pages
>>> used, but only the three first of them contains valid nodes.
>>
>> That's interesting too. If I understand correctly, only three pages
>> contain entries for the current content of the table. What do the other
>> pages contain? Old versions of records? Garbage? Nothing?
Other, not older, sorry :)
> So far I am not sure what the older pages contain. The debug printouts
> I have
> is when we start a LIMIT search (IndexRootPage::positionIndex), when
> we move
> on to the next index page (IndexRootPage::repositionIndex), and when
> we return
> a valid record number from IndexWalker::getValidatedRecord. So the only
> information I have about the 'empty' pages right now is that I see
> that we
> reposition to them, and then move along to the next page.
>
> Just fyi:
> A change that I have in my sandbox, is that if we come across
> BUCKET_END as the
> first entry in a page, we move on to the next page instead of stopping
> the search.
>
>>
>> If there are three "leaf nodes" - level 0 pages - there should be one
>> mostly empty level 1 page for a total of four pages, unless the table
>> was once much larger, or had key values with a ranges that varied with
>> time. If the key values are evenly distributed over time, then there
>> could be empty pages at the end, assuming that the key values got
>> smaller (lower value, not just shorter) over time. Can you print out
>> an unused page to see what's in it?
> As soon as the bug is reproduced, the database remains in this state, and
> I can run the query multiple times and experience the same results every
> time. Getting the information you asked for should be doable. I will
> respond
> to that in an additional mail when I have dug a little deeper.
>>>
>>> The three first pages are page 154, 136, 213 (in that order).
>>>
>>> All the nodes returned as valid from pages 154 and 213 match with
>>> the duplicate rows returned. Page 136 contains valid nodes for all
>>> 1000 records in the table.
>>
>> One page contains all the legitimate values, and it's the middle of
>> the three. Hmmm. Doesn't exactly sound like a failed split, but it
>> could be. Are the entries all ordered correctly within the page?
>> Is there any overlap between page 154 and 213? I'm assuming that
>> 154 and 213 contain about 500 entries. Is page 136 full?
> What I forgot to mention yesterday, is that the value of the indexed
> field
> is set to the same value for every record. Within the page, the nodes
> are however
> orderer according to record number. The are no overlap of the entries
> in page 154
> and 213. Page 154 has about 400 'valid' entries, whereas page 213 has
> about 25. How
> many 'unvalid' entries they have, I do not know. I don't know at the
> present time if
> page 136 is full.
>
>
>>>
>>> This leads me to believe that our index is fishy. Some of the nodes
>>> have two entries in the index, and these two entries are not even
>>> adjacent ...
>>
>> Nasty.
>>
>>>
>>> I assume that if this is the case, we would not notice it during a
>>> regular index search (not using LIMIT), because this would only
>>> result in the same bit in the bitmap being set twice, during the
>>> scan phase. I tried to test this by adding the following to
>>> IndexRootPage::scanIndex
>>>
>>> ASSERT (!bitmap->isSet(number)); // Added this
>>> bitmap->set (number);
>>>
>>> And this asserted during the test.
>>
>> Good test.
>>>
>>> Does this mean anything? Or are there valid conditions where this
>>> assert may happen? I think this looks suspect.
>>
>> No there are no valid cases that I know of where a single value appears
>> twice with the same record number. The record number is part of the
>> upper level key (as you know) to distinguish duplicate values from each
>> other - reduces the index garbage collection cost. But in this case,
>> as Kevin pointed out, garbage collection isn't going to work.
>>
> This seems correct, as the database is left in this corrupted state.
> It is not a transient
> bug (unless the occurrence window is VERY big :)).
>
>
> I will get back to you with more detailed answers later today.
>
>> Interesting indeed.
>>
>> Best regards,
>>
>> Ann
> Best,
>
> Lars-Erik
>
>

Content reproduced on this site is the property of the respective copyright holders. It is not reviewed in advance by Oracle and does not necessarily represent the opinion of Oracle or any other party.