From: Johan De Meersman
Date: September 19 2011 11:19am
Subject: Re: myisamchk error (duplicate key records)
List-Archive: http://lists.mysql.com/mysql/225760
Message-Id: <42b87994-8957-4f4a-b4d9-12ac8b2654b1@zimbra>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
----- Original Message -----
> From: "Hank"
>
> While running a -rq on a large table, I got the following error:
>
> myisamchk: warning: Duplicate key for record at 54381140 against
> record at 54380810
>
> How do I find which records are duplicated (without doing the typical
> self-join or "having cnt(*)>1" query)? This table has 144 million
> rows, so that's not really feasible.
Given that the error is a duplicate *key*, "select from

group by having count() > 1" is an index-covered query, and should thus be perfectly feasible :-)
What I'm not so sure about, is wether the duplicate key will show up correctly in the index - as that index may be marked corrupt - and so, if it falls back to a full tablescan, it's indeed going to take a long time. If it does, however, there's no other option anyway: the only way to do it fast is an index, and that index is untrustworthy.
--
Bier met grenadyn
Is als mosterd by den wyn
Sy die't drinkt, is eene kwezel
Hy die't drinkt, is ras een ezel