I have a table that blew up in the number of regions, I fixed the
configuration and now need to bring the regions back down to our desired
number of 16. I wrote a program to do exactly that and it worked when
tested on a single node cluster, but now in production I get perpetually
stuck on a merge with the log statement in the HMaster log below.
Typically this is the sequence of events:
Table has the following -
RegionA
RegionB
RegionC
RegionMergedA = merge(RegionA, RegionB)
RegionMergeB = merge(RegionMergedA, RegionC)
Then the logs just get stuck repeating the following, where RegionMergeB =
190d50047a820d9b3ef588429c9065ea
The perpetual statement on the master node. I waited 40 minutes to see if
it cleared to no avail.
9:50:58.059 AM INFO
org.apache.hadoop.hbase.master.handler.DispatchMergingRegionHandler
Skip merging regions
registry.medical.records,,1403961571625.190d50047a820d9b3ef588429c9065ea.,
registry.medical.records,\x0D\xFC\xE0\xD3\xFA\xF5FJ\x9D\xC6F\xD3\xE19*\xEAWLAB115\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x16%\xB6,1400656218257.11a7c474dd162ae4d7fc4a5573dda858.,
because region 190d50047a820d9b3ef588429c9065ea has merge qualifier
The logs on the regionserver show a successful merge.
Regions merged, hbase:meta updated, and report to master.
region_a=registry.medical.records,,1403959070153.fa5121800ef7841875a153b3fef48d95.,
region_b=registry.medical.records,\x0C\xEE\xA7\x0CbNE\xA3\xAA\xC1g\x05\x86\xFA\xF3\xFEWLAB169\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x94_(,1400656255740.e750ecca47b5dd64060d7bc40e0c0a57.,merged
region=registry.medical.records,,1403961571625.190d50047a820d9b3ef588429c9065ea..
Region merge took 0sec
How do I clear the merge qualifier on the region that has been merged?
I tried:
*admin.runCatalogScan();
*table.clearRegionCache();
This also happens when I try to merge on the shell:
merge_region '190d50047a820d9b3ef588429c9065ea',
'11a7c474dd162ae4d7fc4a5573dda858'