Details

Description

When I play with ITBLL (from trunk tip), sometimes, meta scan hangs when the cluster is rolling restarted. When this happens, the master takes about 1000% of CPU. It looks like there is an infinite loop somewhere. The logs show nothing interesting except some meta scanner RPC calls timed out. Jstask shows the 10 high QoS RPC handlers are busy with meta scanning.

However, if I run it again without HBASE-10018, things are fine. I suspect there is something to do with the small/reverse scan.

By the way, I see this problem even with log replay off and hfile version = 2.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

I think I found the problem. In StoreFileScanner, we always use Bytes.compareTo(), which is wrong. We should use the right comparator. For meta, the comparator is a little different, that's why the problem shows up with meta scan.

Jimmy Xiang
added a comment - 11/Apr/14 21:28 I think I found the problem. In StoreFileScanner, we always use Bytes.compareTo(), which is wrong. We should use the right comparator. For meta, the comparator is a little different, that's why the problem shows up with meta scan.

Jimmy Xiang
added a comment - 10/Apr/14 19:19 Looked into it and found StoreFileScanner#backwardSeek always returns true in some case. I am testing a patch so that if seek() return false, don't return true.