Right now I'm running some loading tests and I'm getting walls of text every time a split happens and it's basically the same message repeated hundreds of times. We used to have a similar message before but we removed it since it's pretty spammy (or we set it to DEBUG, can't remember).

Jean-Daniel Cryans
added a comment - 29/Nov/12 23:29 Nicolas,
Do you think this log message could be removed?
12/11/29 15:17:36 INFO client.HConnectionManager$HConnectionImplementation: Region TestTable,0001966229,1354231005211.1bbba78dda968874d2981c322ed3319f. moved from 572ba57e-1cab-4f9c-a071-782e5a1a7184.cs1cloud.internal:60020, updating client location cache. New server: 20590793-0e19-4eb4-b2f6-05de8244f716.cs1cloud.internal:60020
Right now I'm running some loading tests and I'm getting walls of text every time a split happens and it's basically the same message repeated hundreds of times. We used to have a similar message before but we removed it since it's pretty spammy (or we set it to DEBUG, can't remember).

Hadoop QA
added a comment - 11/May/12 10:32 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12526501/5877.v18.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1845//console
This message is automatically generated.

Now how does the updateCachelocation help here. If for some reason the opening of the region is not yet done and if the client gets RegionMovedException the client will try to contact the RS thinking the region got moved to it.

Yes, exactly. That's why I kept the sleep in the client code even for this RegionMoved. We could optimize this by adding a timestamps, with an heuristic like: "we give two seconds for the region to move after it's closed on the origin server". Sharing the region state in ZK would be a simpler option, as we would know if the region has moved or not.

Nicolas Liochon
added a comment - 09/May/12 19:13 Now how does the updateCachelocation help here. If for some reason the opening of the region is not yet done and if the client gets RegionMovedException the client will try to contact the RS thinking the region got moved to it.
Yes, exactly. That's why I kept the sleep in the client code even for this RegionMoved. We could optimize this by adding a timestamps, with an heuristic like: "we give two seconds for the region to move after it's closed on the origin server". Sharing the region state in ZK would be a simpler option, as we would know if the region has moved or not.

Just saw the intent behind this JIRA. It is very important and a good feature. You have done a lot in this but the JIRA says it is Minor .

It doesn't know if the region is opened yet. We could have this by adding the info in zk, but it would increase the zk load...

Now how does the updateCachelocation help here. If for some reason the opening of the region is not yet done and if the client gets RegionMovedException the client will try to contact the RS thinking the region got moved to it.
Thanks N.

ramkrishna.s.vasudevan
added a comment - 09/May/12 18:47 Just saw the intent behind this JIRA. It is very important and a good feature. You have done a lot in this but the JIRA says it is Minor .
It doesn't know if the region is opened yet. We could have this by adding the info in zk, but it would increase the zk load...
Now how does the updateCachelocation help here. If for some reason the opening of the region is not yet done and if the client gets RegionMovedException the client will try to contact the RS thinking the region got moved to it.
Thanks N.

Hadoop QA
added a comment - 09/May/12 10:50 +1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12526123/5877.v18.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed unit tests in .
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1812//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1812//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1812//console
This message is automatically generated.

Hadoop QA
added a comment - 08/May/12 23:52 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12526047/5877.v18.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.regionserver.TestSplitTransactionOnCluster
org.apache.hadoop.hbase.replication.TestReplication
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1801//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1801//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1801//console
This message is automatically generated.

Hadoop QA
added a comment - 08/May/12 07:23 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525963/5877-v17.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
-1 patch. The patch command could not apply the patch.
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1793//console
This message is automatically generated.

Hadoop QA
added a comment - 04/May/12 23:52 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525684/5877-v17.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1775//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1775//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1775//console
This message is automatically generated.

Hadoop QA
added a comment - 04/May/12 22:50 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525676/5877-v16.txt
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestMultiParallel
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1773//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1773//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1773//console
This message is automatically generated.

Ted Yu
added a comment - 04/May/12 21:44 Patch v16 is able to pass TestMultiParallel.
See the following link for the reason of infinite recursion:
http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/com.cloudera.hadoop/hadoop-core/0.20.2-737/org/apache/hadoop/ipc/RemoteException.java#RemoteException.unwrapRemoteException%28%29

testFlushCommitsNoAbort(org.apache.hadoop.hbase.client.TestMultiParallel) Time elapsed: 1.36 sec <<< ERROR!
java.lang.StackOverflowError
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:91)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)

Ted Yu
added a comment - 04/May/12 21:21 I can reproduce the TestMultiParallel failure:
testFlushCommitsNoAbort(org.apache.hadoop.hbase.client.TestMultiParallel) Time elapsed: 1.36 sec <<< ERROR!
java.lang.StackOverflowError
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:91)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)
at org.apache.hadoop.hbase.RegionMovedException.find(RegionMovedException.java:108)

Hadoop QA
added a comment - 04/May/12 20:29 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525660/5877.v15.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestMultiParallel
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1770//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1770//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1770//console
This message is automatically generated.

You don't want to have RegionMovedException carry a ServerName#toString instead of host and port?

I think it's safer this way, as I have to parse the string afterward. Otherwise, if someone modifies ServerName#toString he will break the parsing in RegionMovedException, a class he may never have heard of (yes, it will break the unit test )

Is this a bug fix?

Unfortunately, it's a feature. The error management is duplicated, and I have to manage both cases, because we don't have the exception when we come back to the result later in the code.

Stuff is lazily cleared from movedRegions? Should we have a cleaner come visit occasionally?

Aggreed, it would be better with a cleaner. Will do as well.

Why you say the above? When we protobuf it, it'll just be an option so it shouldn't be too bad?

Yeah, if it was too bad I would not have proposed it . It's an imperfection to accept I think. We would not have it if we share the regions state within the cluster with ZK.

@ted

Under what condition would newHrl be null above ?

Oops. Refactoring error. Removed.

Please remove the space between newHrl and ')' below:

Done.

Would the above code result in NPE since I see the following in javadoc:

It should not happen because we test hrl value before. But I added a check on the arguments to make it safer.

Since updateCachedLocations() is used to handle exception, the presizing above may not be needed.

Since updateCachedLocations() is used to handle exception, the presizing above may not be needed.

Yeah, I sized it thinking: if we're doing a rolling restart we may have 100 regions with a wrong location if we're really unlucky. As it small, any solution would work here, but I prefer to have the size explicitly set, as it says "I though about it, that's a reasonable size". I added a comment however.

Nicolas Liochon
added a comment - 04/May/12 09:06 @stack
You don't want to have RegionMovedException carry a ServerName#toString instead of host and port?
I think it's safer this way, as I have to parse the string afterward. Otherwise, if someone modifies ServerName#toString he will break the parsing in RegionMovedException, a class he may never have heard of (yes, it will break the unit test )
Is this a bug fix?
Unfortunately, it's a feature. The error management is duplicated, and I have to manage both cases, because we don't have the exception when we come back to the result later in the code.
Put the history of moved regions out into its own class?
You're right, it would be better. Wil do.
Don't presize this I'd say: private static final long TIMEOUT_REGION_MOVED = (2L * 60L * 1000L);
You would prefer a configurable value?
Stuff is lazily cleared from movedRegions? Should we have a cleaner come visit occasionally?
Aggreed, it would be better with a cleaner. Will do as well.
Why you say the above? When we protobuf it, it'll just be an option so it shouldn't be too bad?
Yeah, if it was too bad I would not have proposed it . It's an imperfection to accept I think. We would not have it if we share the regions state within the cluster with ZK.
@ted
Under what condition would newHrl be null above ?
Oops. Refactoring error. Removed.
Please remove the space between newHrl and ')' below:
Done.
Would the above code result in NPE since I see the following in javadoc:
It should not happen because we test hrl value before. But I added a check on the arguments to make it safer.
Since updateCachedLocations() is used to handle exception, the presizing above may not be needed.
Since updateCachedLocations() is used to handle exception, the presizing above may not be needed.
Yeah, I sized it thinking: if we're doing a rolling restart we may have 100 regions with a wrong location if we're really unlucky. As it small, any solution would work here, but I prefer to have the size explicitly set, as it says "I though about it, that's a reasonable size". I added a comment however.
The indentation of CloseRegionHandler() above is off.
Fixed.
'will contains' -> 'will contain'. 'keep a too old' -> 'keep too old'.
Fixed.

Ted Yu
added a comment - 03/May/12 23:19
+ final Set< String > updateHistory = new HashSet< String >(100);
Since updateCachedLocations() is used to handle exception, the presizing above may not be needed.
+ this (server, rsServices, regionInfo, abort, zk, versionOfClosingNode, eventType, null );
+ }
+
+ protected CloseRegionHandler( final Server server,
The indentation of CloseRegionHandler() above is off.
+ // This map will contains all the regions that we closed for a move.
+ // We add the time it was moved as we don't want to keep a too old information
'will contains' -> 'will contain'. 'keep a too old' -> 'keep too old'.
This is a very useful feature. Good work, N.

Ted Yu
added a comment - 03/May/12 22:47
+ HRegionLocation newHrl = new HRegionLocation(hrl.getRegionInfo(), hostname, port);
+ if (newHrl == null ){
+ // Should not happened. Just in case .
Under what condition would newHrl be null above ?
Please remove the space between newHrl and ')' below:
+ cacheLocation(hrl.getRegionInfo().getTableName(), newHrl );
For updateCachedLocations():
+ hrl : getCachedLocation(tableName, row.getRow()));
Would the above code result in NPE since I see the following in javadoc:
+ * @param row - can be null if hrl is not null .
More review comments to follow.

Hadoop QA
added a comment - 03/May/12 22:01 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12525508/5877.v12.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 9 new or modified tests.
+1 hadoop23. The patch compiles against the hadoop 0.23.x profile.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed these unit tests:
org.apache.hadoop.hbase.client.TestMultiParallel
Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1749//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1749//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1749//console
This message is automatically generated.

stack
added a comment - 03/May/12 21:44 You don't want to have RegionMovedException carry a ServerName#toString instead of host and port? Or it doesn't make sense when our cached region exceptions are keyed by hostname+port only?
Is this a bug fix?
@@ -1910,6 +1989,7 @@ public class HConnectionManager {
}
} catch (ExecutionException e) {
LOG.debug( "Failed all from " + loc, e);
+ updateCachedLocations(updateHistory, loc, e);
Put the history of moved regions out into its own class?
Don't presize this I'd say:
+ private static final long TIMEOUT_REGION_MOVED = (2L * 60L * 1000L);
Stuff is lazily cleared from movedRegions? Should we have a cleaner come visit occasionally?
Patch looks fine to me. Nice fat test.
5) The destination is the closeRegion interface is a kind of interface hijacking. Other options would be:
Why you say the above? When we protobuf it, it'll just be an option so it shouldn't be too bad?
The HCM stuff is ugly but thats not your fault.

1) ServerName is used everywhere in the interface, thanks to protobuf
2) hadoop.ipc serialization of exception is based on the #getMessage. So we have to parse it internally. It's not visisble to the exception user.
3) The code to manage the error in the client package is quite complex. We have the exception at the very beginning, and then it's checked again, but we don't have the real exception anymore. I used a new "historyList" to make it works. There is another JIRA for other improvement, in which I could get rid of this (HBASE-5924)
4) Generated with protobuf 2.4.1
5) The destination is the closeRegion interface is a kind of interface hijacking. Other options would be:

sharing the region state in zookeeper

letting the regionserver calls the master to get the new server. On paper this would be more efficient than a client -> master call. In both cases we could consider that the client should not connect to the master except for cluster administration (create table, split regin; ...). That would increase global reliability. That's for another discussion as well I think.
6) RegionServerServices has been modified to set a destination when removing a region from the online regions.
7) In another JIRA I will manage the case when the destination is not specified when calling the move function.

Nicolas Liochon
added a comment - 03/May/12 20:57 v12, should be final.
1) ServerName is used everywhere in the interface, thanks to protobuf
2) hadoop.ipc serialization of exception is based on the #getMessage. So we have to parse it internally. It's not visisble to the exception user.
3) The code to manage the error in the client package is quite complex. We have the exception at the very beginning, and then it's checked again, but we don't have the real exception anymore. I used a new "historyList" to make it works. There is another JIRA for other improvement, in which I could get rid of this ( HBASE-5924 )
4) Generated with protobuf 2.4.1
5) The destination is the closeRegion interface is a kind of interface hijacking. Other options would be:
sharing the region state in zookeeper
letting the regionserver calls the master to get the new server. On paper this would be more efficient than a client -> master call. In both cases we could consider that the client should not connect to the master except for cluster administration (create table, split regin; ...). That would increase global reliability. That's for another discussion as well I think.
6) RegionServerServices has been modified to set a destination when removing a region from the online regions.
7) In another JIRA I will manage the case when the destination is not specified when calling the move function.

Here are the things I'm not a big fan, but for which I don't have a better solution:

the move management in the client code: I think it's possible to change the way we manage error (don't wait for all results before retrying), but that would be for another JIRA

the destination is the closeRegion interface is a kind of interface hijacking. Other options would be:

sharing the region state in zookeeper

letting the regionserver calls the master to get the new server. On paper this would be more efficient than a client -> master call. In both cases we could consider that the client should not connect to the master except for cluster administration (create table, split regin; ...). That would increase global reliability. That's for another discussion as well I think.

Here is what I plan to do in the final version

move the handler functional code into a function in HRegionServer: this would allow to have the function addToMovedRegion as private instead of public.

Change all the CloseRegionHandler to take a RegionServer instead of a server? I'm not really keen on adding a class RegionServerServices, but may be I should?

Manage the case when the destination is not specified at the beginning of the move (may be in a different Jira if it's not simple)...

Nicolas Liochon
added a comment - 02/May/12 16:57 Generated with protobuf 2.4.1
Here are the things I'm not a big fan, but for which I don't have a better solution:
the move management in the client code: I think it's possible to change the way we manage error (don't wait for all results before retrying), but that would be for another JIRA
the destination is the closeRegion interface is a kind of interface hijacking. Other options would be:
sharing the region state in zookeeper
letting the regionserver calls the master to get the new server. On paper this would be more efficient than a client -> master call. In both cases we could consider that the client should not connect to the master except for cluster administration (create table, split regin; ...). That would increase global reliability. That's for another discussion as well I think.
Here is what I plan to do in the final version
move the handler functional code into a function in HRegionServer: this would allow to have the function addToMovedRegion as private instead of public.
Change all the CloseRegionHandler to take a RegionServer instead of a server? I'm not really keen on adding a class RegionServerServices, but may be I should?
Manage the case when the destination is not specified at the beginning of the move (may be in a different Jira if it's not simple)...
All the previous comment should have been taken into account.

Yes, but as there is a wait time between two tries, I think the benefit will be minimal vs. the wait time for a single client. I could add an heuristic like if region was closed more than 2 seconds ago, consider that it's now available on the new server and don't sleep before the next retry. That could lead of having more network messages if the rule is wrong (and the rule will be wrong when the system is overloaded), and it will add some complexity to the client code. Having the real status of the region would solve this.

Anyway, with the dev already done to cut the link between master & clients, it can help to save a reconnect to master. And during a rolling restart with regions moving everywhere, I think it will make a real difference.

I don't see changes to make use of this new functionality? I'd expect the balancer in master to make use of it?

Yes, it's the changes in AssignmentManager: the changes are in the patch, but are quite small at the end: basically:

Nicolas Liochon
added a comment - 26/Apr/12 21:59 This patch will benefit any move, not just rolling restart, right?
Yes, but as there is a wait time between two tries, I think the benefit will be minimal vs. the wait time for a single client. I could add an heuristic like if region was closed more than 2 seconds ago, consider that it's now available on the new server and don't sleep before the next retry. That could lead of having more network messages if the rule is wrong (and the rule will be wrong when the system is overloaded), and it will add some complexity to the client code. Having the real status of the region would solve this.
Anyway, with the dev already done to cut the link between master & clients, it can help to save a reconnect to master. And during a rolling restart with regions moving everywhere, I think it will make a real difference.
I don't see changes to make use of this new functionality? I'd expect the balancer in master to make use of it?
Yes, it's the changes in AssignmentManager: the changes are in the patch, but are quite small at the end: basically:
- unassign(plan.getRegionInfo());
+ unassign(plan.getRegionInfo(), false, plan.getDestination());
I still need to manage the case when the destination is not specified at the beginning.

stack
added a comment - 26/Apr/12 19:22 @N This patch will benefit any move, not just rolling restart, right?
Here are a quick couple of comments on the patch.
I like addition of RegionMovedException.
Convention is to capitalize static defines: '+ private static final String hostField = "hostname=";' so it should be HOSTFIELD.
Its super ugly that you have to parse exception message to get exception data member fields... but thats not your fault.
Please keep the style of the surrounding code. This kinda thing is unconventional:
+ private void updateCachedLocations(
+ Set< String > updateHistory,
+ HRegionLocation hrl,
+ Object t) {
Ugh on how ugly it is updating cache. Again, not your fault.
Ted suggests updating interface version. Maybe don't. If you do, you can't get this into a 0.94.1, etc.
I don't see changes to make use of this new functionality? I'd expect the balancer in master to make use of it?

Ted Yu
added a comment - 26/Apr/12 17:50 @N:
If the testing result is favorable, I think Lars may want it in 0.94 as well.
I think making this feature functional in 0.94 cluster would be a good start.
We could have this by adding the info in zk
A separate discussion should be started w.r.t. the above. This would shift load imposed by clients from master to zk quorum.

Note that I'm currently rewriting the patch, as it conflicts with the protobuf stuff that was committed recently... But the logic hasn't changed.

@ted What we're saving in the current implementation is a call to the master. It can be interesting in itself if the region moves is used by a lot of clients. We could do better by letting the client know that the region is now fully available somewhere else and that there is no need to wait before retrying. But right now the region server only knows that the region is closed and moved to another server. It doesn't know if the region is opened yet. We could have this by adding the info in zk, but it would increase the zk load...

Nicolas Liochon
added a comment - 26/Apr/12 17:32 Note that I'm currently rewriting the patch, as it conflicts with the protobuf stuff that was committed recently... But the logic hasn't changed.
@ted What we're saving in the current implementation is a call to the master. It can be interesting in itself if the region moves is used by a lot of clients. We could do better by letting the client know that the region is now fully available somewhere else and that there is no need to wait before retrying. But right now the region server only knows that the region is closed and moved to another server. It doesn't know if the region is opened yet. We could have this by adding the info in zk, but it would increase the zk load...

@N:
I think the following validation in real cluster would illustrate the benefit of this feature:
For given table, select a region server and note the row key ranges hosted by one region on the region server. Direct client load to this region.
Issue the following command in shell:

hbase> move 'ENCODED_REGIONNAME', 'SERVER_NAME'

at time T.

Difference in client response to region migration around time T with and without the patch would be interesting.

Ted Yu
added a comment - 26/Apr/12 17:26 @N:
I think the following validation in real cluster would illustrate the benefit of this feature:
For given table, select a region server and note the row key ranges hosted by one region on the region server. Direct client load to this region.
Issue the following command in shell:
hbase> move 'ENCODED_REGIONNAME', 'SERVER_NAME'
at time T.
Difference in client response to region migration around time T with and without the patch would be interesting.

Can we mark the failure and make this RegionMovedException behave the same as NotServingRegionException ?

Done.

For updateCachedLocations(), please put explanation for parameter on the same line as the parameter:

Done.

'Failed all' -> 'Failed call'

It's an existing comment that we can find again later in the code. It really means "failed all": all the queries on this server failed. I don't mind changing it to something better, but I think we should keep the "all".

'which the server' -> 'which the region'

Done.

Please increase the VERSION of HRegionInterface

Done.

How is the server removed from cache since I see 'continue' above ?

That's what makes this code complex and difficult to change: the error is actually managed later, when we don't have the real exception anymore.

Nicolas Liochon
added a comment - 26/Apr/12 09:25 Can we mark the failure and make this RegionMovedException behave the same as NotServingRegionException ?
Done.
For updateCachedLocations(), please put explanation for parameter on the same line as the parameter:
Done.
'Failed all' -> 'Failed call'
It's an existing comment that we can find again later in the code. It really means "failed all": all the queries on this server failed. I don't mind changing it to something better, but I think we should keep the "all".
'which the server' -> 'which the region'
Done.
Please increase the VERSION of HRegionInterface
Done.
How is the server removed from cache since I see 'continue' above ?
That's what makes this code complex and difficult to change: the error is actually managed later, when we don't have the real exception anymore.
For ServerManager.sendRegionClose(), please add javadoc for destServerName param.
Done.
Is it possible that destServerName is null ?
Safety checks added.
Please change the above to debug log. && Why is the above fatal (regionResult != null) ? Step 4 appears in a comment below the above code. Should the above say step 3 ?
Bad logs fixed.

Can we mark the failure and make this RegionMovedException behave the same as NotServingRegionException ?
For updateCachedLocations(), please put explanation for parameter on the same line as the parameter:

v1. On an old trunk, so it's just to give an overview. Includes some bits of HBASE-5844 as well.

There are 3 workarounds in the implementation:
1) As a ServerName is not serializable we use the String dedicated to this kind of issue. Acceptable I think.
2) hadoop.ipc serialization of exception is based on the #getMessage. So we have to parse it internally. It's not visisble to the exception user. Still acceptable (?
3) The code to manage the error in the client package is quite complex. We have the exception at the very beginning, and then it's checked again, but we don't have the real exception anymore. I used a new "updateList" to make it works, I'm looking for another solution here...

Nicolas Liochon
added a comment - 25/Apr/12 20:21 v1. On an old trunk, so it's just to give an overview. Includes some bits of HBASE-5844 as well.
There are 3 workarounds in the implementation:
1) As a ServerName is not serializable we use the String dedicated to this kind of issue. Acceptable I think.
2) hadoop.ipc serialization of exception is based on the #getMessage. So we have to parse it internally. It's not visisble to the exception user. Still acceptable (?
3) The code to manage the error in the client package is quite complex. We have the exception at the very beginning, and then it's checked again, but we don't have the real exception anymore. I used a new "updateList" to make it works, I'm looking for another solution here...