Attempts to reconnect to expired ZooKeeper sessions

Details

Type: Task

Status:Resolved

Priority: Major

Resolution:
Invalid

Affects Version/s:0.90.5, 0.92.0

Fix Version/s:
None

Component/s:
None

Labels:

None

Description

In a couple of instances of short network outages, we have observed afterward zombie HBase processes attempting over and over to reconnect to expired ZooKeeper sessions. We believe this is due to ZOOKEEPER-1159. Opening this issue as reference to that.

I am seeing similar behavior after restart of my HBase 0.90.4 cluster. The messages that are printed continuously look like the below. Is it the same issue? Also, ZOOKEEPER-1159 is not yet resolved yet. Which upstream issue was marked as invalid?

Ken Weiner
added a comment - 22/Dec/11 01:37 I am seeing similar behavior after restart of my HBase 0.90.4 cluster. The messages that are printed continuously look like the below. Is it the same issue? Also, ZOOKEEPER-1159 is not yet resolved yet. Which upstream issue was marked as invalid?
- Socket connection established to localhost/127.0.0.1:2181, initiating session
- Unable to read additional data from server sessionid 0x1345e2d625d0009, likely server has closed socket, closing socket connection and attempting reconnect

@Ken: Not sure if it is the same issue. ZooKeeper logs say the session expired?

Pardon, when I spoke with Camille at Hadoop World NYC the verdict was the patch on 1159 was not correct, and this was the basis for 'ClientCnxn does not propagate session expiration indication'. This got jumbled in my head to an issue resolution. I do have the impression that jira will be resolved that way.

Andrew Purtell
added a comment - 22/Dec/11 01:42 @Ken: Not sure if it is the same issue. ZooKeeper logs say the session expired?
Pardon, when I spoke with Camille at Hadoop World NYC the verdict was the patch on 1159 was not correct, and this was the basis for 'ClientCnxn does not propagate session expiration indication'. This got jumbled in my head to an issue resolution. I do have the impression that jira will be resolved that way.