My hbase cluster completely stopped working this morning. When i looked at
the log files, I saw the below. I am wondering why this happened and what
can be done to avoid this in future. I restarted the master and
regionserver and things look ok now. but i don't know how much data i must
have lost in the process.
can someone help?
2011-12-08 17:21:24,400 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer
: ABORTING region server
serverName=ip-10-68-145-124.ec2.internal,60020,13181075
55030, load=(requests=1709, regions=377, usedHeap=1402, maxHeap=2991):
regionser
ver:60020-0x132d86477bf02c3-0x132d86477bf02c3-0x132d86477bf02c3-0x132d86477bf02c
3
regionserver:60020-0x132d86477bf02c3-0x132d86477bf02c3-0x132d86477bf02c3-0x132
d86477bf02c3 received expired from ZooKeeper, aborting
org.apache.zookeeper.KeeperException$SessionExpiredException:
KeeperErrorCode =
Session expired
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(Zo
oKeeperWatcher.java:343)
at
org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperW
atcher.java:261)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.j
ava:530)
at
org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:506)
2011-12-08 17:21:28,310 FATAL
org.apache.hadoop.hbase.regionserver.HRegionServer: ABORTING region server
serverName=ip-10-68-145-124.ec2.internal,60020,1318107555030,
load=(requests=4120, regions=377, usedHeap=1450, maxHeap=2991): Unhandled
exception: org.apache.hadoop.hbase.YouAreDeadException: Server REPORT
rejected; currently processing
ip-10-68-145-124.ec2.internal,60020,1318107555030 as dead server
org.apache.hadoop.hbase.YouAreDeadException:
org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
currently processing ip-10-68-145-124.ec2.internal,60020,1318107555030 as
dead server
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:532)
at
org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:95)
at
org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:79)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:733)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:594)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.hbase.YouAreDeadException: Server REPORT rejected;
currently processing ip-10-68-145-124.ec2.internal,60020,1318107555030 as
dead server
at
org.apache.hadoop.hbase.master.ServerManager.checkIsDead(ServerManager.java:201)
at
org.apache.hadoop.hbase.master.ServerManager.regionServerReport(ServerManager.java:259)
at
org.apache.hadoop.hbase.master.HMaster.regionServerReport(HMaster.java:641)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:616)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Server.call(HBaseRPC.java:570)
at
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1039)
at
org.apache.hadoop.hbase.ipc.HBaseClient.call(HBaseClient.java:771)
at
org.apache.hadoop.hbase.ipc.HBaseRPC$Invoker.invoke(HBaseRPC.java:257)
at $Proxy5.regionServerReport(Unknown Source)
at
org.apache.hadoop.hbase.regionserver.HRegionServer.tryRegionServerReport(HRegionServer.java:727)
... 2 more
thanks
vinod