I have 5 box zookeeper quorum. 2 / 5 are unable to deserialize the message from leader and hence they are not coming up. I tried restarting few times but still seeing the same issue. This issue is observed when we were trying to do the following:1> We have a script which brings down the zookeeper server on the leader. We are trying to test if new follower takes over once leader is down.

2> The script has been running since 2-3 days. Surprisingly we don't see any issue for the first 2 days but after that we hit the below issue wherein 2 out 5 zookeeper machines are unable to deserialize the message.

It can take quite a bit of time for the leader to communicate theznode data to the followers during startup. initLimit is the amount oftime the leader will allow the followers to download the data from theleader and get to a point where they are ready to serve requests toclients. If this time is exceeded, the leader will close theconnection to the follower and the quorum process will restart. Yourcurrent initLimit value of 20 sec is typically more than enough time.That said, if you have a slow connection or the data is very large itmight be necessary to increase this.

The syncLimit controls the amount of time the quorum members willallow each other to communicate. If the leader doesn't hear from afollower in the syncLimit time, it will drop the follower from thequorum, and vice versa. It's important not to set this parameter toohigh as it's one of the ways a server detects networking issues andcauses recovery to take place (i.e. the follower will drop out of thequorum and try to reconnect, all clients on that follower willdisconnect and reconnect to another server, etc...).

Can you try increasing both initLimit and syncLimit to 100 on all ZKservers and restarting? Note that it's not good to have syncLimit besuch a high value - fixed by ZOOKEEPER-1521.