I think I've encountered a client-visible manifestation of this issue.

The ZooKeeper FAQ entry [1] titled "Is there an easy way to expire a session for testing?" explains that when one client closes a session, all other sessions using that same session ID will be expired. This feature appears to work with Java clients but not with C clients.

I've attached a C and a Java version of the test case. The basic idea is this:

1) Connect to ZooKeeper.
2) Create a test znode ("/test") if it doesn't exist.
3) Write to the znode.
4) Connect to ZooKeeper with the first connection's session ID and password.
5) Close this new connection.
6) Try to write to the znode again on the first connection.

The write in (6) should never succeed. Using the Java client, the write first returns a ConnectionLossException. Retrying it then returns a SessionExpiredException, as expected. However, when using the C client, the write returns ZOK; this is the wrong behavior.

I've also attached the relevant parts of the server's log file when running against both the Java client and the C client. They diverge in the way jiang guangran originally reported. The key thing to see in these logs is that, in the C client, the session is not terminated cleanly. The client library transparently reconnects a new socket using the same session ID, and the server accepts the session ID that should have been expired.

Based on this behavior, I think there may be two issues with the C client library:

First, it does not cleanly close its sessions. This prevents C clients from forcefully closing each others' sessions.

Second, it attempts to reconnect over a new socket without returning a ZCONNECTIONLOSS to the client. This doesn't matter much to me, but could it affect correctness for some applications?

If you need more info or clarification, feel free to ask by commenting on this issue. I'm also on IRC as 'ongardie'.

Diego Ongaro
added a comment - 07/Dec/11 03:01 I think I've encountered a client-visible manifestation of this issue.
The ZooKeeper FAQ entry [1] titled "Is there an easy way to expire a session for testing?" explains that when one client closes a session, all other sessions using that same session ID will be expired. This feature appears to work with Java clients but not with C clients.
I've attached a C and a Java version of the test case. The basic idea is this:
1) Connect to ZooKeeper.
2) Create a test znode ("/test") if it doesn't exist.
3) Write to the znode.
4) Connect to ZooKeeper with the first connection's session ID and password.
5) Close this new connection.
6) Try to write to the znode again on the first connection.
The write in (6) should never succeed. Using the Java client, the write first returns a ConnectionLossException. Retrying it then returns a SessionExpiredException, as expected. However, when using the C client, the write returns ZOK; this is the wrong behavior.
I've also attached the relevant parts of the server's log file when running against both the Java client and the C client. They diverge in the way jiang guangran originally reported. The key thing to see in these logs is that, in the C client, the session is not terminated cleanly. The client library transparently reconnects a new socket using the same session ID, and the server accepts the session ID that should have been expired.
Based on this behavior, I think there may be two issues with the C client library:
First, it does not cleanly close its sessions. This prevents C clients from forcefully closing each others' sessions.
Second, it attempts to reconnect over a new socket without returning a ZCONNECTIONLOSS to the client. This doesn't matter much to me, but could it affect correctness for some applications?
If you need more info or clarification, feel free to ask by commenting on this issue. I'm also on IRC as 'ongardie'.
Cheers,
Diego
[1] http://wiki.apache.org/hadoop/ZooKeeper/FAQ#A4

Patrick Hunt
added a comment - 13/Feb/12 05:03 Diego, can you submit this in the form of a patch?
https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute
I'm afraid we missed this originally, when you have something you'd like us to review please advance the workflow by clicking on the "submit" link. Thanks!

Diego, I've also encountered java.nio.channels.CancelledKeyException in server's log when a python client calls close method, but I found client had sent CLOSE_OP request to server but exit before server can read the request completely，

src/c/src/zookeeper.c
@@ -2522,6 +2522,22 @@
/* make sure the close request is sent; we set timeout to an arbitrary

lincoln.lee
added a comment - 15/Jun/12 08:04 Diego, I've also encountered java.nio.channels.CancelledKeyException in server's log when a python client calls close method, but I found client had sent CLOSE_OP request to server but exit before server can read the request completely，
src/c/src/zookeeper.c
@@ -2522,6 +2522,22 @@
/* make sure the close request is sent; we set timeout to an arbitrary
(but reasonable) number of milliseconds since we want the call to block*/
rc=adaptor_send_queue(zh, 3000); // here actually call send(s, buf, len, MSG_NOSIGNAL)
we can simply sleep one second after call send_buffer in flush_send_queue method of src/c/src/zookeeper.c
rc = send_buffer(zh->fd, zh->to_send.head);
+ if (timeout>0)sleep(1);
if(rc==0 && timeout==0){
or add a little more lines as the ZOOKEEPER-1105 .patch attached （against trunk）

Michi Mutsuzaki
added a comment - 16/Sep/12 03:06 It looks like this patch broke the windows build. The pollfd struct is not available in windows, so we need to check "#ifdef WIN32" and do something else for windows.
https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008/513/console

Mahadev konar
added a comment - 16/Sep/12 21:33 Nice catch Michi. I think Ill revert the patch for 3.4 and trunk and we can fix it later. I dont think this looks like a blocker for 3.4 release. Mitchi what do you think?

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

Hadoop QA
added a comment - 28/Sep/12 13:10 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12546985/ZOOKEEPER-1105v1.patch
against trunk revision 1389711.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1193//console
This message is automatically generated.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.

+1 release audit. The applied patch does not increase the total number of release audit warnings.

Hadoop QA
added a comment - 08/Oct/13 16:42 -1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12546985/ZOOKEEPER-1105v1.patch
against trunk revision 1530166.
+1 @author. The patch does not contain any @author tags.
-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no new tests are needed for this patch.
Also please list what manual steps were performed to verify this patch.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of release audit warnings.
-1 core tests. The patch failed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//testReport/
Findbugs warnings: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1657//console
This message is automatically generated.