Github user sudheeshkatkam commented on a diff in the pull request:
https://github.com/apache/drill/pull/442#discussion_r58285085
--- Diff: exec/rpc/src/main/java/org/apache/drill/exec/rpc/RpcBus.java ---
@@ -159,19 +159,15 @@ public ChannelClosedHandler(C clientConnection, Channel channel)
{
@Override
public void operationComplete(ChannelFuture future) throws Exception {
String msg;
- if(local!=null) {
+ if(local != null) {
msg = String.format("Channel closed %s <--> %s.", local, remote);
}else{
msg = String.format("Channel closed %s <--> %s.", future.channel().localAddress(),
future.channel().remoteAddress());
}
- if (RpcBus.this.isClient()) {
- if(local != null) {
- logger.info(String.format(msg));
- }
- } else {
- queue.channelClosed(new ChannelClosedException(msg));
- }
+ logger.info(msg); // should we leave this at info level ?
+
+ queue.channelClosed(new ChannelClosedException(msg));
--- End diff --
@adeneche @jacques-n correct me if I am wrong.
Per my understanding, this logic is incomplete, with or without this change. Let's looks
at bit-to-bit comm.
There is one CoordinationQueue **for each instance** of RpcBus (\*Server, \*Client classes
inherit from RpcBus). Also, this queue is used by requestors to listen to outcomes of requests.
1. DataClient <--> DataServer. DataClient is always the requestor, and there can
be at most two data connections between two bits (A --> B and B --> A).
a. Since a DataClient is created per connection, the client's queue contains outcomes
of requests to one DataServer. When this connection closes, failing all RPC outcomes in the
queue makes sense.
b. There is only one instance of DataServer per Drillbit, and so **one server queue**.
Since DataServer never makes requests, this queue should be empty. `queue.channelClosed(...)`
should be a noop.
2. ControlClient <--> ControlServer. This communication is peer-to-peer i.e. ControlServer
and ControlClient can make requests and handle requests. The bit initiating a connection is
the ControlClient, and its peer is the ControlServer for lifetime of this connection. There
can be at most one connection between two bits (A <--> B), and messages are sent both
ways. Now, assume a connection is made.
a. On the ControlClient side, the queue contains outcomes of requests to one ControlServer.
When this connection closes, failing all RPC outcomes in the queue makes sense.
b. There is only one instance of ControlServer per Drillbit, and so **one server queue**.
However, ControlServer can make requests to other bits, and to multiple clients! So this queue
can contain outcomes from multiple connections and `queue.channelClosed(...)` fails outcomes
of requests from **all** connections??
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---