Arun C Murthy
added a comment - 14/Mar/14 05:27 Colin P. McCabe I have a barebones C client already as an offshoot of https://github.com/hortonworks/gohadoop ; would you be interested in using that as a starting point?

Colin P. McCabe
added a comment - 14/Mar/14 17:39 Thanks, Arun C Murthy , I'll definitely take a look. I like Golang.
I have some code here already. Mostly I just need to get it into shape and integrate with Maven and CMake.

GenerateProtobufs.cmake: this CMake script invokes the protobuf compiler on the various Hadoop protobuf files.

common/hadoop_err.c: implements hadoop error objects which can be easily created and managed to represent errors throughout the system. They have both an error code and an error message. The error message starts with an exception name.

common/net.c: network utilities. This is very minimal since libuv provides most of what we need.

common/test.h: unit test macros

common/tree.h: BSD-licensed red-black tree, similar to what we use in libhdfs.

rpc/messenger.c: the messenger coordinates all network traffic. It owns all the reactor threads. Currently there is only one reactor thread, but in the future we may want more than one to get more parallelism.

rpc/proxy.c: clients use the proxy to start and manage RPCs, similar to how Proxy is used in the Hadoop java code. There is going to be some more glue code here so that each Hadoop RPC will have an equivalent sync and async invocation function.

rpc/reactor.c: a reactor is a single thread calling epoll. The messenger owns all reactors. Connection callbacks happen in the context of the reactor thread.

rpc/shorten.c: shorten is a utility which makes some really long object names generated by protobuf-c shorter.

Colin P. McCabe
added a comment - 27/Mar/14 02:29 This patch implements an async Hadoop RPC client in C. It uses libuv and protobuf-c.
CMakeLists.txt: the top-level build file which specifies build products.
GenerateProtobufs.cmake: this CMake script invokes the protobuf compiler on the various Hadoop protobuf files.
common/hadoop_err.c: implements hadoop error objects which can be easily created and managed to represent errors throughout the system. They have both an error code and an error message. The error message starts with an exception name.
common/net.c: network utilities. This is very minimal since libuv provides most of what we need.
common/test.h: unit test macros
common/tree.h: BSD-licensed red-black tree, similar to what we use in libhdfs.
common/queue.h: BSD-licensed list implementation
hdfs/namenode-rpc-unit.c: end-to-end test of the RPC framework.
rpc/call.c: represents an in-progress RPC call.
rpc/client_id.c: represents a 16-byte client ID, as specified in Hadoop RPC.
rpc/conn.c: represents an existing TCP connection to a remote Hadoop server.
rpc/messenger.c: the messenger coordinates all network traffic. It owns all the reactor threads. Currently there is only one reactor thread, but in the future we may want more than one to get more parallelism.
rpc/proxy.c: clients use the proxy to start and manage RPCs, similar to how Proxy is used in the Hadoop java code. There is going to be some more glue code here so that each Hadoop RPC will have an equivalent sync and async invocation function.
rpc/reactor.c: a reactor is a single thread calling epoll. The messenger owns all reactors. Connection callbacks happen in the context of the reactor thread.
rpc/shorten.c: shorten is a utility which makes some really long object names generated by protobuf-c shorter.
rpc/varint.c: varint handles the serialization and deserialization of variable length integers.
Build with "cmake . && make"
Unit tests can be run with "make test."

[~wenwupeng]: sorry, I misspoke. I meant to say 0.11. The version I am using is 0.11.22 (my brain must have collapsed this into 0.12) The error handling API changed a bit from 0.10, which is what I was using earlier. Check it out.

Colin P. McCabe
added a comment - 14/Apr/14 16:39 [~wenwupeng] : sorry, I misspoke. I meant to say 0.11. The version I am using is 0.11.22 (my brain must have collapsed this into 0.12) The error handling API changed a bit from 0.10, which is what I was using earlier. Check it out.

I couldn't tell if this had already made it into the hadoop-common build system?

I'm assuming this is a preliminary patch? I think that we'll need more test cases and a valgrind run?

I think in the long term the cmake file will need to be cleaned up. I saw a couple of just hacks (ie find_library(PROTOC_LIB NAMES libprotoc.so HINTS /opt/protobuf-2.5/lib64/).

Do we need a cap on how large '&reactor->inbox.pending_calls' can be? I'm concerned that some systems can be intensive enough to cause OOM exceptions. Also, client side throttling may set good boundaries?

Abraham Elmahrek
added a comment - 30/Apr/14 21:36 In general, +1. Please take a look at the following comments:
I couldn't tell if this had already made it into the hadoop-common build system?
I'm assuming this is a preliminary patch? I think that we'll need more test cases and a valgrind run?
I think in the long term the cmake file will need to be cleaned up. I saw a couple of just hacks (ie find_library(PROTOC_LIB NAMES libprotoc.so HINTS /opt/protobuf-2.5/lib64/).
Do we need a cap on how large '&reactor->inbox.pending_calls' can be? I'm concerned that some systems can be intensive enough to cause OOM exceptions. Also, client side throttling may set good boundaries?

Luke Lu
added a comment - 30/Apr/14 21:48 I assume this is a prototype/partial demo? as all the security stuff is missing. What else is missing (including not yet published)? Are there any work left for the newly minted branch committers?

I couldn't tell if this had already made it into the hadoop-common build system?

It's not hooked up to Maven yet, no. We'll use the same method we hook CMake to the current Maven poms, I think. Shouldn't be too difficult.

I'm assuming this is a preliminary patch? I think that we'll need more test cases and a valgrind run?

I have run valgrind on it and come up clean. There is one test case here, more will follow.

I think in the long term the cmake file will need to be cleaned up. I saw a couple of just hacks (ie find_library(PROTOC_LIB NAMES libprotoc.so HINTS /opt/protobuf-2.5/lib64/).

Fixed, thanks

Do we need a cap on how large '&reactor->inbox.pending_calls' can be? I'm concerned that some systems can be intensive enough to cause OOM exceptions. Also, client side throttling may set good boundaries?

I think the first thing to do is implement RPC timeouts, which we can do in a follow-on. This should avoid having queues that grow too much, unless the client is truly unwise about making a ton of calls in a short period.

Luke Lu: yes, we will do user / group security in a follow-up. I think for the "plain" auth method, it will be as simple as just filling in the fields in the header. SASL will be more work. There is definitely lots of work for branch committers here

Colin P. McCabe
added a comment - 01/May/14 02:03 I couldn't tell if this had already made it into the hadoop-common build system?
It's not hooked up to Maven yet, no. We'll use the same method we hook CMake to the current Maven poms, I think. Shouldn't be too difficult.
I'm assuming this is a preliminary patch? I think that we'll need more test cases and a valgrind run?
I have run valgrind on it and come up clean. There is one test case here, more will follow.
I think in the long term the cmake file will need to be cleaned up. I saw a couple of just hacks (ie find_library(PROTOC_LIB NAMES libprotoc.so HINTS /opt/protobuf-2.5/lib64/).
Fixed, thanks
Do we need a cap on how large '&reactor->inbox.pending_calls' can be? I'm concerned that some systems can be intensive enough to cause OOM exceptions. Also, client side throttling may set good boundaries?
I think the first thing to do is implement RPC timeouts, which we can do in a follow-on. This should avoid having queues that grow too much, unless the client is truly unwise about making a ton of calls in a short period.
Luke Lu : yes, we will do user / group security in a follow-up. I think for the "plain" auth method, it will be as simple as just filling in the fields in the header. SASL will be more work. There is definitely lots of work for branch committers here
+1
Will commit to branch. And file follow-ups...

A quick comment here is, I'd suggest to replace the one character var $

{R}

with a multi character name. So it's easier to see where the variable is set. Actually you used "GENENATOR_DIR" in earlier part of the script, probably you can change "R" to "HADOOP_DIR" to be consistent.

Yongjun Zhang
added a comment - 06/May/14 04:51 Hi Colin,
Thanks for the impressive work you did!
I attempted to compile the code and run into an issue
[yzhang@localhost hadoop- native -core]$ cmake . && make
-- Building hadoop- native -core, the native Hadoop core libraries.
......
Linking C executable shorten
[ 3%] Built target shorten
make[2]: *** No rule to make target `/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/ClientNamenodeProtocol.proto', needed by `protobuf/ClientNamenodeProtocol.pb-c.c'. Stop.
make[1]: *** [CMakeFiles/hdfs-core.dir/all] Error 2
make: *** [all] Error 2
[yzhang@localhost hadoop- native -core]$
The fact I got into this maybe because of a bug in the script. Need to look into.
BTW, I saw that in GenerateProtobufs.cmake,
DECLARE_PROTOS(
COMMON_PROTOBUF_SRCS
${CMAKE_CURRENT_BINARY_DIR}/protobuf
"${R}/hadoop-common-project/hadoop-common/src/main/proto/"
${R}/hadoop-common-project/hadoop-common/src/main/proto/GetUserMappingsProtocol.proto
${R}/hadoop-common-project/hadoop-common/src/main/proto/HAServiceProtocol.proto
...
A quick comment here is, I'd suggest to replace the one character var $
{R}
with a multi character name. So it's easier to see where the variable is set. Actually you used "GENENATOR_DIR" in earlier part of the script, probably you can change "R" to "HADOOP_DIR" to be consistent.
Thanks.

Yongjun Zhang
added a comment - 06/May/14 20:46 I found that my cmake version is too old, which is the cause of the problem reported above.
CMAKE_CURRENT_LIST_DIR: (since 2.8.3) this is the directory of the listfile currently being processed.

Hi Yongjun, thanks for trying the patch, the issue you mentioned is right,

Actually there are more issues I want discuss, Colin:
1. string and macro need to be separated with space, e.g. "%" PRId64 "xxx", some compiler are more strict about this
2. I think it is pretty safe to use %d instead of "%" PRId32, it seems unnecessary do more typing

Binglin Chang
added a comment - 07/May/14 06:24 Hi Yongjun, thanks for trying the patch, the issue you mentioned is right,
Actually there are more issues I want discuss, Colin:
1. string and macro need to be separated with space, e.g. "%" PRId64 "xxx", some compiler are more strict about this
2. I think it is pretty safe to use %d instead of "%" PRId32, it seems unnecessary do more typing

I need to add [stdint.h] to both rpc/conn.c and rpc/proxy.c to solve the problem.

Thanks for this. I will add that include line to both those files in HADOOP-10564.

Binglin said:

1. string and macro need to be separated with space, e.g. "%" PRId64 "xxx", some compiler are more strict about this

Hmm. That's odd. No matter how many warning and pedantic options I pass to gcc, I cannot get it to complain about this. It seems like a lot of the existing C code in Hadoop does this, so I'm surprised you had a problem.

Still, this is an easy issue for us to fix, so I will roll it into HADOOP-10564.

2. I think it is pretty safe to use %d instead of "%" PRId32, it seems unnecessary do more typing

From a practical perspective, I see what you're saying. PRId32 would only be different from %d if the system were using 16-bit ints, which is extremely unlikely to be the case on any modern system. Still, I like the principle of matching the format specifier to the type. Once you start making exceptions and being clever you have to remember all the exceptions, whereas if you just match the format specifier to the type, you only have to remember to do that. I'll think about it some more...

Colin P. McCabe
added a comment - 07/May/14 18:27 Yongjun said:
[cmake version issue]
As Wenwu commented, this is fixed by HADOOP-10573 .
I need to add [stdint.h] to both rpc/conn.c and rpc/proxy.c to solve the problem.
Thanks for this. I will add that include line to both those files in HADOOP-10564 .
Binglin said:
1. string and macro need to be separated with space, e.g. "%" PRId64 "xxx", some compiler are more strict about this
Hmm. That's odd. No matter how many warning and pedantic options I pass to gcc, I cannot get it to complain about this. It seems like a lot of the existing C code in Hadoop does this, so I'm surprised you had a problem.
Still, this is an easy issue for us to fix, so I will roll it into HADOOP-10564 .
2. I think it is pretty safe to use %d instead of "%" PRId32, it seems unnecessary do more typing
From a practical perspective, I see what you're saying. PRId32 would only be different from %d if the system were using 16-bit ints, which is extremely unlikely to be the case on any modern system. Still, I like the principle of matching the format specifier to the type. Once you start making exceptions and being clever you have to remember all the exceptions, whereas if you just match the format specifier to the type, you only have to remember to do that. I'll think about it some more...

Binglin Chang
added a comment - 08/May/14 05:44 Hmm. That's odd. No matter how many warning and pedantic options I pass to gcc
http://stackoverflow.com/questions/11869593/c99-printf-formatters-vs-c11-user-defined-literals
although it is c code, it could be used in c++

Hi Colin, I have some difficulty understanding your code and some comments, could you please help explain?
1. to my understanding, rpc client should have a map<callid, call> to record all unfinished calls, but I could not find any code assigning callids(only make them 0) and manage unfinished calls, could you help me located those logic?
2. in the demo namenode-rpc-unit, I see each proxy only have one call(the current call), does this mean client can only call one rpc at the same time? If so probably every rpc call will need it's own rpc_proxy, from users standing point, they may want what java's interface, multi-thread can concurrently call one proxy, this is very common in hdfs client.
3. hrpc_proxy.call belongs to hrpc_proxy, but in hrpc_proxy_start, the call is passed to reactor->inbox.pending_calls, which may have longer life circle than hrpc_proxy, so there may be protential bug in hrpc_proxy.call?

Binglin Chang
added a comment - 12/May/14 10:02 Hi Colin, I have some difficulty understanding your code and some comments, could you please help explain?
1. to my understanding, rpc client should have a map<callid, call> to record all unfinished calls, but I could not find any code assigning callids(only make them 0) and manage unfinished calls, could you help me located those logic?
2. in the demo namenode-rpc-unit, I see each proxy only have one call(the current call), does this mean client can only call one rpc at the same time? If so probably every rpc call will need it's own rpc_proxy, from users standing point, they may want what java's interface, multi-thread can concurrently call one proxy, this is very common in hdfs client.
3. hrpc_proxy.call belongs to hrpc_proxy, but in hrpc_proxy_start, the call is passed to reactor->inbox.pending_calls, which may have longer life circle than hrpc_proxy, so there may be protential bug in hrpc_proxy.call?

1. to my understanding, rpc client should have a map<callid, call> to record all unfinished calls, but I could not find any code assigning callids(only make them 0) and manage unfinished calls, could you help me located those logic?

Good question. We should be using incrementing call IDs like the Java client does, but this is not implemented yet, as you pointed out.

You probably already know this, but I guess I ought to comment here... The rationale behind call id in general is that in some future version of the Java RPC system, we may want to allow multiple calls to be "in flight" at once. Then, the server could send back a response, and tag it with the appropriate "call id" of the call in flight. However, this is not implemented currently in the Java RPC server, as I understand. So it's difficult to test on the C client.

Regardless, I agree that we should implement this.

2. in the demo namenode-rpc-unit, I see each proxy only have one call(the current call), does this mean client can only call one rpc at the same time?

No. The client can make multiple calls at once by using multiple proxies.

If so probably every rpc call will need it's own rpc_proxy, from users standing point, they may want what java's interface, multi-thread can concurrently call one proxy, this is very common in hdfs client.

The users won't be using proxies... only the native HDFS client has to deal with them. From the library user's perspective, they are calling hdfsOpen, hdfsClose, etc. etc.

3. hrpc_proxy.call belongs to hrpc_proxy, but in hrpc_proxy_start, the call is passed to reactor->inbox.pending_calls, which may have longer life circle than hrpc_proxy, so there may be protential bug in hrpc_proxy.call?

There's no bug. You just can't de-allocate the proxy while it is in use. This problem can only be solved by garbage collection or memory management of some form, which C doesn't have. I will add a comment to proxy.h to make this requirement clearer.

Colin P. McCabe
added a comment - 12/May/14 18:36 1. to my understanding, rpc client should have a map<callid, call> to record all unfinished calls, but I could not find any code assigning callids(only make them 0) and manage unfinished calls, could you help me located those logic?
Good question. We should be using incrementing call IDs like the Java client does, but this is not implemented yet, as you pointed out.
You probably already know this, but I guess I ought to comment here... The rationale behind call id in general is that in some future version of the Java RPC system, we may want to allow multiple calls to be "in flight" at once. Then, the server could send back a response, and tag it with the appropriate "call id" of the call in flight. However, this is not implemented currently in the Java RPC server, as I understand. So it's difficult to test on the C client.
Regardless, I agree that we should implement this.
2. in the demo namenode-rpc-unit, I see each proxy only have one call(the current call), does this mean client can only call one rpc at the same time?
No. The client can make multiple calls at once by using multiple proxies.
If so probably every rpc call will need it's own rpc_proxy, from users standing point, they may want what java's interface, multi-thread can concurrently call one proxy, this is very common in hdfs client.
The users won't be using proxies... only the native HDFS client has to deal with them. From the library user's perspective, they are calling hdfsOpen , hdfsClose , etc. etc.
3. hrpc_proxy.call belongs to hrpc_proxy, but in hrpc_proxy_start, the call is passed to reactor->inbox.pending_calls, which may have longer life circle than hrpc_proxy, so there may be protential bug in hrpc_proxy.call?
There's no bug. You just can't de-allocate the proxy while it is in use. This problem can only be solved by garbage collection or memory management of some form, which C doesn't have. I will add a comment to proxy.h to make this requirement clearer.

The rationale behind call id in general is that in some future version of the Java RPC system, we may want to allow multiple calls to be "in flight" at once

I guess I always thought this is already implemented, because client already can make parallel calls, and there are multiple rpc handler threads in server side already, doing this should be natural and easy, although I haven't test about this. Are you sure about this? If so I can try to add this in java...

From the library user's perspective, they are calling hdfsOpen, hdfsClose, etc. etc.

So those method all need to initialize hrpc_proxy again(which need server address, user and other configs), what I try to say is maybe proxy and call can be separated, proxy can be shared, call on stack for each call. Maybe it's to late to change that, just my two cents.

You just can't de-allocate the proxy while it is in use.

So there should be a method for user to cancel an ongoing rpc(also need to make sure after cancel complete, no more memory access to hrpc_proxy and call), looks like hrpc_proxy_deactivate can't do this yet?

Binglin Chang
added a comment - 13/May/14 03:59 The rationale behind call id in general is that in some future version of the Java RPC system, we may want to allow multiple calls to be "in flight" at once
I guess I always thought this is already implemented, because client already can make parallel calls, and there are multiple rpc handler threads in server side already, doing this should be natural and easy, although I haven't test about this. Are you sure about this? If so I can try to add this in java...
From the library user's perspective, they are calling hdfsOpen, hdfsClose, etc. etc.
So those method all need to initialize hrpc_proxy again(which need server address, user and other configs), what I try to say is maybe proxy and call can be separated, proxy can be shared, call on stack for each call. Maybe it's to late to change that, just my two cents.
You just can't de-allocate the proxy while it is in use.
So there should be a method for user to cancel an ongoing rpc(also need to make sure after cancel complete, no more memory access to hrpc_proxy and call), looks like hrpc_proxy_deactivate can't do this yet?

Binglin Chang
added a comment - 13/May/14 05:36 So there should be a method for user to cancel an ongoing rpc
I thought more about this, adding timeout to call also works and seems like a better solution.

So those method all need to initialize hrpc_proxy again(which need server address, user and other configs), what I try to say is maybe proxy and call can be separated, proxy can be shared, call on stack for each call. Maybe it's to late to change that, just my two cents.

I think the performance is actually going to be pretty good, since we're just putting an object on the stack and doing some memory copying. I have some code which implements the native filesystem which I will post soon... I think some of this will make more sense when you see how it gets used.

So there should be a method for user to cancel an ongoing rpc(also need to make sure after cancel complete, no more memory access to hrpc_proxy and call), looks like hrpc_proxy_deactivate can't do this yet?

The most important use-case for cancelling an RPC is when shutting down the filesystem in hdfsClose. We can already handle that by calling hrpc_messenger_shutdown, which will abort all in-progress RPCs.

I thought more about this, adding timeout to call also works and seems like a better solution.

Yeah, I want to implement timeouts. The two most important timeouts are how long we should wait for a response from the server and how long we should keep around an inactive connection.

Colin P. McCabe
added a comment - 13/May/14 18:43 So those method all need to initialize hrpc_proxy again(which need server address, user and other configs), what I try to say is maybe proxy and call can be separated, proxy can be shared, call on stack for each call. Maybe it's to late to change that, just my two cents.
I think the performance is actually going to be pretty good, since we're just putting an object on the stack and doing some memory copying. I have some code which implements the native filesystem which I will post soon... I think some of this will make more sense when you see how it gets used.
So there should be a method for user to cancel an ongoing rpc(also need to make sure after cancel complete, no more memory access to hrpc_proxy and call), looks like hrpc_proxy_deactivate can't do this yet?
The most important use-case for cancelling an RPC is when shutting down the filesystem in hdfsClose . We can already handle that by calling hrpc_messenger_shutdown , which will abort all in-progress RPCs.
I thought more about this, adding timeout to call also works and seems like a better solution.
Yeah, I want to implement timeouts. The two most important timeouts are how long we should wait for a response from the server and how long we should keep around an inactive connection.

I am not worried about performance, it just may cause more redundant code, wait to see some code then
Speak of redundant code, there are too many redundant codes in xxx.call.c, is it possible to do this using functions rather than generating repeat code?

The rationale behind call id in general is that in some future version of the Java RPC system, we may want to allow multiple calls to be "in flight" at once

I looked closely to the rpc code, looks like concurrent rpc is supported, unit test in TestRPC.testSlowRpc test this.

Binglin Chang
added a comment - 14/May/14 07:33 I think the performance is actually going to be pretty good
I am not worried about performance, it just may cause more redundant code, wait to see some code then
Speak of redundant code, there are too many redundant codes in xxx.call.c, is it possible to do this using functions rather than generating repeat code?
The rationale behind call id in general is that in some future version of the Java RPC system, we may want to allow multiple calls to be "in flight" at once
I looked closely to the rpc code, looks like concurrent rpc is supported, unit test in TestRPC.testSlowRpc test this.

I am not worried about performance, it just may cause more redundant code, wait to see some code then. Speak of redundant code, there are too many redundant codes in xxx.call.c, is it possible to do this using functions rather than generating repeat code?

The point of the generated functions is to provide type safety, so you can't pass the wrong request and response types to the functions. It also makes remote procedure calls look like a local function call, which is one of the main ideas in RPC.

I want to post the rest of the native client code, but I am waiting for a +1 on HADOOP-10564.

I looked closely to the rpc code, looks like concurrent rpc is supported, unit test in TestRPC.testSlowRpc test this.

I think we are talking about two different things. What I am talking about is sending two different responses at once on the same TCP socket. What you are talking about (and what TestRPC.testSlowRpc tests) is making two RPCs at once from a client.) Clearly it is possible to make more than one RPC at once from a client. Otherwise, our performance would be pretty poor. But last time I investigated it, each TCP socket could only do one request at once.

Colin P. McCabe
added a comment - 15/May/14 00:27 I am not worried about performance, it just may cause more redundant code, wait to see some code then. Speak of redundant code, there are too many redundant codes in xxx.call.c, is it possible to do this using functions rather than generating repeat code?
The point of the generated functions is to provide type safety, so you can't pass the wrong request and response types to the functions. It also makes remote procedure calls look like a local function call, which is one of the main ideas in RPC.
I want to post the rest of the native client code, but I am waiting for a +1 on HADOOP-10564 .
I looked closely to the rpc code, looks like concurrent rpc is supported, unit test in TestRPC.testSlowRpc test this.
I think we are talking about two different things. What I am talking about is sending two different responses at once on the same TCP socket. What you are talking about (and what TestRPC.testSlowRpc tests) is making two RPCs at once from a client.) Clearly it is possible to make more than one RPC at once from a client. Otherwise, our performance would be pretty poor. But last time I investigated it, each TCP socket could only do one request at once.

Binglin Chang
added a comment - 15/May/14 03:05 I think TestRPC.testSlowRpc should only use one socket based on socket reuse logic, so the responses also go through one socket. The test basically do the following:
client call (id=0) -> server
client call (id=1) -> server
server response(id = 1) -> client
client call (id=2) -> server
server response(id = 2) -> client
server response(id = 0) -> client
So client should recognize callid in response handling logic, otherwise the response is mismatched(it will return response 1 for call 0)
But last time I investigated it, each TCP socket could only do one request at once.
Do you mean the current native code? or java code?

The point of the generated functions is to provide type safety, so you can't pass the wrong request and response types to the functions. It also makes remote procedure calls look like a local function call, which is one of the main ideas in RPC.

We can keep functions, but the repeated code in these functions can be eliminated using abstraction, so as to reduce the binary code size.

Binglin Chang
added a comment - 15/May/14 05:30 The point of the generated functions is to provide type safety, so you can't pass the wrong request and response types to the functions. It also makes remote procedure calls look like a local function call, which is one of the main ideas in RPC.
We can keep functions, but the repeated code in these functions can be eliminated using abstraction, so as to reduce the binary code size.

I was basing my belief that we don't support multiple call IDs in flight at once off a casual conversation I had (off the record) with some folks at Hadoop Summit Europe. It's possible that the code has improved since then, or that they were out of date. I admit that I haven't scrutinized the server code closely enough to give a definitive answer here.

There is an easy way to resolve this, of course: we can modify the C code to put multiple call IDs in flight at once, and see if the server loses its marbles

The important thing to remember is that this is an optimization. Even if we never do it, we'll still have a usable native client. So I'm going to try to get the basic stuff done first, then perhaps we can circle back on this. Or maybe if one of you guys wants to do it in parallel that would work out too. The tricky part is testing... we need some way to force multiple calls to be in flight on the channel so we know that it works.

We can keep functions, but the repeated code in these functions can be eliminated using abstraction, so as to reduce the binary code size.

I have a patch which implements most of the native client, based on the existing RPC code. The library I generate is only 3 MB, even including a bunch of stuff which has nothing to do with RPC. So although I can see that there might be a potential to optimize code size, I don't think it should be our highest priority right now.

You also have to keep in mind that code which is not used will be stripped out by the linker. So if we don't use the async version of a certain RPC (for example), that code will not become part of libhdfs.so. So although you might look at the generated code and go "OMG so much code!" it's really not that bad. This is similar to how in C++, every time you template std::map on a different type, you get another set of std::map functions in your binary. In practice, it is usually not a problem.

Still, if you want to work on optimizing generated code size, I would welcome any patches. The challenge would be to reduce the code size while still maintaining RPC-specific error messages and not regressing performance.

Colin P. McCabe
added a comment - 15/May/14 19:57 <multiple call ids in flight discussion>
I was basing my belief that we don't support multiple call IDs in flight at once off a casual conversation I had (off the record) with some folks at Hadoop Summit Europe. It's possible that the code has improved since then, or that they were out of date. I admit that I haven't scrutinized the server code closely enough to give a definitive answer here.
There is an easy way to resolve this, of course: we can modify the C code to put multiple call IDs in flight at once, and see if the server loses its marbles
The important thing to remember is that this is an optimization. Even if we never do it, we'll still have a usable native client. So I'm going to try to get the basic stuff done first, then perhaps we can circle back on this. Or maybe if one of you guys wants to do it in parallel that would work out too. The tricky part is testing... we need some way to force multiple calls to be in flight on the channel so we know that it works.
We can keep functions, but the repeated code in these functions can be eliminated using abstraction, so as to reduce the binary code size.
I have a patch which implements most of the native client, based on the existing RPC code. The library I generate is only 3 MB, even including a bunch of stuff which has nothing to do with RPC. So although I can see that there might be a potential to optimize code size, I don't think it should be our highest priority right now.
You also have to keep in mind that code which is not used will be stripped out by the linker. So if we don't use the async version of a certain RPC (for example), that code will not become part of libhdfs.so . So although you might look at the generated code and go "OMG so much code!" it's really not that bad. This is similar to how in C++, every time you template std::map on a different type, you get another set of std::map functions in your binary. In practice, it is usually not a problem.
Still, if you want to work on optimizing generated code size, I would welcome any patches. The challenge would be to reduce the code size while still maintaining RPC-specific error messages and not regressing performance.

Neither a server nor a client must implement multiple calls in flight. Both the Java client and server do implement this feature. But if a client only ever submits requests serially over a connection then it can ignore the response callid, since while only a single call is ever outstanding then only that call will ever be responded to. Similarly, a server might be implemented to serially respond to requests so long as it copies call ids from requests responses.

Doug Cutting
added a comment - 16/May/14 19:44 Neither a server nor a client must implement multiple calls in flight. Both the Java client and server do implement this feature. But if a client only ever submits requests serially over a connection then it can ignore the response callid, since while only a single call is ever outstanding then only that call will ever be responded to. Similarly, a server might be implemented to serially respond to requests so long as it copies call ids from requests responses.

Neither a server nor a client must implement multiple calls in flight. Both the Java client and server do implement this feature. But if a client only ever submits requests serially over a connection then it can ignore the response callid, since while only a single call is ever outstanding then only that call will ever be responded to.

Thanks, Doug. That is the strategy we're using now in the native client-- one call at a time. I don't think it would be too tough to put multiple calls in flight at some point, though... maybe I'll take a look...

Colin P. McCabe
added a comment - 16/May/14 20:36 Neither a server nor a client must implement multiple calls in flight. Both the Java client and server do implement this feature. But if a client only ever submits requests serially over a connection then it can ignore the response callid, since while only a single call is ever outstanding then only that call will ever be responded to.
Thanks, Doug. That is the strategy we're using now in the native client-- one call at a time. I don't think it would be too tough to put multiple calls in flight at some point, though... maybe I'll take a look...

I'm wondering whether using C, instead of a more modern version language like c++11 is a requirement. It looks to me that the code has to reinvent a lot of wheels so that the client can be functional. It also looks to me that constructs like the code assumes the callers to properly manage constructs like hadoop_err, which could be quite error-prone.

Haohui Mai
added a comment - 06/Jun/14 23:00 I'm wondering whether using C, instead of a more modern version language like c++11 is a requirement. It looks to me that the code has to reinvent a lot of wheels so that the client can be functional. It also looks to me that constructs like the code assumes the callers to properly manage constructs like hadoop_err , which could be quite error-prone.

The code is implementing the existing libhdfs interface, which is a C interface. The way callers use this interface is not going to change. hadoop_err is an internal type which is not exposed to callers. No wheels are being reinvented... we are using libuv for our portability layer and other libraries where appropriate. We will provide clients for other languages like Objective C, C++, D, golang, Python, whatever by layering them on top of the C client, which is very easy to do with C.

Colin P. McCabe
added a comment - 07/Jun/14 00:09 The code is implementing the existing libhdfs interface, which is a C interface. The way callers use this interface is not going to change. hadoop_err is an internal type which is not exposed to callers. No wheels are being reinvented... we are using libuv for our portability layer and other libraries where appropriate. We will provide clients for other languages like Objective C, C++, D, golang, Python, whatever by layering them on top of the C client, which is very easy to do with C.

No wheels are being reinvented... we are using libuv for our portability layer and other libraries where appropriate.

What make me concerned is that the code has to bring in a lot more dependency in plain C, which has a high cost on maintenance. For example, this patch at least contains implementation of linked list, splay tress, hash tables, and rb trees. There are a lot of overheads on implementing, reviewing and testing the code. For example, a lot of time has to waste on issue like the following:

The above link demonstrates that even the code is directly copied from other places, the overheads of reviewing and maintaining them are ineligible. I anticipate that down the road the problem will only get worse. For example, do you considering supporting filenames in unicode? That way I think libicu might need to be brought into the picture.

It looks to me that it is much more compelling to implement the code in a more modern language, say, c++11, where much of the headache right now is taken away by a mature standard library.

Haohui Mai
added a comment - 10/Jun/14 20:55 No wheels are being reinvented... we are using libuv for our portability layer and other libraries where appropriate.
What make me concerned is that the code has to bring in a lot more dependency in plain C, which has a high cost on maintenance. For example, this patch at least contains implementation of linked list, splay tress, hash tables, and rb trees. There are a lot of overheads on implementing, reviewing and testing the code. For example, a lot of time has to waste on issue like the following:
https://issues.apache.org/jira/browse/HADOOP-10640?focusedCommentId=14026841&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14026841
The above link demonstrates that even the code is directly copied from other places, the overheads of reviewing and maintaining them are ineligible. I anticipate that down the road the problem will only get worse. For example, do you considering supporting filenames in unicode? That way I think libicu might need to be brought into the picture.
It looks to me that it is much more compelling to implement the code in a more modern language, say, c++11, where much of the headache right now is taken away by a mature standard library.

What make me concerned is that the code has to bring in a lot more dependency in plain C, which has a high cost on maintenance

Currently, the libraries we depend on are: libuv, for portability primitives, protobuf-c, for protobuf functionality, expat, for XML parsing, and liburiparser, for parsing URIs. None of that functionality is provided by the C++ standard library, so your statement is false.

For example, this patch at least contains implementation of linked list, splay tress, hash tables, and rb trees. There are a lot of overheads on implementing, reviewing and testing the code.

A lot of this code is not new. For example, we were using tree.h (which implements splay trees and rb trees), previously in libhdfs. The maintenance burden was not high. In fact, it was zero, because we never had to fix a bug in tree.h. So once again, your statement is just false.

htable.c got a review because it is new code. I would hardly call reviewing new code a "maintenance burden." And anyway, there is a standard C way to use hash tables... the hcreate_r, hsearch_r, and hdestroy functions. We would like to use the standard way, but Windows doesn't implement these functions.

For example, do you considering supporting filenames in unicode? That way I think libicu might need to be brought into the picture.

First of all, the question of whether we should use libicu is independent of the question of whether we should use C++. libicu has a C interface, and the standard C++ libraries and runtime don't provide any unicode functionality beyond what the standard C libraries provide.

Second of all, I see no reason to use libicu. All the strings we are dealing with are UTF-8 supplied to and from protobuf. This means that they are null-terminated and can be printed and handled with existing string functions. libicu might come into the picture if we wanted to start normalizing unicode strings or using wide character strings. But we don't need or want to do that.

It looks to me that it is much more compelling to implement the code in a more modern language, say, c++11, where much of the headache right now is taken away by a mature standard library.

C++ first came on the scene in 1983. That is 31 years ago. C++ may be a lot of things, but "modern" isn't one of them. I was a C++ programmer for 10 years. I know the language about as well as anyone can. I specifically chose C for this project because of a few things.

Firstly, the challenge of maintaining a consistent C++ coding style is very, very large. This is true even when everyone is a professional C++ programmer working under the same roof. For a project like Hadoop, where C/C++ is not everyone's first language, the challenge is just unsupportable. The C++ learning curve is just much higher than C. You have to know everything you have to know for C, plus a lot of very tricky things that are unique to C++.

There are a lot of contentious issues in the community like use exceptions, or don't use exceptions? Use global constructors, or don't use global constructors? Use boost, or don't use boost? Use C+0x / C11 / C+14 or use some older standard? Use runtime type information (dynamic_cast, typeof), or don't use runtime type information? Operator overloading, or no operator overloading?

There are reasonable arguments for each of these positions. For example, exceptions harm performance (because of the need to maintain data to do stack unwinding. See here: http://preshing.com/20110807/the-cost-of-enabling-exception-handling/. That's just if you don't use them... if you do use them, exceptions turn out to be a lot slower than return codes. They also can make code difficult to follow. C++ doesn't have checked exceptions, so you can never really know what any function will throw. For this reason, some fairly smart people at Google have decided to ban exceptions from their coding standard. This, in turn, means that it's difficult for libraries to throw exceptions, since open source projects using the Google Coding standard (and there are a lot of them) can't deal with exceptions. Of course, without exceptions, certain things in C++ are very hard to do. (By the way, I'm not interested in having the argument for/against exceptions here, just in noting that there is huge fragmentation here and reasonable people on both sides.)

A similar story could be told about all the other choices. The net effect is that we have to police a very large set of arbitrary style decisions that just wouldn't come up at all if we just used C.

C++ library APIs have binary compatibility issues. A lot of them. Just take a look at http://techbase.kde.org/Policies/Binary_Compatibility_Issues_With_C++. Again, how are we going to ensure that everyone follows these rules? It's nearly impossible. Considering the number of issues we've had maintaining API compatibility in Java, with Java's much simpler rules, this is just a deal-breaker. Whereas with C, the rules for maintaining binary compatibility are very simple.

C is available on every platform out there, even AIX. C++11 is only available on a subset of those platforms. This is another advantage of plain old C.

If we were writing a new daemon or something, then I might consider C++, even C++11. Yes, C++11 added some good things. auto was a good idea (borrowed from golang or someplace), and move constructors are nice. But none of it addresses the problems above, and all of it just adds more complexity for people to master. What we are writing is just a client, and it's not that thick. Especially the YARN client, which just makes some RPCs and that's it. And the code is nearly done.

I'm not interested in having a language flamewar here. C has some advantages, and C++ has another set. For this particular project, the former outweigh the latter. I'm very familiar with C++ and I don't need a lecture on its advantages, having been a user for a decade.

If you are interested in writing a C++ interface for libhdfs or libyarn, then by all means do that. I think this interface should be in a header file only, to avoid the binary compatibility issues I mentioned earlier. Since the header file would be compiled by each client, we would be free to change it whenever we liked without worrying about binary compatibility.

Colin P. McCabe
added a comment - 11/Jun/14 00:59 What make me concerned is that the code has to bring in a lot more dependency in plain C, which has a high cost on maintenance
Currently, the libraries we depend on are: libuv , for portability primitives, protobuf-c , for protobuf functionality, expat , for XML parsing, and liburiparser , for parsing URIs. None of that functionality is provided by the C++ standard library, so your statement is false.
For example, this patch at least contains implementation of linked list, splay tress, hash tables, and rb trees. There are a lot of overheads on implementing, reviewing and testing the code.
A lot of this code is not new. For example, we were using tree.h (which implements splay trees and rb trees), previously in libhdfs. The maintenance burden was not high. In fact, it was zero, because we never had to fix a bug in tree.h . So once again, your statement is just false.
htable.c got a review because it is new code. I would hardly call reviewing new code a "maintenance burden." And anyway, there is a standard C way to use hash tables... the hcreate_r , hsearch_r , and hdestroy functions. We would like to use the standard way, but Windows doesn't implement these functions.
For example, do you considering supporting filenames in unicode? That way I think libicu might need to be brought into the picture.
First of all, the question of whether we should use libicu is independent of the question of whether we should use C++. libicu has a C interface, and the standard C++ libraries and runtime don't provide any unicode functionality beyond what the standard C libraries provide.
Second of all, I see no reason to use libicu. All the strings we are dealing with are UTF-8 supplied to and from protobuf. This means that they are null-terminated and can be printed and handled with existing string functions. libicu might come into the picture if we wanted to start normalizing unicode strings or using wide character strings. But we don't need or want to do that.
It looks to me that it is much more compelling to implement the code in a more modern language, say, c++11, where much of the headache right now is taken away by a mature standard library.
C++ first came on the scene in 1983. That is 31 years ago. C++ may be a lot of things, but "modern" isn't one of them. I was a C++ programmer for 10 years. I know the language about as well as anyone can. I specifically chose C for this project because of a few things.
Firstly, the challenge of maintaining a consistent C++ coding style is very, very large. This is true even when everyone is a professional C++ programmer working under the same roof. For a project like Hadoop, where C/C++ is not everyone's first language, the challenge is just unsupportable. The C++ learning curve is just much higher than C. You have to know everything you have to know for C, plus a lot of very tricky things that are unique to C++.
There are a lot of contentious issues in the community like use exceptions, or don't use exceptions? Use global constructors, or don't use global constructors? Use boost, or don't use boost? Use C+ 0x / C 11 / C +14 or use some older standard? Use runtime type information ( dynamic_cast , typeof ), or don't use runtime type information? Operator overloading, or no operator overloading?
There are reasonable arguments for each of these positions. For example, exceptions harm performance (because of the need to maintain data to do stack unwinding. See here: http://preshing.com/20110807/the-cost-of-enabling-exception-handling/ . That's just if you don't use them... if you do use them, exceptions turn out to be a lot slower than return codes. They also can make code difficult to follow. C++ doesn't have checked exceptions, so you can never really know what any function will throw. For this reason, some fairly smart people at Google have decided to ban exceptions from their coding standard. This, in turn, means that it's difficult for libraries to throw exceptions, since open source projects using the Google Coding standard (and there are a lot of them) can't deal with exceptions. Of course, without exceptions, certain things in C++ are very hard to do. (By the way, I'm not interested in having the argument for/against exceptions here, just in noting that there is huge fragmentation here and reasonable people on both sides.)
A similar story could be told about all the other choices. The net effect is that we have to police a very large set of arbitrary style decisions that just wouldn't come up at all if we just used C.
C++ library APIs have binary compatibility issues. A lot of them. Just take a look at http://techbase.kde.org/Policies/Binary_Compatibility_Issues_With_C++ . Again, how are we going to ensure that everyone follows these rules? It's nearly impossible. Considering the number of issues we've had maintaining API compatibility in Java, with Java's much simpler rules, this is just a deal-breaker. Whereas with C, the rules for maintaining binary compatibility are very simple.
C is available on every platform out there, even AIX. C++11 is only available on a subset of those platforms. This is another advantage of plain old C.
But more importantly, it's easy to bind other higher-level languages to C than it is to C++. For example, in Python you can use ctypes to easily wrap a C library. https://docs.python.org/2/library/ctypes.html . Do you want to use ctypes with C++? Then you're out of luck. http://stackoverflow.com/questions/1615813/how-to-use-c-classes-with-ctypes . A similar story could be told about golang, and most other high-level languages. You have to write a lot of boilerplate to wrap C++, and almost none for C.
If we were writing a new daemon or something, then I might consider C++, even C++11. Yes, C++11 added some good things. auto was a good idea (borrowed from golang or someplace), and move constructors are nice. But none of it addresses the problems above, and all of it just adds more complexity for people to master. What we are writing is just a client, and it's not that thick. Especially the YARN client, which just makes some RPCs and that's it. And the code is nearly done.
I'm not interested in having a language flamewar here. C has some advantages, and C++ has another set. For this particular project, the former outweigh the latter. I'm very familiar with C++ and I don't need a lecture on its advantages, having been a user for a decade.
If you are interested in writing a C++ interface for libhdfs or libyarn, then by all means do that. I think this interface should be in a header file only, to avoid the binary compatibility issues I mentioned earlier. Since the header file would be compiled by each client, we would be free to change it whenever we liked without worrying about binary compatibility.

Currently, the libraries we depend on are: libuv, for portability primitives, protobuf-c, for protobuf functionality, expat, for XML parsing, and liburiparser, for parsing URIs. None of that functionality is provided by the C++ standard library, so your statement is false.

A lot of this code is not new. For example, we were using tree.h (which implements splay trees and rb trees), previously in libhdfs. The maintenance burden was not high. In fact, it was zero, because we never had to fix a bug in tree.h. So once again, your statement is just false.

htable.c got a review because it is new code. I would hardly call reviewing new code a "maintenance burden." And anyway, there is a standard C way to use hash tables... the hcreate_r, hsearch_r, and hdestroy functions. We would like to use the standard way, but Windows doesn't implement these functions.

I fail to understand what point you're try to make. My point is that you can write much less code in a modern language with better standard libraries, which makes things much easier to review and maintain. For example, when you're working on trunk, how many times you have to put up a 200kb patch like this jira? How many big patches in this feature branch? Please be considerate of the reviewers of the patch.

Arguably you can implement what you want in C++ and C equally well. Coding styles and performance can be a problem.

However, before any of them I'm much more concerned about the correctness of the current code. For example, I'm seeing the code allocates hadoop_err on the common paths, and it has to clean it up on all error paths. I'm also seeing many calls to strcpy(), as well as calls to *printf() with non constant format strings.

My question is that (1) whether the code contains no memory leak, no buffer overflow, and no format string overflow? (2) whether the code always passes the function pointer with the correct type? I'm perfectly happy to +1 your patches as long as you can show your code is indeed free of these common defects.

Given the amount of code in the branch, it might be an issue worth looking at some point, compared to when a merge vote is called.

Haohui Mai
added a comment - 11/Jun/14 02:15
Currently, the libraries we depend on are: libuv, for portability primitives, protobuf-c, for protobuf functionality, expat, for XML parsing, and liburiparser, for parsing URIs. None of that functionality is provided by the C++ standard library, so your statement is false.
A lot of this code is not new. For example, we were using tree.h (which implements splay trees and rb trees), previously in libhdfs. The maintenance burden was not high. In fact, it was zero, because we never had to fix a bug in tree.h. So once again, your statement is just false.
htable.c got a review because it is new code. I would hardly call reviewing new code a "maintenance burden." And anyway, there is a standard C way to use hash tables... the hcreate_r, hsearch_r, and hdestroy functions. We would like to use the standard way, but Windows doesn't implement these functions.
I fail to understand what point you're try to make. My point is that you can write much less code in a modern language with better standard libraries, which makes things much easier to review and maintain. For example, when you're working on trunk, how many times you have to put up a 200kb patch like this jira? How many big patches in this feature branch? Please be considerate of the reviewers of the patch.
Firstly, the challenge of maintaining a consistent C++ coding style is very, very large. ...
For example, exceptions harm performance...
C++ library APIs have binary compatibility issues....
Arguably you can implement what you want in C++ and C equally well. Coding styles and performance can be a problem.
However, before any of them I'm much more concerned about the correctness of the current code. For example, I'm seeing the code allocates hadoop_err on the common paths, and it has to clean it up on all error paths. I'm also seeing many calls to strcpy() , as well as calls to *printf() with non constant format strings.
My question is that (1) whether the code contains no memory leak, no buffer overflow, and no format string overflow? (2) whether the code always passes the function pointer with the correct type? I'm perfectly happy to +1 your patches as long as you can show your code is indeed free of these common defects.
Given the amount of code in the branch, it might be an issue worth looking at some point, compared to when a merge vote is called.

I'm also seeing many calls to strcpy(), as well as calls to *printf() with non constant format strings.

There's 10 calls to strcpy in the whole code base, which is 37 files now. All of them are in cases where we calculate the destination buffer size beforehand based on the source string size, so there is no problem.

printf and similar functions are not a problem because we have _attribute_((format(printf))), which warns about cases where the format string doesn't match the varargs. And we only ever use snprintf, vsnprintf, and the other functions which print into a buffer of a known size.

My question is that (1) whether the code contains no memory leak, no buffer overflow, and no format string overflow? (2) whether the code always passes the function pointer with the correct type? I'm perfectly happy to +1 your patches as long as you can show your code is indeed free of these common defects.

The compiler checks whether function pointers are the correct type. We have compile-time checks that printf's arguments match its format string, we don't ever use non-constant format strings, and we use the versions that take a maximum length . I use valgrind to spot memory leaks.

I think running static and dynamic analysis tools on the code is always a good idea. Having good unit tests coverage is also a good idea. The native code will always have burdens that Java doesn't have, because it is not garbage collected. But I think with care, those burdens can be managed in a client, just like we manage them in the existing libhdfs.

Colin P. McCabe
added a comment - 11/Jun/14 17:33 I'm also seeing many calls to strcpy(), as well as calls to *printf() with non constant format strings.
There's 10 calls to strcpy in the whole code base, which is 37 files now. All of them are in cases where we calculate the destination buffer size beforehand based on the source string size, so there is no problem.
printf and similar functions are not a problem because we have _ attribute _((format(printf))) , which warns about cases where the format string doesn't match the varargs. And we only ever use snprintf , vsnprintf , and the other functions which print into a buffer of a known size.
My question is that (1) whether the code contains no memory leak, no buffer overflow, and no format string overflow? (2) whether the code always passes the function pointer with the correct type? I'm perfectly happy to +1 your patches as long as you can show your code is indeed free of these common defects.
The compiler checks whether function pointers are the correct type. We have compile-time checks that printf's arguments match its format string, we don't ever use non-constant format strings, and we use the versions that take a maximum length . I use valgrind to spot memory leaks.
I think running static and dynamic analysis tools on the code is always a good idea. Having good unit tests coverage is also a good idea. The native code will always have burdens that Java doesn't have, because it is not garbage collected. But I think with care, those burdens can be managed in a client, just like we manage them in the existing libhdfs.

printf and similar functions are not a problem because we have attribute((format(printf))), which warns about cases where the format string doesn't match the varargs. And we only ever use snprintf, vsnprintf, and the other functions which print into a buffer of a known size.

While trying to enforce good coding style is a good thing, it is a best-effort claim but not a proof. I'm not trying to question your creditability on the claim, as any human can make mistakes on such a large code base.

The point is the type checker in compiler gives you a proof automatically. With proper uses, compilers do this job much better than humans, so please leave the job to the compiler.

Additionally, the caller can easily trigger a buffer overflow by passing "%s" as a format string. Alternatively, using stringstream will never have this problem. Note that the code uses the c++ runtime anyway, why not just writing the code in a type safe way and let the type system do the proof for you?

I use valgrind to spot memory leaks.

valgrind is not a panacea. Valgrind is a dynamic tool where it spots the leak only when the code runs into the buggy path. Many leaks happen in the error-handling path, and I'm skeptical that (1) the unit tests are able to cover all of them, (2) you'll ever be able to run valgrind in production considering its overheads (10x~100x). Alternatively, the issue is largely addressed in c++ compile time if the code passes smart pointers with discipline.

In short, they are best-effort approaches to address the common but critical defects I've raised. With proper uses, a modern language like C++ is able to provide much higher assurance (type checked proofs) in security and reliability. Why not embracing these tools instead of putting heavy dependency and responsibility on code review to catch these bugs?

Haohui Mai
added a comment - 11/Jun/14 18:06 printf and similar functions are not a problem because we have attribute ((format(printf))), which warns about cases where the format string doesn't match the varargs. And we only ever use snprintf, vsnprintf, and the other functions which print into a buffer of a known size.
Just a random grep, here is a call to vsprintf but not vsnprintf :
+ vsprintf(err->msg + class_len + SUFFIX_LEN, fmt, ap2);
+ va_end(ap2);
+ va_end(ap);
While trying to enforce good coding style is a good thing, it is a best-effort claim but not a proof. I'm not trying to question your creditability on the claim, as any human can make mistakes on such a large code base.
The point is the type checker in compiler gives you a proof automatically. With proper uses, compilers do this job much better than humans, so please leave the job to the compiler.
Additionally, the caller can easily trigger a buffer overflow by passing "%s" as a format string. Alternatively, using stringstream will never have this problem. Note that the code uses the c++ runtime anyway, why not just writing the code in a type safe way and let the type system do the proof for you?
I use valgrind to spot memory leaks.
valgrind is not a panacea. Valgrind is a dynamic tool where it spots the leak only when the code runs into the buggy path. Many leaks happen in the error-handling path, and I'm skeptical that (1) the unit tests are able to cover all of them, (2) you'll ever be able to run valgrind in production considering its overheads (10x~100x). Alternatively, the issue is largely addressed in c++ compile time if the code passes smart pointers with discipline.
In short, they are best-effort approaches to address the common but critical defects I've raised. With proper uses, a modern language like C++ is able to provide much higher assurance (type checked proofs) in security and reliability. Why not embracing these tools instead of putting heavy dependency and responsibility on code review to catch these bugs?

Suresh Srinivas
added a comment - 11/Jun/14 18:48 Can you please tone down the rudeness. The following comments are unnecessary for productive discussion:
I'm very familiar with C++ and I don't need a lecture on its advantages, having been a user for a decade.
You have already made your point on your c++ cred. Lets continue the discussion in the right tone.

Haohui Mai, can you please provide information on how many lines of code is not necessary and can be replaced by c++ standard libraries. Of course that is one of the factors and not the only reason to choose the direction of this solution.

Suresh Srinivas
added a comment - 11/Jun/14 18:54 Haohui Mai , can you please provide information on how many lines of code is not necessary and can be replaced by c++ standard libraries. Of course that is one of the factors and not the only reason to choose the direction of this solution.

The reason why that is a vsprintf is that we use vsnprintf above with the same format string to calculate the length of the buffer we'll need.

Additionally, the caller can easily trigger a buffer overflow by passing "%s" as a format string. Alternatively, using stringstream will never have this problem. Note that the code uses the c++ runtime anyway, why not just writing the code in a type safe way and let the type system do the proof for you?

The caller can't "easily trigger a buffer overflow" in the code above. The reason is because if the caller passes %s followed by a string, we will compute the length of that string as fmt_len. Let's follow through the code.

So you can see that msg_len will depend on fmt_len. If the length of the string the caller supplies to %s increases, the length of the buffer we allocate will increase.

valgrind is not a panacea. Valgrind is a dynamic tool where it spots the leak only when the code runs into the buggy path. Many leaks happen in the error-handling path, and I'm skeptical that (1) the unit tests are able to cover all of them, (2) you'll ever be able to run valgrind in production considering its overheads (10x~100x). Alternatively, the issue is largely addressed in c++ compile time if the code passes smart pointers with discipline.

C++ is not a panacea. I have worked on large C++ projects in the past. They have memory leaks too, sometimes quite serious ones. All of them ran valgrind, thread sanitizer, and other dynamic tools to spot the leaks. Once you get a complex object ownership structure with shared_ptr, unique_ptr, and auto_ptr, plus passing bare references and pointers... it's easy to make mistakes.

In short, they are best-effort approaches to address the common but critical defects I've raised. With proper uses, a modern language like C++ is able to provide much higher assurance (type checked proofs) in security and reliability. Why not embracing these tools instead of putting heavy dependency and responsibility on code review to catch these bugs?

So far, we haven't found any defects. And C++ doesn't provide "type checked proofs" of anything, since it is an unsafe language (much like C, which it's based on), which supports arbitrary casting, pointer aliasing, arbitrary code execution, and so on. If you wanted provable correctness, you might try something like SML or Haskell plus a theorem prover like Coq. But what does this have to do with a native client?

You have already made your point on your c++ cred. Lets continue the discussion in the right tone.

Suresh, I apologize if this came across as rude. My intention was just to point out that I have considered the merits of C versus C++. I have experience with both. It seemed like there was an assumption that I was not familiar with C++.

I feel somewhat frustrated, because I don't think JIRA is a suitable forum for programming language advocacy. This JIRA doesn't have anything to do with C versus C++. It is about the RPC code, which was already reviewed and committed. There are plenty of places on the web advocating a particular programming language-- does this have to be one of them? If I opened a JIRA about switching Hadoop to Scala, would the discussion continue "in the right tone"? If I pointed out a bunch of stuff that would be "fixed by scala" (that wasn't actually buggy), what would the tone be then? (Note that I have nothing against Scala-- it's a fine language.)

Colin P. McCabe
added a comment - 11/Jun/14 20:56 Just a random grep, here is a call to vsprintf but not vsnprintf:
The reason why that is a vsprintf is that we use vsnprintf above with the same format string to calculate the length of the buffer we'll need.
Additionally, the caller can easily trigger a buffer overflow by passing "%s" as a format string. Alternatively, using stringstream will never have this problem. Note that the code uses the c++ runtime anyway, why not just writing the code in a type safe way and let the type system do the proof for you?
The caller can't "easily trigger a buffer overflow" in the code above. The reason is because if the caller passes %s followed by a string, we will compute the length of that string as fmt_len . Let's follow through the code.
fmt_len = vsnprintf(test_buf, sizeof(test_buf), fmt, ap);
if (fmt_len < 0) {
ierr = (struct hadoop_err*)&HADOOP_VSNPRINTF_ERR;
goto done;
}
class = errno_to_class(code);
class_len = strlen(class);
msg_len = class_len + SUFFIX_LEN + fmt_len + 1;
err->msg = malloc(msg_len);
if (!err->msg) {
ierr = (struct hadoop_err*)&HADOOP_OOM_ERR;
goto done;
}
strcpy(err->msg, class);
strcpy(err->msg + class_len, SUFFIX);
vsprintf(err->msg + class_len + SUFFIX_LEN, fmt, ap2);
So you can see that msg_len will depend on fmt_len . If the length of the string the caller supplies to %s increases, the length of the buffer we allocate will increase.
valgrind is not a panacea. Valgrind is a dynamic tool where it spots the leak only when the code runs into the buggy path. Many leaks happen in the error-handling path, and I'm skeptical that (1) the unit tests are able to cover all of them, (2) you'll ever be able to run valgrind in production considering its overheads (10x~100x). Alternatively, the issue is largely addressed in c++ compile time if the code passes smart pointers with discipline.
C++ is not a panacea. I have worked on large C++ projects in the past. They have memory leaks too, sometimes quite serious ones. All of them ran valgrind, thread sanitizer, and other dynamic tools to spot the leaks. Once you get a complex object ownership structure with shared_ptr , unique_ptr , and auto_ptr , plus passing bare references and pointers... it's easy to make mistakes.
In short, they are best-effort approaches to address the common but critical defects I've raised. With proper uses, a modern language like C++ is able to provide much higher assurance (type checked proofs) in security and reliability. Why not embracing these tools instead of putting heavy dependency and responsibility on code review to catch these bugs?
So far, we haven't found any defects. And C++ doesn't provide "type checked proofs" of anything, since it is an unsafe language (much like C, which it's based on), which supports arbitrary casting, pointer aliasing, arbitrary code execution, and so on. If you wanted provable correctness, you might try something like SML or Haskell plus a theorem prover like Coq . But what does this have to do with a native client?
You have already made your point on your c++ cred. Lets continue the discussion in the right tone.
Suresh, I apologize if this came across as rude. My intention was just to point out that I have considered the merits of C versus C++. I have experience with both. It seemed like there was an assumption that I was not familiar with C++.
I feel somewhat frustrated, because I don't think JIRA is a suitable forum for programming language advocacy. This JIRA doesn't have anything to do with C versus C++. It is about the RPC code, which was already reviewed and committed. There are plenty of places on the web advocating a particular programming language-- does this have to be one of them? If I opened a JIRA about switching Hadoop to Scala, would the discussion continue "in the right tone"? If I pointed out a bunch of stuff that would be "fixed by scala" (that wasn't actually buggy), what would the tone be then? (Note that I have nothing against Scala-- it's a fine language.)

So far I'm yet to be convinced that the code is in good quality. I can see two implication on "haven't found any defects". One is the code is in good quality, the other is the code has not yet been gone through full scale reviews and testings. I'm unsure which implication I should get. I haven't gone through this patch, but in other patches there might be some similar issues:

Due to the complexity the code, I'm deeply concern that the code cannot be properly reviewed and have at least the same quality of the code in Hadoop.

This JIRA doesn't have anything to do with C versus C++. It is about the RPC code, which was already reviewed and committed.

You're missing the point. From the prospective of a reviewer, the key question to answer is that what quality the code currently is. There are two problems I can see as the current state of the code:

The code is yet to be separated to logical pieces to be throughly reviewed.

The control flow involves several heap allocations, and it is nontrivial to ensure that there are no resource leaks.

I call out C++ because it is widely available when developing native code. You're more than welcome to do it in C if you can address the above requirements. I'm not sure what your plan on this branch, but if you plan to merge it to trunk, in my opinion the above comments have to be properly addressed.

And C++ doesn't provide "type checked proofs" of anything, since it is an unsafe language (much like C, which it's based on). If you wanted provable correctness, you might try something like SML or Haskell plus a theorem prover like Coq. But what does this have to do with a native client?

At the very least, the code should be free of the following common defects: resource leaks, buffer overflow, and control flow hijacking. It would be great if you can fix your code to systematically show that. I call out C++ because it allows you to provide a much better claim that the code is correct (obviously it can't save you if you intentionally screw it up), but it seems the the current code does not even have this luxury.

By the way, if you're interested in how far off C/C++ towards a memory safe language, you might want to take a look at the SAFECode project (http://safecode.cs.illinois.edu/index.html). My impression is that the current code is too far off.

If I opened a JIRA about switching Hadoop to Scala, would the discussion continue "in the right tone"? If I pointed out a bunch of stuff that would be "fixed by scala" (that wasn't actually buggy), what would the tone be then? (Note that I have nothing against Scala-- it's a fine language.)

There are more interests in writing more robust and efficient code, compared to what language you want to implement it. Personally I'm fairly open to this discussion, if you can demonstrate the benefits as I pointed out in this jira.

Haohui Mai
added a comment - 11/Jun/14 22:31 So far, we haven't found any defects.
So far I'm yet to be convinced that the code is in good quality. I can see two implication on "haven't found any defects". One is the code is in good quality, the other is the code has not yet been gone through full scale reviews and testings. I'm unsure which implication I should get. I haven't gone through this patch, but in other patches there might be some similar issues:
https://issues.apache.org/jira/browse/HADOOP-10640?focusedCommentId=14025478&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14025478
Due to the complexity the code, I'm deeply concern that the code cannot be properly reviewed and have at least the same quality of the code in Hadoop.
This JIRA doesn't have anything to do with C versus C++. It is about the RPC code, which was already reviewed and committed.
You're missing the point. From the prospective of a reviewer, the key question to answer is that what quality the code currently is. There are two problems I can see as the current state of the code:
The code is yet to be separated to logical pieces to be throughly reviewed.
The control flow involves several heap allocations, and it is nontrivial to ensure that there are no resource leaks.
I call out C++ because it is widely available when developing native code. You're more than welcome to do it in C if you can address the above requirements. I'm not sure what your plan on this branch, but if you plan to merge it to trunk, in my opinion the above comments have to be properly addressed.
And C++ doesn't provide "type checked proofs" of anything, since it is an unsafe language (much like C, which it's based on). If you wanted provable correctness, you might try something like SML or Haskell plus a theorem prover like Coq. But what does this have to do with a native client?
At the very least, the code should be free of the following common defects: resource leaks, buffer overflow, and control flow hijacking. It would be great if you can fix your code to systematically show that. I call out C++ because it allows you to provide a much better claim that the code is correct (obviously it can't save you if you intentionally screw it up), but it seems the the current code does not even have this luxury.
By the way, if you're interested in how far off C/C++ towards a memory safe language, you might want to take a look at the SAFECode project ( http://safecode.cs.illinois.edu/index.html ). My impression is that the current code is too far off.
If I opened a JIRA about switching Hadoop to Scala, would the discussion continue "in the right tone"? If I pointed out a bunch of stuff that would be "fixed by scala" (that wasn't actually buggy), what would the tone be then? (Note that I have nothing against Scala-- it's a fine language.)
There are more interests in writing more robust and efficient code, compared to what language you want to implement it. Personally I'm fairly open to this discussion, if you can demonstrate the benefits as I pointed out in this jira.

If you want to help improve the code, reviews and testing are welcome. We always respond to questions or comments about a particular piece of code, just as I did in this thread. There is a lot of native code in Hadoop already, even outside this branch. For example, libwebhdfs, libhadoop, fuse_dfs, and the JNI HDFS stuff. All of those projects are good places to start if you want to make a contribution. That would be constructive. I don't think this thread is constructive.

Colin P. McCabe
added a comment - 12/Jun/14 17:52 If you want to help improve the code, reviews and testing are welcome. We always respond to questions or comments about a particular piece of code, just as I did in this thread. There is a lot of native code in Hadoop already, even outside this branch. For example, libwebhdfs, libhadoop, fuse_dfs, and the JNI HDFS stuff. All of those projects are good places to start if you want to make a contribution. That would be constructive. I don't think this thread is constructive.

The patches in this branch are too big to review. Is it possible to separate utility functions in separate jiras to facilitate reviews?

Can you post your general guidelines on how the code is using standard library functions, how to prevent buffer overflows?

How to prevent resource leaks systematically in the code, especially there are multiple gotos in a function?

Especially for (2) and (3), it should rely on automatic / semi-automatic tools to do that – it should not push the responsibility to the reviewers, given the amount of the code that needs to be reviewed. (the size of patches has already exceed 500k already)

Haohui Mai
added a comment - 12/Jun/14 19:03 Just to recap my comments:
The patches in this branch are too big to review. Is it possible to separate utility functions in separate jiras to facilitate reviews?
Can you post your general guidelines on how the code is using standard library functions, how to prevent buffer overflows?
How to prevent resource leaks systematically in the code, especially there are multiple gotos in a function?
Especially for (2) and (3), it should rely on automatic / semi-automatic tools to do that – it should not push the responsibility to the reviewers, given the amount of the code that needs to be reviewed. (the size of patches has already exceed 500k already)

Haohui Mai, can you be more specific about the code you've found difficult to review and audit, and what could be done to improve its readability? A treatise on buffer overflows and memory leaks will not eliminate them; if Colin were to produce it, how would it help you? Comparing a hypothetical implementation against a partial one by arguing about language features is very indirect. As an example, Binglin Chang's questions showed that he had made an effort to understand the code, and Colin's answers provided some insight into the approach. If the code is so impenetrable that review is truly impossible, please consider demonstrating the virtues of C++11 in a competing branch.

Colin P. McCabe, I looked at the patch attached to this JIRA. Some of it was familiar and/or requires cursory review (makefiles, a queue impl from 1994 *BSD, vint utility methods), but other files (common/string.*, a splay tree impl from...?, protoc-gen-hrpc.cc) are less obvious, and you didn't explain them until asked. If you can help others follow your progress in the branch, they won't need to ask the "right" questions to understand your approach. This can be in design docs[1], thorough comments[2], and calling out which files are new code (requiring the rigorous audit that Haohui Mai rightly expects), and what has been working for 20+ years. You can solicit better questions.

[1] Is there a design doc for this? I looked on HADOOP-10388, but didn't find one.[2] For example, one of the few comments in this patch was for hrpc_conn_read_alloc, which explained esoteric details of the interaction with libuv.

Chris Douglas
added a comment - 12/Jun/14 20:14 - edited Haohui Mai , can you be more specific about the code you've found difficult to review and audit, and what could be done to improve its readability? A treatise on buffer overflows and memory leaks will not eliminate them; if Colin were to produce it, how would it help you? Comparing a hypothetical implementation against a partial one by arguing about language features is very indirect. As an example, Binglin Chang 's questions showed that he had made an effort to understand the code, and Colin's answers provided some insight into the approach. If the code is so impenetrable that review is truly impossible, please consider demonstrating the virtues of C++11 in a competing branch.
Colin P. McCabe , I looked at the patch attached to this JIRA. Some of it was familiar and/or requires cursory review (makefiles, a queue impl from 1994 *BSD, vint utility methods), but other files (common/string.*, a splay tree impl from...?, protoc-gen-hrpc.cc) are less obvious, and you didn't explain them until asked. If you can help others follow your progress in the branch, they won't need to ask the "right" questions to understand your approach. This can be in design docs [1] , thorough comments [2] , and calling out which files are new code (requiring the rigorous audit that Haohui Mai rightly expects), and what has been working for 20+ years. You can solicit better questions.
[1] Is there a design doc for this? I looked on HADOOP-10388 , but didn't find one.
[2] For example, one of the few comments in this patch was for hrpc_conn_read_alloc , which explained esoteric details of the interaction with libuv.

Colin Patrick McCabe, I looked at the patch attached to this JIRA. Some of it was familiar and/or requires cursory review (makefiles, a queue impl from 1994 BSD, vint utility methods), but other files (common/string., a splay tree impl from...?, protoc-gen-hrpc.cc) are less obvious, and you didn't explain them until asked. If you can help others follow your progress in the branch, they won't need to ask the "right" questions to understand your approach. This can be in design docs[1], thorough comments[2], and calling out which files are new code (requiring the rigorous audit that Haohui Mai rightly expects), and what has been working for 20+ years. You can solicit better questions.

protoc-gen-hrpc.cc is to generate the thread-safe RPC wrappers. It plugs into protobuf's code generation subsystem. I'd be happy to answer any questions about that. I agree that it's helpful to call out what's new and what's moved (I tried to do this in the design doc, hopefully it clears some things up.) Let me know if there's also areas that could be better commented... things are still a work in progress. I am using doxygen (looks a lot like JavaDoc so hopefully that will provide some context.

Colin P. McCabe
added a comment - 13/Jun/14 22:59 [1] Is there a design doc for this? I looked on HADOOP-10388 , but didn't find one.
I posted a design doc today on HADOOP-10388 . Check it out: https://issues.apache.org/jira/secure/attachment/12650406/2014-06-13_HADOOP-10388_design.pdf
Colin Patrick McCabe, I looked at the patch attached to this JIRA. Some of it was familiar and/or requires cursory review (makefiles, a queue impl from 1994 BSD, vint utility methods), but other files (common/string. , a splay tree impl from...?, protoc-gen-hrpc.cc) are less obvious, and you didn't explain them until asked. If you can help others follow your progress in the branch, they won't need to ask the "right" questions to understand your approach. This can be in design docs [1] , thorough comments [2] , and calling out which files are new code (requiring the rigorous audit that Haohui Mai rightly expects), and what has been working for 20+ years. You can solicit better questions.
protoc-gen-hrpc.cc is to generate the thread-safe RPC wrappers. It plugs into protobuf's code generation subsystem. I'd be happy to answer any questions about that. I agree that it's helpful to call out what's new and what's moved (I tried to do this in the design doc, hopefully it clears some things up.) Let me know if there's also areas that could be better commented... things are still a work in progress. I am using doxygen (looks a lot like JavaDoc so hopefully that will provide some context.

If the code is so impenetrable that review is truly impossible, please consider demonstrating the virtues of C++11 in a competing branch.

As a weekend project, I've uploaded a proof-of-concept patch in C++ for this jira. It implements a synchronous RPC engine and the exists() call for libhdfs.

The main point is not (and never intends to be) to demonstrate c++ is superior to c, but to show that it is practical to using current technology to improve code quality and to cut down the maintenance cost down the road. In my personal opinion it is important to ensure that the code can be easily maintained by the community. The patch demonstrates a viable way to approach this goal.

Just a couple points to highlight:

Simpler implementation. Thanks to the standard libraries, there is no need to put things like RB trees, hash tables, linked lists into the patch and to ask for reviews. Though not fully equivalent, the patch is an order of magnitude smaller in terms of size.

Automatic resource management. Explicit resource management, (e.g., malloc() / free(), and closing sockets) are no longer required. The life time of resource matches the scope of the code. It is a systematic way to avoid resource leaks.

Stronger claims from the type systems. Modern language allows the code to be written in a mostly type-safe way, where the type system is able to show that the majority of the code is safe. There is only one cast required in the code compared to many (each in the linked list) in the old implementation. New constructs like std::array also allow integrate bounds check in the code to help prevent buffer overflows.

Just to reiterate, by no means I'm trying to claim that the current patch is perfect and free of bugs. People myself included make mistakes all the time. A modern tool, however, can help people to catch the mistakes at the beginning of the development cycle and to avoid them. Thoughts?

Haohui Mai
added a comment - 16/Jun/14 18:06 - edited If the code is so impenetrable that review is truly impossible, please consider demonstrating the virtues of C++11 in a competing branch.
As a weekend project, I've uploaded a proof-of-concept patch in C++ for this jira. It implements a synchronous RPC engine and the exists() call for libhdfs.
The main point is not (and never intends to be) to demonstrate c++ is superior to c, but to show that it is practical to using current technology to improve code quality and to cut down the maintenance cost down the road. In my personal opinion it is important to ensure that the code can be easily maintained by the community. The patch demonstrates a viable way to approach this goal.
Just a couple points to highlight:
Simpler implementation. Thanks to the standard libraries, there is no need to put things like RB trees, hash tables, linked lists into the patch and to ask for reviews. Though not fully equivalent, the patch is an order of magnitude smaller in terms of size.
Automatic resource management. Explicit resource management, (e.g., malloc() / free() , and closing sockets) are no longer required. The life time of resource matches the scope of the code. It is a systematic way to avoid resource leaks.
Stronger claims from the type systems. Modern language allows the code to be written in a mostly type-safe way, where the type system is able to show that the majority of the code is safe. There is only one cast required in the code compared to many (each in the linked list) in the old implementation. New constructs like std::array also allow integrate bounds check in the code to help prevent buffer overflows.
Just to reiterate, by no means I'm trying to claim that the current patch is perfect and free of bugs. People myself included make mistakes all the time. A modern tool, however, can help people to catch the mistakes at the beginning of the development cycle and to avoid them. Thoughts?

To me this is not that interesting. There are a lot of clients already out there in various languages... C, C++, even golang. Your code is much smaller because you didn't implement generation of type-safe RPC wrappers (you just wrote out everything manually in HdfsImpl::Exists), didn't integrate the existing JNIFS code, are not parsing the config XMLs, don't handle errors, etc. etc. boost::asio doesn't provide equivalent functionality to libuv. It does wrap a lot of network stuff, but doesn't handle things like condition variables, signals, etc. etc. It uses exceptions, which are problematic. Yes, you can use boost for some of that stuff, but the boost versions are often not that good. For example, the boost condition variable actually can let your thread wake up without recovering the lock (it then has to sleep again). Boost in general has a versioning problem which is really bad for libraries. If your library links against version X of boost, any application linking to you needs to use the same version... otherwise bad stuff happens. You could link it statically, but it might be too big for that to work (I've never tried).

If you're interested in something even smaller and more minimal, you can check out this: https://github.com/toddlipcon/libhrpc/ Though I think that that is for an out-of-date version of hadoop RPC at this point.

Colin P. McCabe
added a comment - 17/Jun/14 03:47 To me this is not that interesting. There are a lot of clients already out there in various languages... C, C++, even golang. Your code is much smaller because you didn't implement generation of type-safe RPC wrappers (you just wrote out everything manually in HdfsImpl::Exists ), didn't integrate the existing JNIFS code, are not parsing the config XMLs, don't handle errors, etc. etc. boost::asio doesn't provide equivalent functionality to libuv . It does wrap a lot of network stuff, but doesn't handle things like condition variables, signals, etc. etc. It uses exceptions, which are problematic. Yes, you can use boost for some of that stuff, but the boost versions are often not that good. For example, the boost condition variable actually can let your thread wake up without recovering the lock (it then has to sleep again). Boost in general has a versioning problem which is really bad for libraries. If your library links against version X of boost, any application linking to you needs to use the same version... otherwise bad stuff happens. You could link it statically, but it might be too big for that to work (I've never tried).
If you're interested in something even smaller and more minimal, you can check out this: https://github.com/toddlipcon/libhrpc/ Though I think that that is for an out-of-date version of hadoop RPC at this point.

I'd like to add more inputs on this:
I wrote a c++ rpc/hdfs/yarn client(https://github.com/decster/libhadoopclient), it uses c++11, so it does not need boost(although many people use boost, they just for header only libraries, and public headers does not include boost, so there are no version issues).
c++ 's main concern is abi compatibility, this can be resolved by using c or simple c++ class public headers, hiding real implementation.

I think some issue using c++/pro using c are:
1. centos does not have enough support for c+11, c+11 is not generally available yet
2. remain libhdfs compatibility, since libhdfs is written in c, we might just continue using c as well

Also there are some concerns about using c:
1. the protobuf-c library is just not so reliable as official protobuf library which is maintained and verified by google and many other companies/projects, I read some of the protobuf-c code, it uses a reflection style implementation to do serializing/deserializing, so performance, security, compatibility may all at risk. I see https://github.com/protobuf-c/protobuf-c only have 92 stars.
2. malloc/free/memset can easily generate buggy code, need additional care and checks, I see many those kinds of code recently(HDFS-6534,HADOOP-10640,HADOOP-10706). it is OK to use c, but we may need more care and effort.

About JNIFS, why do we need jnifs if we already have nativefs? using dlopen/dlsym to replace jni apis is not trivial if both compile/runtime dependency is needed to be removed.

Binglin Chang
added a comment - 17/Jun/14 06:36 I'd like to add more inputs on this:
I wrote a c++ rpc/hdfs/yarn client( https://github.com/decster/libhadoopclient ), it uses c++11, so it does not need boost(although many people use boost, they just for header only libraries, and public headers does not include boost, so there are no version issues).
c++ 's main concern is abi compatibility, this can be resolved by using c or simple c++ class public headers, hiding real implementation.
I think some issue using c++/pro using c are:
1. centos does not have enough support for c+ 11, c +11 is not generally available yet
2. remain libhdfs compatibility, since libhdfs is written in c, we might just continue using c as well
Also there are some concerns about using c:
1. the protobuf-c library is just not so reliable as official protobuf library which is maintained and verified by google and many other companies/projects, I read some of the protobuf-c code, it uses a reflection style implementation to do serializing/deserializing, so performance, security, compatibility may all at risk. I see https://github.com/protobuf-c/protobuf-c only have 92 stars.
2. malloc/free/memset can easily generate buggy code, need additional care and checks, I see many those kinds of code recently( HDFS-6534 , HADOOP-10640 , HADOOP-10706 ). it is OK to use c, but we may need more care and effort.
About JNIFS, why do we need jnifs if we already have nativefs? using dlopen/dlsym to replace jni apis is not trivial if both compile/runtime dependency is needed to be removed.

Correct me if I'm wrong – based on my understanding the original patch in HADOOP-10369 implements none of them above. Both implementation can benefit from other jiras (e.g. HADOOP-10446).

..., because you didn't implement generation of type-safe RPC wrappers (you just wrote out everything manually in HdfsImpl::Exists),

This is implemented as C++ code in the original patch. It can be brought in if it is required.

For what is worth, do you agree that the code implements comparable functionality with much smaller code base?

boost::asio doesn't provide equivalent functionality to libuv. It does wrap a lot of network stuff, but doesn't handle things like condition variables, signals, etc. etc., ..., For example, the boost condition variable actually can let your thread wake up without recovering the lock (it then has to sleep again).

I'm unsure why this is relevant as I don't see the current code uses condition variables and signals. My guess is that you might need them to implement RPC with timeout. I'm yet to be convinced that this is required in the c++ implementation, as std::future provides a much better abstraction for that.

It uses exceptions, which are problematic.

This is not a fundamental part of the API. The current implementation uses the exception-free variants of the library calls whenever possible. The std::system_error object carries the error conditions.

The current implementation of the rpc-send path, however, might throw bad_alloc. It can be fixed by limiting the maximum size of the outgoing buffer.

I say it is safe to define BOOOST_NO_EXCEPTIONS and compile the code with -fno-exceptions.

Yes, you can use boost for some of that stuff, but the boost versions are often not that good.

Boost in general has a versioning problem which is really bad for libraries. If your library links against version X of boost, any application linking to you needs to use the same version... otherwise bad stuff happens. You could link it statically, but it might be too big for that to work (I've never tried).

The only thing required is boost::system::error, which have a fairly small foolprints (~70K on Mac OS X). The size is comparable to the size of protobuf-c (~75K on Mac OS X).

To me this is not that interesting. There are a lot of clients already out there in various languages... C, C++, even golang.

In my opinion the same argument applies to any clients, none of them is superior to the others. Everybody can write their own clients. Contributions are welcome, but for a community takes over the code and to maintain it, it is important to ensure that the code can be easily changed, reviewed, and maintained by other contributors.

This key point, however, remains unaddressed in the the current branch.

If you're interested in something even smaller and more minimal, you can check out this: https://github.com/toddlipcon/libhrpc/ Though I think that that is for an out-of-date version of hadoop RPC at this point.

Given the fact that many hadoop RPC clients happily sit on github, and the protocol remains pretty stable, is it more appropriate to host this project on github if you don't have time to address the comments?

Haohui Mai
added a comment - 17/Jun/14 23:03 - edited didn't integrate the existing JNIFS code, are not parsing the config XMLs, don't handle errors, etc. etc.
Correct me if I'm wrong – based on my understanding the original patch in HADOOP-10369 implements none of them above. Both implementation can benefit from other jiras (e.g. HADOOP-10446 ).
..., because you didn't implement generation of type-safe RPC wrappers (you just wrote out everything manually in HdfsImpl::Exists),
This is implemented as C++ code in the original patch. It can be brought in if it is required.
For what is worth, do you agree that the code implements comparable functionality with much smaller code base?
boost::asio doesn't provide equivalent functionality to libuv. It does wrap a lot of network stuff, but doesn't handle things like condition variables, signals, etc. etc., ..., For example, the boost condition variable actually can let your thread wake up without recovering the lock (it then has to sleep again).
I'm unsure why this is relevant as I don't see the current code uses condition variables and signals. My guess is that you might need them to implement RPC with timeout. I'm yet to be convinced that this is required in the c++ implementation, as std::future provides a much better abstraction for that.
It uses exceptions, which are problematic.
This is not a fundamental part of the API. The current implementation uses the exception-free variants of the library calls whenever possible. The std::system_error object carries the error conditions.
The current implementation of the rpc-send path, however, might throw bad_alloc . It can be fixed by limiting the maximum size of the outgoing buffer.
I say it is safe to define BOOOST_NO_EXCEPTIONS and compile the code with -fno-exceptions .
Yes, you can use boost for some of that stuff, but the boost versions are often not that good.
Boost in general has a versioning problem which is really bad for libraries. If your library links against version X of boost, any application linking to you needs to use the same version... otherwise bad stuff happens. You could link it statically, but it might be too big for that to work (I've never tried).
The only thing required is boost::system::error , which have a fairly small foolprints (~70K on Mac OS X). The size is comparable to the size of protobuf-c (~75K on Mac OS X).
To me this is not that interesting. There are a lot of clients already out there in various languages... C, C++, even golang.
In my opinion the same argument applies to any clients, none of them is superior to the others. Everybody can write their own clients. Contributions are welcome, but for a community takes over the code and to maintain it, it is important to ensure that the code can be easily changed, reviewed, and maintained by other contributors.
This key point, however, remains unaddressed in the the current branch.
If you're interested in something even smaller and more minimal, you can check out this: https://github.com/toddlipcon/libhrpc/ Though I think that that is for an out-of-date version of hadoop RPC at this point.
Given the fact that many hadoop RPC clients happily sit on github, and the protocol remains pretty stable, is it more appropriate to host this project on github if you don't have time to address the comments?

Thanks Binglin Chang for the input. This could be another direction to go. Some of my thoughts:

1. centos does not have enough support for c+11, c+11 is not generally available yet

The POC patch uses a few syntactic sugars in c+11, which can be easily implemented in C+TR1 (Note that the current native code is using std::shared_ptr from tr1 already). The tougher part is that the private APIs passes the std::error_code objects for error conditions, it can be fixed by passing int or boost::system::error_code instead.

2. remain libhdfs compatibility, since libhdfs is written in c, we might just continue using c as well

If the ultimate goal is to implement the libhdfs interface, there should be no c++ symbols exported in the library, thus there is really no compatibility issue from this regard. As a result, c++ remains as an implementation detail only.

1. the protobuf-c library is just not so reliable as official protobuf library which is maintained and verified by google and many other companies/projects, I read some of the protobuf-c code, it uses a reflection style implementation to do serializing/deserializing, so performance, security, compatibility may all at risk. I see https://github.com/protobuf-c/protobuf-c only have 92 stars.

I think this is a legit concern. It might be better to depend only on widely deployed, well-tested libraries.

Haohui Mai
added a comment - 17/Jun/14 23:20 Thanks Binglin Chang for the input. This could be another direction to go. Some of my thoughts:
1. centos does not have enough support for c+11, c+11 is not generally available yet
The POC patch uses a few syntactic sugars in c+ 11, which can be easily implemented in C +TR1 (Note that the current native code is using std::shared_ptr from tr1 already). The tougher part is that the private APIs passes the std::error_code objects for error conditions, it can be fixed by passing int or boost::system::error_code instead.
2. remain libhdfs compatibility, since libhdfs is written in c, we might just continue using c as well
If the ultimate goal is to implement the libhdfs interface, there should be no c++ symbols exported in the library, thus there is really no compatibility issue from this regard. As a result, c++ remains as an implementation detail only.
1. the protobuf-c library is just not so reliable as official protobuf library which is maintained and verified by google and many other companies/projects, I read some of the protobuf-c code, it uses a reflection style implementation to do serializing/deserializing, so performance, security, compatibility may all at risk. I see https://github.com/protobuf-c/protobuf-c only have 92 stars.
I think this is a legit concern. It might be better to depend only on widely deployed, well-tested libraries.

Boost in general has a versioning problem which is really bad for libraries. If your library links against version X of boost, any application linking to you needs to use the same version... otherwise bad stuff happens.

FWIW, I agree wholeheartedly. Boost's major, blocker-level drawback is it's compatibility with itself. Enough of one, that I'm more than willing to -1 any patch that uses it. The operational issues of trying to support multiple versions of boost on a machine are too high of a wall to climb. Been there, done that, didn't even get a t-shirt.

Allen Wittenauer
added a comment - 19/Jun/14 19:18 Boost in general has a versioning problem which is really bad for libraries. If your library links against version X of boost, any application linking to you needs to use the same version... otherwise bad stuff happens.
FWIW, I agree wholeheartedly. Boost's major, blocker-level drawback is it's compatibility with itself. Enough of one, that I'm more than willing to -1 any patch that uses it. The operational issues of trying to support multiple versions of boost on a machine are too high of a wall to climb. Been there, done that, didn't even get a t-shirt.

Just make sure we're on the same page, I think the boost library will be linked statically and no symbols will be exported. The user of the library should not be aware of the existence of boost. Therefore I don't quite understand where the compatibility issue is coming from. I played around locally and it worked for me. I can update the patch if needed.

Does it address your concern? If you don't like boost, I have no problems using an in-house implementation like what Binglin Chang did, or libuv.

Haohui Mai
added a comment - 19/Jun/14 21:33 Boost's major, blocker-level drawback is it's compatibility with itself.
Just make sure we're on the same page, I think the boost library will be linked statically and no symbols will be exported. The user of the library should not be aware of the existence of boost. Therefore I don't quite understand where the compatibility issue is coming from. I played around locally and it worked for me. I can update the patch if needed.
Does it address your concern? If you don't like boost, I have no problems using an in-house implementation like what Binglin Chang did, or libuv .

Not really, no, because it becomes a code maintenance nightmare whenever the API changes. "Oh, we have to compile with this specific version of boost." ( Ironically, I was just told a story about someone who had this exact problem with Impala just today.) Historically, the Hadoop project has been trying to jettison those types of dependencies. (Hai forrest.)

Allen Wittenauer
added a comment - 20/Jun/14 00:22 Not really, no, because it becomes a code maintenance nightmare whenever the API changes. "Oh, we have to compile with this specific version of boost." ( Ironically, I was just told a story about someone who had this exact problem with Impala just today.) Historically, the Hadoop project has been trying to jettison those types of dependencies. (Hai forrest.)
As to libuv, I'm still looking at Colin's patch. I have some concerns about libuv's stability as well (esp given http://upstream-tracker.org/versions/libuv.html ) since lack experience with it... unlike boost.

I might be over optimistic for this one – The main dependency is boost::asio, which is being pushed into C++TR2. The APIs are indeed quite stable right now.

I'm still leaning towards boost::asio for now since it is widely deployed and well tested. I'm also open to use other libraries, though we need to deal with the API compatibilities of these libraries as well. Baking our own libraries can be the last choice, but I'm also open to it as well.

Haohui Mai
added a comment - 20/Jun/14 00:50 "Oh, we have to compile with this specific version of boost."
I might be over optimistic for this one – The main dependency is boost::asio , which is being pushed into C++TR2. The APIs are indeed quite stable right now.
I'm still leaning towards boost::asio for now since it is widely deployed and well tested. I'm also open to use other libraries, though we need to deal with the API compatibilities of these libraries as well. Baking our own libraries can be the last choice, but I'm also open to it as well.