Hadoop uses a general-purpose RPC mechanism. The main idea is, define a single interface, shared by the server and the client. The client will use the java.reflection proxy class pattern, to generate an implementation to the RPC interface. See Java theory and practice: Decorating with dynamic proxies fro more details.

When the client calls a method, say, String ping(String msg), in the proxy class, it would:

1. Serialize the arguments for the method (msg in our example). 2. Connect to the RPC server. 3. Tell the server to execute the method with its arguments (in our example, we'll tell the server to execute method ping, with a single argument msg

Note that currently, a production Hadoop RPC interface, must extend and implement the VersionedProtocol interface. It's used by ProtocolProxy.fetchServerMethods to make sure client and server protocol version match. We neglected this detail for the sake of simplicity.

We understand now how to implement a basic Hadoop RPC interface, and even how to add methods to it, maintaining backwards compatibility. But what happens under the hood? What goes through the wire when the proxy issues ping?

The logic behind the RPC mechanism is not very easy to follow. There are two main tasks one needs to do when issuing an RPC call. Serializing and deserializing sent Java Objects, and sending them on the wire with some protocol.

The wire protocol is defined at the org.apache.hadoop.ipc.Client and org.apache.hadoop.ipc.Server classes. A client call starts its life with the Client.call method. There we would create a new connection (or pull one from the connection pool) and begin an RPC handshake.

Version is probably the RPC protocol version, and is currently 9. ServiceClass defaults to 0. The AuthProtocol determines whether or not to use SASL authentication before starting actual RPC calls. SASL will be used if hadoop.security.authentication is set to non simple authentication, or that the client explicitly gave an authentication token when creating the proxy .

Next, we'll set the connection context. The connection context is a protocol buffer message IpcConnectionContextProto, generated by makeIpcConnectionContext. The context object defines the destination RPC protocol name (we defined it as "ping", using Java annotations in the example above), and the user who calls the protocol. We'll tell more about this object when we'll speak about RPC security. Note that unlike other protocol buffers objects, the context is delimited by 4 bytes ints, and not by varint.

The payload header, RpcRequestHeaderProto, is a google Protocol Buffers object, hence serializable by Protocol Buffers. The main content of this header are callId, which is an identifier of the RPC request in the connection, and clientId, which is a UUID generated for the client. Negative callIds, are reserved for meta-RPC purposes, e.g., to handshake SASL method.

The payload itself is a serialized representation of a method call. Serializing and deserializing the objects is done by an RpcEngine, the default one is WritableRpcEngine, which uses the native Hadoop serialization to serialize Java objects Writable. Protocol Buffers based engine can be selected by setting rpc.engine.<protocol_name>=ProtobufRpcEngine, but we'll concentrate on the default implementation.

Naturally, WritableRpcEngine writes the payload as a Writable object, whose serialization is defined by the Invocation object write method, WritableRpcEngine.Invocation.write: The weird mix between Protocol Buffers and Writables, is probably due to on going migration to Protocol Buffers based RPC.

The RpcResponseHeaderProto, contains the callId, used to identify the client, an indicator whether or not the call was successful, and in case it wasn't, information about the error.

The last detail of the RPC protocol is the ping message. If the RPC client times out waiting for input, the client will send it 0xFF_FF_FF_FF, probably to keep the connection alive. The server knows to ignore such headers. This can be disabled by setting ipc.client.ping=false, and is controlled by the PingInputStream class.

Let's have a look at a real example. Let's take the simple ping RPC server in our example git repository hadoop_rpc_walktrhough.

First, we'll set a fake server that records client traffic, and then we'll clone and run the simple RPC client.

Since the nc -l 5121 server does not respond, the client will eventually send a ping message.

// ping request
0xff, 0xff, 0xff, 0xff,

How would we get the server message? Let's use the client request file we've recorded earlier, and feed it to nc as a Hadoop RPC client demiculu. Note we need to wait for input, hence sleep before closing nc's input stream.

This are the essential details of a Hadoop RPC server. You can see example go code that parses Hadoop RPC on the main_test.go test that walks through an example static binary output, or a very simple ping rpc client.