The ZeroMQ Message Transport Protocol (ZMTP) is a transport layer protocol for exchanging messages between two peers over a connected transport layer such as TCP. This document describes ZMTP 3.0. The major change in this version is the addition of security mechanisms and the removal of hard-coded connection metadata (socket type and identity) from the greeting.

Preamble

Copyright (c) 2009-2014 iMatix Corporation & Contributors

This Specification is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This Specification is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, see <http://www.gnu.org/licenses>.

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Goals

The ZeroMQ Message Transport Protocol (ZMTP) is a transport layer protocol for exchanging messages between two peers over a connected transport layer such as TCP. This document describes version 3.0 of ZMTP. ZMTP solves a number of problems we face when using TCP carry messages:

TCP carries a stream of octets with no delimiters, but we want to send and receive discrete messages. Thus, ZMTP reads and writes frames consisting of a size and a body.

We need to carry metadata on each frame (such as, whether the frame is part of a multipart message, or not). ZMTP provides a flags field in each frame for metadata.

We need to be able to talk to older implementations, so that our framing can evolve without breaking existing implementations. ZMTP defines a greeting that announces the implemented version number, and specifies a method for version negotiation.

We need security so that peers can be sure of the identity of the peers they talk to, and so that messages cannot be tampered with, nor inspected, by third parties. ZMTP defines a security handshake that allows peers to create a secure connection.

We need a range of security protocols, from clear text (no security, but fast) to fully authenticated and encrypted (secure, but slow). Further, we need the freedom to add new security protocols over time. ZMTP defines a way for peers to agree on an extensible security mechanism.

We need a way to carry metadata about the connection, after the security handshake. ZMTP defines a standard set of metadata properties (socket type, identity, etc.) that peers exchange after the security mechanism.

We need to write down these solutions in a way that is easy for teams to implement on any platform and in any language. ZMTP is thus specified as a formal protocol (this document) and made available to teams under a free license.

We need guarantees that people will not create private forks of ZMTP, thus breaking interoperability. ZMTP is thus licensed under the GPLv3, so that any derived versions must also be made available to users of software that implements it.

Changes from ZMTP 2.0

The changes in version 3.0 of the specification are a simpler version numbering scheme, the addition of security mechanisms, the removal of hard-coded connection metadata (socket type and identity) from the greeting, the addition of connection metadata, and the addition of commands.

Implementation

Overall Behavior

The two peers agree on the version and security mechanism of the connection by sending each other data and either continuing the discussion, or closing the connection.

The two peers handshake the security mechanism by exchanging zero or more commands. If the security handshake is successful, the peers continue the discussion, otherwise one or both peers closes the connection.

Each peer then sends the other metadata about the connection as a final command. The peers may check the metadata and each peer decides either to continue, or to close the connection.

Each peer is then able to send the other messages. Either peer may at any moment close the connection.

Version Negotiation

ZMTP provides asymmetric version negotiation. A ZMTP peer MAY attempt to detect and work with older versions of the protocol. It MAY also demand ZMTP capabilities from its peers.

In the first case, after making or receiving a connection, the peer SHALL send to the other a partial greeting sufficient to trigger version detection. This is the first 11 octets of the greeting (signature and major version number). The peer SHALL then read the first eleven octets of greeting sent by the other peer, and determine whether to downgrade or not. The specific heuristics for each older ZMTP version are explained in the section "Backwards Interoperability". In this case the peer MAY use the padding field for older protocol detection (we explain the specific known case below).

In the second case, after making or receiving a connection, the peer SHALL send its entire greeting (64 octets) and SHALL expect a matching 64 octet greeting. In this case the peer SHOULD set the padding field to binary zeros.

In either case, note that:

A peer SHALL NOT assign any significance to the padding field and MUST NOT validate this nor interpret it in any way whatsoever.

A peer MUST accept higher protocol versions as valid. That is, a ZMTP peer MUST accept protocol versions greater or equal to 3.0. This allows future implementations to safely interoperate with current implementations.

A peer SHALL always use its own protocol (including framing) when talking to an equal or higher protocol peer.

A peer MAY downgrade its protocol to talk to a lower protocol peer.

If a peer cannot downgrade its protocol to match its peer, it MUST close the connection.

Topology

ZMTP is by default a peer-to-peer protocol that makes no distinction between the clients and servers.

However, the security mechanisms (which are extension protocols, and explained below) MAY define distinct roles for client peers and server peers. This difference reflects the general model of concentrating authentication on servers.

Traditionally in TCP topologies, the "server" is the peer which binds, and the "client" is the peer that connects. ZMTP allows this but also the opposite topology in which the client binds, and the server connects. The actual server role is therefore not defined by bind/connect direction, but specified in the greeting as-server field, which is set to 1 for server peers, and 0 for client peers.

If the chosen security mechanism does not specify a client and server topology, the as-server field SHALL have no significance, and SHOULD be zero for all peers.

Authentication and Confidentiality

ZMTP provides extensible authentication and confidentiality through the use of a negotiated security mechanism that is loosely based on the IETF Simple Authentication and Security Layer (SASL). The peer MAY support any or all of the following mechanisms:

"NULL", specified later in this document, which implements no authentication, and no confidentiality.

The security mechanism is an ASCII string, null-padded as needed to fit 20 octets. Implementations MAY define their own mechanisms for experimentation and internal use. All mechanisms intended for public interoperability SHALL be defined as 0MQ RFCs. Mechanism names SHALL be assigned on a first-come first-served basis. Mechanism names SHALL consist only of uppercase letters A to Z, digits, and embedded hyphens or underscores.

A peer announces precisely one security mechanism, unlike SASL, which lets servers announce multiple security mechanisms. Security in ZMTP is assertive in that all peers on a given socket have the same, required level of security. This prevents downgrade attacks and simplifies implementations.

Each security mechanism defines a protocol consisting of zero or more commands sent by either peer to the other until the handshaking is complete or either of the peers refuses to continue, and closes the connection.

Commands are single frames consisting of a size field, and a body. If the body is 0-255 octets, the command SHOULD use a short size field (%x04 followed by 1 octet). If the body is 256 or more octets, the command SHOULD use a long size field (%x06 followed by 8 octets).

The size field is always in clear text. The body may be partially or fully encrypted. ZMTP does not define the syntax nor semantics of commands. These are fully defined by the security mechanism protocol.

The supported mechanism is not considered sensitive information. A peer that reads a full greeting, including mechanism, MUST also send a full greeting including mechanism. This avoids deadlocks in which two peers each wait for the other to divulge the remainder of their greeting.

If the mechanism that the peer received does not exactly match the mechanism it sent, it MUST close the connection.

Error Handling

ZMTP allows an explicit fatal error response during the mechanism handshake, using the ERROR command. The peer SHALL treat an incoming ERROR command as fatal, and act by closing the connection, and not re-connecting using the same security credentials.

An implementation SHOULD signal any other error, e.g. overloaded, temporarily refusing connections, etc. by closing the connection. The peer SHALL treat an unexpected connection close as a temporary error, and SHOULD reconnect.

To avoid connection storms, peers should reconnect after a short and possibly randomized interval. Further, if a peer reconnects more than once, it should increase the delay between reconnects. Various strategies are possible.

Framing

Following the greeting, which has a fixed size of 64 octets, all further data is sent as frames, which carry commands or messages. Frames consist of a size field, a flags fields, and a body. The frame design is meant to be efficient for small frames but capable of handling extremely large data as well.

A frame consists of a flags field (1 octet), followed by a size field (one octet or eight octets) and a frame body of size octets. The size does not include the flags field, nor itself, so an empty frame has a size of zero.

A short frame has a body of 0 to 255 octets. A long frame has a body of 0 to 2^63-1 octets.

The flags field consists of a single octet containing various control flags. Bit 0 is the least significant bit (rightmost bit):

Bits 7-3: Reserved. Bits 7-3 are reserved for future use and MUST be zero.

Bit 2 (COMMAND): Command frame. A value of 1 indicates that the frame is a command frame. A value of 0 indicates that the frame is a message frame.

Bit 1 (LONG): Long frame. A value of 0 indicates that the frame size is encoded as a single octet. A value of 1 indicates that the frame size is encoded as a 64-bit unsigned integer in network byte order.

Bit 0 (MORE): More frames to follow. A value of 0 indicates that there are no more frames to follow. A value of 1 indicates that more frames will follow. This bit SHALL be zero on command frames.

Commands

Commands are used by the ZMTP implementation and not generally visible to the application except in some cases. Commands always consist of one frame, containing a printable command name, a null octet separator, and data.

ZMTP supports extensible security mechanisms and these may define their own commands. Security mechanisms MAY use any command names they need to.

Messages

Messages carry application data and are not generally created, modified, or filtered by the ZMTP implementation except in some cases. Messages consist of one or more frames and an implementation SHALL always send and deliver messages atomically, that is, all the frames of a message, or none of them.

The NULL Security Mechanism

The NULL mechanism implements no authentication and no confidentiality. The NULL mechanism SHOULD NOT be used on public infrastructure without transport-level security (e.g. over a VPN).

To complete a NULL security handshake, both peers SHALL send each other a READY command. Both peers SHOULD then parse, and MAY validate the READY command. Either or both peer MAY then choose to close the connection if validation failed. A peer MAY start to send messages immediately after sending its metadata.

When a peer uses the NULL security mechanism, the as-server field MUST be zero. The following ABNF grammar defines the NULL security handshake:

The message and command-size are defined by the ZMTP grammar previously explained.

The body of the READY command consists of a list of properties consisting of name and value as size-specified strings.

The name SHALL be 1 to 255 characters. Zero-sized names are not valid. The case (upper or lower) of names SHALL NOT be significant.

The value SHALL be 0 to 2,147,483,647 (2^31-1 or INT32_MAX in C/C++) octets of opaque binary data. Zero-sized values are allowed. The semantics of the value depend on the property. The value size field SHALL be four octets, in network byte order. Note that this size field will mostly not be aligned in memory.

Note that to avoid deadlocks, each peer MUST send its READY command before attempting to receive a READY from the other peer. In the NULL mechanism, peers are symmetric. Here is the command flow for two peers, P and R:

P:greeting R:greeting
P:ready R:ready
P:message... R:message...

The body of the ERROR command contains an error reason that can be logged. It has no defined semantic value.

Connection Metadata

The security mechanism SHALL provide a way for peers to exchange metadata in the form of a key-value dictionary. The precise encoding of the metadata depends on the mechanism.

Metadata names SHALL be case-insensitive.

These metadata properties are defined:

"Socket-Type", which specifies the sender's socket type. See the section "The Socket Type Property" below. The sender SHOULD specify the Socket-Type.

"Identity", which specifies the sender's socket identity. See the section "The Identity Property" below. The sender MAY specify an Identity.

The implementation MAY provide other metadata properties such as implementation name, platform name, and so on. For interoperability, metadata names and semantics MAY be defined as RFCs.

When a peer validates the socket type, it SHOULD handle errors by returning an ERROR command, and then disconnecting the peer.

The Identity Property

A REQ, DEALER, or ROUTER peer connecting to a ROUTER MAY announce its identity, which is used as an addressing mechanism by the ROUTER socket. For all other socket types, the Identity property shall be ignored.

The Identity SHALL match this grammar:

identity = 0*255OCTET

Socket Semantics

In order to ensure interoperability, we aim to define both the socket-specific semantics of the API to calling applications, and the behavior of peers talking to each other over the network.

These rules apply to all sockets:

All sockets SHALL accept connections (binding to an address) and make connections.

All sockets SHALL establish connections opportunistically, that is: they connect to an endpoint asynchronously, and if the connection is broken, SHOULD reconnect after a suitable delay.

A message SHALL be sent or received atomically; that is, all frames or none. On sending, the peer SHALL queue all frames of a message in memory until the final frame is sent.

To cancel a subscription, the SUB or XSUB peer SHALL send an unsubscribe message, which has this grammar:

subscribe = %x00 short-size %d0 subscription

The subscription is a binary string that specifies what messages the subscriber wants. A subscription of "A" SHALL match all messages starting with "A". An empty subscription SHALL match all messages.

Subscriptions SHALL be additive and SHALL NOT be idempotent. That is, subscribing to "A" and "" is the same as subscribing to "" alone. Subscribing to "A" and "A" counts as two subscriptions, and would require two unsubscribe messages to undo.

Detecting ZMTP 1.0 and 2.0 Peers

ZMTP 1.0 did not have any version information. To detect and interoperate with a ZMTP 1.0 and 2.0 peers, an implementation MAY use this strategy:

Send a 10-octet pseudo-signature consisting of "%xFF size %x7F" where 'size' is the number of octets in the sender's identity (0 or greater) plus 1. The size SHALL be 8 octets in network byte order and occupies the padding field.

Read the first octet from the other peer.

If this first octet is not %FF, then the other peer is using ZMTP 1.0, and has sent us a short frame length for its identity. We read that many octets.

If this first octet is %FF, then we read nine further octets, and inspect the last octet (the 10th in total). If the least significant bit is 0, then the other peer is using ZMTP 1.0 and has sent us a long length for its identity. We read that many octets.

If the least significant bit is 1, the peer is using ZMTP 2.0 or later and has sent us the ZMTP signature. We read a further octet, which indicates the ZMTP version. If this is 1 or 2, we have a ZMTP 2.0 peer. If this is 3, we have a ZMTP 3.0 peer.

If the implementation's security mechanism is anything other than NULL, and it detects a ZMTP 1.0 or 2.0 peer, it MUST immediately close the connection. A ZMTP 1.0 or 2.0 peer can only request NULL security.

When we have detected a ZMTP 1.0 peer, we have already sent 10 octets, which the other peer interprets as the start of an identity frame. We continue by sending the body of the identity frame (zero or more octets). From then, we encode and decode all frames on that connection using the ZMTP 1.0 framing syntax.

When we have detected a ZMTP 2.0 peer, we continue with version negotiation as explained above, by sending our version number and then socket type and identity as defined by the ZMTP 2.0 specification.

When we have detected a ZMTP 3.0 peer, we continue by sending the remainder of the greeting (octets 10-64) and then continue with the security handshake as normal.

Worked Example

A DEALER client connects to a ROUTER server. Both client and server are running ZMTP and the implementation has backwards compatibility detection. The peers will use the NULL security mechanism to talk to each other.

The client sends a partial greeting (11 octets) greeting to the server, and at the same time (before receiving anything from the client), the server also sends a partial greeting:

As soon as the server has sent its own READY command, it may also send messages to the client. As soon as the client has received the READY command from the server, it may send messages to the server.

Wire Analysis of the Connection

A ZMTP connection is either clear text, or encrypted, according to the security mechanism. On a clear text connection, message data SHALL be sent as message frames. All clear text mechanisms SHALL use the message framing defined in this specification, so that wire analysis does not require knowledge of the specific mechanism.

On an encrypted connection, message data SHALL be encoded as commands so that wire analysis is not possible.

On all ZMTP connections, command names SHALL be visible and command frames SHALL be printable.

Security Considerations

An implementation MUST guard against downgrade attacks, which may take the form of crafting a ZMTP 1.0 or 2.0 protocol header, or asking for a less secure mechanism than the application has requested.

Attackers may overwhelm a peer by repeated connection attempts. Thus, a peer MAY log failed accesses, and MAY detect and block repeated failed connections from specific originating IP addresses.

Attackers may try to cause memory exhaustion in a peer by holding open connections. Thus, a peer MAY allocate memory to a connection only after the security handshake is complete, and MAY limit the number and cost of in-progress handshakes.

Attackers may try to make small requests that generate large responses back to faked originating addresses (the real target of the attack). This is called an "amplification attack". Thus, peers MAY limit the number of in-progress handshakes per originating IP address.

Attackers may try to uncover vulnerable versions of an implementation from its metadata. Thus, ZMTP sends metadata after security handshaking.

Attackers can use the size of messages (even without breaking encryption) to collect information about the meaning. Thus, on an encrypted connection, message data SHOULD be padded to a randomized minimum size.

Attackers can use the simple presence or absence of traffic to collect information about the peer ("Person X has come on line"). Thus, on an encrypted connection, the peer SHOULD send random garbage data ("noise") when there is no other traffic. To fully mask traffic, a peer MAY throttle messages so there is no visible difference between noise and real data.

Note on use of the GPL: we use this license to cover the specification text itself, not implementations. You can license your own code under any license you wish. If you make derived versions of this specification, you must share them under the GPL. The goal of this is to prevent private extensions of our specifications.