Preamble

Many situations need simple secure datagram communications.
For fixed links from point to point, a shared secret
approach is often preferred over more complicated
forms, especially if there are additional costs
in using certificates.
Alternatively, secure communications may already be
established via other means, in which case, all that
is lacking is a simple datagram delivery mechanism
to take advantage of an already shared secret.

Further, most existing protocols (e.g., SSL and SSH)
are intended for connection-oriented
communications, and are ill-suited to datagram requirements
such as is required for transaction protocols charged with
delivering end-to-end reliability.

SDP1 is designed to fill this gap. SDP1 provides a way to
secure a single datagram or packet travelling from one
node to another. It is assumed that the two nodes have
previously established a set of shared secrets, which
we call a context
(or, two contexts for full duplex communications).
The context itself identifies SDP1, and thus is designed
for drop-in replacement by a future SDP2, and similar.

In this document, it is assumed that a key exchange
protocol such as that employed by SOX
exists as a higher layer feature in order to
establish the shared secrets needed.

Authors and Documentation

Ian Grigg and
Bryce 'Zooko' Wilcox-O'Hearn
wrote this protocol.
Thanks for comments go to many.

General Features

This protocol is stateless, imposing no demands or
relationships between subsequent packets or replies.
Packets may be delivered zero, one or many times,
and no guarantees on delivery are made.
(In this context, it is questionable whether this
design is a protocol, but its relationship to the context
makes it of wider scope than a packet layout.)

If a received packet MACs and decrypts
correctly, then it is understood to be
securely delivered.

The algorithms and modes chosen provide strong
protection against eavesdropping. DOS attacks
are no easier against this protocol than against
open traffic.
Little consideration is taken of traffic analysis.
No consideration is taken of ordering, selective
deletion, insertion or repeats.

Relationship to other layers

SDP1 creates a way to send a single datagram.
It demands a context within which to exist,
sometimes known as a session. This context
provides secret keys, IV values, and MAC secrets.

The context needs to be negotiated by a higher layer
protocol, which is out of scope of this document.
What is specified here is what the context must
provide to SDP1 in order to deal with each packet.
Such a context negotiation could be achieved within
a sophisticated protocol, or could be organised by
a set of manually transmitted and stored secrets.

The resultant datagram is neutral to the transmission
protocol. It could be sent over UDP, layered over
a TCP/IP connection, or conceivably be ascii-armoured
or MIME'd and sent in email.

Algorithms

SDP1 specifies the use of these algorithms:

Encryption.
AES with a 128 bit key length in CBC mode.

MAC.
HMAC-SHA1 in encrypt-then-MAC mode.

SDP1 depends on good random inputs from the context,
but is designed not to rely on good random inputs in
the calculation of each packet.
There are no public key algorithms required.
No external padding modes are needed as described below.

The Operating Environment of SDP1

The context
is the operating environment of the Secure Datagram Protocol.
The SDP1 context primarily requires these parameters for the algorithms:
MAC secrets, the bulk encryption keys and IVs for
each end-point, in both the read and the write directions.

A context for a given end point is identified by a
context token.
This is a small number (array of bytes). The value is
implementation dependent value, save that it is unique for
the receiving end-point.
This number is delivered in the outer (unencrypted)
packet and directs choice of the context by the sender,
imposed on the receiver.
A packet with an unknown context token can be ignored.

The context also maintains as secondary paramaters a time base
and sequence numbers for packets.

Note. Other protocols have called the context the session.
That name implies distinct and separate relationships between the
packets, such as ordering and concatenation, whereas in
SDP1 (and similar) the context is the only relationship between
the packets. See R2.

Logical Contexts

Logically, there are always two contexts: the read and write states,
each with their own context token.
One of the nodes will generally
initiate new contexts, this is called the client where necessary.
An implementation's context may specify which is the client
but for the most part, it is sufficient to deal here with
remote and local end points.

Context Negotiation

Initialisation and exchange of context
is out of scope of this protocol, see
(undocumented) SOXKEX
for an example.
SDP1 itself is also specified in the context (as number 1),
and is not self-determinable as SDP1.
Thus, context switching might also include a switch from
SDP1 to (by way of hypothetical example) SDP2.

Paramaters demanded by SDP1

All of the below are required by the protocol.
In each case, there are two elements required,
one for each direction.
Where the term random is used, the data
should be strongly random, as the security of the
protocol depends on it.

Secret Key

AES requires a 128 bit random key.

CIV

The context provides an initial value (IV) fixed over the
entire context. This is called the Context Initial Value (CIV).

The CIV is a 128 bit random number.
It is ex-ored in CBC mode with the first block
of the plaintext. (The Pad, described below,
generates a unique IV for each datagram.)

MAC Secret

The MAC secret is 160 bits of random data.

Context Token

The context will supply a token that identifies the remote
end point. This permits the receiving end point
to select the right context for encryption and
application duties, such as onwards delivery.

The Context Token is a Byte Array (see Appendix 3)
which is the first element in in the open network packet.
Its length is undefined. If there is only one context
possible (perhaps due to port mapping) then the length
could be zero, but the zero length Byte Array must still be present.

The token is guaranteed by the node to be unique within its
space of accepted contexts. Its contents are undefined,
but in practical scenarios might be expected to be an
increasing integer in order to ensure uniqueness over
software restarts.

MAC calculation

The MAC is calculated over the full Context Token
and ciphertext blocks (encrypt-then-MAC mode)
[2].
These are concatenated together in their
two Byte Arrays and then MACed using the
HMAC-SHA1 algorithm defined in
RFC 2104[3].
The full 160 bit MAC (untruncated) is then placed in
its own Byte Array to form the MAC tag which is then
appended to its input data to form the complete Datagram.

Inner Cleartext Layout

To create the ciphertext,
SPD1 uses CBC mode over the plaintext.
The
initialization vector (IV) required for CBC mode
is taken from the context and is called the
context IV (CIV).

The inner cleartext layout
is a concatenation of the following elements.

Pad.
This is a Byte Array that includes a unique number.

Payload.
This is a Byte Array that contains the
application data.

Pad

. . . Payload . . . clear . . . text . . .

Figure 3 - Inner Cleartext Layout

The contatenation of these two Byte Arrays forms the plaintext.
The length of the Pad is from 16 to 31 bytes long,
and is calculated to set the total length of the inner block
to a multiple of the encryption algorithm's block size.

The length of the Payload is at least 1 byte, being a
ByteArray with a length byte of zero.
Thus, this forces a Pad of 31 bytes, which
has the side-effect of hiding an empty application packet
amongst those that are up to 15 bytes long.

Encryption

The encryption key is supplied by the context,
as is the context initial value (CIV).
Both are 128 bit (16 byte) quantities.

Encryption is done in CBC mode, with the CIV being
exclusive-ored with the first block of the plaintext.
No padding mode is used as the plaintext is already
extended by the Pad to the block size.

Pad

The Pad is a Byte Array that is prefixed to
set the inner plaintext total length to a multiple of the
encryption algorithm's block size (16 bytes).
It MUST be between 16 and 31 bytes long.

The contents of the Pad
are designed to create the effect of
an IV for this specific datagram.
In this context, the datagram IV ("DIV")
refers to the first 16 bytes
of the Pad, in its effect as an IV. There is
no separate or actual IV element, the DIV
is a notional element only.

The DIV MUST contain a
unique number for each separate datagram
within the context space.
On reception, the Pad SHOULD be read as a Byte Array
and disposed of.

Calculating the Length of the Pad

A Byte Array will always be at
least 1 byte long, being the leading length
(a Compact Integer containing from 0 to 127).
Generally from 16 bytes to 31 bytes,
including the Compact Integer (generally of 1 byte).

To calculate the length of the Pad, do the following:

Calculate lenPayload,
being the total length of the Payload,
including its Byte Array length.

Calculate frac,
the fractional remainder of lenPayload, as

frac = lenPayload % 16 ;

Calculate the length of the padding required to take
the payload up to to a 16 byte boundary as:

paddingNeeded = 16 - frac ;

The length of the Pad is then the sum of the DIV's
required size and the padding required:

lenPad = 16 + paddingNeeded ;

The length of the Pad includes the single byte
needed for its ByteArray length.

When laid out, the total should block to 16 bytes:

assert ( ( lenPad + lenPayload ) % 16 == 0 ) ;

To Be Confirmed...

To calculate the number of random bytes needed in the padding,
follow the following:

Calculate the byte lengths of the Sequence Number and Time
Compact Integers, t1, t2.

The length of the Pad Byte Array length is 1.

Calculate the length of the padding required to take this
to the length of the Pad:

lenPadding = padLength - 1 - t1 - t2;

To Be Confirmed...

Payload

The Payload may be any length, up to the limit
that can be expressed in a ByteArray.

Security issues are discussed throughout this memo.
As SDP1 fits within a wider security application,
wider issues are discussed here.

General Environment of the Inspiring Application

SDP1 is designed for small numbers of small datagrams.
In orders of magnitude, the inspiring application
called for maybe 1000 datagrams of 1000 bytes each
(see R7).
It envisages frequent and relatively painless context
changes, so the Datagram Protocol layer chooses to be
casual about threats to large amounts of
data, in comparison to smaller amounts of data.

Re-keying

Notwithstanding the above, with care, the protocol should be suitable
for much higher orders of magnitude of data and packets.
As of this writing, AES used in CBC mode is good for 2^64 * 128 bits
of encryption
[6].
By way of comparison, triple DES is good for only 2^32 * 64 bits,
which is well within the size range of modern disks and mere
minutes of fast ethernet usage. Its small block size makes it vulnerable.

It remains to be seen whether SDP1 delivers the full
protection implied by the above theoretical comments.

Threats

The inspiring application is highly susceptible to
traffic analysis. That is, basic information about
the contents of the packets is easily determinable
by analysing size, frequency, IP numbers, and other patterns.
SDP1 makes no effort to hide this information.

In contrast, SDP1 takes some effort to protect against
attacks of these types:

Eavesdropping of content (beyond traffic analysis)

Denial of service (DOS) attacks

Poor key choice or deliberate leakage by one of the nodes.

Poor or incomplete software implementations.

The Security of SDP1

The security of SDP1 rests almost entirely on these
factors:

The encryption algorithm, AES.

The secret material shared in the context
(secret key and CIV).

The quality of the DIV.

The secret keys should be strongly randomised by both nodes.
Care needs to be taken that the DIV is unique, and it is
strongly suggested for security reasons that the methods
similar to
Suggested Layout of the Pad
be used.

In SDP1, little attention is paid to data integrity as the
higher layers are expected to send signed and
uniquely identified packets. The higher layers are
expected to be idempotent.
However, signing (or more properly verification)
is expensive, so in order to reduce the potential
for DOS attacks, the datagrams are MACd.
Such is not intended to be a replacement for
proper higher layer checks for packet integrity,
authentication, ordering, repeats or uniqueness.

Payload

This Payload be any length, up to the limit that can be expressed
in a ByteArray. The algorithm and mode employed
is generally thought to be secure up to very long lengths.

However, resetting the context on a regular basis is
advised. Where possible, user protocols MAY monitor
the number of packets sent and the total amount of data
sent.
Both of these numbers could be usefully employed
as watermarks to trip a requirement to re-establish a
new context.

In calculating the entire quantity of data
sent using the context's single key, the sum of
all packets' data needs to be taken in into account.
As the datagram is encrypted using a context that shares
its use of key and CIV secrets across many datagrams,
the total number of packets encrypted within that context
puts stress on the DIV to be unique.

The Context Token

The Context Token is sent in the clear. This enables an
attacker to group together packets and analyse the traffic
within each context. Such an attack is not covered by SDP1.

Enhancements

Any weaknesses can be addressed if desired by
designing SDP2, etc. Indeed, that is to be expected
and is the design intention.

The following desiderata are the requirements placed
on the design of the protocol. As all good requirements
go, they are sometimes in conflict; interested readers
are encouraged to determine how well the designers met
these challenges.

Requirement

The protocol must protect a datagram from one end-point
to another, at the application level.

Reason

Datagrams are the core atomic unit of transmission
in most secure protocols. Because of the laws of
computer science, a secure protocol must work at
the level of datagrams.

Discussion

The coordination problem means that there
is no way for a protocol to prove that data has
arrived, unless the recipient confirms its presence
[7].
This is exploited in the acknowledgement
feature of protocols, whereby the recipient tells
the sender the name of the packet just received.

Connection-oriented protocols (like TCP) attempt to
provide a complete service that overcomes these limitations,
by implementing sliding window features to deliver packets
reliably. But, they only deal with the ramifications of
the coordination problem within their own domain.
Outside their domain, they are powerless.

In practice, this means that even though TCP is a reliable
protocol, it has trouble exporting that reliability to the
application. Two applications (by this, we mean
agents that use TCP) can use TCP and within TCP packets
are delivered reliably. But there are use cases where
the applications themselves will not receive data, or
where the data is delivered duplicatively or out of
sequence, once outside TCP.

The consequences of this are that a reliable application
must do its own datagram delivery. Normally, designers
try and avoid the consequences by inserting connection
oriented protocols, but this results in shifting the
burden around. For example, HTTP was designed as a
datagram based request-response model. Yet, when it
was layered over TCP and then over SSL, its ability
to reliably deliver packets was lost.

The upshot of this is that there is a requirement for
a secure datagram protocol, one which directly gives
the secure application a datagram delivery mechanism
over which to create its reliable protocol.

Reason

A reliable protocol requires that each and every
action be atomic in isolation of other events.

Discussion

This means that each packet is independent of
every other one, the only relationship being
their common use of a context defined by the
secret key (R3).

Notes

This requirement also eliminates checking for
resends, repeats, re-orderings and tampering
in general. These
are more properly the requirements of the
secure application, and should not be provided
in the wire protocol. However, see Rx, DOS.

As a result of the Stateless requirement, and of R1 Datagram,
this means that Compression is dropped as a feature.

Requirement

The protocol is to assume, and require, that whatever
information required for efficient secret key operation
is already shared between the two end points in a context.

Reason

All good protocols are made in two parts,
the second of which starts with a totally
trusted key. SDP1 is the second part of
a complete secure datagram protocol.

Discussion

The protocol works with another, separately defined
key exchange protocol.

This protocol demands and needs to specify what secret
key information (including MACs, IVs, etc) needs to
be shared, and what strengths are required.
Notes need to be made on vulnerabilities (quality
of randomness, interrelation between the elements
of the secret key demanded) but the essential
task of creating and providing the shared secrets
is assumed by this protocol as a given.

Reason

Discussion

Notes

Requirement

The protocol should be simple to implement for developers
who have no deep understanding of security and cryptography
issues.

Reason

A protocol that is not easy for an implementer to
understand and code up is not likely to help much.
It either bogs down in implementation of innumerable
marginal features, or it becomes insecure through
rushed coding or internal issues. Deployment falters,
and traffic does not get protected through unavailability
at low cost of appropriate protocols.

Vulnerability follows, often on a much larger scale
than the protocol designers envisaged.

Discussion

Too often, companies use difficult and complex
protocols as an excuse to build barriers to entry.
Once complexity is rigourously poured through a
design, a mantra of "security is too hard for any
but the professionals" pervades. This guild-like
approach holds back the securing of general purpose
traffic, thus contributing to the spread of viruses,
spam, DOS and identity theft.

This is of course a balance. Specifically, this
protocol eschews the following.

No negotiation of secret key algorithms, modes, key strengths or MACs.
The protocol is fixed in its use of all algorithms.

Most protocols permit various negotiations, but over time,
these negotiation settle out into one favoured set, and
the rest become detritus. Yet, the choices offered are
often quite poorly thought out. They relate to marginal
factors that seemed good on paper, but lacked any understanding
of their impact in the implementation phases and the market place.
Negotiations bring in complexities and securities
that rarely pay off.

No compression. Because compression is
difficult to benefit from in small datagrams,
and because it is a stateless protocol (R2),
we drop any compression and leave that to a
higher layer that will understand the state better.

No defence against traffic analysis.
This is a very difficult thing to defend against,
and the inspiring applications are actually quite
weak here. We choose to defer traffic analysis in
detail to a later generation (SDP2).
See also R2, stateless.

This protocol goes for simplicity over completeness.
If there is any weakness or shortfall in the protocol,
it should be totally rewritten and replaced.
The correct place to select this protocol or any replacement
such as a future SDP2 is in the
key exchange protocol, a higher layer that will specify
this protocol, or a complete alternative.

Notes

In practice, R5 is the most important objective
of the efficiency goals. Speed, efficiency, and
elegance all are subsidiary to the efficiency of
development and deployment, and for that reason,
the former goals are not even stated as requirements.

Requirement

The protocol should not introduce any weakenesses
that are found under normal predictable adverse conditions.

Reason

If a secure protocol loses its security through some
mischance or difficulty experienced in the field,
it cannot be said to be secure. The protocol should
incorporate experience from systems engineering and
the real world of the net.

Discussion

R6.1 No weaker under DOS
If a secure protocol slows down under DOS
as against the
same activity employed without using a secured protocol
then the user is encouraged to switch off security.
This results in an easy security attack:
start a DOS and wait for the user to turn off the protocol.

DOS is now an ever present feature, it seems
[8].
A protocol should make sure that it opens no
holes for a DOS attack that encourage the
protocol to be switched off.

In this case, it probably means that a datagram
should be detectable as authenticated as early as possible.
This in practice suggests the use of a MAC, over the
ciphertext.

R6.2 Does not rely internally on a good random number source.
Delivering good random numbers is a very hard task.
There are theoretical and practical solutions,
but experience shows that deployment and availability
of good random numbers is still not robust.

Therefore, this protocol itself should not rely on entropy,
but should punt the problem upstairs, and should specify
strong initial secret key material, and techniques within
protocol to deliver strength without additional help.

Note this shifts the burden, only. It is still
required that the KEX deliver strong keys. But, it eases
the implementation and deployment issues, by concentrating
the difficult requirement in one place.

Note that this does not mean that the protocol should
not use good randomness where it is available. More
strength is always good, as long as the base case is
solid.

R6.3 Is not vulnerable to restart artifacts.
Restarts are troublesome to protocols, especially ones
with no expectation of state.
Specifically this means that a counting nonce is hard
to do, as there is no way to guarantee that each nonce
is securely saved over crashes. Neither is it reasonable
to require transactional logic for a nonce, in practical
software engineering terms.

R6.4 Is not vulnerable to time artifacts.
Specifically, changes in time should not break the
protocol, nor should it be vulnerable to guessing
of time.

R6.5 Is resistant to counter-party attacks.
The protocol should still perform well if one or other
of the nodes is poorly implemented or is deliberately
aggressive. For example, opportunities to leak keys
should be reduced (CIV is secret, DIV is encrypted).

Notes

Byte

A Byte is an octet of 8 bits in network order.

Compact Integer

A Compact Integer is one that expands according to
the size of the unsigned integer it carries.
Compact integers are generally used for lengths of data
arrays, as described below, but this is not essential.

A Compact Integer is formed from one to five bytes
in sequence.
If the leading (sign) bit in each byte is set,
then additional bytes follow. If a byte
has the sign bit reset (0) then this is the last byte.
The unsigned integer is constructed by concatenating
the lower order 7 bits in each byte.

A one byte Compact Integer holds an integer of 0 to 127
in the 7 lower order bits, with the sign bit reset to zero.
A two byte Compact Integer can describe from
128 to 16383 (XXXX check).
The leading byte has the sign bit set (1) and the trailing
byte has the sign bit reset (0).

The largest Compact Integer generally defined is one of five bytes.
It is generally defined to hold an unsigned integer of 31 bits,
being the largest normally expressable safely in a 4 byte space.
The use of the 3 additional bits is undefined.

(There is a Compact Long defined for 63 bits but it is unused
in SDP1, and not widely agreed.)

Byte Sequence

A Byte Sequence is a series of bytes layed out in network order.

The Byte Sequence would normally be preceeded by a Compact Integer
that determines the number of bytes in the sequence, thus
making up a Byte Array.

Byte Array

A Byte Array is a Compact Integer holding the length of the
following Byte Sequence.

Compact Long

A compact long form has recently been defined.
This form extends the Compact Integer to use
up to 9 bytes. All bits are allocated, to form
a number of 64 bits.
It is not currently in use within SDP1 directly,
but may find its way into a Pad.

AES-128 is the current de facto standard
for a secure cipher.
The alternate of AES-256 is not a pareto-secure improvement,
and is less available for implementations. Other algorithms
are similarly less attractive for unavailability, age or
novelty.

The choice of HMAC-SHA1 is dictated by availability in code
and reasonable speed.

A more convenient choice such as a combined mac/encryption
mode is not readily available (either in code or patent free)
as yet. The development of combined mac/encryption modes is
relatively new, dating from both the invention of the MAC and
the AES standardisation process, both relatively recent events.
This has resulted in a burst of activity in designing new
combined modes.

However to date all proposals suffer one or more of the
following: patents, lack of speed, complications in coding,
lack of analysis, and lack of widespread code.

Only HMAC-SHA1 overcomes the bulk of those issues.
This would however predict that SDP1 would have
a lifetime of only a few years, until the mode
wars settle down. That is not an issue, as the
protocol envisages that context negotiation also
settles on the packet protocol.

Classically, the IV for a cipher mode
such as CBC is often sent in the clear in the packet.
In SDP1, we moved the IV into the secret context
(CIV for context IV) and supplemented it with a notional
secondary IV (DIV for datagram IV) built into the Pad.
The DIV is the first block to be encrypted by the CBC mode.

This design closes off any weaknesses deriving from a public IV,
and also allows us to re-use the shared secret CIV
effectively over many datagrams within the context.

One such weakness is that an aggressive node can leak
key material
[9].
If the IV is a random
value, then there is no way for the other party to be
aware that the randoms so chosen are actually a channel
to an eavesdropper.

SDP1 protects this as the datagram DIV has already been
encrypted. In effect, if any secret leakage were to be
done, it must be done in the secret sharing phase (thus
putting pressure on the secret sharing KEX to also address
this issue).

Mode

CBC was chosen as the old standby.

Counter mode has been suggested, but its requirements for
uniqueness in the counter place demands on the implementations
that are not easy to deal with.