This document describes Mimix version 1; an anonymous message transfer protocol that delivers messages using electronic email and is designed to provide resistance to traffic analysis.

Mimix is heavily based Mixmaster which, in turn, is based on Dr. David Chaum's mix-net concept. A mix (remailer) is a service that forwards messages, using public key cryptography to hide the correlation between its inputs and outputs. Sending messages through sequences of remailers achieves anonymity and unobservability of communications against a powerful adversary.

Mixmaster uses SMTP as a transport, both for inter-remailer communication and final delivery. Mimix uses a very similar packet structure but employs HTML for inter-remailer communication. This enables Middle remailers to operate anonymously, as Location Hidden Services within the Tor network.

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at http://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

Copyright (c) 2013 IETF Trust and the persons identified as the document authors. All rights reserved.

This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.

This document describes a mail transfer protocol designed to protect electronic mail against traffic analysis. Most e-mail security protocols only protect the message body, leaving useful information such as the identities of the conversing parties, sizes of messages and frequency of message exchange open to adversaries.

Message transmission can be protected against traffic analysis by the mix-net protocol. A mix (remailer) is a service that forwards messages, using public key cryptography to hide the correlation between its inputs and outputs. If a message is sent through a sequence of mixes, one trusted mix is sufficient to provide anonymity and unobservability of communications against a powerful adversary. Mixmaster is a mix-net implementation for electronic mail.

The mix-net protocol [Chaum81] allows one to send messages while hiding the relation of sender and recipient from observers (unobservability). It also allows the sender of a message to remain anonymous to the recipient (sender anonymity). If anonymity is not desired, authenticity and unobservability can be achieved at the same time by transmitting digitally signed messages.

This section gives an overview of the protocol and messaging pattern. The mixing algorithm is specified in Section 3, and the message format is specified in Section 4.

Viewed from a high level, Mixmaster is like a packet network, where each node in the network is known as a "remailer." The original content is split into pieces, and an independent path is determined for each piece, with the only requirement that all paths must end at the same remailer. Each piece is multiply encrypted so that any intermediate remailer can only decrypt enough information to determine the next hop in the path. When all pieces have arrived at the final remailer, the original content is re-created and sent to its final destination.

In this section the terms "sender" and "user agent" are used informally.

The user agent splits the original content into chunks of 10236 bytes; if the last chunk is shorter, random padding is added. Each chunk has a four-byte length prepended, and the result is called the packet body. If sender anonymity is desired, care should be taken to not include any identifying information (such as headers or unique content from the original plaintext message) in the packets. The content may be compressed before splitting.

The sender next chooses a chain of up to 20 remailers for each packet. Each path is independent, and can be of a different length, but all paths must end at the same remailer. This final remailer is responsible for detecting and discarding duplicate packets, reconstructing the message, and doing the final delivery.

Each packet is next prepared as follows (the full details are in Section 4.3.1). For a chain of "n" remailers, headers "n + 1" through 20 are filled with random data. For headers "n" down to one, the sender generates a symmetric encryption key. This key is used to encrypt the packet body and all the following headers. The key, and other control information, is then encrypted with the public key of the "n"'th remailer in the chain.

The process is repeated, working backward through the chain until the first packet has header information encrypted for the first remailer, and the packet body has been encrypted "n" times. The packet is then sent to the first remailer on its chain.

When a remailer receives a message, it uses its private key to decrypt the first header section. The Packet ID (see Section 4.3.1) can be used to detect duplicates. The integrity of the message is verified by checking the packet length and verifying the message digest in the packet header.

All header sections, as well as the packet body, are decrypted with the symmetric key found in the header. This reveals a public key-encrypted header section for the next remailer.

The first header section is now removed, the others are shifted up, and the last section is replaced with random bytes. Transport encoding is applied to the new message as described in Section 4.4.

In order to prevent an adversary from determining the relationship between incoming and outgoing messages (i.e., traffic analysis), the remailer must collect several encrypted messages before sending the message it has just created; see Section 3.1.

When a packet is sent to the final remailer, it contains an indication that the chain ends at that remailer, and whether the packet contains the complete message or if it is part of a multi-part message. If the packet contains the entire message, the packet body is decrypted and after reordering messages, the plain text is delivered to the recipient. For partial messages, a message ID is used to identify the other parts as they arrive. When all parts have arrived, the message is reassembled, decompressed if necessary, and delivered. A final remailer may discard partial messages if all packets have not been received within a local time limit.

Note that only the final remailer can determine whether packets are part of a specific message. To all of other remailers, the packets appear to be completely independent.

To obfuscate the link between incoming and outgoing messages, Mixmaster uses a pooling scheme. Messages to be forwarded are stored in a pool. At regular intervals the remailer sends some random messages from the pool to either the next hop or their final recipients.

The pooling scheme is a "Timed Dynamic Pool Mix" [trickle02], which has the following three parameters:

Name

Description

t

Mixing interval

min

Minimum number of messages in the pool

rate

Percentage of messages to be send in one round

The following steps are implemented every "t" seconds:

Let "n" be the number of messages currently in the pool.

Let "count" be the smaller of "n - min" and "n * rate", or zero if "n - min" is negative.

Select "count" messages from the pool at random and send them.

In its default configuration, Mixmaster has a mixing interval of 15 minutes, a minimum pool size of 45 messages, and permits a maximum of 65% of the pool to be sent in one round.

Dummy messages (see Section 4.1) are multi-hop messages with four randomly selected remailers as the chain. The chain must be selected such that no remailer will appear twice unless two other remailers separate them.

Every time a message is placed in the pool, the remailer chooses a random number from a geometric distribution and creates that many dummy messages which are also placed in the pool.

Similarly, prior to each execution of the mixing algorithm described in Section 3.1, the remailer selects a random number from a different geometric distribution and adds that many dummy messages to the pool as well.

The parameters should be chosen so that on average the remailer creates one dummy for every 32 inbound messages and one every nine mixing rounds.

Each destination field consists of a string of up to 80 ASCII characters, padded with null-bytes to a total size of 80 bytes. The following strings are defined:

null:

Dummy message. The remailer will discard the message.

post:

Usenet message. The remailer will post the message to Usenet.

post: [newsgroup]

Usenet message. The remailer will add a "Newsgroups" header with the specified content, and post the message to Usenet.

[address]

E-mail message. The remailer will add a "To" header with the specified content, and send the message as e-mail.

If no destination field is given, the payload is an e-mail message.

Message headers can be specified in header line fields. Each header line field consists of a string of up to 80 ASCII characters, padded with null-bytes to a total size of 80 bytes.

There are three types of user data sections:

A compressed user data section begins with the GZIP identification header (31, 139). This header contains an additional user data section. The data are compressed using GZIP [RFC 1952]. The GZIP operating system field must be set to Unix, and file names must not be given. Compression may be used if the capabilities attribute of the final remailer contains the flag "C".

An RFC 2822 user data section begins with the three bytes "##[CR]" (35, 35, 13). It contains an e-mail message or a Usenet message.

A user data section not beginning with one of the above identification strings contains only the body of the message. When this type of user data section is used, the message header fields must be included in destination and header line fields.

The payload is limited to a maximum size of 2610180 bytes. Individual remailers may use a smaller limit.

Remailer operators can choose to remove header fields supplied by the sender and insert additional header fields, according to local policy; see Section 5.

The packet header consists of 20 header sections (specified in Section 4.3.1) of 512 bytes each, resulting in a total header size of 10240 bytes. The header sections (except for the first one) and the packet body are encrypted with symmetric session keys specified in the first header section.

To generate the RSA-encrypted session key, a 32-byte AES key is encrypted with RSAES-PKCS1-OAEP, resulting in up to 512 bytes (4096 bits) of encrypted data. This AES key and the initialization vector provided in clear are used to decrypt the encrypted header part. They are not used at other stages of message processing.

Encrypted header part:

Header components unlocked with the AES session key and IV. Details of these components are described later in this section.

The 384 bytes of data encrypted to form the encrypted header part are as follows:

The fields are defined as follows:

Packet ID:

Randomly generated packet identifier, used to prevent replay attacks. A Remailer maintains a log of processed Packet IDs and will not repeat process one.

AES key:

Used to encrypt the following header sections and the packet body.

Packet type identifier:

The type identifiers are: ADD PACKET TYPES

Timestamp:

The timestamp defines the number of days since January 1, 1970 (00:00 UTC), in little-endian byte order. A random number between one and three, inclusive, may be subtracted from the number of days in order to obscure the origin of the message.

Header digest:

SHA2-512 digest computed over the preceding elements of the encrypted header part.

The packet information depends on the packet type identifier, as follows:

Initialization vectors:

For packet type 1 and 2, the IV is used to symmetrically encrypt the packet body. For packet type 0, there is one IV for each of the 19 following header sections. The IV for the last header section is also used for the packet body.

Remailer address:

E-mail address of next hop.

Message ID:

Identifier unique to (all chunks of) this message.

Chunk number:

Sequence number used in multi-part messages, starting with one.

Number of chunks:

Total number of chunks.

In the case of packet type zero, header sections two through twenty, and the packet body, each are decrypted separately using the respective initialization vectors. In the case of packet types one and two, header sections two through twenty are ignored, and the packet body is decrypted using the given initialization vector.

The message payload Section 4.1 is split into chunks of 10236 bytes. Random padding is added to the last chunk if necessary. The length of each chunk (not counting the padding), is prepended to the chunk as a four-byte little-endian number. This forms the body of a Mixmaster packet.

Mixmaster packets are sent as standard email messages [RFC2822]. The message body has the following format:

The length field always contains the decimal number "20480", since the size of Mixmaster packets is constant. An MD5 message digest [RFC1321] of the packet prior to Base-64 encoding is encoded in Base-64.

The packet itself is encoded in Base-64 encoding [RFC1421], with line-breaks every 40 characters.

Remailer public key files consist of a list of attributes and a public RSA key:

The attributes are listed in one line separated by spaces. Individual attributes must not contain whitespace, and are defined as follows:

identifier:

A human readable string identifying the remailer

address:

The remailer's Internet mail address

key ID:

Public key ID

version:

Software version number

capabilities:

Flags indicating additional remailer capabilities

validity date:

Date from which the key is valid

expiration date:

Date of the key's expiration

The identifier consists of lowercase alphanumeric characters, beginning with an alphabetic character. The identifier should be no more than eight characters in length.

The key ID is the MD5 message digest of the representation of the RSA public key (not including the length bytes). It is encoded as a hexadecimal string.

The version field consists of the protocol version number followed by a colon and the software version information, limited to the ASCII alphanumeric characters, plus dot (.) and dash (-). All implementations of the protocol specified here should prepend the software version with "2:". Existing implementations lacking a protocol version number imply protocol version 2.

The capabilities field is optional. It is a list of flags represented by a string of ASCII characters. Clients should ignore unknown flags. The following flags are defined:

The date fields are optional. They are ASCII date stamps in the format YYYY-MM-DD. The first date indicates the date from which the key is first valid; the second date indicates its expiration. If only one date is present, it is treated as the key creation date. (The date stamp implies 00:00 UTC).

The version, capabilities, and date fields must each be no longer than 125 characters.

The encoded key part consists of two bytes specifying the key length (1024 bits) in little-endian byte order, and of the RSA modulus and the public exponent in big-endian form using 128 bytes each, with preceding null bytes for the exponent if necessary. The packet is encoded in Base-64 [RFC1421], with line-breaks every 40 characters. Its length (258 bytes) is given as a decimal number.

Digital signatures [RFC2440] should be used to ensure the authenticity of the key files.

Some remailers may understand multiple remailer protocols. In the interest of creating a unified anonymity set, remailers which speak multiple remailer protocols should attempt to remix messages that use the older protocols whenever possible.

When a remailer receives a message in the older protocol format, it should determine if the message destination is another remailer which also speaks the Mixmaster protocol. If the remailer knows the Mixmaster public key for the next hop, it should process the message normally, but instead of sending the message to its next hop, treat the processed message as opaque data which will comprise the body of a Mixmaster message. The remailer should then create a Mixmaster message with this body to be delivered to the next hop remailer.

Ensuring that a remailer's keyring contains up to date copies of the public keys for other remailers is the responsibility of the given remailer's operator. Utilities such as Echolot [Palfrader03] can be used to assist in automating this task.

If the remailer receives a Mixmaster message that, when decrypted, contains a message in an alternate protocol supported by the remailer, it should process the message as though it had initially been delivered in the alternate protocol format.

The existing remailer software understands a number of specific administrative commands. These commands are sent via the Subject: line of an e-mail to the email address of the remailer:

remailer-help:

Returns information about using the remailer. The remailer may support a suffix consisting of a dash and a two-letter ISO 639 country code. For example, remailer-help-ar will return a help file in Arabic, if available. Supported languages should be listed at the beginning of the "remailer-help" response.

remailer-key:

Returns the remailer's public key as described in Section 5. It may also return the keys and attributes of other remailers it knows about.

remailer-stats:

Returns information about the number of messages the remailer has processed per day (again, a day starts at 00:00 UTC).

remailer-conf:

Returns local configuration information such as software version, supported protocols, filtered headers, blocked newsgroups and domains, and the attribute strings for other remailers the remailer knows about.

Older versions of Mixmaster (2.0.4 through 2.9.0) allowed for the creation of dummy message cover traffic, but provided no automated means for introducing this dummy traffic into the system. Beginning in version 3.0, Mixmaster employs an internal dummy policy.

Beginning with version 3.0, Mixmaster offers automatic key rotation. Care must be taken to minimize the possibility for partitioning attacks during the key rotation window.

Keys are generated with a validity date and an expiration date. User agents should only display valid keys which have not expired.

Keys are valid for a 13 month period. A remailer must generate a new key when the existing key's expiration date is one month or less in the future. When queried, a remailer must report the most recently generated key as its key, effectively giving each key a 12 month service period.

Remailers must continue to decrypt and process mail encrypted to expired keys for one week past the expiration date on the key. One week after expiration, an expired remailer key should be securely destroyed.

When anonymous messages are forwarded to third parties, remailer operators should be aware that senders might try to supply header fields that indicate a false identity or to send unauthorized Usenet control messages. This is a problem because many news servers accept control messages automatically without any authentication.

For these reasons, remailer software should allow the operator to disable certain types of message headers, and to insert headers automatically.

Remailers usually add a "From:" field containing an address controlled by the remailer operator to anonymous messages. Using the word "Anonymous" in the name field allows recipients to apply scoring mechanisms and filters to anonymous messages. Appropriate additional information about the origin of the message can be inserted in the "Comments:" header field of the anonymous messages.

Anonymous remailers are sometimes used to send harassing e-mail. To prevent this abuse, remailer software should allow operators to block destination addresses on request. Real-life abuse and attacks on anonymous remailers are discussed in [Mazieres98].

The security of the mix-net relies on the assumption that the underlying cryptographic primitives are secure. In addition, specific attacks on the mix-net need to be considered; [Moeller98] contains a more detailed analysis of these attacks.

Passive adversaries can observe some or all of the messages sent to mixes. The users' anonymity comes from the fact that a large number of messages are collected and sent in random order. For that reason remailers should collect as many messages as possible while keeping the delay acceptable.

Statistical traffic analysis is possible even if single messages are anonymized in a perfectly secure way: an eavesdropper may correlate the times of Mixmaster packets being sent and anonymized messages being received. This is a powerful attack if several anonymous messages can be linked together (by their contents or because they are sent under a pseudonym). To protect themselves, senders must mail Mixmaster packets stochastically independent of the actual messages they want to send. This can be done by sending packets at regular intervals, using a dummy message whenever appropriate. To avoid leaking information, the intervals should not be smaller than the randomness in the delay caused by trusted remailers.

There is no anonymity if all remailers in a given chain collude with the adversary, or if they are compromised during the lifetime of their keys. Using a longer chain increases the assurance that the user's privacy will be preserved, but at the same time causes lower reliability and higher latency. Sending redundant copies of a message increases reliability but may also facilitate attacks. An optimum must be found according to the individual security needs and trust in the remailers.

Active adversaries can also create, suppress or modify messages. Remailers must check the packet IDs to prevent replay attacks. To minimize the number of packet IDs that the remailer must retain, packets which bear a timestamp more than a reasonable number of days in the past may be discarded. Implementors should consider that packets maybe up to three days younger than indicated by the timestamp, and select an expiration value which allows sufficient time for legitimate messages to pass through the network. The number of packet IDs that the remailer must retain can be further minimized by discarding packet IDs for packets encrypted to a key which has expired more than a week in the past.

The use of a link-level encryption protocol with an ephemeral key, such as STARTTLS with SMTP [RFC2487], provides for forward secrecy and further aids against replay attacks. Remailer operators should be encouraged to deploy such solutions at the MTA level whenever possible.

Early implementations of Mixmaster did not generate a timestamp packet. Implementors should be aware of the partitioning attack implications if they chose to permit processing of packets without timestamps. Mixmaster versions 2.0.5 and greater in the 2.0.x tree as well as Mixmaster 3.0 in the 3.x tree do not permit processing of such packets.

Message integrity must be verified to prevent the adversary from performing chosen ciphertext attacks or replay attacks with modified packet IDs, and from encoding information in an intercepted message in a way not affected by decryption (e.g. by modifying the message length or inducing errors). This version of the protocol does not provide integrity for the packet body. Because the padding for header section is random, in this version of the protocol it is impossible for a remailer to check the integrity of the encrypted header sections that will be decrypted by the following remailers. Chosen ciphertext attacks and replay attacks are detected by verifying the message digest included in the header section.

The adversary can trace a message if he knows the decryption of all other messages that pass through the remailer at the same time. To make it less practical for an attacker to flood a mix with known messages, remailers can store received messages in a reordering pool that grows in size while more than average messages are received, and periodically choose at random a fixed fraction of the messages in the pool for processing. There is no complete protection against flooding attacks in an open system, but if the number of messages required is high, an attack is less likely to go unnoticed. Additional work has been done in the field of active flooding attack protection; future mix-net protocols may wish to take advantage of this work [Danezis03].

If the adversary suppresses all Mixmaster messages from one particular sender and observes that anonymous messages of a certain kind are discontinued at the same time, that sender's anonymity is compromised with high probability. There is no practical cryptographic protection against this attack in large-scale networks. The effect of a more powerful attack that combines suppressing messages and re-injecting them at a later time is reduced by using timestamps.

Manipulation of the distribution of remailer keys, capabilities, and statistics can lead to powerful attacks against a remailer network. Sensitive information such as this should be distributed in a secure manner.

The lack of accountability that comes with anonymity may have implications for the security of a network. For example, many news servers accept control messages automatically without any cryptographic authentication. Possible countermeasures are discussed in Section 6.5.