Network Working Group Y. Nir
Internet-Draft Check Point
Intended status: Informational A. Langley
Expires: August 24, 2015 Google Inc
February 20, 2015
ChaCha20 and Poly1305 for IETF protocolsdraft-irtf-cfrg-chacha20-poly1305-10
Abstract
This document defines the ChaCha20 stream cipher, as well as the use
of the Poly1305 authenticator, both as stand-alone algorithms, and as
a "combined mode", or Authenticated Encryption with Additional Data
(AEAD) algorithm.
This document does not introduce any new crypto, but is meant to
serve as a stable reference and an implementation guide. It is a
product of the Crypto Forum Research Group (CFRG)
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on August 24, 2015.
Copyright Notice
Copyright (c) 2015 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
Nir & Langley Expires August 24, 2015 [Page 1]

Internet-Draft ChaCha20 & Poly1305 February 20151. Introduction
The Advanced Encryption Standard (AES - [FIPS-197]) has become the
gold standard in encryption. Its efficient design, widespread
implementation, and hardware support allow for high performance in
many areas. On most modern platforms, AES is anywhere from 4x to 10x
as fast as the previous most-used cipher, 3-key Data Encryption
Standard (3DES - [SP800-67]), which makes it not only the best
choice, but the only practical choice.
There are several problems with this. If future advances in
cryptanalysis reveal a weakness in AES, users will be in an
unenviable position. With the only other widely supported cipher
being the much slower 3DES, it is not feasible to re-configure
deployments to use 3DES. [Standby-Cipher] describes this issue and
the need for a standby cipher in greater detail. Another problem is
that while AES is very fast on dedicated hardware, its performance on
platforms that lack such hardware is considerably lower. Yet another
problem is that many AES implementations are vulnerable to cache-
collision timing attacks ([cache-collisions]).
This document provides a definition and implementation guide for
three algorithms:
1. The ChaCha20 cipher. This is a high-speed cipher first described
in [ChaCha]. It is considerably faster than AES in software-only
implementations, making it around three times as fast on
platforms that lack specialized AES hardware. See Appendix B for
some hard numbers. ChaCha20 is also not sensitive to timing
attacks (see the security considerations in Section 4) This
algorithm is described in Section 2.4
2. The Poly1305 authenticator. This is a high-speed message
authentication code. Implementation is also straight-forward and
easy to get right. The algorithm is described in Section 2.5.
3. The CHACHA20-POLY1305 Authenticated Encryption with Associated
Data (AEAD) construction, described in Section 2.8.
This document does not introduce these new algorithms for the first
time. They have been defined in scientific papers by D. J.
Bernstein, which are referenced by this document. The purpose of
this document is to serve as a stable reference for IETF documents
making use of these algorithms.
These algorithms have undergone rigorous analysis. Several papers
discuss the security of Salsa and ChaCha ([LatinDances],
[LatinDances2], [Zhenqing2012]).
Nir & Langley Expires August 24, 2015 [Page 3]

Internet-Draft ChaCha20 & Poly1305 February 2015
This document represents the consensus of the Crypto Forum Research
Group (CFRG).
1.1. Conventions Used in This Document
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
The description of the ChaCha algorithm will at various time refer to
the ChaCha state as a "vector" or as a "matrix". This follows the
use of these terms in Prof. Bernstein's paper. The matrix notation
is more visually convenient, and gives a better notion as to why some
rounds are called "column rounds" while others are called "diagonal
rounds". Here's a diagram of how the matrices relate to vectors
(using the C language convention of zero being the index origin).
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
The elements in this vector or matrix are 32-bit unsigned integers.
The algorithm name is "ChaCha". "ChaCha20" is a specific instance
where 20 "rounds" (or 80 quarter rounds - see Section 2.1) are used.
Other variations are defined, with 8 or 12 rounds, but in this
document we only describe the 20-round ChaCha, so the names "ChaCha"
and "ChaCha20" will be used interchangeably.
2. The Algorithms
The subsections below describe the algorithms used and the AEAD
construction.
2.1. The ChaCha Quarter Round
The basic operation of the ChaCha algorithm is the quarter round. It
operates on four 32-bit unsigned integers, denoted a, b, c, and d.
The operation is as follows (in C-like notation):
1. a += b; d ^= a; d <<<= 16;
2. c += d; b ^= c; b <<<= 12;
3. a += b; d ^= a; d <<<= 8;
4. c += d; b ^= c; b <<<= 7;
Nir & Langley Expires August 24, 2015 [Page 4]

Internet-Draft ChaCha20 & Poly1305 February 2015
Note that this run of quarter round is part of what is called a
"column round".
2.2.1. Test Vector for the Quarter Round on the ChaCha state
For a test vector, we will use a ChaCha state that was generated
randomly:
Sample ChaCha State
879531e0 c5ecf37d 516461b1 c9a62f8a
44c20ef3 3390af7f d9fc690b 2a5f714c
53372767 b00a5631 974c541a 359e9963
5c971061 3d631689 2098d9d6 91dbd320
We will apply the QUARTERROUND(2,7,8,13) operation to this state.
For obvious reasons, this one is part of what is called a "diagonal
round":
After applying QUARTERROUND(2,7,8,13)
879531e0 c5ecf37d *bdb886dc c9a62f8a
44c20ef3 3390af7f d9fc690b *cfacafd2
*e46bea80 b00a5631 974c541a 359e9963
5c971061 *ccc07c79 2098d9d6 91dbd320
Note that only the numbers in positions 2, 7, 8, and 13 changed.
2.3. The ChaCha20 block Function
The ChaCha block function transforms a ChaCha state by running
multiple quarter rounds.
The inputs to ChaCha20 are:
o A 256-bit key, treated as a concatenation of 8 32-bit little-
endian integers.
o A 96-bit nonce, treated as a concatenation of 3 32-bit little-
endian integers.
o A 32-bit block count parameter, treated as a 32-bit little-endian
integer.
The output is 64 random-looking bytes.
The ChaCha algorithm described here uses a 256-bit key. The original
algorithm also specified 128-bit keys and 8- and 12-round variants,
but these are out of scope for this document. In this section we
describe the ChaCha block function.
Nir & Langley Expires August 24, 2015 [Page 6]

Internet-Draft ChaCha20 & Poly1305 February 2015
Note also that the original ChaCha had a 64-bit nonce and 64-bit
block count. We have modified this here to be more consistent with
recommendations in section 3.2 of [RFC5116]. This limits the use of
a single (key,nonce) combination to 2^32 blocks, or 256 GB, but that
is enough for most uses. In cases where a single key is used by
multiple senders, it is important to make sure that they don't use
the same nonces. This can be assured by partitioning the nonce space
so that the first 32 bits are unique per sender, while the other 64
bits come from a counter.
The ChaCha20 state is initialized as follows:
o The first 4 words (0-3) are constants: 0x61707865, 0x3320646e,
0x79622d32, 0x6b206574.
o The next 8 words (4-11) are taken from the 256-bit key by reading
the bytes in little-endian order, in 4-byte chunks.
o Word 12 is a block counter. Since each block is 64-byte, a 32-bit
word is enough for 256 gigabytes of data.
o Words 13-15 are a nonce, which should not be repeated for the same
key. The 13th word is the first 32 bits of the input nonce taken
as a little-endian integer, while the 15th word is the last 32
bits.
cccccccc cccccccc cccccccc cccccccc
kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk
kkkkkkkk kkkkkkkk kkkkkkkk kkkkkkkk
bbbbbbbb nnnnnnnn nnnnnnnn nnnnnnnn
c=constant k=key b=blockcount n=nonce
ChaCha20 runs 20 rounds, alternating between "column" and "diagonal"
rounds. Each round is 4 quarter-rounds, and they are run as follows.
Quarter-rounds 1-4 are part of a "column" round, while 5-8 are part
of a "diagonal" round:
1. QUARTERROUND ( 0, 4, 8,12)
2. QUARTERROUND ( 1, 5, 9,13)
3. QUARTERROUND ( 2, 6,10,14)
4. QUARTERROUND ( 3, 7,11,15)
5. QUARTERROUND ( 0, 5,10,15)
6. QUARTERROUND ( 1, 6,11,12)
7. QUARTERROUND ( 2, 7, 8,13)
8. QUARTERROUND ( 3, 4, 9,14)
At the end of 20 rounds (or 10 iterations of the above list), we add
the original input words to the output words, and serialize the
result by sequencing the words one-by-one in little-endian order.
Nir & Langley Expires August 24, 2015 [Page 7]

Internet-Draft ChaCha20 & Poly1305 February 2015
Note: "addition" in the above paragraph is done modulo 2^32. In some
machine languages this is called carryless addition on a 32-bit word.
2.3.1. The ChaCha20 Block Function in Pseudo-Code
Note: This section and a few others contain pseudo-code for the
algorithm explained in a previous section. Every effort was made for
the pseudo-code to accurately reflect the algorithm as described in
the preceding section. If a conflict is still present, the textual
explanation and the test vectors are normative.
inner_block (state):
Qround(state, 0, 4, 8,12)
Qround(state, 1, 5, 9,13)
Qround(state, 2, 6,10,14)
Qround(state, 3, 7,11,15)
Qround(state, 0, 5,10,15)
Qround(state, 1, 6,11,12)
Qround(state, 2, 7, 8,13)
Qround(state, 3, 4, 9,14)
end
chacha20_block(key, counter, nonce):
state = constants | key | counter | nonce
working_state = state
for i=1 upto 10
inner_block(working_state)
end
state += working_state
return serialize(state)
end
2.3.2. Test Vector for the ChaCha20 Block Function
For a test vector, we will use the following inputs to the ChaCha20
block function:
o Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13:
14:15:16:17:18:19:1a:1b:1c:1d:1e:1f. The key is a sequence of
octets with no particular structure before we copy it into the
ChaCha state.
o Nonce = (00:00:00:09:00:00:00:4a:00:00:00:00)
o Block Count = 1.
After setting up the ChaCha state, it looks like this:
Nir & Langley Expires August 24, 2015 [Page 8]

Internet-Draft ChaCha20 & Poly1305 February 2015
next block, saving some memory. There is no requirement for the
plaintext to be an integral multiple of 512-bits. If there is extra
keystream from the last block, it is discarded. Specific protocols
MAY require that the plaintext and ciphertext have certain length.
Such protocols need to specify how the plaintext is padded, and how
much padding it receives.
The inputs to ChaCha20 are:
o A 256-bit key
o A 32-bit initial counter. This can be set to any number, but will
usually be zero or one. It makes sense to use 1 if we use the
zero block for something else, such as generating a one-time
authenticator key as part of an AEAD algorithm.
o A 96-bit nonce. In some protocols, this is known as the
Initialization Vector.
o An arbitrary-length plaintext
The output is an encrypted message, or "ciphertext" of the same
length.
Decryption is done in the same way. The ChaCha20 block function is
used to expand the key into a key stream, which is XOR-ed with the
ciphertext giving back the plaintext.
2.4.1. The ChaCha20 encryption algorithm in Pseudo-Code
chacha20_encrypt(key, counter, nonce, plaintext):
for counter=1 upto ceil(len(plaintext) / 64)
key_stream = chacha20_block(key, counter, nonce)
block = plaintext[((counter-1)*64)..(counter*64-1)]
encrypted_message += block ^ key_stream
end
if ((len(plaintext) % 64) != 0)
key_stream = chacha20_block(key, counter, nonce)
block = plaintext[(counter*64)..len(plaintext)-1]
encrypted_message += (block^key_stream)[0..len(plaintext)%64]
end
return encrypted_mesage
end
2.4.2. Example and Test Vector for the ChaCha20 Cipher
For a test vector, we will use the following inputs to the ChaCha20
block function:
o Key = 00:01:02:03:04:05:06:07:08:09:0a:0b:0c:0d:0e:0f:10:11:12:13:
14:15:16:17:18:19:1a:1b:1c:1d:1e:1f.
Nir & Langley Expires August 24, 2015 [Page 10]

Internet-Draft ChaCha20 & Poly1305 February 2015
o r[3], r[7], r[11], and r[15] are required to have their top four
bits clear (be smaller than 16)
o r[4], r[8], and r[12] are required to have their bottom two bits
clear (be divisible by 4)
The following sample code clamps "r" to be appropriate:
/*
Adapted from poly1305aes_test_clamp.c version 20050207
D. J. Bernstein
Public domain.
*/
#include "poly1305aes_test.h"
void poly1305aes_test_clamp(unsigned char r[16])
{
r[3] &= 15;
r[7] &= 15;
r[11] &= 15;
r[15] &= 15;
r[4] &= 252;
r[8] &= 252;
r[12] &= 252;
}
The "s" should be unpredictable, but it is perfectly acceptable to
generate both "r" and "s" uniquely each time. Because each of them
is 128-bit, pseudo-randomly generating them (see Section 2.6) is also
acceptable.
The inputs to Poly1305 are:
o A 256-bit one-time key
o An arbitrary length message
The output is a 128-bit tag.
First, the "r" value should be clamped.
Next, set the constant prime "P" be 2^130-5:
3fffffffffffffffffffffffffffffffb. Also set a variable "accumulator"
to zero.
Next, divide the message into 16-byte blocks. The last one might be
shorter:
o Read the block as a little-endian number.
Nir & Langley Expires August 24, 2015 [Page 13]

Internet-Draft ChaCha20 & Poly1305 February 2015
o Add one bit beyond the number of octets. For a 16-byte block this
is equivalent to adding 2^128 to the number. For the shorter
block it can be 2^120, 2^112, or any power of two that is evenly
divisible by 8, all the way down to 2^8.
o If the block is not 17 bytes long (the last block), pad it with
zeros. This is meaningless if you are treating the blocks as
numbers.
o Add this number to the accumulator.
o Multiply by "r"
o Set the accumulator to the result modulo p. To summarize: Acc =
((Acc+block)*r) % p.
Finally, the value of the secret key "s" is added to the accumulator,
and the 128 least significant bits are serialized in little-endian
order to form the tag.
2.5.1. The Poly1305 Algorithms in Pseudo-Code
clamp(r): r &= 0x0ffffffc0ffffffc0ffffffc0fffffff
poly1305_mac(msg, key):
r = (le_bytes_to_num(key[0..15])
clamp(r)
s = le_num(key[16..31])
accumulator = 0
p = (1<<130)-5
for i=1 upto ceil(msg length in bytes / 16)
n = le_bytes_to_num(msg[((i-1)*16)..(i*16)] | [0x01])
a += n
a = (r * a) % p
end
a += s
return num_to_16_le_bytes(a)
end
2.5.2. Poly1305 Example and Test Vector
For our example, we will dispense with generating the one-time key
using AES, and assume that we got the following keying material:
o Key Material: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8:01:0
3:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b
o s as an octet string:
01:03:80:8a:fb:0d:b2:fd:4a:bf:f6:af:41:49:f5:1b
o s as a 128-bit number: 1bf54941aff6bf4afdb20dfb8a800301
o r before clamping: 85:d6:be:78:57:55:6d:33:7f:44:52:fe:42:d5:06:a8
o Clamped r as a number: 806d5400e52447c036d555408bed685.
For our message, we'll use a short text:
Nir & Langley Expires August 24, 2015 [Page 14]

Internet-Draft ChaCha20 & Poly1305 February 20152.6. Generating the Poly1305 key using ChaCha20
As said in Section 2.5, it is acceptable to generate the one-time
Poly1305 pseudo-randomly. This section proposes such a method.
To generate such a key pair (r,s), we will use the ChaCha20 block
function described in Section 2.3. This assumes that we have a
256-bit session key for the MAC function, such as SK_ai and SK_ar in
IKEv2 ([RFC7296]), the integrity key in ESP and AH, or the
client_write_MAC_key and server_write_MAC_key in TLS. Any document
that specifies the use of Poly1305 as a MAC algorithm for some
protocol must specify that 256 bits are allocated for the integrity
key. Note that in the AEAD construction defined in Section 2.8, the
same key is used for encryption and key generation, so the use of
SK_a* or *_write_MAC_key is only for stand-alone Poly1305.
The method is to call the block function with the following
parameters:
o The 256-bit session integrity key is used as the ChaCha20 key.
o The block counter is set to zero.
o The protocol will specify a 96-bit or 64-bit nonce. This MUST be
unique per invocation with the same key, so it MUST NOT be
randomly generated. A counter is a good way to implement this,
but other methods, such as a Linear Feedback Shift Register (LFSR)
are also acceptable. ChaCha20 as specified here requires a 96-bit
nonce. So if the provided nonce is only 64-bit, then the first 32
bits of the nonce will be set to a constant number. This will
usually be zero, but for protocols with multiple senders it may be
different for each sender, but should be the same for all
invocations of the function with the same key by a particular
sender.
After running the block function, we have a 512-bit state. We take
the first 256 bits or the serialized state, and use those as the one-
time Poly1305 key: The first 128 bits are clamped, and form "r",
while the next 128 bits become "s". The other 256 bits are
discarded.
Note that while many protocols have provisions for a nonce for
encryption algorithms (often called Initialization Vectors, or IVs),
they usually don't have such a provision for the MAC function. In
that case the per-invocation nonce will have to come from somewhere
else, such as a message counter.
Nir & Langley Expires August 24, 2015 [Page 16]

Internet-Draft ChaCha20 & Poly1305 February 2015
Poly-1305 is not a suitable choice for a PRF. Poly-1305 prohibits
using the same key twice, whereas the PRF in IKEv2 is used multiple
times with the same key. Additionally, unlike HMAC, Poly-1305 is
biased, so using it for key derivation would reduce the security of
the symmetric encryption.
Chacha20 could be used as a key-derivation function, by generating an
arbitrarily long keystream. However, that is not what protocols such
as IKEv2 require.
For this reason, this document does not specify a PRF, and recommends
that crypto suites use some other PRF such as PRF_HMAC_SHA2_256
(section 2.1.2 of [RFC4868])
2.8. AEAD Construction
AEAD_CHACHA20_POLY1305 is an authenticated encryption with additional
data algorithm. The inputs to AEAD_CHACHA20_POLY1305 are:
o A 256-bit key
o A 96-bit nonce - different for each invocation with the same key.
o An arbitrary length plaintext
o Arbitrary length additional authenticated data (AAD)
Some protocols may have unique per-invocation inputs that are not
96-bit in length. For example, IPsec may specify a 64-bit nonce. In
such a case, it is up to the protocol document to define how to
transform the protocol nonce into a 96-bit nonce, for example by
concatenating a constant value.
The ChaCha20 and Poly1305 primitives are combined into an AEAD that
takes a 256-bit key and 96-bit nonce as follows:
o First, a Poly1305 one-time key is generated from the 256-bit key
and nonce using the procedure described in Section 2.6.
o Next, the ChaCha20 encryption function is called to encrypt the
plaintext, using the same key and nonce, and with the initial
counter set to 1.
o Finally, the Poly1305 function is called with the Poly1305 key
calculated above, and a message constructed as a concatenation of
the following:
* The AAD
* padding1 - the padding is up to 15 zero bytes, and it brings
the total length so far to an integral multiple of 16. If the
length of the AAD was already an integral multiple of 16 bytes,
this field is zero-length.
* The ciphertext
Nir & Langley Expires August 24, 2015 [Page 18]

Internet-Draft ChaCha20 & Poly1305 February 2015
* padding2 - the padding is up to 15 zero bytes, and it brings
the total length so far to an integral multiple of 16. If the
length of the ciphertext was already an integral multiple of 16
bytes, this field is zero-length.
* The length of the additional data in octets (as a 64-bit
little-endian integer).
* The length of the ciphertext in octets (as a 64-bit little-
endian integer).
The output from the AEAD is twofold:
o A ciphertext of the same length as the plaintext.
o A 128-bit tag, which is the output of the Poly1305 function.
Decryption is similar with the following differences:
o The roles of ciphertext and plaintext are reversed, so the
ChaCha20 encryption function is applied to the ciphertext,
producing the plaintext.
o The Poly1305 function is still run on the AAD and the ciphertext,
not the plaintext.
o The calculated tag is bitwise compared to the received tag. The
message is authenticated if and only if the tags match.
A few notes about this design:
1. The amount of encrypted data possible in a single invocation is
2^32-1 blocks of 64 bytes each, because of the size of the block
counter field in the ChaCha20 block function. This gives a total
of 247,877,906,880 bytes, or nearly 256 GB. This should be
enough for traffic protocols such as IPsec and TLS, but may be
too small for file and/or disk encryption. For such uses, we can
return to the original design, reduce the nonce to 64 bits, and
use the integer at position 13 as the top 32 bits of a 64-bit
block counter, increasing the total message size to over a
million petabytes (1,180,591,620,717,411,303,360 bytes to be
exact).
2. Despite the previous item, the ciphertext length field in the
construction of the buffer on which Poly1305 runs limits the
ciphertext (and hence, the plaintext) size to 2^64 bytes, or
sixteen thousand petabytes (18,446,744,073,709,551,616 bytes to
be exact).
The AEAD construction in this section is a novel composition of
ChaCha20 and Poly1305. A security analysis of this composition is
given in [Procter].
Nir & Langley Expires August 24, 2015 [Page 19]

Internet-Draft ChaCha20 & Poly1305 February 2015
state, but only to increment the block counter. This saves
approximately 5.5% of the cycles.
It is not recommended to use a generic big number library such as the
one in OpenSSL for the arithmetic operations in Poly1305. Such
libraries use dynamic allocation to be able to handle any-sized
integer, but that flexibility comes at the expense of performance as
well as side-channel security. More efficient implementations that
run in constant time are available, one of them in D. J.
Bernstein's own library, NaCl ([NaCl]). A constant-time but not
optimal approach would be to naively implement the arithmetic
operations for a 288-bit integers, because even a naive
implementation will not exceed 2^288 in the multiplication of
(acc+block) and r. An efficient constant-time implementation can be
found in the public domain library poly1305-donna ([poly1305_donna]).
4. Security Considerations
The ChaCha20 cipher is designed to provide 256-bit security.
The Poly1305 authenticator is designed to ensure that forged messages
are rejected with a probability of 1-(n/(2^102)) for a 16n-byte
message, even after sending 2^64 legitimate messages, so it is SUF-
CMA in the terminology of [AE].
Proving the security of either of these is beyond the scope of this
document. Such proofs are available in the referenced academic
papers ([ChaCha],[Poly1305],[LatinDances], [LatinDances2], and
[Zhenqing2012])
The most important security consideration in implementing this draft
is the uniqueness of the nonce used in ChaCha20. Counters and LFSRs
are both acceptable ways of generating unique nonces, as is
encrypting a counter using a 64-bit cipher such as DES. Note that it
is not acceptable to use a truncation of a counter encrypted with a
128-bit or 256-bit cipher, because such a truncation may repeat after
a short time.
Consequences of repeating a nonce: If a nonce is repeated, then both
the one-time Poly1305 key and the key-stream are identical between
the messages. This reveals the XOR of the plaintexts, because the
XOR of the plaintexts is equal to the XOR of the ciphertexts.
The Poly1305 key MUST be unpredictable to an attacker. Randomly
generating the key would fulfill this requirement, except that
Poly1305 is often used in communications protocols, so the receiver
should know the key. Pseudo-random number generation such as by
Nir & Langley Expires August 24, 2015 [Page 23]

Internet-Draft ChaCha20 & Poly1305 February 2015
encrypting a counter is acceptable. Using ChaCha with a secret key
and a nonce is also acceptable.
The algorithms presented here were designed to be easy to implement
in constant time to avoid side-channel vulnerabilities. The
operations used in ChaCha20 are all additions, XORs, and fixed
rotations. All of these can and should be implemented in constant
time. Access to offsets into the ChaCha state and the number of
operations do not depend on any property of the key, eliminating the
chance of information about the key leaking through the timing of
cache misses.
For Poly1305, the operations are addition, multiplication and
modulus, all on >128-bit numbers. This can be done in constant time,
but a naive implementation (such as using some generic big number
library) will not be constant time. For example, if the
multiplication is performed as a separate operation from the modulus,
the result will sometimes be under 2^256 and some times be above
2^256. Implementers should be careful about timing side-channels for
Poly1305 by using the appropriate implementation of these operations.
Validating the authenticity of a message involves a bitwise
comparison of the calculated tag with the received tag. In most use
cases nonces and AAD contents are not "used up" until a valid message
is received. This allows an attacker to send multiple identical
messages with different tags until one passes the tag comparison.
This is hard if the attacker has to try all 2^128 possible tags one
by one. However, if the timing of the tag comparison operation
reveals how long a prefix of the calculated and received tags is
identical, the number of messages can be reduced significantly. For
this reason, with online protocols, implementation MUST use a
constant-time comparison function rather than relying on optimized
but insecure library functions such as the C language's memcmp().
5. IANA Considerations
IANA is requested to assign an entry in the "Authenticated Encryption
with Associated Data (AEAD) Parameters" registry with
"AEAD_CHACHA20_POLY1305" as the name and this document as reference.
6. Acknowledgements
ChaCha20 and Poly1305 were invented by Daniel J. Bernstein. The
AEAD construction and the method of creating the one-time Poly1305
key were invented by Adam Langley.
Thanks to Robert Ransom, Watson Ladd, Stefan Buhler, Dan Harkins, and
Kenny Paterson for their helpful comments and explanations. Thanks
Nir & Langley Expires August 24, 2015 [Page 24]

Internet-Draft ChaCha20 & Poly1305 February 2015
to Niels Moeller for suggesting the more efficient AEAD construction
in this document. Special thanks to Ilari Liusvaara for providing
extra test vectors, helpful comments, and for being the first to
attempt an implementation from this draft. And thanks to Sean
Parkinson for suggesting improvements to the examples and the pseudo-
code. Thanks to David Ireland for pointing out a bug in the pseudo-
code, and to Stephen Farrell and Alyssa Rowan for pointing out
missing advise in the security considerations.
Special thanks goes to Gordon Procter for performing a security
analysis of the composition and publishing [Procter].
7. Changes from Previous Versions
NOTE TO RFC EDITOR: PLEASE REMOVE THIS SECTION BEFORE PUBLICATION
7.1. Changes from version -01 to version -02
Added IANA considerations and a paragraph in the security
considerations detailing the consequences of repeating a nonce.
Added the pseudo-code.
Replaced the example of a quarterround in section 2.17.2. Changes from version -00 to version -01
Added references to [LatinDances2] and [Procter].
Added this section.
7.3. Changes from draft-nir-cfrg to draft-irtf-cfrg
Added references to [Zhenqing2012] and [LatinDances].
Many clarifications and improved terminology.
More test vectors from Illari.
8. References8.1. Normative References
[ChaCha] Bernstein, D., "ChaCha, a variant of Salsa20", January
2008, <http://cr.yp.to/chacha/chacha-20080128.pdf>.
Nir & Langley Expires August 24, 2015 [Page 25]