Beyond-birthday secure domain-preserving PRFs from a single permutation

Abstract

This paper revisits the fundamental cryptographic problem of building pseudorandom functions (PRFs) from pseudorandom permutations (PRPs). We prove that, SUMPIP, i.e. \(P \oplus P^{-1}\), the sum of a PRP and its inverse, and EDMDSP, the single-permutation variant of the “dual” of the Encrypted Davies–Meyer scheme introduced by Mennink and Neves (CRYPTO 2017), are secure PRFs up to \(2^{2n/3}/n\) adversarial queries. To our best knowledge, SUMPIP is the first parallelizable, single-permutation-based, domain-preserving, beyond-birthday secure PRP-to-PRF conversion method.

Keywords

PRP-to-PRF Beyond birthday bound Domain preserving

Communicated by A. Winterhof.

Mathematics Subject Classification

Notes

Acknowledgements

We thank the reviewers of EUROCRYPT & CRYPTO 2018 for invaluable comments. Chun Guo is a postdoc in ICTEAM/ELEN/Crypto Group, Université Catholique de Louvain, and his work is funded in part by the ERC project 724725 (acronym SWORD). Many thanks to François-Xavier Standaert for the invaluable support. Yaobin Shen, Lei Wang and Dawu Gu are supported by National Natural Science Foundation of China (61602302, 61472250, 61672347), Natural Science Foundation of Shanghai (16ZR1416400), Shanghai Excellent Academic Leader Funds (16XD1401300), 13th five-year National Development Fund of Cryptography (MMJJ20170114).

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Our alternative proof for EDMSP

In this section we present our security analysis of EDMSP.

Theorem 3

For any distinguisher D making at most q queries with \(q\ll N/6\), we have

The proof for this scheme is more complicated, since the evaluations of two distinct inputs may collide after the “first round”, i.e., \(P(x)\oplus x=P(x')\oplus x'\). It can be seen that two queries (x, z) and \((x',z')\) in \(\tau \) is such a “colliding” pair if and only if the corresponding outputs collide, i.e. \(z=z'\). To simplify the analysis, we make a separate discussion on such “colliding” queries.

In details, for an attainable transcript \(\tau =\{(x_1,z_1),(x_2,z_2),\ldots ,(x_q,z_q)\}\), define a set

and let \(\mu =|{\mathcal {S}}{\mathcal {C}}{\mathcal {S}}|\). Then the bad transcripts are defined in the next subsection.

A.1 Bad transcript

Definition 3

(Bad transcripts for\(\textsf {EDMSP}\)) If one of the following conditions is fulfilled, we say an attainable transcript \(\tau =((x_1,z_1),(x_2,z_2),\ldots ,(x_q,z_q))\) is bad:

(B-1) there exists three distinct indices \(i,j,k\in \{1,\ldots ,q\}\) such that \(z_i=z_j=z_k\);

(B-2) \(\mu \ge {q^3}/{N}+q^{3/2}\sqrt{3n}\);

(B-3) \(|\tau _1|\ge \sqrt{q}\).

Otherwise we say that \(\tau \) is good. Denote by \({\mathcal {T}} _{\mathrm {bad}}\), resp. \({\mathcal {T}} _{\mathrm {good}}\) the set of bad, resp. good transcripts.

Clearly, \(\Pr [\text {(B-1)}]\le \left( {\begin{array}{c}q\\ 3\end{array}}\right) \cdot \frac{1}{N^2}\le \frac{q^3}{N^2}\). On the other hand, \(\Pr [\text {(B-2)}]\le \frac{2}{N}\) immediately follows from Lemma 2. Finally, for (B-3), we have at most \(\left( {\begin{array}{c}q\\ 2\end{array}}\right) \le q^2/2\) pairs \(((x_i,z_i),(x_j,z_j))\) of distinct records in \(\tau \). We note \(\Pr [\lambda \ge \sqrt{q}]\) does not exceed the probability that the number of pairs \(((x_i,z_i),(x_j,z_j))\) with \(z_i=z_j\) exceeds \(\sqrt{q}/2\). For each such \((x_i,z_i)\) and \((x_j,z_j)\), we have \(\Pr [z_i=z_j]=1/N\), thus \(\Pr [\text {(B-3)}]=\Pr [|\tau _1|\ge \sqrt{q}]\le \frac{q^2/2}{N\sqrt{q}/2}={q^{3/2}}/{N}\) by Markov’s inequality, and

For the \(2\lambda \) queries in \(\tau _1\), we lower bound the number of sequences of distinct intermediate values \({\mathbf {Y}}=(y_1,y_2,\ldots ,y_{\lambda })\) such that each pair \(\textsf {EDMSP}^{P}(x_l)=z_l\) and \(\textsf {EDMSP}^{P}(x_l')=z_l\) for \(l=1,\ldots ,\lambda \) is equivalent to \(3\lambda \) distinct equations \(P(x_l)=x_l\oplus y_l\), \(P(x_l')=x_l'\oplus y_l\), and \(P(y_l)=z_l\), cf. the notations in Fig. 1 (right). Formally, we lower bound the number \(N_Y\) of \({\mathbf {Y}}\) that satisfy:

once \(y_1\) is fixed, there are at least \(N-|\mathcal {X}|-2|\mathcal {Z}|-1-4\cdot 1=N-3q+2\lambda -5\) choices for \(y_2\), since \(y_2\ne y_1\), and since \(x_2\oplus y_2\) and \(x_2'\oplus y_2\) should be different from \(x_1\oplus y_1\) and \(x_1'\oplus y_1\);

once \(y_1,\ldots ,y_l\) are fixed, there are at least \(N-|\mathcal {X}|-2|\mathcal {Z}|-l-4l=N-3q+2\lambda -5l\) choices for \(y_{l+1}\), since \(y_{l+1}\ne y_1,\ldots ,y_l\), and since both \(x_{l+1}\oplus y_{l+1}\) and \(x_{l+1}'\oplus y_{l+1}\) should avoid 2l values (i.e. \(x_1\oplus y_1,x_1'\oplus y_1,\ldots ,x_l\oplus y_l,x_l'\oplus y_l\)).

It’s not hard to see that given such a good \({\mathbf {Y}}\), the event \(\textsf {EDMSP}^{P}(x_l)=z_l\) for \(l=1,\ldots ,\lambda \) is indeed equivalent to the desired \(3\lambda \) equations. By these,

We then proceed to derive the lower bound for \(\Pr [P\xleftarrow {\$}\mathcal {P}(n):\textsf {EDMSP} ^P\vdash \tau _2\mid \textsf {EDMSP} ^P\vdash \tau _1]\). To this end, we fix a good sequence \({\mathbf {Y}}\) as described, assume that a randomly picked P satisfies the \(3\lambda \) induced equations, and analyze the queries in \(\tau _2\).

A.2.2 Analyzing \(\tau _2\)

We lower bound the number of P’s that satisfy \(\textsf {EDMSP}^{P}(x_i)=z_i\) for \(i=2\lambda +1,\ldots ,q\). To this end, for a fixed \(t\in \{0,\ldots ,\frac{q-2\lambda }{2}\}\), we bound the number of permutations P such that

Note that this function slightly deviates from the tf functions used in Sects. 3 and 4.

In detail, we sequentially choose t pairs of indices \((i_1,j_1),(i_2,j_2),\ldots ,(i_t,j_t)\) from \(2\lambda +1,\ldots ,q\), such that each pair of them determines three equations on P, i.e., \(P(x_{i_l})=x_{i_l}\oplus x_{j_l}\), \(P(x_{j_l})=z_{i_l}\), and \(P(x_{j_l}\oplus z_{i_l})=z_{j_l}\) for \(l=1,\ldots ,t\). In order to make these equations “good”, that is consistent with a permutation and does not determine any other unfixed input-output patterns on P, we consider the sequences of indices that satisfy:

there are at least \(q-2\lambda \) choices for \(i_1\) and \(q-2\lambda -1\) choices for \(j_1\). However, among these \((q-2\lambda )(q-2\lambda -1)\) choices, there are at most \(2\mu \) bad ones that would violate the required condition (ii). Moreover, in a similar vein as the corresponding reasoning in Sect. 3.2, condition (iii) would exclude at most \(3\lambda q\) choices in total. Therefore, there are at least \((q-2\lambda )(q-2\lambda -1)-2\mu -3\lambda q\) choices for \((i_1,j_1)\);

For \(i_2\) and \(j_2\),

condition (ii) accounts to subtracting at most \(2\mu \) choices,

condition (iii) accounts to subtracting at most \(3\lambda q\) choices, and

condition (iv) accounts to subtracting at most 2q choices.

Therefore, there are at least \((q-2\lambda -2)(q-2\lambda -3)-2\mu -3\lambda q-2q\) choices for \((i_1,j_1)\);

\(\ldots \)

once \((i_1,j_1),\ldots ,(i_l,j_l)\) are fixed, to ensure distinctness, there are at least \((q-2\lambda -2l)(q-2\lambda -2l-1)-2\mu -3\lambda q-2lq\) choices for \((i_{l+1},j_{l+1})\).

It’s not hard to see given such a sequence of good indices \((i_1,j_i),\ldots ,(i_t,j_t)\), for \(l=1,\ldots ,t\), the 3t equations \(P(x_{i_l})=x_{i_l}\oplus x_{j_l}\), \(P(x_{j_l})=z_{i_l}\), and \(P(x_{j_l} \oplus z_{i_l})=z_{j_l}\) would be new and distinct ones. Having the redundant possibilities excluded, the number \(N_I\) of such 3t equations is thus at least

Finally, given a good choice \({\mathbf {Y}}\) and a good set of indices \(\{(i_1,j_1),\ldots ,(i_t,j_t)\}\), we choose a sequence of \(q-2\lambda -2t\) distinct intermediate values for the remaining \(q-2\lambda -2t\) queries in \(\tau _2\). For convenience, we rename the subscripts and write

there does not exist \(1\le j\le \lambda \) in \(\tau _1\) such that \(y_l=y_j\) or \(x_l\oplus y_l=x_j\oplus y_j\) or \(x_l\oplus y_l=x_j'\oplus y_j\) (\(y_j\) is given by \({\mathbf {Y}}\));

there does not exist a pair of indices \((\alpha ,\beta )\) (selected in the previous phase of the t pairs in \(\tau _2\)) such that either \(y_l=x_{\beta }\oplus z_{\alpha }\) or \(x_l\oplus y_l=x_{\alpha }\oplus x_{\beta }\).

It can be seen that given a good choice of \({\mathbf {Y}}\), a good choice of a set of 2t indices, and a good choice of \({\mathbf {Y}}'\), the event \(T_{\mathrm {re}}=\tau \) is equivalent to P satisfying \(3\lambda +3t+2(q-2t-2\lambda )=2q-t-\lambda \) equations. Therefore,