Thursday, April 21, 2016

In the paper Robustness of the Learning with Errors Assumption, Goldwasser, Kalai, Peikert and Vaikuntanathan show that the standard LWE assumption implies the hardness of LWE even if the secret is drawn from an arbitrary distribution with sufficient min-entropy (and also given an arbitrary hard-to-invert function of the secret, but we won't focus on this point here since we didn't cover it during the study group). A noticeable application of this result is a symmetric-key encryption scheme which is provable secure in the presence of leakage but for which no maximum amount of leakage is foreseen in the parameters. In fact, leakage-resilient schemes usually guarantee security up to, say, $\lambda$ bits of leakage, hence using constructions to tolerate such a loss at the cost of some overhead. However, think of the extreme scenario where none of those $\lambda$ bits are actually leaked. This results in the overhead being still there, and hence in the scheme being less efficient, without the constructions adding any security, since the unprotected version was secure anyway.

We adopt the following conventions: even though they aren't the most general possible, they make the explanation of the work easier. The Decisional Learning With Errors problem of parameters $n$, $q$ and $\alpha$ ($DLWE_{n,q,\alpha}$) asks to distinguish between samples of the form $+x_i$ from random where $\mathbf{a}_i \leftarrow U(\mathbb{Z}_q^n)$ are known and randomly chosen vectors, $\mathbf{s} \leftarrow U(\mathbb{Z}_q^n)$ is a secret randomly chosen vector and $x_i \leftarrow \psi_\alpha$ is an error vector chosen from the Gaussian distribution over $\mathbb{Z}_q$ of mean 0 and standard deviation $\alpha$. If we collect $m$ of such samples, the above can be rewritten using matrices as:
$$
(\mathbf{A},\mathbf{A}\mathbf{s}+\mathbf{x}) \approx (\mathbf{A},\mathbf{u})
$$
where $\mathbf{A} \leftarrow U(\mathbb{Z}_q^{m\times n})$, $\mathbf{x} \leftarrow \psi_\alpha^m$ and $\mathbf{u}$ is random. We use the symbol $\approx$ to mean (statistical) indistinguishability. Finally, we denote by $DLWE_{n,q,\alpha}(\mathcal{D})$ the same problem with the exception that the secret $\mathbf{s}$ is randomly drawn from the distribution $\mathcal{D}$ instead of from the uniform one.

Then, the main theorem, stated using the above formalism, looks like
$$
DLWE_{\ell,q,\gamma} \Rightarrow DLWE_{n,q,\beta}(\mathcal{D})
$$
where $\mathcal{D}$ is a distribution over $\{0,1\}^n$ with min-entropy at least $k$, $\ell \leq (k-\omega(\log{n}))/\log{q}$ and $\gamma,\beta > 0$ are such that $\gamma/\beta$ is a negligible function of $n$. The requirement of $\mathcal{D}$ being over $\{0,1\}^n$ instead of over $\mathbb{Z}_q^n$ is made just for simplicity and can be dropped at the cost of a less neat proof. In essence, we want to prove that
$$
(\mathbf{A},\mathbf{A}\mathbf{s}+\mathbf{x}) \approx (\mathbf{A},\mathbf{u})
$$
assuming $DLWE_{\ell,q,\gamma}$ and where $\mathbf{A} \leftarrow U(\mathbb{Z}_q^{m\times n})$, $\mathbf{s} \leftarrow \mathcal{D}$ and $\mathbf{x} \leftarrow \psi_\beta^m$.

We just sketch the main idea of each step here. First of all, we draw $\mathbf{B} \leftarrow U(\mathbb{Z}_q^{m\times \ell})$, $\mathbf{C} \leftarrow U(\mathbb{Z}_q^{\ell\times n})$ and $\mathbf{Z} \leftarrow \psi_\gamma^{m\times n}$ and define $\mathbf{A}' = \mathbf{B}\mathbf{C} + \mathbf{Z}$. Each column of $\mathbf{A}'$ is of the form
$$
\mathbf{B}\mathbf{c}_i + \mathbf{z}_i
$$
which can be thought as a bunch of $DLWE_{\ell,q,\gamma}$ samples. Hence, by that assumption, we can assume $\mathbf{A}' \approx \mathbf{A}$ since $\mathbf{A}$ was an uniformly random matrix.

Next step is getting rid of the term $\mathbf{Z}\mathbf{s}+\mathbf{x}$ by noticing that since $\mathbf{Z}$ was drawn from a Gaussian distribution with standard deviation $\gamma$, $\mathbf{s}$ is a binary vector, $\mathbf{x}$ was drawn from a Gaussian distribution with standard deviation $\beta$ and $\gamma \ll \beta$, then the term $\mathbf{Z}\mathbf{s}$ has a negligible impact on $\mathbf{x}$. This means that there exists $\mathbf{x}' \leftarrow \psi_\beta^m$ such that
$$
(\mathbf{Z},\mathbf{s},\mathbf{Z}\mathbf{s}+\mathbf{x}) \approx (\mathbf{Z},\mathbf{s},\mathbf{x}').
$$
The proof is then reduced to show that
$$
(\mathbf{B},\mathbf{C},\mathbf{B}\mathbf{C}\mathbf{s}+\mathbf{x}') \approx (\mathbf{B},\mathbf{C},\mathbf{u}).
$$

You can probably notice that we are getting closer to a $DLWE$ instance again. Eventually we will be able to use our assumption. Before, we note that, for the leftover hash lemma
$$
(\mathbf{C},\mathbf{C}\mathbf{s}) \approx (\mathbf{C},\mathbf{t})
$$
because of the min-entropy of $\mathbf{s}$ and where $\mathbf{t} \leftarrow U(\mathbb{Z}_q^\ell)$ is a random vector. This lets us rewrite the statement we want to prove as
$$
(\mathbf{B},\mathbf{B}\mathbf{t}+\mathbf{x}') \approx (\mathbf{B},\mathbf{u}).
$$

We have finally obtained a proper $DLWE_{\ell,q,\beta}$ equation whose truth then follows from our $DLWE_{\ell,q,\gamma}$ assumption and by noting that $DLWE_{\ell,q,\gamma} \Rightarrow DLWE_{\ell,q,\beta}$. $\Box$

Thanks to the above result, the author define a symmetric-key encryption scheme whose parameters are just $q = q(n)$, $\beta = q/poly(n)$ and $m = poly(n)$. As you can see, nothing is said about any anticipated amount of leakage: the scheme does not introduce any overhead just to protect against leakage. However, it is proved secure under the $DLWE_{\ell,q,\gamma}$ assumption with respect to the following definition of security.

We say that a symmetric encryption scheme is CPA secure w.r.t. k(n)-weak keys if for any distribution $\{\mathcal{D}_n\}_{n\in \mathbb{N}}$ with min-entropy k(n), the scheme is CPA secure even if the secret key is chosen according to the distribution $\mathcal{D}_n$.

A couple of conclusive remarks.

Our discussion was informal and didn't go into too many details on purpose. For instance we didn't mention how this proof affects the parameters involved. For a more insight we refer to the paper itself. Instead, if you prefer a "less math, more ideas" approach, we refer to a post on the ECRYPT-EU blog;

I find really interesting that during the study group it was pointed out how saying "no overhead is introduced" is not correct. Some overhead is indeed introduced, but in the initial parameters (to compensate for a possible loss by means of leakage) rather than through specific constructions. This has the advantage that, if no leakage occurs, the scheme is simply more secure because the parameters have been raised.

In this blog post we will consider a slimmed-down version of the accomplishments of the paper for the sake of brevity. Moreover, while the idea presented is complicated, it is not particularly complex, and in such circumstances it is often better to favour concision over precision.

Introduction
Obfuscation is the idea of mixing up a program so that it becomes unintelligible but functionality is preserved. Essentially, a person's access to a function is replaced with access to an oracle which computes that function.

Common examples of its use are:

Crippleware: software can be obfuscated so that it is difficult to reverse engineer and so that premium parts of software can be packaged with non-premium parts by 'crippling' them with obfuscation, so that a product key can later be used to unlock the premium content.

Public key encryption from private key encryption: Given a private key encryption scheme, one can obfuscate an encryption circuit with the secret key embedded and release this publicly. This (roughly) gives a public key scheme.

What this paper achieves
In this paper, the authors build on the core of an older and well-known breakthrough obfuscation construction of 2013 by Sanjam Garg, Craig Gentry, Shai Halevi, Mariana Raykova, Amit Sahai and Brent Waters. In doing so, they present a new obfuscator; while the starting point of the new construction is exactly the same as the old, they depart from it when providing countermeasures against the (speculated) attacks against the scheme.

This construction is in the multi-linear model, which roughly translates to assuming the existence of Graded Encoding Schemes (GESs). (Note that intheliterature, the terms 'Multi-linear Maps' and 'Graded Encoding Schemes' are used interchangeably, though there are important differences. In this discussion, we are considering Graded Encoding Schemes.)

GESs provide a way of encoding (NB that it is not encrypting) elements of some plaintext space so that they have an associated level, which is any subset of $\{1,...,k\}$ for some integer $k$. Intuitively, they offer a good way of preventing an adversary from doing 'too much' with the data he is given (in particular, 'too many' multiplications). Importantly, they allow multi-linear operations, and therefore, in particular, matrix multiplication. The properties they have which are pertinent for this post are the following:

Encodings may be added only if they are encoded at the same level.

Encodings may be multiplied only if the levels at which the elements are encoded are disjoint. When a multiplication is performed, the resulting element is encoded at the union of these disjoint levels.

One can check if a top level encoding (i.e. an element encoded at level $\{1,...,k\}$) is an encoding of zero.

Since their inception, GESs have been rife with problems: no current construction is secure for all desired applications. Fortunately, for their use in obfuscation, all known attacks on GESs (at least for the CLT15 GES at the time of writing) require more information than an obfuscation of a function provides. With a relatively modest overhead (compared with the notorious time and space requirements for GESs), a recent paper propounded the use of random matrices to achieve a similar result to the paper we are discussing without the use of a GES.

Starting point
We start with a circuit which we want to obfuscate; i.e., a Boolean function $C:\{0,1\}^n\to\{0,1\}$. From this, we create a matrix branching program: this is a sequence of pairs of matrices where each pair has an associated input bit. (We will not cover here how exactly one does this, but a good starting point for more information would be the paper from 1986 by David Barrington.) We denote this sequence of pairs by $(B_{i,0},B_{i,1})_{\textsf{inp}(i)}$ where $\textsf{inp}(i)$ is a function describing which bit of the input $x$ the $i$th point in the branching program inspects. A single input bit can be inspected multiple times throughout the program. To evaluate the program on an input $x$, for each point in the program $i$, we check the value $b$ of the $\textsf{inp}(i)$th input bit of $x$ and select the matrix $B_{i,b}$. We then multiply all of these matrices in order and look at the result, $\prod_{i=1}^{l} B_{i,x_{\textsf{inp}(i)}}$ where $l$ is the length of the branching program and $x_{\textsf{inp}(i)}$ is the $\textsf{inp}(i)$th bit of input $x$. If the product is the identity matrix, we interpret this as $C(x)=1$, and otherwise $C(x)=0$.

The end goal is to 'mix up' matrices so that, if we give them away, they don't reveal information about the underlying circuit but they still allow evaluations of the original function.

We start by encoding (using the GES) the $i$th pair of matrices under the singleton level $\{i\}$. This is useful because, as previously noted, we can still perform multi-linear operations like matrix multiplication, but not do 'too many' operations with them (for example, we cannot compute powers of matrices). The product received from an evaluation of the branching program will be a top-level encoding, to which we can then apply our zero-testing procedure (which, recall, requires that an element be top-level in order to check whether or not it encodes zero).

Attacks and Preventative Measures
Attacks on an obfuscation scheme are when an adversary seeks to glean some auxiliary information about the structure of the branching program by seeing what happens when computations are done erroneously. Attacks fall into four categories, and we will see how this paper deals with each of them. The first two are covered in the same way as the original construction on which this paper builds.

Non-algebraic attacks: performing non-multi-linear operations on the matrix elements, or failing to respect the algebraic structure of the matrices as matrices over the plaintext ring. These attacks are largely dealt with by encoding with the GES.

Out of order attacks: taking one matrix from each pair from the branching program and finding the product without respecting their branching program order. This is actually very simple to deal with, and involves a trick by Joe Kilian which he described in his paper of 1988. We choose a set of random matrices $\{R_0,\ldots,R_{l-1}\}$ (where $l$ is the length of the branching program), set $R_l=R_0$, and then define:
\begin{eqnarray}
\widetilde{B_{i,0}}&=R_{i-1}B_{i,0}R_{i}^{-1}\\
\widetilde{B_{i,1}}&=R_{i-1}B_{i,1}R_{i}^{-1}
\end{eqnarray}
for $i=1,\ldots,l$. This randomised version of the branching program computes the same output as the original, but now products that don't respect the branching program order will be encodings of random elements.

Mixed Input attacks: it was stated that more than one point in the branching program may inspect the same input bit (i.e. $\textsf{inp}$ is not necessarily injective). This attack involves performing an unfaithful evaluation by switching from reading, say, the $j$th bit of the input as 0 at one point in the program and then as 1 at some later point. This attack is dealt with using the innovative idea of straddling set systems. Encoding matrices of a branching program at levels given by sets in these systems, instead of at singleton sets, effectively locks together matrices at different points in the branching program. This is exactly what we need in order to prevent this kind of attack (and also the next).

We provide an elucidating example rather than the technical definition (details can be found in the paper, linked above): Let $S=C\cup D$ where $C=\{\{1\},\{2,3\},\{4,5\}\}$ and $D=\{\{1,2\},\{3,4\},\{5\}\}$. We call $S$ a straddling set system for the universe $U=\{1,2,3,4,5\}$. Note that $C$ and $D$ are the only exact covers of $U$ in $S$.

Now, suppose that we had a branching program which considered input bit 1 at positions 1 and 3 in the branching program, and input bit 2 at positions 2 and 4. I.e., we have

$1$

$2$

$3$

$4$

$\textsf{inp}(1)=1$

$\textsf{inp}(2)=2$

$\textsf{inp}(3)=1$

$\textsf{inp}(4)=2$

$B_{1,0}$

$B_{2,0}$

$B_{3,0}$

$B_{4,0}$

$B_{1,1}$

$B_{2,1}$

$B_{3,1}$

$B_{4,1}$

We define a straddling set system for each input bit: $S_1$ is

$C_1=\{\{1\},\{2,3\}\}$ and $D_1=\{\{1,2\},\{3\}\}$

and $S_2$ is

$C_2=\{\{4\},\{5,6\}\}$ and $D_2=\{\{4,5\},\{6\}\}$

Then denoting by $(M,A)$ an encoding of the matrix $M$ at level $A\subset \{1,\ldots,k\}$, we encode as follows:

$1$

$2$

$3$

$4$

$\textsf{inp}(1)=1$

$\textsf{inp}(2)=2$

$\textsf{inp}(3)=1$

$\textsf{inp}(4)=2$

$(B_{1,0},\{1\})$

$(B_{2,0},\{4\})$

$(B_{3,0},\{2,3\})$

$(B_{4,0},\{5,6\})$

$(B_{1,1},\{1,2\})$

$(B_{2,1},\{4,5\})$

$(B_{3,1},\{3\})$

$(B_{4,1},\{6\})$

Since $C_1$ and $D_1$ are the only exact covers of $U_1$ and $C_2$ and $D_2$ are the only exact covers of $U_2$, if we want to obtain a top-level product at the end, choosing, for example, $B_{1,0}$ forces us also to choose $B_{3,0}$ (and we cannot choose $B_{3,1}$). This exactly prevents mixed-input attacks.

Partial evaluation attacks: computing subproducts of matrices and comparing them. These attacks are dealt with using more complicated straddling set systems, but the idea of 'locking' matrices together is essentially the same. The branching program is extended so that each point is dependent on two input bits, and several redundant matrices are inserted into the program so that for every pair of input bits, there is a point in the program which inspects both of these bits. The straddling sets can then be constructed so that each input bit affects the matrix the evaluator must choose at every point in the program. Thus it greatly restricts the operations that can be performed, and stops these partial evaluation attacks.

The paper concludes with a proof that this obfuscator achieves what is known as virtual black-box obfuscation in the ideal graded encoding model. The really interesting thing in this author's opinion is the aforementioned translation of the straddling set technique to the obfuscator (for which, admittedly, a security proof has yet to materialise) which doesn't require a GES.