§10.4: More on randomization/symmetrization

In Section 3 we collected a number of consequences of the General Hypercontractivity Theorem for functions $f \in L^2(\Omega^n, \pi^{\otimes n})$. All of these had a dependence on “$\lambda$”, the least probability of an outcome under $\pi$. This can sometimes be quite expensive; for example, the KKL Theorem and its consequence Theorem 28 are trivialized when $\lambda = 1/n^{\Theta(1)}$.

However as mentioned in Section 2, when working with symmetric random variables $\boldsymbol{X}$, the “randomization” trick sometimes lets you reduce to the analysis of uniformly random $\pm 1$ bits (which have $\lambda = 1/2$). Further, Lemma 14 suggests a way of “symmetrizing” general mean-zero random variables (at least if we don’t mind applying $\mathrm{T}_{\frac12}$). In this section we will develop the randomization/symmetrization technique more thoroughly and see an application: bounding the $L^p \to L^p$ norm of the “low-degree projection” operator.

Informally, applying the randomization/symmetrization technique to $f \in L^2(\Omega^n,\pi^{\otimes n})$ means introducing $n$ independent uniformly random bits $\boldsymbol{r} = (\boldsymbol{r}_1, \dots, \boldsymbol{r}_n) \sim \{-1,1\}^n$ and then “multiplying the $i$th input to $f$ by $\boldsymbol{r}_i$”. Of course $\Omega$ is just an abstract set so this doesn’t quite make sense. What we really mean is “multiplying $\mathrm{L}_i f$, the $i$th part of $f$’s Fourier expansion (orthogonal decomposition), by $\boldsymbol{r}_i$”. Let’s see some examples:

Remark 32 Another way of defining $\widetilde{f}$ is to stipulate that for each $x \in \Omega^n$, the function $\widetilde{f}_{ \mid x} : \{-1,1\}^n \to {\mathbb R}$ is defined to be the boolean function whose Fourier coefficient on $S$ is $f^{=S}(x)$. (This is more evident from \eqref{eqn:randsymm-def} if you swap the positions of $\boldsymbol{r}^{S}$ and $f^{=S}({\boldsymbol{x}})$.)

In light of this remark, the basic Parseval formula for boolean functions implies that for all $x \in \Omega^n$, \[ \|\widetilde{f}_{ \mid x}\|_{2,\boldsymbol{r}}^2 = \sum_{S \subseteq [n]} f^{=S}(x)^2. \] (The notation $\|\cdot\|_{2,\boldsymbol{r}}$ emphasizes that the norm is computed with respect to the random inputs $\boldsymbol{r}$.) If we take the expectation of the above over ${\boldsymbol{x}} \sim \pi^{\otimes n}$, the left-hand side becomes $\|\widetilde{f}\|_{2,\boldsymbol{r},{\boldsymbol{x}}}^2$ and the right-hand side becomes $\|f\|_{2,{\boldsymbol{x}}}^2$, by Parseval’s formula for $L^2(\Omega^n,\pi^{\otimes n})$. Thus:

Thus randomization/symmetrization doesn’t change $2$-norms. What about $q$-norms for $q \neq 2$? As discussed in Examples 29, 30, if $f$ has the lucky property that its Fourier expansion only contains symmetric basis functions then $\widetilde{f}(\boldsymbol{r},{\boldsymbol{x}})$ and $f({\boldsymbol{x}})$ have identical distributions, so their $q$-norms are identical. The essential feature of the randomization/symmetrization technique is that even for general $f$ the $q$-norms don’t change much — if you are willing to apply $\mathrm{T}_\rho$ for some constant $\rho$:

The two inequalities in \eqref{eqn:rand-symm-thm1} are not too difficult to prove; for example, you might already correctly guess that the left-hand inequality follows from our first randomization/symmetrization Lemma 14 and an induction. We’ll give the proofs at the end of this section. But first, let’s illustrate how you might use them by solving the following basic problem concerning low-degree projections:

Question Let $k \in {\mathbb N}$, let $1 < q < \infty$, and let $f \in L^2(\Omega^n, \pi^{\otimes n})$. Can $\|f^{\leq k}\|_q$ be much larger than $\|f\|_q$? To put the question in reverse, suppose $g \in L^2(\Omega^n, \pi^{\otimes n})$ has degree at most $k$; is it possible to make the $q$-norm of $g$ much smaller by adding terms of degree exceeding $k$ to its Fourier expansion?

The question has a simple answer if $q = 2$: in this case we have $\|f^{\leq k}\|_2 \leq \|f\|_2$ always. This follows from Paresval: \begin{equation} \label{eqn:norm-proj2} \|f^{\leq k}\|_2^2 = \sum_{j = 0}^k \mathbf{W}^{j}[f] \leq \sum_{j = 0}^n \mathbf{W}^{j}[f] = \|f\|_2^2. \end{equation} When $q \neq 2$ things are not so simple, so let’s first consider the most familiar setting of $\Omega = \{-1,1\}$, $\pi = \pi_{1/2}$. In this case we can relate the $q$-norm and the $2$-norm via the Hypercontractivity Theorem:

Now let’s consider functions $f \in L^2(\Omega^n, \pi^{\otimes n})$ on general product spaces; for simplicity, we’ll continue to focus on the case $q = 4$. One possibility is to repeat the above proof using the General Hypercontractivity Theorem (more specifically, Theorem 20). This would give us $\|f^{\leq k}\|_4 \leq \sqrt{3/\lambda}^k \|f\|_4$. However we will see that it’s possible to get a bound completely independent of $\lambda$ — i.e., independent of $(\Omega, \pi)$ — using randomization/symmetrization.

First, suppose we are in the lucky case described in Example 30 in which $f$’s Fourier spectrum only uses symmetric basis functions. In this case $f^{\leq k}({\boldsymbol{x}})$ and $\widetilde{f^{\leq k}}(\boldsymbol{r},{\boldsymbol{x}})$ have the same distribution for any $k$, and we can leverage the $L^2(\{-1,1\})$ bound \eqref{eqn:low-deg-proj-4-bounded} to get the same result for $f$: First, \[ \|f^{\leq k}\|_4 = \|\widetilde{f^{\leq k}}\|_4 = \left\|\ \|\widetilde{f^{\leq k}}_{ \mid {\boldsymbol{x}}}(\boldsymbol{r})\|_{4, \boldsymbol{r}}\ \right\|_{4, {\boldsymbol{x}}}. \] For each outcome ${\boldsymbol{x}} = x$, the inner function $g(r) = \widetilde{f^{\leq k}}_{ \mid x}(r)$ is a degree-$k$ function of $r \in \{-1,1\}^n$. Therefore we can apply \eqref{eqn:low-deg-proj-4-bounded} with this $g$ to deduce \[ \left\|\ \|\widetilde{f^{\leq k}}_{ \mid {\boldsymbol{x}}}(\boldsymbol{r})\|_{4, \boldsymbol{r}}\ \right\|_{4, {\boldsymbol{x}}} \leq \left\|\ \sqrt{3}^k \|\widetilde{f}_{ \mid {\boldsymbol{x}}}(\boldsymbol{r})\|_{4, \boldsymbol{r}}\ \right\|_{4, {\boldsymbol{x}}} = \sqrt{3}^k \|\widetilde{f}\|_4 = \sqrt{3}^k \|f\|_4. \] Thus we see that we can deduce \eqref{eqn:low-deg-proj-4-bounded} “automatically” for these luckily symmetric $f$, with no dependence on “$\lambda$”. We’ll now show that we can get something similar for a completely general $f$ using the randomization/symmetrization Theorem 34. This will cause us to lose a factor of $(2\cdot\tfrac52)^k$, due to application of $\mathrm{T}_2$ and $\mathrm{T}_{\frac52}$; to prepare for this, we first extend the calculation in \eqref{eqn:low-deg-proj-4-bounded} slightly.

The remainder of this section is devoted to the proof of Theorem 34, which lets us compare norms of a function and its randomization/symmetrization. It will help to view randomization/symmetrization from an operator perspective. To do this, we need to slightly extend our $\mathrm{T}_\rho$ notation, allowing for “different noise rates on different coordinates”.

These generalizations of the noise operator behave the way you would expect; you are referred to the exercises for some basic properties. Now comparing \eqref{eqn:single-T2} and \eqref{eqn:randsymm-def} reveals the connection to randomization/symmetrization:

In other words, randomization/symmetrization of $f$ means applying $\mathrm{T}_{(\pm 1, \pm 1, \dots, \pm 1)}$ to $f$ for a random choice of signs. We use this viewpoint to prove Theorem 34, which we do in two steps:

Proof: We will only prove the statement for $q = 4$; you are asked to establish the general case in the exercises. By homogeneity we may assume $a = 1$; then raising the inequality to the $4$th power we need to show \[ \mathop{\bf E}[(1 - c \boldsymbol{X})^4] \leq \mathop{\bf E}[(1 + \boldsymbol{X})^4] \] for small enough $c$. Expanding both sides and using $\mathop{\bf E}[\boldsymbol{X}] = 0$, this is equivalent to \begin{equation} \label{eqn:unsymm} \mathop{\bf E}[(1 - c^4) \boldsymbol{X}^4 + (4+4c^3)\boldsymbol{X}^3 + (6 - 6c^2)\boldsymbol{X}^2] \geq 0. \end{equation} It suffices to find $c$ such that \begin{equation} \label{eqn:unsymm-ex} (1 – c^4) x^2 + (4+4c^3)x + (6 – 6c^2) \geq 0 \quad \forall x \in {\mathbb R}; \end{equation} then we can multiply this inequality by $x^2$ and take expectations to obtain \eqref{eqn:unsymm}. This last problem is elementary, and in the exercises you’re asked to find the largest $c$ which works (the answer is $c \approx .435$). To see that $c = \frac25$ suffices, we use the fact that $x \geq -\frac29 x^2 – \frac98$ for all $x$ (because the difference of the left- and right-hand sides is $\frac{1}{72}(4x+9)^2$). Putting this into \eqref{eqn:unsymm-ex}, it remains to ensure \[ (\tfrac19-\tfrac89c^3-c^4)x^2 + (\tfrac32 -6c^2-\tfrac92c^3) \geq 0 \quad \forall x \in {\mathbb R}, \] and when $c = \frac25$ this is the trivially true statement $\frac{161}{5625}x^2+\frac{63}{250} \geq 0$. $\Box$

Proof: In fact, we can show that for every outcome $\boldsymbol{r} = r \in \{-1,1\}^n$ we have \[ \|\mathrm{T}_{c_q r} f({\boldsymbol{x}})\|_{q,{\boldsymbol{x}}} \leq \| f({\boldsymbol{x}})\|_{q, {\boldsymbol{x}}} \] for sufficiently small $c_q > 0$. Note that on the left-hand side we have \[ \|\mathrm{T}^1_{\pm c_q} \mathrm{T}^2_{\pm c_q} \cdots \mathrm{T}^n_{\pm c_q} f({\boldsymbol{x}})\|_{q, {\boldsymbol{x}}}. \] We know that $\mathrm{T}^i_\rho$ is a contraction in $L^q$ for any $\rho \geq 0$ (exercise). Hence it suffices to show that $\mathrm{T}^i_{-c_q}$ is a contraction in $L^q$; i.e., that \begin{equation} \label{eqn:unsymm1} \|\mathrm{T}^i_{-c_q} g({\boldsymbol{x}})\|_{q,{\boldsymbol{x}}} \leq \|g({\boldsymbol{x}})\|_{q,{\boldsymbol{x}}} \end{equation} for all $g \in L^2(\Omega^n, \pi^{\otimes n})$. Similar to the proof of Theorem 40, it suffices to show \begin{equation} \label{eqn:negative-contract} \|\mathrm{T}_{-c_q} h\|_q \leq \|h\|_q \end{equation} for all one-input functions $h \in L^2(\Omega, \pi)$, because then \eqref{eqn:unsymm1} holds pointwise for all outcomes of ${\boldsymbol{x}}_1, \dots, {\boldsymbol{x}}_{i-1}, {\boldsymbol{x}}_{i+1}, \dots, {\boldsymbol{x}}_n$. By Proposition 9.19, if we prove \eqref{eqn:negative-contract} for some $q$ then the same constant $c_q$ works for the conjugate Hölder index $q’$; thus we may restrict attention to $q \geq 2$. Now the result follows from Lemma 41 by taking $a = h^{=\emptyset}$ and $\boldsymbol{X} = h^{=\{1\}}({\boldsymbol{x}})$. $\Box$