In a recent paper with Alex Arkhipov on "The Computational Complexity of Linear Optics," we needed to assume a reasonable-sounding probabilistic conjecture: namely, that the permanent of a matrix of i.i.d. Gaussian entries is "not too concentrated around 0." Here's a formal statement of our conjecture:

This conjecture seems interesting even apart from our application, so I wanted to bring it to people's attention -- maybe there's a simple/known proof that we're missing!

Here's what we do know:

The expectation of Per(X) is of course 0 (by symmetry), while the standard deviation is $\sqrt{n!}$. Thus, our conjecture basically says that "Per(X) is polynomially smaller than its standard deviation only a 1/poly(n) fraction of the time."

Recently, Terry Tao and Van Vu proved a wonderful anti-concentration bound for the permanents of Bernoulli matrices, which can be stated as follows: for all $\varepsilon > 0$ and sufficiently large n,
$\Pr_{X\in\left\{ -1,1\right\} ^{n\times n}}\left[ \left\vert
\operatorname*{Per}\left( X\right) \right\vert \leq \frac{\sqrt{n!}%
}{n^{\varepsilon n}}\right] \leq \frac{1}{n^{0.1}}.$
Unfortunately, their result falls short of what we need in three respects. First, it's for Bernoulli matrices rather than Gaussian matrices. (Though of course, the Gaussian case might well be easier than the Bernoulli case, which is our main reason for optimism!) Second, and most important, Tao and Vu only prove that Per(X) is at least a 1/nεn fraction of its standard deviation with high probability, whereas we need that it's at least a 1/poly(n) fraction. Third, they upper-bound the probability of a "bad event" by 1/n0.1, whereas we'd like to upper-bound it by 1/p(n) for any polynomial p.

We can prove that our conjecture holds with the determinant in place of the permanent. To do so, we use the fact that if X is Gaussian, then because of the rotational invariance of the Gaussian measure, there's an explicit formula for all the moments of Det(X) -- even the fractional and inverse moments.

One might wonder if we can also calculate the higher moments of Per(X), and use that to prove our conjecture. Indeed, we can show that $\operatorname*{E}_{X\sim\mathcal{N}\left( 0,1\right) _{\mathbb{C}}^{n\times n}}\left[ \left\vert \operatorname*{Per}\left( X\right)\right\vert ^{4}\right] =\left( n!\right) ^{2}\left( n+1\right)$, which then implies the following weak anti-concentration bound: for all β<1, $\Pr_{X\sim\mathcal{N}\left( 0,1\right) _{\mathbb{C}}^{n\times n}}\left[ \left\vert \operatorname*{Per}\left( X\right) \right\vert \geq
\beta\sqrt{n!}\right] \geq\frac{\left( 1-\beta^{2}\right) ^{2}}{n+1}$. Unfortunately, computing the 6th, 8th, and higher moments seems difficult.

Short of proving our anti-concentration conjecture, here are two easier questions whose answers would also greatly interest us:

Can we at least reprove Tao and Vu's bound for Gaussian matrices rather than Bernoulli matrices? In their paper, Tao and Vu say their result holds for "virtually any (not too degenerate) discrete distribution." I don't think the Gaussian distribution would present serious new difficulties, but I'm not sure.

Does the pdf of Per(X) diverge at the origin? (We don't even know the answer to that question in the case of Det(X).) I don't know of any formal implications between this question and the anti-concentration question, but it would be great to answer anyway.

1 Answer
1

I did a preliminary feasibility analysis of our methods and it appears possible that one may be able to tighten our $n^\epsilon$ loss to something more like $\exp( \sqrt{n} )$ in the Gaussian case, but this is still well short of what you want. The main obstacle is potential coupling between permanents of minors, which we were not able to fully avoid.

Here's the heuristic calculation. Suppose that a Gaussian $k \times k$ permanent has some distribution $P_k$. Then a Gaussian $k+1 \times k+1$ permanent $P_{k+1}$, by cofactor expansion, looks like

$$ P_{k+1} = \sum_{i=1}^{k+1} (-1)^i x_i P_k^{(i)}$$

where $x_i$ are iid Gaussians and the $P_k^{(i)}$ are copies of $P_k$ corresponding to various $k \times k$ minors of the $k+1 \times k+1$ matrix.

As the entries of the $k+1 \times k+1$ matrix are iid, the $x_i$ are independent of the $P_k^{(i)}$, and so for fixed values of $P_k^{(i)}$, we see that $P_{k+1}$ is distributed normally with mean zero and variance $\sum_{i=1}^{k+1} |P_k^{(i)}|^2$, which we can rewrite as

where $N_{k+1}$ is a standard normal random variable (independent of the $P_k^{(i)}$).

Now we come up against the key problem: the $P_k^{(i)}$ are identically distributed, but are not jointly independent, because there is a huge amount of overlap between the $k \times k$ minors. So, while heuristically one expects concentration of measure to kick in and make $(\sum_{i=1}^{k+1} |P_k^{(i)}|^2)^{1/2}$ more concentrated than any one of the $P_k^{(i)}$, we don't know how to prevent huge correlations from happening. In the worst case, all the $P_k^{(i)}$ are perfectly correlated to each other, and then (1) could become something more like

$$ P_{k+1} = (k+1)^{1/2} |P_k| \cdot N_{k+1}.$$

This multiplicative normal process would lead to $P_n$ to concentrate between $\sqrt{n!} \exp(-O(\sqrt{n}))$ and $\sqrt{n!} \exp(O(\sqrt{n}))$, as can be seen by taking logs and applying the central limit theorem.

But this worst case can't actually be the truth - among other things, it contradicts the second moment calculation. So there should be some way to prevent this correlation. Unfortunately, my paper with Van completely evades this issue - we try to get as far as we can just from the obvious fact that disjoint minors are independent from each other. This is why our bounds are significantly worse than $\exp(O(\sqrt{n}))$ from the truth.

As you say, the situation is much better for the determinant of a Gaussian iid matrix. Here, we can use the base-times-height formula to express the determinant as a product $\prod_{i=1}^n \hbox{dist}(X_i,V_i)$, where $X_i$ is the $i^{th}$ row and $V_i$ is the span of the first $i-1$ rows. With everything being Gaussian, $\hbox{dist}(X_i,V_i)^2$ is a chi-squared distribution, which is a martingale in $i$, and as a consequence one can get the determinant within $\exp(O(\sqrt{\log n}))$ of $\sqrt{(n-1)!}$, which would give you what you want. Unfortunately, there is nothing like the base-times-height formula for the permanent...

Finally, I am fairly certain that at the level of $n^{-\epsilon n}$ losses, one can replicate our paper in the Bernoulli case to the Gaussian case. I don't think the $n^{-\epsilon n}$ loss gets much better in that case, but the $\frac{1}{n^{0.1}}$ bound should improve substantially.

Thanks so much, Terry; that's incredibly helpful! Improving the n^{eps*n} loss to exp(sqrt(n)) would be great progress. I agree that finding some way to control the coupling between minors is the key issue -- we also realized that, if (counterfactually) the minors were perfectly correlated, then the bound we're seeking would be false. Alas, since the permanent is #P-complete, it will presumably never have a geometric interpretation like the determinant does, which lets us sidestep the coupling issue. Still, it would be great if we could somehow avoid "slogging it out minor by minor"! :-)
–
Scott AaronsonNov 12 '10 at 18:30