The location of an individual eigenvalue is by now quite well understood. If we normalise the entries of the matrix to have mean zero and variance , then in the asymptotic limit , the Wigner semicircle law tells us that with probability one has

where the classical location of the eigenvalue is given by the formula

and the semicircular distribution is given by the formula

Actually, one can improve the error term here from to for any (see this previous recent paper of Van and myself for more discussion of these sorts of estimates, sometimes known as eigenvalue rigidity estimates).

From the semicircle law (and the fundamental theorem of calculus), one expects the eigenvalue spacing to have an average size of . It is thus natural to introduce the normalised eigenvalue spacing

and ask what the distribution of is.

As mentioned previously, we will focus on the bulk case , and begin with the model case when is drawn from GUE. (In the edge case when is close to or to , the distribution is given by the famous Tracy-Widom law.) Here, the distribution was almost (but as we shall see, not quite) worked out by Gaudin and Mehta. By using the theory of determinantal processes, they were able to compute a quantity closely related to , namely the probability

that an interval near of length comparable to the expected eigenvalue spacing is devoid of eigenvalues. For in the bulk and fixed , they showed that this probability is equal to

where is the Dyson projection

to Fourier modes in , and is the Fredholm determinant. As shown by Jimbo, Miwa, Tetsuji, Mori, and Sato, this determinant can also be expressed in terms of a solution to a Painleve V ODE, though we will not need this fact here. In view of this asymptotic and some standard integration by parts manipulations, it becomes plausible to propose that will be asymptotically distributed according to the Gaudin-Mehta distribution, where

A reasonably accurate approximation for is given by the Wigner surmise, which was presciently proposed by Wigner as early as 1957; it is exact for but not in the asymptotic limit .

Unfortunately, when one tries to make this argument rigorous, one finds that the asymptotic for (1) does not control a single gap , but rather an ensemble of gaps , where is drawn from an interval of some moderate size (e.g. ); see for instance this paper of Deift, Kriecherbauer, McLaughlin, Venakides, and Zhou for a more precise formalisation of this statement (which is phrased slightly differently, in which one samples all gaps inside a fixed window of spectrum, rather than inside a fixed range of eigenvalue indices ). (This result is stated for GUE, but can be extended to other Wigner ensembles by the Four Moment Theorem, at least if one assumes a moment matching condition; see this previous paper with Van Vu for details. The moment condition can in fact be removed, as was done in this subsequent paper with Erdos, Ramirez, Schlein, Vu, and Yau.)

The problem is that when one specifies a given window of spectrum such as , one cannot quite pin down in advance which eigenvalues are going to lie to the left or right of this window; even with the strongest eigenvalue rigidity results available, there is a natural uncertainty of or so in the index (as can be quantified quite precisely by this central limit theorem of Gustavsson).

The main difficulty here is that there could potentially be some strange coupling between the event (1) of an interval being devoid of eigenvalues, and the number of eigenvalues to the left of that interval. For instance, one could conceive of a possible scenario in which the interval in (1) tends to have many eigenvalues when is even, but very few when is odd. In this sort of situation, the gaps may have different behaviour for even than for odd , and such anomalies would not be picked up in the averaged statistics in which is allowed to range over some moderately large interval.

The main result of the current paper is that these anomalies do not actually occur, and that all of the eigenvalue gaps in the bulk are asymptotically governed by the Gaudin-Mehta law without the need for averaging in the parameter. Again, this is shown first for GUE, and then extended to other Wigner matrices obeying a matching moment condition using the Four Moment Theorem. (It is likely that the moment matching condition can be removed here, but I was unable to achieve this, despite all the recent advances in establishing universality of local spectral statistics for Wigner matrices, mainly because the universality results in the literature are more focused on specific energy levels than on specific eigenvalue indices . To make matters worse, in some cases universality is currently known only after an additional averaging in the energy parameter.)

The main task in the proof is to show that the random variable is largely decoupled from the event in (1) when is drawn from GUE. To do this we use some of the theory of determinantal processes, and in particular the nice fact that when one conditions a determinantal process to the event that a certain spatial region (such as an interval) contains no points of the process, then one obtains a new determinantal process (with a kernel that is closely related to the original kernel). The main task is then to obtain a sufficiently good control on the distance between the new determinantal kernel and the old one, which we do by some functional-analytic considerations involving the manipulation of norms of operators (and specifically, the operator norm, Hilbert-Schmidt norm, and nuclear norm). Amusingly, the Fredholm alternative makes a key appearance, as I end up having to invert a compact perturbation of the identity at one point (specifically, I need to invert , where is the Dyson projection and is an interval). As such, the bounds in my paper become ineffective, though I am sure that with more work one can invert this particular perturbation of the identity by hand, without the need to invoke the Fredholm alternative.

We can now turn attention to one of the centerpiece universality results in random matrix theory, namely the Wigner semi-circle law for Wigner matrices. Recall from previous notes that a Wigner Hermitian matrix ensemble is a random matrix ensemble of Hermitian matrices (thus ; this includes real symmetric matrices as an important special case), in which the upper-triangular entries , are iid complex random variables with mean zero and unit variance, and the diagonal entries are iid real variables, independent of the upper-triangular entries, with bounded mean and variance. Particular special cases of interest include the Gaussian Orthogonal Ensemble (GOE), the symmetric random sign matrices (aka symmetric Bernoulli ensemble), and the Gaussian Unitary Ensemble (GUE).

In previous notes we saw that the operator norm of was typically of size , so it is natural to work with the normalised matrix . Accordingly, given any Hermitian matrix , we can form the (normalised) empirical spectral distribution (or ESD for short)

of , where are the (necessarily real) eigenvalues of , counting multiplicity. The ESD is a probability measure, which can be viewed as a distribution of the normalised eigenvalues of .

When is a random matrix ensemble, then the ESD is now a random measure – i.e. a random variable taking values in the space of probability measures on the real line. (Thus, the distribution of is a probability measure on probability measures!)

Now we consider the behaviour of the ESD of a sequence of Hermitian matrix ensembles as . Recall from Notes 0 that for any sequence of random variables in a -compact metrisable space, one can define notions of convergence in probability and convergence almost surely. Specialising these definitions to the case of random probability measures on , and to deterministic limits, we see that a sequence of random ESDs converge in probability (resp. converge almost surely) to a deterministic limit (which, confusingly enough, is a deterministic probability measure!) if, for every test function , the quantities converge in probability (resp. converge almost surely) to .

Remark 1 As usual, convergence almost surely implies convergence in probability, but not vice versa. In the special case of random probability measures, there is an even weaker notion of convergence, namely convergence in expectation, defined as follows. Given a random ESD , one can form its expectation, defined via duality (the Riesz representation theorem) as

this probability measure can be viewed as the law of a random eigenvalue drawn from a random matrix from the ensemble. We then say that the ESDs converge in expectation to a limit if converges the vague topology to , thus

for all .

In general, these notions of convergence are distinct from each other; but in practice, one often finds in random matrix theory that these notions are effectively equivalent to each other, thanks to the concentration of measure phenomenon.

Exercise 1 Let be a sequence of Hermitian matrix ensembles, and let be a continuous probability measure on .

Show that converges almost surely to if and only if converges almost surely to for all .

Show that converges in probability to if and only if converges in probability to for all .

Show that converges in expectation to if and only if converges to for all .

We can now state the Wigner semi-circular law.

Theorem 1 (Semicircular law) Let be the top left minors of an infinite Wigner matrix . Then the ESDs converge almost surely (and hence also in probability and in expectation) to the Wigner semi-circular distribution

The semi-circular law nicely complements the upper Bai-Yin theorem from Notes 3, which asserts that (in the case when the entries have finite fourth moment, at least), the matrices almost surely has operator norm at most . Note that the operator norm is the same thing as the largest magnitude of the eigenvalues. Because the semi-circular distribution (1) is supported on the interval with positive density on the interior of this interval, Theorem 1 easily supplies the lower Bai-Yin theorem, that the operator norm of is almost surely at least, and thus (in the finite fourth moment case) the norm is in fact equal to . Indeed, we have just shown that the circular law provides an alternate proof of the lower Bai-Yin bound (Proposition 11 of Notes 3).

As will hopefully become clearer in the next set of notes, the semi-circular law is the noncommutative (or free probability) analogue of the central limit theorem, with the semi-circular distribution (1) taking on the role of the normal distribution. Of course, there is a striking difference between the two distributions, in that the former is compactly supported while the latter is merely subgaussian. One reason for this is that the concentration of measure phenomenon is more powerful in the case of ESDs of Wigner matrices than it is for averages of iid variables; compare the concentration of measure results in Notes 3 with those in Notes 1.

There are several ways to prove (or at least to heuristically justify) the circular law. In this set of notes we shall focus on the two most popular methods, the moment method and the Stieltjes transform method, together with a third (heuristic) method based on Dyson Brownian motion (Notes 3b). In the next set of notes we shall also study the free probability method, and in the set of notes after that we use the determinantal processes method (although this method is initially only restricted to highly symmetric ensembles, such as GUE).

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.