Thursday, August 19, 2004

First, thanks to Jeff Erickson for the lone comment on my previous post. For a second I was worried that I would have to post about Quantum Learning to bait Scott Aaronson into commenting. Fortunately, we're past that. I also realize now that being in Chicago I'm just a two hour drive away from you Jeff at UIUC. Maybe you, Ernie, and I should get some 3-D hotcakes at the local Waffle House-- on me.

Many top notch researchers come to visit the Toyota Technological Insitute here in Chicago. This fall alone Lenore Blum, Bruno Codenotti, Prahladh Harsha, and Jaikumar Radhakrishnan will be in residence along with TTI's regular faculty.

Yishay Mansour visited in August for about two weeks, and I had the chance to ask him about his work along with N. Linial and N. Nisan which pioneered the use of discrete Fourier analysis in computational learning theory. The paper Constant Depth Circuits, Fourier Transform, and Learnability gives a quasi-polynomial time learning algorithm for constant depth circuits with respect to the uniform distribution. More importantly, it pointed out a connection between the Fourier concentration of a Boolean function and its learnability.

If f is a Boolean function, we could imagine writing f as a multilinear polynomial in n variables mapping {-1,1}^n to {-1,1}. Every Boolean function can be written as such a polynomial whose total degree is at most n. Each coefficient of this polynomial measures the correlation of f with that monomial. These are the Fourier coefficients of f. Linial, Mansour, and Nisan showed that if the sum of the squares of the coefficients of monomials of degree d and larger is small, then there is an n^{O(d)} time algorithm for learning f with respect to the uniform distribution-- roughly speaking it is sufficient to estimate the coefficients of only the low degree monomials and output this polynomial.

What does any of this have to do with constant depth circuits? They also proved that for any circuit of depth d, the sum of the squares of the coefficients on the terms of degree (log n)^d and higher decays rapidly. From the above paragraph this gives us a quasi-polynomial time algorithm. For a rough intuition as to why this is true, consider one implication of Hastad's Switching Lemma, namely that parity cannot even be approximated by constant depth circuits. Thus every coefficient of a sufficiently large monomial (each monomial is a parity function) must be small.

As it turns out, according to Yishay, the work stemmed from an effort to prove circuit lower bounds, rather than a plan to develop new learning algorithms. The authors were inspired by Kahn, Kalai, and Linial's work on the influence of variables on Boolean functions and thought that discrete Fourier analysis might be the right tool for studying circuits. They happened to be correct-- in an unexpected way.