None of these papers got mentioned in the New York Times so the most notable paper of the year goes to Deolalikar's P ≠ NP. Many of you got upset that I didn't give this paper the respect it didn't deserve. I did appreciate the publicity the paper generated for our great open problem but the status of the P versus NP question remains: still open.

Last year we highlighted several new blogs. This year the trend is reversing as several theory bloggers have slowed down or stopped blogging. A few of our commentors got very ugly on our blog this year and finally we have given in to comment moderation, though we rarely block.

Wednesday, December 22, 2010

Yesterday the Census Bureau announced the new apportionment of the 435 representatives to states based on the 2010 census. Illinois lost one representative. Texas gains four. Not only do these affect the makeup of the House of Representatives but also the Electoral College that chooses the president.

Since 1940 the apportionment is not done by a specific formula but by an algorithm.

Input: Pop, a population array for the 50 states.

Output: Rep, a representatives array for the 50 states.

Let Rep[i] = 1 for each state i.

For j = 51 to 435

Let i = arg max Pop[i]/sqrt(Rep[i]*(Rep[i]+1))

Rep[i] = Rep[i]+1

This algorithm, called the Huntington-Hill method or the Method of Equal Proportions, minimizes the relative difference between sizes of congressional districts.

Implemented naively the running time is O(rn) for n the number of states and r the number of representatives. I'll leave it to you readers to find better implementations. Can I compute how many representatives New Jersey gets from Pop in o(n) space?

Of course with n=50 and r=435 the numbers don't get big enough to cause any problem at all for today's computers. I wonder how long it took in 1940?

Monday, December 20, 2010

We all recently heard about the breakthrough complexity result by Ryan
Williams on non-uniform circuit lower bounds. Here is a breakthrough from
algorithms side.

All of us may heard about the Christofides' algorithm from 1976 for metric
Traveling Salesman Problem (TSP), in which, given an undirected graph G
with nonnegative edge weights on the edges satisfying the triangle
inequality, the goal is find a shortest possible tour that visits each
city once. The algorithm and its analysis is very simple. Create the
minimum spanning tree (MST) T of G, find a minimum weight perfect matching
M (T-join) in the complete graph over the vertices with odd degree in T,
combine the edges of M and T to form an Eulerian circuit and shortcut the
circuit by skipping visited nodes. This simple algorithm is a
3/2-approximation algorithm for metric TSP since both the weight of T and
twice the weight of M are lowerbounds for the optimum. You can write a
natural LP known as Held-Karp relaxation for the problem, for which one
can show an integrality gap of at most 3/2 essentially via Christofides'
algorithm. For lowerbounds, we know that the integrality gap is at least
4/3 for this LP.
(ADDED BY BILL: According to Wikipedia it is known that you can never get
an approx better than 220/219=1.00456... times opt, unless P = NP.
ADDED LATER:
Here
is the link to the Wikipedia post and
here
is the paper by Papadrimtriou and Vempala that contains the result.
The post lead me to find it.
)

Though Christofides' algorithm seems very simple and easy to improve, so
far there has been no progress in this regard despite consistent efforts
from 1976. Today I'm happy to announce a breakthrough for this problem by
Shayan Oveis-Gharan, Amin Saberi and Mohit Singh.

They obtain a (3/2-c)-approximation algorithm, for some positive constant
c, for the graphical TSP where the metric is the shortest path distance in
an unweighted graph (a very natural case considered by many so far). Their
approach is natural also, they find the tour by sampling a random spanning
tree from a maximum entropy distribution defined by the linear programming
relaxation of the problem (similar to the approach of Asadpour, Goemans,
Madry, Oveis-Gharan, and Saberi from SODA'10 on improved algorithms for
asymmetric TSP). Since again the cost of this tree is upper bounded by the
optimum solution, it suffices to show that the cost of its Eulerian
augmentation (or T-join) is
strictly less than half of the optimum. Of course the analysis is not
simple at all. According to the authors:

``The first ingredient of the proof is a deeper study of random spanning
trees and their corresponding generating functions. We build on a
recent and very interesting study of real-stable polynomials and their
applications in discrete probability. In particular, we use these
techniques to bound the probability of events defined on the parity of
the number of edges chosen from a certain subset to be in the random
spanning tree. This is crucial for bounding the cost of the T-join.

The other essential ingredient of our proof is a more detailed
analysis of the near-minimum cuts of the LP solution. We study the
cross graph of (1+d)-near minimum cuts for some d < 1/100 and observe
several interesting properties. In particular, we show that the cross
graph corresponding to these cuts has a structure very similar to
cactuses. Our analysis generalizes to all graphs and it could be of
independent interest.''

Amin told me that they should post the paper in arxiv by the end of this month.

Thursday, December 16, 2010

If A is a set then A' (pronounced 'A jump') is
the halting set relative to A. Formally it is:

{ e | MeA(e) converges }

A set A is low if A' &leT HALT.

A set A is superlow if A' &lett HALT.

There is a well known construction of an undecidable low c.e.
(What I call c.e., computably enumerable, used to be called r.e., recursively enumerable.
See Soare's article
for why the change makes sense.)
This is one way to get an
intermediary Turing Degree.
However, it turns out that the set is not just low, its superlow.
Consider the following definition of supersuperlow which I made up:

A set A is supersuperlow if A' &lebtt HALT.

I wondered if there exists an undecidable supersuperlow set.
I asked four prominent computability theorists
(spellcheck wanted me to write computable theorists).
They all
(1) thought it was a good question,
(2) thought the answer was NO and KNOWN,
and
(3) didn't have a proof or reference.
I then asked Carl Jockusch and he (1) thought it was a good question,
(2) knew the answer was NO and KNOWN, and
(3) had both a proof and a reference.

Since they all thought it was a good question, and because
papers that contain the needed results are behind paywalls or unpublished, and hence lost
to humanity forever, I did an exposition, which you can get free online
here
where I present the following:

The classic proof that there exists an undecidable c.e. low set.

Comments on this classic proof that show how it really resulted in an undecidable c.e. superlow set.

A complete unpublished proof that if A is supersuperlow then A is decidable.

Should I add more to it and make it a real survey for a real journal?

PRO: The referees comments may help improve it.

PRO: If the journal is free online then it will be better archived.

PRO: I will get to learn all of this material.

PRO: A paper on my resume.

CON: Referees can be a pain in the neck.

CON: Math arXiv is just as good, perhaps better, than a journal for archiving
(I will certainly polish it a bit and submit it to math arXiv at some point)

CON: I will have to learn all this stuff. Do I care beyond what I have already found out?
I stopped doing recursion theory back when it was still called recursion theory. Now I'd rather spend my time and energy
on Ramsey Theory and other things. (I could, of course, do a survey of Computable Ramsey Theory. Sample theorems
by Carl Jockusch: (1) If COL is a COMPUTABLE 2-coloring of the edges of the complete
graph on N then there exists a &Pi2 homogeneous set.
(2) There exists a COMPUTABLE 2-coloring of the edges of the complete graph on N such
that no homogeneous set is &Sigma2.
There is one problem with that: I already did! Its part of my
survey of recursive combinatorics.

CAVEAT: Is having a survey on my resume that valuable? I ask this non-rhetorically which is why this is a
CAVEAT rather than a CON.

Monday, December 13, 2010

In the last month we have reported on NEW RESULTS by
Williams,
Katz and Guth,
Sanders,
and
Pinkerton and Setra.
For a change of pace lets look at some really OLD math from
a really OLD book- the Bible.
(NOTE- this has nothing to do with whether the Bible is true,
just that its old.)

In Genesis 18 God wants to destroy Sodom and Gomorrah.
Abraham wants to argue against this.
Here is a paraphrase using modern terms.

GOD: If there are 50 righteous people in Sodom then I will spare the city.

ABRAHAM: What if there are 45? 45 is pretty close to 50.

GOD: Okay. If there are 45 righteous people then I will spare the city.

ABRAHAM: What if there are 40? 40 is pretty close to 45.

GOD: Okay. If there are 40 righteous people then I will spare the city.

ABRAHAM: You know, uh, 30 is pretty close to 40.

GOD: Okay. If there are 30 righteous people then I will spare the city.

ABRAHAM: 20 is only 10 less than 30 so how about....

GOD: Okay. If there are 20 righteous people then I will spare the city.

ABRAHAM: 10 is only 10 less than 20 so how about....

GOD: Okay. If there are 10 righteous people then I will spare the city.

ABRAHAM: Good. (He stops bargaining. Perhaps he shouldn't have--- They couldn't find
10 and the city was destroyed.)

I think Abraham should have used smaller increments as he went down
since going from 20 to 10 is cutting it in half which sounds like a lot.
But who am I to argue--- he got God down to 10 which is impressive
even though it didn't work.

This reminds me of two paradoxes:

The Small Number Paradox. All numbers are small. Proof by induction.
Clearly 0 is small. If n is small then adding 1 can't make it much bigger,
so n+1 is small. Hence all numbers are small.

The Sorties Paradox
also called The Heap Paradox.
1,000,000 grains of sand makes a heap.
If n grains of sand make a heap then so do n-1 grains of sand.
Hence 0 grains of sand make a heap.

Is the passage in Genesis 18 really math?
In a very primitive form I would say yes.
Is there any older source for anything that resembles math?

Thursday, December 09, 2010

(CONGRADS to all the new
ACM fellows.
Among them are theorists Jennifer Chayes, Anne Condon, Phil Klein, S. Muthu, and Dan Spielman.)

In my post about
mentoring High School Students
I noted that for every $1000 my HS mentees win in math and science research competitions I get a free lunch.
Two of my students,
James Pinkerton and Rafael Setra entered the Team Siemens Competition and
won $23,000 each!
(The article says $20,000 but James and Rafael tell me it's $23,000.)
Looks like I will be eating well for a while.
If you visit UMCP and they are free for lunch with us then you can have a free
lunch from them.

Here is
the announcement of the winners. They came in third place in the team category.

Their project was on DUPLICATOR SPOILER games.
(Like most games defined in math papers
this game is not fun.)
Recall that these games provide a way to prove certain properties are not expressible in certain
languages (e.g., well-foundness for linear orderings is not first-order expressible).
These games go for a FINITE number of moves that is specified before the game begins.
They looked at these games when the number of moves can be an ordinal.
Here is how it works: (1) at the beginning of the game there is a counter that has in the
ordinal &alpha, and (2) after SPOILER makes his move he decrments the counter to some
ordinal &beta < &alpha of his choice.
Here is one result: for any ordinal &alpha there exists two linear orderings
L1 and L2 such that if the game is played with these two orderings
and both DUPLICATOR and SPOILER play perfectly
then SPOILER wins the game in exactly &alpha moves.
This is not just an existence proof- they actually say what the orderings are.
They are natural in that they were not constructed just for the purpose of
having this property.

The money actually goes towards tuition and other college-related expenses.

What is the secret of mine or their success?
There are two things that are needed here (this applies to ugrad projects, Masters Thesis,
PhD thesis, and, if properly generalized, to life):

Student that are hard working, intelligent, and care about the material.
(James actually thinks Dup-SPOILER games are fun!)

A project that is doable.

Note that if a project doesn't go very far
it might be that the PROJECT wasn't good or that the STUDENTS weren't good.
This happens quite a bit, though ALWAYS something
can be salvaged for a high school project.

I usually get to re-use projects since some students don't do a good job
or give up. This was the FIRST TIME I had done the DUP-SPOILER GAMES WITH ORDINAL NUMBER OF MOVES
project. Now, alas, I can't use it again.

Monday, December 06, 2010

Non-uniform lower bounds establish impossibility results for computation in the physical world: it could be that P ≠ NP, yet NP-complete problems can still be efﬁciently solved using a “bloated” program with sufficiently many lines of code. Non-uniform circuit size lower bounds for NP would rule out this possibility.

The class P contains a language L where have a single fixed program that efficiently solves L for all inputs at all lengths. L is in P/poly if for every length n there is a program of size polynomial in n that efficiently solves L for all inputs of length n. The programs for two different lengths may have no relation to each other.

Karp and Lipton showed that if NP is in P/poly then the polynomial-time hierarchy collapses so we generally don't believe that NP is in P/poly. Suppose though we lived in a world where NP is in P/poly but still P ≠ NP.

In one argument, even though we study complexity in asymptotics we generally live at one input length, the size of the problems we want to solve. Once we find the program that works at this length then P = NP for us, even though P ≠ NP in general.

But the only constant over time is time itself. An hour now is the same as an hour in 1971, but the amount we can compute in that hour has grown dramatically. We don't care about fixed problem sizes, rather we care about the largest problem we can solve in a given amount of time. As our technology gets better, those input sizes also grow. For problems in P, like linear programming, the algorithms we have for smaller inputs also work on larger inputs by just increasing the computation time. However, if NP is in P/poly the code for NP-complete problems that worked a few years ago may fail miserably now as we may have to find completely different code for the larger inputs we care about now. If P ≠ NP we can't have an efficient process to find that code even though we know it exists.

Thursday, December 02, 2010

We use the following
terminology: [n] means the set {1,...,n}. k-AP means an arithmetic progression of length k.
A 3-free set is one with no 3-AP.s in it.
We use the following conventions: (1)all inequalities have a big-O or big-&Omega which
we do not include, and (2) we have the pointers to the papers be at the authors name when possible.

For large n, for all A &sube [n], |A| &ge n(log log n)5/log n
then A has a 3-AP.

I give the history and some misc information about the result.

In 1927 van der Waerden published the following theorem which is now known
as van der Waerden's theorem:

For all k, for all c, there exists W=W(k,c) such that for all c-colorings of
[W] there exists a monochromatic k-AP.

Online sources:
(1)
Wikipedia,
(2)blog by
Gilish Varma,
and
(3) preprint of a book by
Gasarch et al.
The upper bounds on W(k,c) from the original proof and the links (which are essentially the same) are BIG.
In particular they are not primitive recursive.
(NOTE- if you know of an online source for van der warden's original article and/or a
translation into English, let me know.)

(ER1)
For all k, for all &epsilon for large enough n,
for all A &sube {1,...,n} such that |A| &ge &epsilon n, A has a k-AP.

(ER2)
Let A be a set of natural numbers.
If &Sigmax ∈ A 1/x diverges then A has arbitrarily long arithmetic sequences.

Both of these imply VDW's theorem. The hope was that they would provide new proofs
with better bounds.

Roth showed ER1 for k=3. In fact, he showed the following:
for large enough n, for all A &sube {1,...,n} such that |A| &ge n/log log n, A has a 3-AP.
The proof used non-combinatorial methods.
For an online proofs see either
Terry Tao's blog
or
Alex Iosevich's notes.

Szemeredi
proved ER1 for k=4 and then for general k.
The proof for general k used VDW's theorem and hence did not provide better bounds
for W(k,c).
Szmeredi's Regularity Lemma
which he developed to prove it, has found
many many applications.
His proof for k=4 was purely combinatorial. A scaled down version of it for k=3
was presented in Graham-Rothchild-Spencer in about 2 pages.
For an online exposition of this see pages 121-130 of
Gasarch et al.
For an online exposition of the proof for the general case see this exposition by
Tao.

Furstenberg
obtained a different proof of ER1 using ergodic theory.
This proof was nonconstructive and hence yielded no bounds on the VDW numbers.
His technique was later used
by
Bergelson and Leibman
to prove the poly VDW theorem and
Poly HJ Theorem.
Later
Walters
found a combinatorial proofs for both. See also the exposition by
Gasarch et al.

Gowers
proved ER1 in a way that lead to much better bounds on the VDW numbers.

A purely combinatorial proof of the Density Hales-Jewitt theorem was done by
the polymath group.
This is likely the easiest proof of ER1.

Green and Tao
showed that the set of primes have arb large arithmetic progressions.
Aside from this, there seems to have been little progress on ER2.
Some people
think ER2 should replace Poincare's conjectures on the list of Millennium Prizes.

That would seem to be the end of the story for now. ER1 was proven in a way that
lead to better bounds on VDW numbers. But notice Roth's theorem. The condition
on the density of a set that lead to a 3-AP are better than ER1.
There have been improvements on Roth's theorem.

Szemeredi
and Heath-Brown obtained the following result independently:
There exists a constant d such that,
for large enough n, for all A &sube [n], |A| &ge n/(log n)d,
A has a 3-AP.
There is a nice exposition by
Green.
(NOTE- if you know of an online source for the original papers let me know.)

Bourgain
obtained the following result:
for large enough n, for all A &sube [n], |A| &ge n((log log n)/log n)0.5 A has a 3-AP.

OKAY- why is this important?
It breaks a barrier that seemed very stubborn and it also
leads to better bounds on W(3,c):

Let f be a function such that
for large enough n, for all A &sube [n], |A| &ge f(n),
A has a 3-AP.
Assume [n] is c-colored.
Some color must appear n/c times.
If n/c &ge f(n) then there is a 3-AP.
Hence we need n/f(n) &ge c.
So if n/f(n) &ge c then W(3,c) &le n.

Roth's result yields that W(3,c) &le 22O(c).
(
Graham and Solymosi
obtained this result purely combinatorially.)

Szemeredi's and Heath-Brown result yields W(3,c) &le 2cO(1).

Bourgain's first result yields W(3,c) &le 2c2log c.

Bourgain's second result improves the exponent of c from 2 to 3/2-o(1)

Sanders's result results improve the exponent of c from 3/2-o(1) to 4/3-o(1).

Sanders NEW RESULT yields W(3,c) &le 2c(log c)5.

The theorems above are of the form:
If A is big enough then A has a 3-AP. What about the other
side of the ledger: How large can 3-free sets be?
There has been an empirical study for small n by
Gasarch et al.,
an unpolished (not yet submitted) survey by
Gasarch et al..
(ADDED LATER- SOMEONE POSTED A WEBSITE WHERE YOU CAN FIND ACTUAL LARGE 3-FREE SETS.
here it is.)
Behrend had the largest 3-free sets for about 50 years before
Elkin
obtained an improvement.
A shorter proof of Elkin's result was given by
by
Green and Wolf.
The current state of affairs on this was summarized in a nice blog entry by
Gil Kalai.

Do these result help find lower bounds on W(3,c)?
YES!
Chandra-Furst-Lipton in
their paper on multiparty protocols (of all things!)
showed that if A &sube [n] and A is k-free then there is a coloring of [n] with nlog n/|A| colors
without any k-AP's. Its an easy prob argument.

SO- how far are we from showing ER2 in the case of k=3?
If Sanders NEW result could be improved to |A| &ge n/(log n)1+&delta
or even |A| &ge n/log n (log log n)1+&delta or
any function such that if you divide by n2 the series converges
then ER2 for k=3 would be proven.
How far away is this result?
A few weeks ago I would have told you that getting NEXP not in ACC0
was not in likely for at least 20 years.
Ryan Williams
proved me WRONG!
Two different knowledgeable sources told me personally that the Erdos Distance Problem had reached an impasse
at n0.864.
Guth and Katz
proved them WRONG and me WRONG for believing them.
Hence, for this one, I do not venture a guess.

What about k-AP's and k-free sets?

Gowers
has shown that there is a constant c such that
if |A| &ge n/(log log n)c then A has a 4-AP.
Gowers
also proved that a function c(k)
such that if
|A| &ge n/(log log n)c(k) then A has a k-AP.
In the paper cited he takes c(k) to be 2-2k+9;
however, I have heard this has been improved.

Green and Tao.
have shown that there is a constant c such that if |A| &ge n/(log n)c then
A has a 4-AP.
Tao's website also indicates they have improved this and a paper is in
preparation.

Laba and Lacey
have a construction of k-free sets which it turns out was already known.

This post has 35 distinct links
(UPDATE- now its 37)
and mentioned four Field medalists: Roth (1958), Bourgain (1994), Gowers (1998),
and Tao (2006). These are probably both complexityblog records.

For more information on this NEW result see also a nice blog posting by
Gil Kalai.