Archive

Well, I finally finished my ICM paper. It’s only 30 pp, but it took many sleepless nights to write and maybe about 10 years to understand what exactly do I want to say. The published version will be a bit shorter – I had to cut section 4 to satisfy their page limitations.

Basically, I give a survey of various recent and not-so-recent results in Enumerative Combinatorics around three major questions:

(1) What is a formula?(2) What is a good bijection?(3) What is a combinatorial interpretation?

Not that I answer these questions, but rather explain how one could answer them from computational complexity point of view. I tried to cover as much ground as I could without overwhelming the reader. Clearly, I had to make a lot of choices, and a great deal of beautiful mathematics had to be omitted, sometimes in favor of the Computational Combinatorics approach. Also, much of the survey surely reflects my own POV on the subject. I sincerely apologize to everyone I slighted and who disagrees with my opinion! Hope you still enjoy the reading.

Let me mention that I will wait for a bit before posting the paper on the arXiv. I very much welcome all comments and suggestions! Post them here or email privately.

Recall the Fibonacci numbers given by 1,1,2,3,5,8,13,21… There is no need to define them. You all know. Now take the Euler numbers (OEIS) 1,1,1,2,5,16,61,272… This is the number of alternating permutations in with the exponential generating function . Both sequences are incredibly famous. Less known are connection between them.

(1) Define the Fibonacci polytope to be a convex hull of 0/1 points in with no two 1 in a row. Then has vertices and vol This is a nice exercise.

(2) (by just a little). For example, . This follows from the fact that

and , where is the golden ratio. Thus, the product . Since and , the inequality is easy to see, but still a bit surprising that the numbers are so close.

Together with Greta Panova and Alejandro Morales we wrote a little note “Why is π < 2φ?” which gives a combinatorial proof of (2) via a direct surjection. Thus we obtain an indirect proof of the inequality in the title. The note is not a research article; rather, it is aimed at a general audience of college students. We will not be posting it on the arXiv, so I figure this blog is a good place to advertise it.

The note also explains that the inequality (2) also follows from Sidorenko’s theorem on complementary posets. Let me briefly mention a connection between (1) and (2) which is not mentioned in the note. I will assume you just spent 5 min and read the note at this point. Following Stanley, the volume of is equal to the volume of the chain polytope (=stable set polytope), see Two Poset Polytopes. But the latter is exactly the polytope that Bollobás, Brightwell and Sidorenko used in their proof of the upper bound via polar duality.

What did we do?

Let F ⊂ Sk be a finite set of permutations and let Cn(F) denote the number of permutations σ ∈ Sn avoiding the set of patternsF. The Noonan-Zeilbeger conjecture (1996), states that the sequence {Cn(F)} is always P-recursive. We disprove this conjecture. Roughly, we show that every Turing machine Tcan be simulated by a set of patterns F, so that the number an of paths of length n accepted by by T is equal to Cn(F) mod 2. I am oversimplifying things quite a bit, but that’s the gist.

What is left is to show how to construct a machine T such that {an} is not equal (mod 2) to any P-recursive sequence. We have done this in our previous paper, where give a negative answer to a question by Kontsevich. There, we constructed a set of 19 generators of GL(4,Z), such that the probability of return sequence is not P-recursive.

When all things are put together, we obtain a set F of about 30,000 permutations in S80 for which {Cn(F)} is non-P-recursive. Yes, the construction is huge, but so what? What’s a few thousand permutations between friends? In fact, perhaps a single pattern (1324) is already non-P-recursive. Let me explain the reasoning behind what we did and why our result is much stronger than it might seem.

Why we did what we did

First, a very brief history of the NZ-conjecture (see Kirtaev’s book for a comprehensive history of the subject and vast references). Traditionally, pattern avoidance dealt with exact and asymptotic counting of pattern avoiding permutations for small sets of patterns. The subject was initiated by MacMahon (1915) and Knuth (1968) who showed that we get Catalan numbers for patterns of length 3. The resulting combinatorics is often so beautiful or at least plentiful, it’s hard to imagine how can it not be, thus the NZ-conjecture. It was clearly very strong, but resisted all challenges until now. Wilf reports that Richard Stanley disbelieved it (Richard confirmed this to me recently as well), but hundreds of papers seemed to confirm its validity in numerous special cases.

Curiously, the case of the (1324) pattern proved difficult early on. It remains unresolved whether Cn(1324) is P-recursive or not. This pattern broke Doron Zeilberger’s belief in the conjecture, and he proclaimed that it’s probably non-P-recursive and thus NZ-conjecture is probably false. When I visited Doron last September he told me he no longer has strong belief in either direction and encouraged me to work on the problem. I took a train back to Manhattan looking over New Jersey’s famously scenic Amtrack route. Somewhere near Pulaski Skyway I called Scott to drop everything, that we should start working on this problem.

You see, when it comes to pattern avoidance, things move from best to good to bad to awful. When they are bad, they are so bad, it can be really hard to prove that they are bad. But why bother – we can try to figure out something awful. The set of patterns that we constructed in our paper is so awful, that proving it is awful ain’t so bad.

Why is our result much stronger than it seems?

That’s because the proof extends to other results. Essentially, we are saying that everything bad you can do with Turing machines, you can do with pattern avoidance (mod 2). For example, why is (1324) so hard to analyze? That’s because it’s even hard to compute both theoretically and experimentally – the existing algorithms are recursive and exponential in n. Until our work, the existing hope for disproving the NZ-conjecture hinged on finding an appropriately bad set of patterns such that computing {Cn(F)} is easy. Something like this sequence which has a nice recurrence, but is provably non-P-recursive. Maybe. But in our paper, we can do worse, a lot worse…

We can make a finite set of patterns F, such that computing {Cn(F) mod 2} is “provably” non-polynomial (Th 1.4). Well, we use quotes because of the complexity theory assumptions we must have. The conclusion is much stronger than non-P-recursiveness, since every P-recursive sequence has a trivial polynomial in n algorithm computing it. But wait, it gets worse!

We prove that for two sets of patterns F and G, the problem “Cn(F) = Cn(G) mod 2 for all n” is undecidable (Th 1.3). This is already a disaster, which takes time to sink in. But then it gets even worse! Take a look at our Corollary 8.1. It says that there are two sets of patterns F and G, such that you can never prove nor disprove that Cn(F) = Cn(G) mod 2. Now that’s what I call truly awful.

What gives?

Well, the original intuition behind the NZ-conjecture was clearly wrong. Many nice examples is not a good enough evidence. But the conjecture was so plausible! Where did the intuition fail? Well, I went to re-read Polya’s classic “Mathematics and Plausible Reasoning“, and it all seemed reasonable. That is both Polya’s arguments and the NZ-conjecture (if you don’t feel like reading the whole book, at least read Barry Mazur’s interesting and short followup).

Now think about Polya’s arguments from the point of view of complexity and computability theory. Again, it sounds very “plausible” that large enough sets of patterns behave badly. Why wouldn’t they? Well, it’s complicated. Consider this example. If someone asks you if every 3-connected planar cubic graph has a Hamiltonian cycle, this sounds plausible (this is Tait’s conjecture). All small examples confirm this. Planar cubic graphs do have very special structure. But if you think about the fact that even for planar graphs, Hamiltonicity is NP-complete, it doesn’t sound plausible anymore. The fact that Tutte found a counterexample is no longer surprising. In fact, the decision problem was recently proved to be NP-complete in this case. But then again, if you require 4-connectivity, then every planar graph has a Hamiltonian cycle. Confused enough?

Back to the patterns. Same story here. When you look at many small cases, everything is P-recursive (or yet to be determined). But compare this with Jacob Fox’s theorem that for a random single pattern π, the sequence {Cn(π)} grows much faster than originally expected (cf. Arratia’s Conjecture). This suggests that small examples are not representative of complexity of the problem. Time to think about disproving ALL conjectures based on that evidence.

If there is a moral in this story, it’s that what’s “plausible” is really hard to judge. The more you know, the better you get. Pay attention to small crumbs of evidence. And think negative!

What’s wrong with being negative?

Well, conjectures tend to be optimistic – they are wishful thinking by definition. Who would want to conjecture that for some large enough a,b,c and n, there exist a solution of an + bn = cn? However, being so positive has a drawback – sometimes you get things badly wrong. In fact, even polynomial Diophantine equations can be as complicated as one wishes. Unfortunately, there is a strong bias in Mathematics against counterexamples. For example, only two of the Clay Millennium Problems automatically pay $1 million for a counterexample. That’s a pity. I understand why they do this, just disagree with the reasoning. If anything, we should encourage thinking in the direction where there is not enough research, not in the direction where people are already super motivated to resolve the problem.

In general, it is always a good idea to keep an open mind. Forget all this “power of positive thinking“, it’s not for math. If you think a conjecture might be false, ignore everybody and just go for disproof. Even if it’s one of those famous unsolved conjectures in mathematics. If you don’t end up disproving the conjecture, you might have a bit of trouble publishing computational evidence. There are some journals who do that, but not that many. Hopefully, this will change soon…

Happy ending

When we were working on our paper, I wrote to Doron Zeilberger if he ever offered a reward for the NZ-conjecture, and for the disproof or proof only? He replied with an unusual award, for the proof and disproof in equal measure. When we finished the paper I emailed to Doron. And he paid. Nice… 🙂

Say, you have written a paper. You want to submit it to a journal. But in what field? More often than not, the precise field/area designation for this paper is easy to determine, or at least easy to place it into some large category. Even if the paper is in between fields, this is often well regarded and understood situation, nothing wrong with that. Say, the paper is resolving a problem in field X with tools from field Y. Submit to X-journal unless the application is routine and the crux of the innovation is in refining the tools. Then submit to Y-journal.

However, when it comes to CS, things are often less clear. This is in part because of the novelty of the subject, and in part due to the situation in CS theory, which is in constant flux and search for direction (a short Wikipedia article is as rather vague and unhelpful, even more so than these generic WP articles tend to be).

The point of this post is to introduce/describe the area of “Computational Combinatorics“. Although Google returns 20K hits for this term (including experts, courses, textbooks), the meaning is either obscure or misleading. We want to clarify what we mean, critique everyone else, and make a stake for the term!

1) What I want computational combinatorics to mean is “theoretical CS aspects of combinatorics” (and to a lesser extend “practical..”), which is essentially part of combinatorics but the tools and statements use compute science terminology (for a concise description of complexity aspects, see dated but excellent survey by David Shmoys and Eva Tardos). I will give a recent example below, but basically if you want to prove a negative result in combinatorics (as in “one should not expect a nice formula for the number of 3-colorings or perfect matchings of a general graph”), then CS language (and basic tools) is a way to go. When people use “computational combinatorics” to mean “basic results in combinatorics that are useful for further studies of computer science”, they are being misleading. A proper name for such course is “Introduction to Combinatorics” or “Combinatorics for Computer Scientists”, etc.

Tileability of a simply connected region in the plane with two types of rectangles can be decided in polynomial time.

First, we show that when the number of rectangles is sufficiently large (originally about 106, later somewhat decreased), one should not expect such a result. Formally, we prove that tileability is NP-hard in this case. We then show that in 3-dim the topology of the region gives no advantage. Among other results, we prove that tileability of contractible regions with 2x2x1 slabs isNP-complete, and counting 2x1x1 domino tilings of contractible regions is #P-complete.

Now, the CS Theory point of view on these types of results changed drastically over time. Roughly, 30 years ago they were mainstream. About 20 years ago they were still of interest, but no longer important. Nowadays they are marginal at best – the field has moved on. My point is that the result are of interest in Combinatorics and Combinatorics only. Indeed, it has long been observed that applying combinatorial group theory to tilings (as done by Thurston, Rémila, etc.) is more of an art than a science. Although we believe that already for three general rectangles in the plane the problem is intractable, proving such a result is exceedingly difficult. Our various results solve weak versions of this problem.

4) So, why name a field at all, given the mess we have? That’s mostly because we really want to incorporate the CS aspects of combinatorics as a legitimate branch of mathematics. Theory CS is already over the top combinatorial (check out the number of people who believe that P=?NP will be resolved with combinatorics), but when a problem arises in combinatorics from within, this part of combinatorics needs a name to call home. I propose using the term computational combinatorics, in line with computational group theory, computational geometry, computational topology, etc., as a part of the loosely defined computational mathematics. I feel that the adjective “computational” is broad and flexible enough to incorporate both theoretical/complexity aspects as well as some experimental work, and combinatorial software development (as in WZ theory), compared to other adjectives, such as “algorithmic”, “computable”, “effective”, “computer-sciency”, etc. So, please, AMS, next time you revise your MSC, consider adding “Computational Combinatorics” as 05Fxx.

P.S. A well known petition asks for graph theory to have its own MSC code (specifically, 07), due to the heavy imbalance in the number of graph theory vs. the rest of combinatorics papers. Without venturing an opinion, let me mention that perhaps, adding a top level “computational combinatorics” subfield of combinatorics will remedy this as well – surely some papers will migrate there from graph theory. Just a thought…

One can argue whether some proofs are from the book, while others maybe not. Some such proofs are short but non-elementary, others are elementary but slightly tedious, yet others are short but mysterious, etc. (see here for these examples). BTW, can one result have two or more “proofs from the book”?

However, very occasionally you come across a proof that is both short, elementary and completely straightforward. One would call such a proof trivial, if not for the fact that it’s brand new. I propose a new term for these – let’s call them lost proofs, loosely defined as proofs which should have been discovered decades or centuries ago, but evaded this fate for whatever accidental historical circumstances (as in lost world, get it?) And when you find such a proof you sort of can’t believe it. Really? This is true? This is new? Really? Really?!?

The number of integer sequences such that , and for , is equal to the total number of partitions of integers into parts .

For example, for the first set is sequences is , while the second of partitions is , both with six elements.

This result was discovered about 20-25 years too soon. In 1879-1882, while at Johns Hopkins University, Sylvesterpioneered what he called a “constructive partition theory”, and had he seen his good friend’s older paper, he probably would have thought about finding a bijective proof. Apparently, he didn’t. In all fairness to everybody involved, Cayley had written over 900 papers.

We are now ready for the “lost proof”, which is really a lost bijection. It’s given by just one formula, due to Matjaž Konvalinka and me:

For example, for we get the following bijection:

Of course, once the bijection is found, the proof of Cayley’s theorem is completely straightforward. Also, once you have such an affine formula, many extensions become trivial.

One wonders, how does one can come up with such a bijection. The answer is: simply compute it assuming there is an affine map. It tends to be unique. Also, we have done this before (for convex partitions and LR-coefficients). There is a reason why this bijection is so similar to Sylvie Corteel’s “brilliant human-generated one-line proof” in the words of Doron Zeilberger. So it’s just amazing that this simple proof has been “lost” for over 150 years, until now…

See our paper (Konvalinka and Pak, “Cayley compositions, partitions, polytopes, and geometric bijections”, 2012) for applications of this “lost bijection” and my survey (Pak, “Partition Bijections, a Survey”, 2006) for more on partition bijections.