DHJ — the triangle-removal approach

The post, A combinatorial approach to density Hales-Jewett, is about one specific idea for coming up with a new proof for the density Hales-Jewett theorem in the case of an alphabet of size 3. (That is, one is looking for a combinatorial line in a dense subset of .) In brief, the idea is to imitate a well-known proof that uses the so-called triangle-removal lemma to prove that a dense subset of contains three points of the form , and . Such configurations are sometimes called corners, and we have been referring to the theorem that a dense subset of contains a corner as the corners theorem.

The purpose of this post is to summarize those parts of the ensuing discussion that are most directly related to this initial proposal. Two other posts, one hosted by Terence Tao on his blog, and the other here, take up alternative approaches that emerged during the discussion. (Terry’s one is about more combinatorial approaches, and the one here will be about obstructions to uniformity and density-increment strategies.) I would be very happy to add to this summary if anyone thinks there are important omissions (which there could well be — I have written it fast and from my own perspective). If you think there are, then either comment about it or send me an email.

Two suggestions, made by Terry in comment number 4 (the second originating with his student Le Thai Hoang), led to two of the main threads of the discussion relating to the triangle-removal approach. One was that since Sperner’s theorem is the density Hales-Jewett theorem for alphabets of size 2, and since there is no obvious way of generalizing the existing proofs of Sperner’s theorem to obtain the density Hales-Jewett theorem, then it would make sense to look for other proofs of Sperner’s theorem. The other was that the existing methods of tackling density theorems tend to prove not just that you get one configuration of a desired type but a positive proportion of the number of configurations in the whole set. One can usually obtain a seemingly stronger result of this type by an easy averaging argument (a prototype of which was proved by Varnavides in the 1950s), but in the case of the density Hales-Jewett theorem the stronger statement is false: one can easily come up with dense subsets of with far fewer than combinatorial lines (the number that you get in itself).

The Varnavides thread.

This led to a search for a suitable Varnavides-like statement, and that search seems to have been successful — though we have not checked this carefully. (See comments 61-64 and 66.)The basic idea is to restrict attention to the set of all sequences such that the numbers of 1s, 2s and 3s in are all within of , where is a large integer that is much smaller than . An easy averaging argument shows that if you can prove that a dense subset of contains a combinatorial line, then you have the density Hales-Jewett theorem. (It seems that one can take to be anything from a very large constant that depends on the density only, to an integer that is a small multiple of .) The reason this helps is that it avoids the problem one has in that the points in a random combinatorial line are very atypical: with high probability they contain about of one value and about of the other two values.

This seems to have cleared up the Varnavides problem, so it is something to hold in reserve and use if we get to the stage where we have a definite approach to try out on the problem. For now, one can either just pretend that the Varnavides problem doesn’t exist and think about , or one can simply bear in mind that one is thinking about sequences with roughly equal numbers of 1s, 2s and 3s and combinatorial lines with small variable sets, or sets of wildcards as some of us are calling them.

The Sperner thread.

Some definite progress has been made on trying to find alternative proofs of Sperner’s theorem. One approach was initiated in comments 21 and 22. The idea here was to try to find a statement that would have the same relationship to Sperner’s theorem as the hoped-for (but not yet even precisely formulated) triangle-removal lemma had to density Hales-Jewett. An argument was given in comments 21 and 22 that the statement, if it existed, should have the following form.

Let and be collections of subsets of such that there are very few pairs with , and . Then it is possible to remove a small number of sets from and and end up with no such pairs.

We have been calling statements like this pair-removal lemmas, even though we have not yet proved any useful ones. A statement like this implies a weak form of Sperner’s theorem as follows. Suppose you have a dense set system that contains no pair of distinct sets with . Then let $\mathcal{B}$ consist of all complements of sets in . If we can find and with , then . Since , this is a contradiction unless . Therefore, the number of pairs with , and is very small. The pair-removal statement then implies that one can remove small subsets from and in order to end up with no such pairs. But that contradicts the fact that is dense and for every we have the pair .

After some difficulty in formulating a precise pair-removal lemma that wasn’t either trivial or false, we have arrived at the following test problem. (It’s not actually the pair-removal lemma we’re going to want, but it is cleaner, looks both true and non-trivial, and seems to involve all the right difficulties.)

Tentative conjecture. Let , where tends slowly to infinity with . Let and be subsets of such that the number of pairs with , and is . Then one can remove sets from each of and and end up with no such pairs.

Some promising suggestions have been made for how to go about proving this conjecture. Jozsef (in comment 124) has drawn attention to a theorem of Bollobás that implies that if each is disjoint from precisely one , then and must both be small. And Ryan (in comments 82, 145, 150 and 152) has drawn attention to existing papers on very closely related subjects. Very recently (at the time of writing it is in the last two or three hours) Jozsef has given us what appears to be a proof of a pair-removal lemma. Assuming there aren’t any irritating surprises, such as it turning out not to be quite the statement we really want, the obvious thing to do here seems to be to play around with Jozsef’s argument and see whether it can be generalized from 01-sequences to 012-sequences.

A very quick word about what to expect of such a generalization. It would be “completing the square”, where vertex 00 is the trivial pair-removal lemma, vertex 01 is the pair-removal lemma in Kneser graphs, vertex 10 is the normal triangle-removal lemma (as described in the background-information post), and we are looking for vertex 11. Since getting from 00 to 10 is by no means a triviality, getting from 10 to 11 is unlikely to be a triviality either. But now we are in a stronger position than we were initially, because to get to 11 there are two potential routes. If we are really lucky, then we’ve got all the ingredients we need and just have to decide how to mix them together.

Other approaches to Sperner.

The above approach to re-proving Sperner has a certain naturalness to it, at least in the context of the main problem, but it is not the only one to have been discussed. If generalizing it runs into difficulties, then we have other ideas to explore that have definitely not yet been explored as far as they could have. They begin with Ryan’s comment 57 (in which he was responding to a related comment of Boris). This introduces the idea of what Boris calls “pushing shadows around”, which is apparently what Sperner himself did.

Other ideas that are still to be explored.

If we are going to follow the triangle-removal proof of the corners theorem quite closely, then at some point we will need to find an analogue of Szemerédi’s regularity lemma for subgraphs of the graph where you join two sets if they are disjoint. There are some obvious problems with doing this (discussed in the initial post in comment V, and comments EE to LL), and a very speculative suggestion for how a proof might look in comment 110. There would also be a need for a counting lemma: an offer to search for such a lemma in a systematic way has not yet been taken up.

[Remark added 7/2/09. It’s important to understand that this post is how the situation seemed to be when it was written yesterday. If you read the comments you will see that some of the assertions that looked plausible (such as the precise form of the tentative conjecture) have now been discovered to be wrong.]

Share this:

Like this:

LikeLoading...

Related

This entry was posted on February 6, 2009 at 10:03 am and is filed under polymath1. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.

71 Responses to “DHJ — the triangle-removal approach”

300. One of the main reasons for this comment is to establish the numbering. Comments on Terry’s post will be numbered 200-299 (with a convention that from now on if a post receives 100 comments then it must be summarized and restarted), comments on this post will be 300-399, and comments on the obstructions-to-uniformity post, when it is written, will be 400-499.

To get the ball rolling for this post, I’d like to write out Jozsef’s proof of pair removal in full detail, mainly for my own benefit so that I understand it properly, but also because if, as I hope, a major subproject for this thread will be to see if we can generalize Jozsef’s argument, it will be convenient to have that argument here for easy referral.

The first thing to note is that the result Jozsef proves is not quite as strong as the pair-removal lemma we originally asked for, because what he shows is not that you can remove a small number of sets and end up with no disjoint pairs, but rather that you can remove a small number of sets and end up with removing almost all the disjoint pairs you started with. This kind of weakening of a removal lemma is good enough for many applications. For instance, if we applied it to the corners problem we would find that we could remove a small number of edges and get rid of almost all the degenerate triangles, which is just as much of a contradiction as the usual one. (In fact, even if we could show that we could get rid of one percent of the degenerate triangles we would have a contradiction.)

The Kneser graph is the graph whose vertices are all subsets of of size , with two sets and joined if and only if they are disjoint. We should think of as being for some fairly small but nonconstant but for now I will avoid trying to specify them.

Let and be collections of sets of size (i.e., subsets of the vertex set of . We can form a bipartite graph by joining to if and only if : this is not quite a bipartite subgraph of (since and are not required to be disjoint), but it is an induced subgraph of what one might call the bipartite version of (where each vertex set contains all -element sets and you join two such sets if they are disjoint). From now on I will blur the distinction between this bipartite graph and itself.

The maximum number of edges that could possibly join to is the number of edges in (bipartite version), which is . So we shall assume that the total number of edges is at most for some very small constant . Our aim is to remove at most vertices and get rid of at least half these edges (or any other fixed fraction you might care to go for), where tends to zero as tends to zero.

What Jozsef ends up showing is that there is a very simple way of deciding which vertices to remove: you just get rid of all vertices of large degree. It turns out that if you do this then you either get rid of a significant fraction of all the edges, or you had almost no vertices to start with (in which case you can just remove all of them).

Now all the time we are hoping to remove a number of edges that is large as a fraction of the total number of edges, so “large degree” means “significantly larger than the average degree”. Let us suppose, then, that the average degree in our graph is . (Here I am averaging over all vertices in , including those that are not in or .) It follows from Markov’s inequality that for any the number of vertices of degree at least is at most . So we will need to choose so that tends to zero as tends to zero.

Suppose that removing these vertices doesn’t get rid of half the edges. That means that after the removal we have a bipartite induced subgraph of such that the average degree is at least and the maximal degree is at most . In the next comment I’ll explain what Jozsef does in this case.

What he does is to use what he calls “the permutation trick of Lubell” to get an upper bound on the number of vertices that there can be in a graph with average degree at least and maximal degree at most . Take a random permutation of . Given two disjoint sets and , we shall say that separates and if every element of is less than every element of or vice versa. What is the expected number of separated pairs with or ?

Let’s suppose that the total number of vertices in the bipartite graph is . Then the total number of disjoint pairs is at least . Each disjoint pair has a probability of of being separated (since there are ways of ordering of which precisely two separate them), so the expected number of separated pairs is at least .

In particular, there exists a permutation that gives rise to at least separated pairs. But how many separated pairs can there be for a single permutation? Well, let’s suppose first that that permutation is the identity (without loss of generality) and that we have separated pairs , ordered in such a way that every element of is less than every element of . The crucial observation here is that if is a set with the smallest maximal element out of any of the (again without loss of generality), then is a separated pair for all . Therefore, there can be at most distinct sets (since we know that has degree at most ). Similarly, there can be at most distinct , so the total number of separated pairs is at most .

It follows that , and hence that . And having got to this point, I realize that I don’t understand why the right-hand side is small. Does anyone see it, or do we wait till Jozsef wakes up? (I also have trouble understanding the significance of the final inequality in his comment 155).

Maybe the point is that the right-hand side doesn’t have to be small if all we know about is that it’s a small multiple of , but if we assume that is much smaller (it’s possible that for applications to Sperner we may even be able to take ) and is reasonably large, then the right-hand side is much smaller than . So this would fit in with my comment 153 , where I suggested that using a stronger hypothesis might be helpful to prove a pair-removal lemma.

Here is a possible toy problem to think about, as an initial contribution to the project of generalizing Jozsef’s proof. This time we let . Our basic objects will be pairs of disjoint sets. We are given three collections , and of such pairs. A triangle is a triple of the form with , and . We would like to prove that if for each pair there is at most one such that and and similarly for the other two ways round, then it is possible to remove a small number of elements from , and and end up removing a significant fraction of all the triangles.

There are no degenerate triangles here, so this wouldn’t do DHJ straight off, but I think it would be a good way of trying to understand whether Jozsef’s methods generalize. An alternative line of investigation would be to generalize Jozsef’s methods to a statement about several central layers of the cube rather than just one.

An optimistic guess about how it might go: we just routinely try to generalize Jozsef’s argument, and at a certain point a difficulty arises, which we turn out to be able to solve using the usual triangle-removal lemma.

At the very least, let’s try to do the routine generalization and see what happens. So without any serious thought about what is sensible and what is not, let us say that a permutation separates the three sets and if every element of comes before every element of , which in turn comes before every element of , or any one of the other five such statements you get by permuting the three sets. We’ll also say that separates a triangle if it separates the three sets used to build it.

Given any two disjoint sets and , let us define the –degree of to be the number of such that , and . (In this case, we say that form a triangle.) We have similar definitions for the other two possible degrees.

Now let’s suppose that the average degree (over all three bipartite graphs) is and the maximal degree is . What can we say?

Let’s think about the expected number of separated triangles. The total number of triangles is (up to some absolute constant depending on whether one orders the vertices etc.), and the probability that separates any given triangle is . So the expected number is Therefore, there must be some permuation that separates at least this number of triangles.

Now let’s think about how many triangles a permutation can separate. Our assumption is that no pair of sets forms a triangle with more than other sets. It’s at this point that, if our wildest dreams come true, we find that normal dense triangle removal is exactly what we need. But no reason to suppose that just yet. I’m going to continue in a new comment.

Let’s suppose that are separated triangles. We would like an upper bound on . If we choose such that the maximal element of is minimized, we see that the number of distinct sets can be at most . Similarly, the number of distinct can be at most . But each pair is allowed at most sets in the middle, so we seem to have at most triangles separated by . This is slightly worrying as it came out a bit too simply for my liking.

If it’s correct, it shows that , and therefore that , which is exponential in . So for very small , such as we might expect to have in DHJ, we get a contradiction.

I’ve got to go now, so all I’ll say at this point is that this argument feels too simple to be correct, but I can’t see what’s wrong with it. (If it’s right, then the explanation must simply be that it is not what we need.)

Suppose are the family of sets not including the last element . Then and have density about within . (We’re thinking , here, right?) It seems that the fraction of Kneser graph edges which are in is about , which is “negligible”. So we should be able to delete any small constant fraction of vertices from and make it an intersecting family. But is of the Kneser graph , and doesn’t the largest intersecting family inside such a Kneser graph have density only around ?

Ryan, all I can say is that I can’t see a flaw in your reasoning. If it’s correct, then it shows something rather interesting. I couldn’t get Jozsef’s argument to work unless I assumed that degrees were very small rather than just small, and your example suggests that it’s actually false when you merely assume that the degrees are a negligible fraction of the maximum possible. In your example, the degree of a vertex in is still large, even if it’s proportionately small, so there’s still some hope that Jozsef’s argument is correct and adaptable to DHJ.

Now I must stare at the argument that I produced in 308 and try to work out what’s going on.

This is idle wondering, but given a random (not quasi-random) tripartite graph, do we know the probability P of n triangles appearing given a density D? This seems like this sort of thing that ought to be just a textbook example.

I’m going to try to do two things at once. The first is express myself more clearly when it comes to arguments such as the one in 308, and the second is to try to develop an argument that involves more than one slice, in the hope that degenerate triangles will come into play somehow.

Let me begin by trying to define the nicest possible triangular grid of slices . I’ll define to be the union of all such that is a triple of integers that belongs to the convex hull of , … actually, cancel this. It seems to be more natural, if one is going for the nicest possible and most symmetrical set of slices, to go for a hexagon. So let’s take all that sum to and that satisfy . Here I’m imagining that is a multiple of 3, not that it’s all that important. (This is a hexagon because there are six linear inequalities, and the symmetry of the situation makes it a regular hexagon.) Let’s write for the union of all these . We would like to prove that any dense subset of contains a combinatorial line.

Now I’m going to do roughly the same thing as before. I’ll assume that there are no combinatorial lines in . For each sequence in and each , let’s define the –set of to be , and we’ll denote this by . As ever, we define to be the set of all pairs with , to be the set of all and to be the set of all (again, always with ).

The assumption that there are no combinatorial lines tells us that there are no triples with , and , except in the degenerate case where . In the next comment I’ll try to hit this with Jozsef’s permutation argument.

So now let’s choose a random permutation and see what we can say about the number of separated triangles. To do this, my first task is to say how many actual triangles there are. And the answer is , one degenerate triangle for each element of and no others.

Next, I’d like some idea of the probability that a random permutation separates a given triangle. Let’s suppose that the point of that gives rise to the triangle belongs to . Then the probability that separates is . So the expected number of separated triangles is .

Now suppose I just choose an arbitrary permutation — without loss of generality the identity. How many separated triangles can there be? I think this may be where the problems arise, but let’s see. Suppose that are separated triangles. We know that for each …

I may come back to this, but I’ve now seen where I went wrong in 308, so I should think about that first, as it’s a simpler problem.

I started 308 with the claim that if you choose so as to minimize the maximum element of , then you could see immediately that there were at most distinct . The thinking behind this was that each would form a triangle.

But that was wrong: there is no reason for to belong to or for to belong to . So it’s back to the drawing board here. (But this, I should stress, is good news as I wanted there to be a complication to give triangle-removal a chance to creep in.)

Here is a new proof strategy. First I’ll copy the first part of Jozsef’s argument to get to the stage where each vertex is contained in roughly the same number of triangles. (If I can’t, I’ll remove a few vertices and get rid of a significant fraction of the triangles.) Next, I’ll argue as above that the expected number of separated triangles is reasonably large. Then I’ll try to bound the number of separated triangles if no vertex is in too many triangles and no edge is in more than one triangle. The first step of that will be to pick … actually, I still don’t see it.

I was going to say that one should pick the with smallest maximum, but I don’t see what that gives us.

What we can say is this. For each there will be some system of sets such that forms a triangle. If in this set system we choose such that the maximum of is minimized, then we can deduce that there are not too many distinct . Except that I don’t know why I even say this so confidently: what makes me think that should lie in ? OK I’m floundering around a bit here so I’d better stop. But I’ll leave these messy thoughts here in case anyone can get a clearer idea of what ought to be done. It feels as though it’s in promising territory somehow.

This is the kind of reason I find it promising. The graph where you join two sets if and only if one is completely to the left of the other is, under normal circumstances, much denser than the graph where you just join two sets if they are disjoint. So what Jozsef’s permutation argument is cleverly achieving seems to be to get us from a sparse context to a dense context. (I don’t yet have a fully precise formulation of this remark.)

Now let’s suppose we have our separated triangles above. Suppose also that we have managed to show somehow that many of the pairs belong to , and so on. We know that there won’t be many triangles, so this will imply, by the usual triangle-removal lemma, that we can remove a small number of edges and end up with no triangles. Could there be an argument along these lines that allows us to prove density Hales-Jewett by hitting it permutation by permutation? We’re in that familiar state where there’s a small probability of being close to a solution and a large probability of having written a load of nonsense.

Bad news – good news. Good morning! I hope that I didn’t derail the project with my last calculation yesterday – that the “removal lemma” works when . The main inequality is correct, and we see that it tells us something non-trivial if m is small compare to n. As I see now, the average degree, d, grows as . I think that this is the range where we have a removal lemma so far. This is a removal lemma we get by simple double-counting, so I’m afraid that we can’t expect too much out of it, but it supports Tim’s conjecture.

Hi Jozsef. Two remarks. It seems to me (but I’d be reassured if I knew you agreed about this) that we have a removal lemma if the average degree is very small. And in interesting contexts it is. For example, if is a counterexample to Sperner’s theorem and we let consist of al complements of sets in , then each set in is joined just to its complement, so the average degree is 1.

The second remark is that it’s not impossible that a simple double counting gives us something good. The reason is that when going from 2 to 3 in the dense case you go from trivial to interesting, so when going from 2 to 3 here, perhaps you go from simple to very interesting.

Now I feel a bit dummy; I wanted to check where did the calculation go wrong yesterday in my last post, but I can’t see. Tim, in 304 you asked why is the right hand side small. By checking what I see – again – is that it should be around . I probably badly overlook the same thing again …
I’ll go to have a morning coffee.

Ryan, you may be ahead of me here and already see why the suggestion I’ve been making is a bad one, but what about strengthening the hypothesis of the Tentative Conjecture from the number of disjoint pairs being to each set belonging to at most one disjoint pair? (With this second hypothesis one tries to prove that the sets must be small.)

Actually, isn’t that just Bollobás’s theorem? I think it is because if I take in 303 then I get the correct bound of . Is that the standard proof of Bollobás’s theorem?

Jason, if has the smallest maximal element and is separated, then the largest element of is at most the largest element of is less than the smallest element of . If this doesn’t explain it then I don’t understand what you’re not understanding.

A quick technical question: I think it should be reasonably straightforward to answer. Suppose we have a dense set in . Suppose now that we take a random permutation and just take from the sequences that start with 1s, continue with 2s and finish with 3s (after you’ve used to scramble the coordinates). And now suppose you form the tripartite graph from just those sequences. That is, you let consist of all pairs that are equal to for some , and so on. We can think of as a bipartite graph, whose vertices are subsets of , where we join to if finishes before starts. What I’d like to know is whether this graph tends to be dense. I think the straightforward answer is no, so what I really mean is whether it can be made dense if you remove a smallish number of vertices and edges, or something like that. The motivation for this question is that I’d very much like to be able to define a dense tripartite graph with few triangles and see what happens when we apply the usual triangle-removal lemma.

Re: 319, yes there was a typo, I meant n/2 there, thanks. But I still don’t see if anything is bad with my calculation; Please help me, is it correct that if then d should be at least or we have a removal lemma? After this I will try to digest Tim’s triangle removal technique.

Hi Tim, in #321, if each set is involved in at most one Kneser edge, then I think the least number of sets you need to delete in order to eliminate all Kneser edges is with a factor of two of the number of sets (vertices), no?

Jozsef in 325, my calculation (which I checked and I’m pretty sure it works out to be the same as yours) suggests that you get something interesting as long as is large and is a lot smaller than . Since is about , it follows that we need to be small compared with in order to get a removal lemma. I think this agrees with you, since you are taking and . So it seems correct to you and it seems correct to me, which means that there’s an chance that it’s incorrect.

329. Thank you Tim. I think I should apologize that, after reading 303, I didn’t pay much attention to later posts from where I could have learned that things were OK with the calculation. In the next post I will try to understand and extend your toy version of a triangle removal lemma. Unfortunately now I have to go to the university and probably I won’t read the posts in the next 5-6 hours.

Jozsef, I should make it clear that I don’t yet have an interesting new triangle removal lemma proved, so if you prove anything at all it counts as an extension of what I’ve done. But I might think just a little bit further about the vague idea of 324.

Suppose then that we have a dense set of sequences in , and for simplicity let’s assume that they’re all reasonably well balanced (meaning that they all have roughly equal numbers of 1s, 2s and 3s). How many times do we expect a given set to occur as for some ? The very rough answer is . Let’s be even more rough and call that . Now has size around .

I’m going to interrupt this comment because I’ve had a different idea that I want to pursue urgently.

The proof I know of Sperner goes like this. You take a random sequence of sets of the form , where has cardinality . If is an antichain, then it intersects each such chain in at most one set.

Now let’s imagine we choose a set at random. What is the probability that it lies in ? Let’s write for the density of inside layer (that is, the number of sets in of size divided by ). Then the expected size of the intersection of with our random chain is . It follows that the add up to at most 1 (since at least one random chain will have at least the expected number of elements of in it). Since the middle layer(s) is (are) biggest, it’s clear that we maximize the size of by choosing a middle layer.

The idea that’s just occurred to me is that we’ve just deduced Sperner from the one-dimensional corners result (which states that if you have a dense subset of then you must be able to find and in it with ). What happens if we try to deduce density Hales-Jewett from 2-dimensional corners?

Well, first we need to decide what replaces a random chain. For reasons of symmetry I’d like to think of a random chain as follows: you pick a random permutation of and then once you’ve scrambled the set you take all partitions of into two sets such that every element of is less than every element of .

At this point I’m just going to guess that it might be reasonable to choose for DHJ purposes a random permutation of followed by all partitions such that every element of comes before every element of , which comes before every element of . This is at least a two-dimensional structure, since it’s determined by the end points of and . I will also think of this as the set of all sequences in that start with some 1s, continue with some 2s, and end with some 3s.

Next question: if contains no combinatorial line, can we deduce that the intersection of with is small? What I’d like to do is prove that it has no corners, in some useful sense of the word “corners”.

How would we get a combinatorial line in this set? We can of course encode each element of as a triple where . And at this point we have a problem: contains no combinatorial lines.

Well, I suppose if that had worked it would have been discovered by now. But it suggests an amusing problem. Let be the hexagonal portion of the triangular grid (a 2-dimensional subset of ) defined by . Let us call a function a corner isomorphism if it is an injection such that every aligned equilateral triangle in maps to a combinatorial line. (An aligned equilateral triangle is a set of the form .) If we could find a corner isomorphism for some fairly large , then we could probably move the image about and prove density Hales-Jewett using an averaging argument. So can anyone either find a big corner isomorphism or (more likely) prove that no such function exists? I don’t expect this problem to be very hard.

My last contribution for the day: I’ve just checked and there isn’t even a corner isomorphism from into . The proof goes like this: you take an arithmetic progression of three points in a line and map them to , and . (I think I may also have wanted to insist that lines go to sets where the appropriate -set is fixed.) Those three points determine a and their images determine what the images of the other three points under a corner isomorphism would have to be, but those three images don’t lie in a combinatorial line. (For what it’s worth, they are , and .)

To understand the notations and arguments below, one should read 305-307-308 first. Tim’s proof in 307-308 works nicely, if we change the initial conditions. Our base set is now just a collection of m-element subsets (where n=3m+k) instead of and we are considering now every triangle formed by any three pairwise disjoint elements of . For this set of triangles, we do have the removal lemma as Tim has calculated it. Let me add two technical remarks which might be needed for the complete proof. A triangle is called “normal” if none of its edges is edge of more than Cd other triangles. Then, the argument gives bound on the number of normal triangles. If most of the triangles aren’t normal, then we have a removal lemma. The conclusion is that if we have a small number of triangles in compare to the size of , then a small fraction of the edges covers most of the triangles. My second small remark is that when we bound the normal triangles separated by a permutation, then we consider the sets right from the leftmost (having the smallest last point) middle element and and the sets left from the rightmost (having the largest first point) element. These are both bounded by Cd and they span at most separated triples, as Tim said. I’m just emphasizing the selection of the left size here, only a technical detail.
This theorem is more restricted than Tim’s original statement, but I think it might be useful. In DHJ the dense subset is given and we will work with the induced substructure.

Jozsef, I think I have the answer to your last question. Let’s define the Kneser product of two set systems and to be the set of all pairs such that , and . Then the theorem would be that every Kneser product that is dense in the set of all disjoint pairs (of which there are ) contains a corner: that is, a configuration of the form , , with all three of , and disjoint.

I’ve stated it like that to make it as similar as I can to Sperner. But a more symmetrical version is this. Suppose that , and are set systems such that there are at least ways of partitioning as with etc. Then the corresponding collection of sequences contains a combinatorial line.

To see that this is a natural extension of Sperner, consider the following silly way of stating (weak) Sperner. If and are two set systems such that there are at least ways of partitioning as with and then the corresponding collection of 01-sequences contains a combinatorial line.

It looks to me as though there should be a two-dimensional family of density Hales-Jewett statements. That is, for each there should be a DHJ(j,k), where Sperner is DHJ(1,2), density Hales-Jewett in is DHJ(k-1,k), and the theorem just discussed (for which it looks as though we may have a proof) is DHJ(1,3). The idea would be that for DHJ(j,k), whether or not a sequence belongs to the set depends on a j-uniform hypergraph with as its vertex set.

We should see if there is any relation between this and Terry’s DHJ(2.5) problem . I suspect there isn’t, in which case it’s amusing to have come up with a different way of “interpolating” between Sperner and DHJ.

A small observation. One could regard the previous comment as showing that at least one natural generalization of Sperner to turns out not to be the statement we want. That is unfortunate, though by no means catastrophic. In this comment I want to point out that there is a different way of generalizing Sperner that leads (I think) to a deeper theorem, though it still isn’t DHJ.

Let’s go back to comment 331 and try to generalize the proof of Sperner and see what comes out, not worrying if it isn’t DHJ. To do this, we associate with any triple the three sets , and . Let’s call this the triple . Now let’s suppose we have a corner, by which in this context I mean a triple of triples of the form , and (with ). What combinatorial configuration do we get from the corresponding triples , and ? The answer is that we partition into sets , , , and . Then the sequence corresponding to is 1 on and , 2 on and , and 3 on . The sequence corresponding to is 1 on , 2 on , and , and 3 on . Finally, the sequence corresponding to is 1 on , 2 on and , and 3 on and .

Thus, we have three fixed sets, namely , and , and two variable sets, namely and . If we identify each set to a point, we can think of the three sequences as and . If we forget about the fixed sets, the restrictions to the variable sets are going . Also, and this is important, the two variable sets have the same size. (If they didn’t, then the resulting theorem would be much easier.)

Given a dense set of triples that start with 1s, continue with 2s and end with 3s, then the corners theorem will tell you that you must have a configuration of the above kind. I haven’t checked this, but I’m about 99% certain that if you imitate the proof of Sperner by averaging over a random permutation, then you will be able to deduce that a dense subset of will contain a “peculiar combinatorial line” of the above form. (Just to be clear, a peculiar combinatorial line is a triple of sequences that are fixed outside two sets and that have the same size, and inside and they go .)

The reason I call this a deeper theorem is that it relies on the corners theorem. Indeed, it implies the corners theorem if you restrict to sets that are unions of slices. (This is because we insist that and have the same size.)

Again, this is a slightly depressing observation because it shows that a reasonably natural generalization of the Sperner proof gives us the wrong theorem. However, again it doesn’t actually tell us that we can’t find a different way of generalizing Sperner that gives the right theorem.

A tiny observation, following on from 332. The main difficulty seems to be as follows. Let’s define the order profile of a sequence to be what you get when you collapse each block of equal coordinates to just one coordinate. For example, the order profile of is . In you can have a combinatorial line where every element has the same order profile: e.g. and . But in that is impossible. (Proof: if the last fixed coordinate before a variable string starts is and the first fixed coordinate after it finishes is , then the variable string can only take the values and .)

I’ve vaguely wondered about having a two-dimensional ground set, so that it’s possible for three sets to meet at a point, but I haven’t managed to define a nice two-parameter family of sequences.

Actually, I can’t resist just slightly expanding on the last paragraph of 337. Let’s take not but , which we think of as the space of all functions from to . I’d like to regard such a function as “simple” if you’ve basically got three big simply connected patches, of 1s, 2s and 3s, and if the boundaries between these patches are not too wiggly. A combinatorial line of functions like this would have the desirable property that it wouldn’t be obvious from just one of the functions what the variable set was: in some sense the “two-dimensional order profile” would be the same for all three points.

One could now ask whether the simpler structure here would make DHJ easier to prove. If so, then one could average up in the Sperner fashion. But it doesn’t look easy to say exactly what this simpler structure is.

Let’s take as our basic object the kind of thing one talks about in Sperner’s proof of the Brouwer fixed point theorem (surely a coincidence though). Chop out a large hexagonal piece from a tiling of the plane by hexagons and colour two of its edges (next to each other) red, two green and two blue, and extend these colourings from the boundary into the inside of the hexagon until you’ve coloured it all, and do so in such a way that you have three simply connected regions that meet at some point. Colourings of this kind are the basic objects that we’ll consider. Of course, “red”, “green” and “blue” is another way of saying 1, 2 and 3.

The question is, if you have a dense set of such colourings, must you have a combinatorial line? A combinatorial line in this context will be like what I described except that the three regions meet not at a point but round the boundary of a fourth region, which is the set of variable coordinates.

At the moment I know pretty well nothing about this question. I don’t know whether there’s an easy counterexample, I don’t know how to think about the measure on all those colourings (which feels like a fairly nasty object of a kind that the late and much lamented Oded Schramm would have been able to tell us about), and I don’t know whether a positive answer follows from density Hales-Jewett. The one thing I feel fairly confident of is that this, if true would imply DHJ. Oh, and one other thing is that I don’t have any feel for whether this is more or less equal to DHJ in difficulty, or whether there is a chance that it is easier.

Actually, there’s another thing that I’m confident of — in fact, certain of — which is that it is a natural generalization of the trivial fact that underlies Sperner. (That is the fact that if you have a dense set of 2-colourings of a line of points, then two of those 2-colourings will form a combinatorial line.)

This conjecture is probably not to be taken too seriously, but I quite like the fact that it doesn’t follow easily from DHJ.

First I must admit, that I haven’t read much in this thread, so I don’t know if this has been said before. I got this idea while I tried to find some low values of .
In 331 Gowers gives a proof that if is a antichain, $\sum_{A\in\mathcal{A}}\frac{1}{\binom n{|A|}}\leq 1$. Using the same idea it is possible to show that if is a family, that intersects every chain , then $\sum_{A\in\mathcal{A}}\frac{1}{\binom n{|A|}}\geq 1$. Notice that when the elements of is only allowed to have one of two given sizes, these two theorems are equivalent. The later theorem seems to be easier to generalize to the k=3 case, and it might be useful, at least in the 200-thread.

I’m reading 335, and I’m trying to understand the conjecture. First, there is a coloring theorem for subsets that says that one can find three disjoint sets in the same colour class that their unions are also in that colour class. It is a special case of the Finite Union Theorem (Folkman’s Theorem) which is equivalent to the finite sums theorem. It states that any finite colouring of the natural numbers contains arbitrary large monochromatic IP sets. The density statement is clearly false without extra conditions on the dense subset. It is obvious that the finite union theorem implies the finite sum theorem, however the other direction is difficult; one can use Van der Waerden for the reduction. Maybe we can also prove some density theorem for disjoint set unions using Szemeredi’s thm. It is still not Tim’s cross-product conjecture, but it seems to be quite promising. Let me also mention that the other proof of finite union theorem uses HJ! About the details: a possible reference is Graham, Rothschild, Spencer, “Ramsey Theory”. Now it’s family lunch-time, so I will be away for a while, but I’m quite excited about this possible “density finite union theorem” so I will come back a.s.a.p.

Jozsef, What I wrote in 335 was meant to be more or less the same as what you suggested in 333 — a very slight generalization perhaps, but you can make it the same if you look at the special case . And I think that, as you say, it can be proved by the argument of 308. Or are we talking about different things?

Boris, I hadn’t actually worked it out, but I think I can. Basically, DHJ(j,k) is density Hales-Jewett in but with placing a restriction on the complexity of the dense set Let me give the case first. This is equivalent to the normal density Hales-Jewett theorem, but I’ll state it slightly differently.

Actually, I think I may as well go for the whole thing right from the start.

Suppose that for each subset of of size you have a set of functions from to the power set of such that the images of all the points in are disjoint sets. Let’s write for a typical element of . I now define a set of (ordered) partitions of into sets by taking all partitions such that for every of size we have that belongs to . Thus, in a certain sense the set of partitions (which are of course in one-to-one correspondence with sequences in ) depends only on what collections of sets there are in the partition. Then DHJ(j,k) is the claim that if your dense set has this special form, then it must contain a combinatorial line.

My suggestion, which I think I could justify completely if pushed to do so, is that if then one can prove DHJ(j,k) by a fairly simple adaptation of the proof of Sperner as given by Jozsef. Sperner, by the way, is DHJ(1,2), and the problem we have been considering is DHJ(2,3). In general, I suspect that DHJ(j,k) is of roughly equal difficulty to DHJ(j,j+1). It might be interesting to see if we can deduce DHJ(j,k) from DHJ(j,j+1), though at the moment my feeling is that we can prove DHJ(1,k) not by deducing it from DHJ(1,2) but by imitating the proof of DHJ(1,2), which is not quite the same thing.

I will go back to the “density finite union theorem” soon as I think it is relevant to our project, but first here is a general remark about the removal lemma. I just want to make it clear that in a certain range we have a real removal lemma. Given three families of sets A,B, and C. If the number of disjoint triples (one element from each) is smaller than X (something depending on the sizes |A|,|B|, and |C|) then a small fraction of the disjoint pairs covers EVERY triangle. To see that, we just have to iterate the counting of “normal” triangles. (post 333.)

346. I think I misunderstand comment 344. Consider DHJ(k-1,k). We have sets each of which gives a rise to a subset of which I call . Then the subset of that corresponds to that is the intersection . However, nothing stops us from make large, and their intersection empty.

347. Boris, I wasn’t clear enough in what I meant. Of course, you are right in what you say, but the statement is that if the intersection is dense then you get a combinatorial line, and not the weaker assumption that each individual is dense. Of course, in the case you can consider just a set system , just as in Sperner you can consider just the set of points where your sequence is 0.

My previous remark was about a possible DFU thm. Let us see the simplest version. What would be a necessary density condition to guarantee a pair of disjoint sets A and B that A,B, and are in this “dense” set? Considering the arithmetic analogue, if we have a set with cn elements a that 2a is also in S, then there is a nontrivial solution for x+y=z.

The first density condition would be that the number of disjoint pairs is at least .The second condition should be something related to sets having 2/3n+-k elements. I’m still not sure what could be a meaningful notation for density here.

A possible statement could be the following; Given a family F of subsets of [n]. Let us suppose that the number of disjoint pairs in F is at least We are going to define a graph on F. The verices are the sets in F and two are connected iff they are disjoint and their union is also in F. Then, either one can remove vertices to destroy all edges, or there are edges where c’ depends on c only. I’m not sure that the statement is true, but something like this should be true.

I’ve got to go to bed, but as a last comment, let me ask whether 350 is intended as a set-theoretic analogue of Ben Green’s theorem that if you have a set with few solutions of then you can remove a small number of elements and end up with no solutions of . It would be very nice to be able to prove something like that (and a generalization to larger finite unions).

If one were going to try to find a counterexample, one would have a large collection of sets of size around , and one would partition them into disjoint pairs, and one would add to the collection the set of unions of these pairs, which would turn them into a matching in your graph. The question would then be whether one could do this without creating further edges.

Here’s an attempt to do so. First we choose a random collection of sets of size , big enough that its lower shadow is all of . Next, we choose all sets of size at most . Then the number of disjoint pairs is certainly at least . Also, the number of edges is at most . Next, it seems to me that it will be very hard to remove all edges: the fact that is random and that every vertex is contained in an edge (and we could make slightly bigger so that each vertex is in several edges) ought to mean that we can choose a large matching. How big is ? Well, each set in covers small sets, so by my calculations the number of edges is , which is a lot smaller than . Sorry if some of this turns out to be nonsense — I must must sleep now.

Let me state here a conjecture. It might not be directly connected to our project, however I’ve been looking for the right formulation for a while and I think that this is the right one; First we have to find the proper notation of density since simple density wouldn’t guarantee the result we are looking for. The “weighted density” of a set is given by the sum of the (normalized) number of pairwise disjoint k-tuples in S for all 1<kc>0 there are m pairwise disjoint sets, that for any nonempty . This form is clearly a density version of Folkman’s theorem; If we colour the subsets of [n] by K colours, then at least one class has density > 1/K.

Let me state here a conjecture. It might not be directly connected to our project, however I’ve been looking for the right formulation for a while and I think that this is the right one; First we have to find the proper notation of density since simple density wouldn’t guarantee the result we are looking for. The “weighted density” of a set is given by the sum of the (normalized) number of pairwise disjoint k-tuples in S. . The maximum number of pairwise disjoint k-tuples is so D(S) is a number between 0 and 1. The conjecture is that for any number m, if S contains no m pairwise disjoint elements with all possible nonempty unions, then it is sparse. D(S) goes to 0 as n goes to infinity.

With different words, if S is dense (and large enough) then there are m pairwise disjoint sets, that for any nonempty index set

Dear Jozsef, I’ve edited your last comment so that the formulae show up, and made a couple of other tiny changes (such as changing to ). Please check that I haven’t introduced any errors.

As it stands, I don’t understand your conjecture. A typical disjoint -tuple has all its sets of size around , so it seems to me that the set of all sets of size around is dense in your sense. And there is of course no hope of getting finite unions in this set. Am I misunderstanding something?

[…] and lower bounds for the density Hales-Jewett problem , hosted by Terence Tao on his blog, and DHJ — the triangle-removal approach, which is on this blog. These three threads are exploring different facets of the problem, though […]

Tim, there is a 1/(n-1) multiplier, so I thought that one layer (k) will contribute a small fraction only. In particular, I think that rich layers form arithmetic progressions and it might be useful in a possible proof.

361. I think Jozsef’s conjecture is true, but not for an interesting reasons. I claim that for every fixed positive integer the condition implies that . Suppose not, let . Now let us generate the partition of into parts as follows. Let be the partition of into random parts. If with , then is positive, and is independent of . Had been independent, it would have implied that is exponentially small. Clearly, ‘s are not independent, but they are nearly independent. More precisely, let be the event that . Then . Hence, sampling random sets by choosing each element with probability is virtually indistinguishable from choosing a random -tuple of disjoint sets, and then looking at first of them. I believe I can write the appropriate string of inequalities to show that if asked.

Since density of is nearly on each the existence of any given configuration follows by simple averaging.

Thanks Boris! It seems that this is still not the right formulation for a DFU conjecture. Actually, I should have noticed that a random selection of sets with probability 1/2 already gives density zero. So, I’m still looking for the right density definition. Boris, do you have any suggestion?

I had to struggle a bit to understand what DHJ(j,k) meant – I had not followed this thread for a while – but when I finally got it, I can see how it fits nicely with the whole hypergraph regularity/hypergraph removal story, and so does look quite promising. Anyway, I thought I’d try to restate it here in case anyone else also wants to know about it…

In , we can define the notion of a “(basic) complexity 1 set” to be a set of strings whose 1-set lies in some fixed class , whose 2-set lies in some fixed class , and whose 3-set lies in some fixed class . A simple example here is ; in this case is the set of subsets of of size a, and so forth. As I just recently understood over at the obstructions to uniformity thread, basic complexity 1-sets are obstructions to uniformity, and so we definitely need to figure out how to find lines in these sets.

In analogy with hypergraph regularity (and also with ergodic theory), the next more complicated type of set might be “non-basic complexity 1 sets” – sets which are unions of a bounded number of basic complexity 1 sets. (In graph theory, basic complexity 1 corresponds to a complete bipartite weighted graph between two cells with some constant weight, and non-basic complexity 1 corresponds to a Szemeredi-type weighted graph between a bounded number of cells, with any pair of cells having a constant weight on the edges connecting them.) But it seems fairly obvious that if dense basic complexity 1 sets have lines, then so do dense non-basic complexity 1 sets (though one may have to be careful with quantifiers, and use the usual Szemeredi double induction to organise the epsilons properly).

Then we have the basic complexity 2 sets in , which are those sets A in which the 12-profile of A (i.e. the pair consisting of the 1-set and 2-set of A; one can also identify this with an element of ) lies in some fixed class , the 23-profile of A lies in some fixed class , and the 31-profile of $A$ lies in some fixed class . But of course when k=3, any one of these complexity 2 profiles determine the entire set A, so every set A is complexity 2, which is why DHJ(2,3) is the same as DHJ(3). [Incidentally, to answer a previous question, this hierarchy seems to be unrelated to my DHJ(2.5) – I was simplifying the lines rather than the sets.]

If we knew that sets A which were uniformly distributed wrt complexity 1 sets were quasirandom (a “counting lemma”), then we would be well on our way to reducing DHJ(2,3) to DHJ(1,3) by the usual regularity type arguments. (I’m sure Tim and the others here already know this, this is just me trying to get up to speed.)

There is now a definite convergence of this thread and the obstructions-to-uniformity thread, which perhaps illustrates the advantages of not splitting the discussion up. At any rate, this last comment of Terry’s is closely related to comment 411, and also to earlier comments on the 400-499 thread that attempt to establish some sort of counting lemma. I now get the feeling that the problem has been softened up, and that it is not ridiculous to hope that we might actually solve it. But it could be that certain difficulties that feel technical will turn out to be fundamental.

Unfortunately, I’ve got two big teaching days coming up, so my input is going to go right down for a while. But perhaps that’s in the spirit of polymath — we get involved, then we go off and do other things, then we come back and catch up with what’s going on, but all the while polymath continues churning away at the problem.

Metacomment: Ideally, I suppose we should be working in an environment in which comments from different threads can somehow overlap, coalesce, or otherwise interlink to each other. But, failing that, I think this division into subthreads has been the least worst solution (e.g. the for small n thread seems to have taken off once it was separated from the “mainstream” threads of the project). And revisiting the threads every 100 comments or so seems to be just about the right tempo, I think.

This is another comment that could go either here or on the obstructions-to-uniformity thread, but I think it is slightly more appropriate here. Terry, that’s a good point about the 100-comments-per-post rule, and if the 300s and 400s threads really do seem to be converging, then they can merge into the 500s thread.

My comment is that I have a variant of DHJ that ought to be equivalent to DHJ itself but has the potential to tidy up a lot of the annoying technicalities to do with slices being of different sizes. (I don’t know for certain that it will work, however.) It’s motivated by the proof of Sperner that I gave in 331. There one shows the stronger result that if the slice densities add up to more than 1 then there is a “combinatorial line”. What should the analogue be for DHJ?

My contention is the following: define the slice density to be . Then I claim that if (for fixed positive and sufficiently large ), then contains a combinatorial line. This conjecture bears the same relation to corners that the strong form of Sperner bears to the trivial fact that if you have at least two points in then you get a configuration of the form with . Given that the averaging argument for Sperner naturally proves this stronger result, I think there is reason to hope that the numbers will all work out miraculously better if we go for this revised version of DHJ. I hope that way we can avoid boring details about being near the middle.

I haven’t begun to check whether this miracle really does happen, but I think it’s definitely worth investigating.

Another way of looking at 365. All we’re doing is choosing a random point of not with the uniform measure but by first choosing a slice and then choosing a random point in that slice. It seems to me that this should have huge advantages. For instance, and most notably, the distribution of a random point used to be radically different if you conditioned on it belonging to a typical combinatorial line. Now it’s different but not that different. (With the new measure the situation is on a par with the situation for corners, where the difference between the distributions of random point and random vertex of a random corner is not a serious problem.) But I think there may be all sorts of arguments that will now work out exactly. Perhaps even the quasirandomness calculations I was trying to do will become clean and nice. No time to check though!

Hi everyone. While the Hales-Jewett theorem may be a little beyond my current level of mathematics, I did put something together that I believe the mathematicians that solved this problem would have interest in.

I created a fully playable 12 dimensional tic-tac-toe iPhone game.
It is available for free download on the app store.

Using the basic framework of the game, I do believe a computer-assisted proof would be possible. I do know that computer proofs are generally considered less elegant in the world of higher mathematics, but if anyone is interested, I would be willing to share my source code and collaborate on such a proof.