254A, Lecture 4: Multiple recurrence

In the previous lecture, we established single recurrence properties for both open sets and for sequences inside a topological dynamical system . In this lecture, we generalise these results to multiple recurrence. More precisely, we shall show

Theorem 1. (Multiple recurrence in open covers) Let be a topological dynamical system, and let be an open cover of X. Then there exists such that for every , we have for infinitely many r.

Note that this theorem includes Theorem 1 from the previous lecture as the special case . This theorem is also equivalent to the following well-known combinatorial result:

Theorem 2. (van der Waerden’s theorem) Suppose the integers are finitely coloured. Then one of the colour classes contains arbitrarily long arithmetic progressions.

Exercise 1. Show that Theorem 1 and Theorem 2 are equivalent.

Exercise 2. Show that Theorem 2 fails if “arbitrarily long” is replaced by “infinitely long”. Deduce that a similar strengthening of Theorem 1 also fails.

Exercise 3. Use Theorem 2 to deduce a finitary version: given any positive integers m and k, there exists an integer N such that whenever is coloured into m colour classes, one of the colour classes contains an arithmetic progression of length k. (Hint: use a “compactness and contradiction” argument, as in my article on hard and soft analysis.)

We also have a stronger version of Theorem 1:

Theorem 3. (Multiple Birkhoff recurrence theorem) Let be a topological dynamical system. Then for any there exists a point and a sequence of integers such that as for all .

These results already have some application to equidistribution of explicit sequences. Here is a simple example (which is also a consequence of Weyl’s equidistribution theorem):

Corollary 1. Let be a real number. Then there exists a sequence of integers such that as .

Proof. Consider the skew shift system with . By Theorem 3, there exists and a sequence such that and both convege to . If we then use the easily verified identity

(1)

we obtain the claim.

Exercise 4. Use Theorem 1 or Theorem 2 in place of Theorem 3 to give an alternate derivation of Corollary 1.

As in the previous lecture, we will give both a traditional topological proof and an ultrafilter-based proof of Theorem 1 and Theorem 3; the reader is invited to see how the various proofs are ultimately equivalent to each other.

— Topological proof of van der Waerden —

We begin by giving a topological proof of Theorem 1, due to Furstenberg and Weiss, which is secretly a translation of van der Waerden’s original “colour focusing” combinatorial proof of Theorem 2 into the dynamical setting.

To prove Theorem 1, it suffices to show the following slightly weaker statement:

Theorem 4. Let be a topological dynamical system, and let be an open cover of X. Then for every there exists an open set which contains an arithmetic progression for some and .

To see how Theorem 4 implies Theorem 1, first observe from compactness that we can take the open cover to be a finite cover. Then by the infinite pigeonhole principle, it suffices to establish Theorem 1 for each separately. For each such k, Theorem 4 gives a single arithmetic progression inside one of the . By replacing the system with the product system for some large N and replacing the open cover of X with the open cover of , one can make the spacing r in the arithmetic progression larger than any specified integer N. Thus by another application of the infinite pigeonhole principle, one of the contains arithmetic progressions with arbitrarily large step r, and the claim follows.

Now we need to prove Theorem 4. By Lemma 1 of Lecture 2 to establish this theorem for minimal dynamical systems. We will need to note that for minimal systems, Theorem 4 automatically implies the following stronger-looking statement:

Theorem 5. Let be a minimal topological dynamical system, let U be a non-empty open set in X, and let . Then U contains an arithmetic progression for some and .

Indeed, the deduction of Theorem 5 from Theorem 4 is immediate from the following useful fact (cf. Lemma 1 from Lecture 3):

Lemma 1. Let be a minimal topological dynamical system, and let U be a non-empty open set in X. Then X can be covered by a finite number of translates of U.

Proof. The set is a proper closed invariant subset of X, which must therefore be empty since X is minimal. The claim then follows from the compactness of X. .

(Of course, the claim is highly false for non-minimal systems; consider for instance the case when T is the identity. More generally, if X is non-minimal, consider an open set U which is the complement of a proper subsystem of X.)

Now we need to prove Theorem 4. We do this by induction on k. The case k=1 is trivial, so suppose and the claim has already been shown for k-1. By the above discussion, we see that Theorem 5 is also true for k-1.

Now fix a minimal system and an open cover , which we can take to be finite. We need to show that one of the contains an arithmetic progression of length k.

To do this, we first need an auxiliary construction.

Lemma 2. (Construction of colour focusing sequence) Let the notation and assumptions be as above. Then for any there exists a sequence of points in X, a sequence of sets in the open cover (not necessarily distinct), and a sequence of positive integers such that for all and .

Proof. We induct on J. The case J=0 is trivial. Now suppose inductively that , and that we have already constructed , , and with the required properties. Now let V be a suitably small neighbourhood of (depending on all the above data) to be chosen later. By Theorem 5 for k-1, V contains an arithmetic progression of length k-1. If one sets , and lets be an arbitrary set in the open cover containing , then we observe that

(2)

for all and . If V is a sufficiently small neighbourhood of , we thus see (from the continuity of the that we verify all the required properties needed to close the induction.

We apply the above lemma with J equal to the number of sets in the open cover. By the pigeonhole principle, we can thus find such that . If we then set and we obtain Theorem 4 as required.

It is instructive to compare the k=2 case of the above arguments with the proof of Theorem 1 from the previous lecture. (For a comparison of this type of proof with the more classical combinatorial proof, see my Montreal lecture notes.)

— Ultrafilter proof of van der Waerden —

We now give a translation of the above proof into the language of ultrafilters (or more precisely, the language of Stone-Čech compactifications). This language may look a little strange, but it will be convenient when we study more general colouring theorems in the next lecture. As before, we will prove Theorem 4 instead of Theorem 1 (thus we only need to find one progression, rather than infinitely many). The key proposition is

Proposition 1. (Ultrafilter version of van der Waerden) Let be a minimal element of . Then for any there exists such that

. (3)

Suppose for the moment that this proposition is true. Applying it with some minimal element p of (which must exist, thanks to Exercise 10 of the previous lecture), we obtain obeying (3). If we let for some arbitrary , we thus obtain

. (4)

If we let be an element of the open cover that contains x, we thus see that for all and all which lie in a sufficiently small neighbourhood of q. Since a LCH space is always dense in its Stone-Čech compactification, the space of all (n,r) with this property is non-empty, and Theorem 4 follows.

Proof of Proposition 1. We induct on k. The case k=1 is trivial (one could take e.g. , so suppose and that the claim has already been proven for k-1. Then we can find such that

(5)

for all .

Now consider the expression

(6)

for any and , where

. (7)

Applying (5) to the limit in (6), we obtain the recursion for all . Iterating this, we conclude that

(8)

for all . For i=0, (8) need not hold, but instead we have the easily verified identity

. (8′)

Now let be a non-principal ultrafilter and define . Observe from (6) that all the lie in the closed set , and so p’ does also. Since p is minimal, there must exist such that p = p” + p’. Expanding this out using (8) or (8′), we conclude that

(9)

for all . Applying (6), we conclude

(10)

where and . Now, define to be the limit

(11)

then we obtain Proposition 1 as desired.

Exercise 5. Strengthen Proposition 1 by adding the additional conclusion . Using this stronger version, deduce Theorem 1 directly without using the trick of multiplying X with a cyclic shift system that was used to deduce Theorem 1 from Theorem 4.

Theorem 1 can be generalised to multiple commuting shifts:

Theorem 6. (Multiple recurrence in open covers) Let be a compact topological space, and let be commuting homeomorphisms. Let be an open cover of X. Then there exists such that for infinitely many r.

Exercise 6. By adapting one of the above arguments, prove Theorem 6.

Exercise 7. Use Theorem 6 to establish the following the multidimensional van der Waerden theorem (due to Gallai): if a lattice is finitely coloured, and , then one of the colour classes contains a pattern of the form for some and some non-zero r.

Exercise 8. Show that Theorem 6 can fail, even for and , if the shift maps are not assumed to commute. (Hint: First show that in the free group on two generators , and any word and non-zero integer r, the three words cannot all begin with the same generator after reduction. This can be used to disprove a non-commutative multidimensional van der Waerden theorem, which can turn be used to disprove a non-commutative version of Theorem 6.)

— Proof of multiple Birkhoff —

We now use van der Waerden’s theorem and an additional Baire category argument to deduce Theorem 3 from Theorem 1. The key new ingredient is

Lemma 3. (Semicontinuous functions are usually continuous) Let be a metric space, and let be semicontinuous. Then the set of points x where F is discontinuous is a set of the first category (i.e. a countable union of nowhere dense sets). In particular, by the Baire category theorem, if X is complete and non-empty, then F is continuous at at least one point.

Proof. Without loss of generality we can take F to be upper semicontinuous. Suppose F is discontinuous at some point x. Then, by upper continuity, there exists a rational number q such that

. (3)

In other words, x lies in the boundary of the closed set . But boundaries of closed sets are always nowhere dense, and the claim follows.

Now we prove Theorem 3. Without loss of generality we can take X to be minimal. Let us place a metric d on the space X. Define the function by the formula

. (4)

It will suffice to show that F(x)=0 for at least one x (notice that if the infimum is actually attained at zero for some n, then x is a periodic point and the claim is obvious). Suppose for contradiction that F is always positive. Observe that F is upper semicontinuous, and so by Lemma 3 there exists a point of continuity of F. In particular there exists a non-empty open set U such that F is bounded away from zero.

By uniform continuity of , we see that if F is bounded away from zero on U, it is also bounded away from zero on for any n (though the bound from below depends on n). Applying Lemma 1, we conclude that F is bounded away from zero on all of X, thus there exists such that for all . But this contradicts Theorem 1 (or Theorem 4), using the balls of radius as the open cover. This contradiction completes the proof of Theorem 3.

Exercise 9. Generalise Theorem 3 to the case in which T is merely assumed to be continuous, rather than be a homeomorphism. (Hint: let denote the space of all sequences with for all n, with the topology induced from the product space . Use a limiting argument to show that is non-empty. Then turn into a topological dynamical system and apply Theorem 3.)

Exercise 12. Let be a minimal topological dynamical system, and let . Call a point x in X k-fold recurrent if there exists a sequence such that for all . Show that the set of k-fold recurrent points in X is residual (i.e. the complement is of the first category). In particular, the set of k-fold recurrent points is dense.

Exercise 13. In the boolean Bernoulli system , show that the set A consisting of all non-zero integers which are divisible by 2 an even number of times is almost periodic. Conclude that there exists a minimal topological dynamical system such that not every point in X is 3-fold recurrent (in the sense of the previous exercise). (Compare this with the arguments in the previous lecture, which imply that every point in X is 2-fold recurrent.)

Exercise 14. Suppose that a sequence of continuous functions on a metric space converges pointwise everywhere to another function . Show that f is continuous on a residual set.

Exercise 15. Let be a minimal topological dynamical system, and let be a function which is T-invariant, thus Tf = f. Show that if f is continuous at even one point , then it has to be constant. (Hint: is in the orbit closure of every point in X.)

How easy is it to make these results about dynamical systems quantitative? For example, in theorem 4 can you give an explicit bound for r in terms of
k and and some invariants of the dynamical system and open sets? Presumably any such bounds would have to be rather large, given that its not easy to find reasonable bounds in van der Waerden’s theorem.

You might be interested in one of Philipp Gerhardy’s papers (click on my name for a link to the PDF), in which he applies proof-theoretic methods to topological dynamics. Girard extracted bounds for van der Waerden numbers from the Furstenberg/Weiss proof of Multiple Birkhoff which are identical to van der Waerden’s original bounds. Gerhardy proves an effective version of Multiple Birkhoff, again via an analysis of Furstenberg/Weiss, and again obtains the same bounds on vdW numbers.

But Gerhardy indicates that the reason behind this is that the topological proof and the combinatorial proof are really the same, just in different languages. He suggests the possibility of carrying out work to obtain a topological proof that, rather than mirroring van der Waerden’s original, corresponds to Shelah’s combinatorial proof, and which would enjoy the same improved bounds on van der Waerden numbers.

One can indeed make the argument quantitative, yielding the same Ackermann-type bounds as in the classical finitary proof of van der Waerden, as in the papers of Gerhardy and Girard mentioned by Ed above.

To answer your specific question, to get Theorem 4 for minimal systems with a designated metric d, the argument above gives quite concrete bounds in terms of (a) the uniform continuity modulus of the shift T, and (b) a quantitative bound for Lemma 1, in particular how many translates are needed for a ball of a given radius to cover all of the space. The only real difficulty is in _reducing_ to the minimal case, using the lemma from the previous lecture: the argument I gave there uses Zorn’s lemma and is thus not immediately finitisable. But one can develop substitutes for that lemma which do have quantitative (albeit Ackermann-type) bounds. It’s easiest to describe this in terms of the systems that are of immediate interest to van der Waerden’s theorem, namely orbit closures of a single word . In this setting, c generates a minimal system if and only if every block in c appears syndetically, thus for every L there exists F(L) such that every block of length L that appears at least once in c, will in fact appear in every F(L)-block of c. The key issue with respect to creating quantitative bounds is to obtain some good bound on F (this is the counterpart of (b) above in this setting). This is not possible if we insist on c being infinite, but we can get somewhere by truncating c to an extremely long but finite length (in particular, much larger than any given function of F(L)). Indeed, by using the finite convergence principle as discussed in my earlier post

one can obtain a quantitative version of Lemma 1 from the previous lecture: given any L and any function G, there exists an N such that in any word c of length N there exists an M and a subword of c of length G(M) such that every block of length L in that subword in fact appears in every block of length M in the subword.

(One should think of the parameters being arranged in the order L << M << G(M) << N. The quantity M is playing the role of F(L) here.)

Basically, N is going to equal an iterated power of G applied to L, where the number of iterations is equal to the number of possible blocks of length L (i.e. ). The intuition here is that if an L-block fails to appear syndetically, one can pass to a long subword in which that block fails to appear at all. Iterating this at most times, one eventually reaches a long subword in which all surviving L-blocks appear syndetically.

(Actually, one needs to apply the above lemma for k different values of L in order to get van der Waerden, because minimality is invoked k times in the proof of Theorem 4 once one unpacks the induction. This is what leads to the Ackermann-type behaviour in k. One has to take some care in arranging the quantifiers (i.e. deciding what parameters are large with respect to what other parameters); this is the price we pay for not using the axiom of infinity, which conveniently offers sets (namely, infinite sets) that are guaranteed to be larger than every finite set one could hope to generate by any conceivable future means, thus removing a lot of the need for epsilon management or quantifier management.)

It’s a nice observation that van der Waerden can be deduced without introducing a metric. In preparing my course I’ve been following Furstenberg’s book on this point, and he proceeds by proving Birkhoff Multiple Recurrence, only then pausing to derive combinatorial consequences. One only needs BMR in the case that is a homeomorphism, but then the proof of BMR seems to require that too. As you point out in your exercises one can deduce the result for continuous maps by a clever kind of lifting trick.

By the way Benji Weiss’s book “Single Orbit Dynamics” has a discussion of how to get minimality without using Zorn’s Lemma, as does “Ergodic Theory via Joinings”. Furstenberg’s book has a more erudite approach (there is a uniquely ergodic -invariant measure on ) but this seems to be an unnecessarily large sledgehammer.

Thanks for the comments. I think (but haven’t fully checked) that one can also use Proposition 1 directly to imply Theorem 1, at least, for non-invertible T, simply by taking p to be a _positive_ minimal ultrafilter (thus ). Then n+ir+p will be positive too, and we can get Theorem 1 without ever having to apply a negative power of T. The derivation of Theorem 3 from Theorem 1 seems to not require invertibility of T either.

(still on the constructivity issue, Baire’s theorem turns out, on its turn,
to be equivalent to the Axiom of Dependent Choices, which is also
independent of the remaning axioms of Zermelo-Fraenkel set theory
and can be used as a “countable” substitute for the Axiom of Choice.
For instance, it’s equivalent to a countable version of Tychonoff’s
theorem)

I am trying to map quantities used in Theorem 1 to those in Theorem 2. In my understanding —
1. X ==> compactified version of Z.
2. Since X is compact, we can have a finite subcover. We will take a finite subcover of compactified Z. The sets in the subcover will be same as the colour classes.
3. k denotes the length of the arithmetic progression.
I think that r denotes the common difference between successive terms of the arithmetic progression but Theorem 2 has nothing to say anything on the common difference. To me it seems that Theorem 2 is weaker than Theorem 1.

One can rephrase Theorem 2 to resemble Theorem 1 more closely by replacing the phrase “contains arbitrarily long arithmetic progressions” by the equivalent phrase “contains infinitely many arithmetic progressions of length k for each k”. (Note that as a long arithmetic progression can always be chopped up into many shorter arithmetic progressions, the two statements imply each other.)

To deduce Theorem 2 from Theorem 1, one does not actually use the compactified integers (the colour classes do not cover the whole of unless one takes the closure, in which case they are no longer open) but instead looks at a subsystem of the Bernoulli system , where m is the number of colours used. The equivalence between Theorem 1 and Theorem 2 here is in fact very analogous to the equivalence between Theorem 1 of the previous lecture and the infinite pigeonhole principle, as discussed in Exercise 1 of that lecture.

[…] The Furstenberg correspondence principle can be extended to relate several other recurrence theorems to their combinatorial analogues. We give some representative examples here (without proofs). Firstly, there is a multidimensional version of Szemerédi’s theorem (compare with Exercise 7 from Lecture 4): […]

[…] dynamics, such as Proposition 3 from Lecture 6 or the Birkhoff multiple recurrence theorem from Lecture 4 (though to get the full strength of the results, one needs the syndetic van der Waerden theorem, […]

[…] this as a colouring of into colours. Applying van der Waerden’s theorem (Exercise 3 from Lecture 4), we can thus find (if K is sufficiently large depending on ) an arithmetic progression in for […]

Ex 10. According to the hint, we can construct the . However, to apply Theorem 3, we have to first prove Theorem 4. In its proof, we need the restricted (in ) to be invertible. This is not necessarily true for the restricted . But we seems able to re-prove Theorem 4 by the same procedure as in the note, only modifying the definition of by taking $x_J$ as the element $r_J$ step in front of $y$ in the orbit. I was wondering if this would be okay.

Ex 14. I spent more than 6 hours on it, I am really silly. I checked the proof given by Xiaochuan Liu in the above link, it seems his proof is not right. I was wondering if you could give a hint on how to prove it.

Dear Prof. Tao,
I’m sorry to bother you, but I’m working on a graduation thesis about topological dynamical systems, and I have some problems about Lemma 2 of this lecture. I will be grateful if you answer a couple of questions.

– First of all, you say about the existence of integers such that for all and , with . Now, if should I expect the existence of two integers or of just one? And in the case , how is it possible to consider , which is undefined?

– After having proved it, you apply Lemma 2 to the proof of Theorem 4, to close the induction. But if I set and , by the Lemma I obtain that contains for all , that is an arithmetic progression of lenght and not as I needed to complete the induction step. Where am I getting wrong?

For commenters

To enter in LaTeX in comments, use $latex <Your LaTeX code>$ (without the < and > signs, of course; in fact, these signs should be avoided as they can cause formatting errors). See the about page for details and for other commenting policy.