"A good stack of examples, as large as possible, is indispensable for a thorough understanding of any concept, and when I want to learn something new, I make it my first job to build one." – Paul Halmos

Recently, I have begun to appreciate the use of ultrafilters to clean up proofs in certain areas of mathematics. I’d like to talk a little about how this works, but first I’d like to give a hefty amount of motivation for the definition of an ultrafilter.

Terence Tao has already written a great introduction to ultrafilters with an eye towards nonstandard analysis. I’d like to introduce them from a different perspective. Some of the topics below are also covered in theseposts by Todd Trimble.

In the previous post we described the following result characterizing the zeta distribution.

Theorem: Let be a probability distribution on . Suppose that the exponents in the prime factorization of are chosen independently and according to a geometric distribution, and further suppose that is monotonically decreasing. Then for some real .

I have been thinking about the first condition, and I no longer like it. At least, I don’t like how I arrived at it. Here is a better way to conceptualize it: given that , the probability distribution on should be the same as the original distribution on . By Bayes’ theorem, this is equivalent to the condition that

which in turn is equivalent to the condition that

.

(I am adopting the natural assumption that for all . No sense in excluding a positive integer from any reasonable probability distribution on .) In other words, is independent of , from which it follows that for some constant . From here it already follows that is determined by for prime and that the exponents in the prime factorization are chosen geometrically. And now the condition that is monotonically decreasing gives the zeta distribution as before. So I think we should use the following characterization theorem instead.

Theorem: Let be a probability distribution on . Suppose that for all and some , and further suppose that is monotonically decreasing. Then for some real .

More generally, the following situation covers all the examples we have used so far. Let be a free commutative monoid on generators , and let be a homomorphism. Let be a probability distribution on . Suppose that for all and some , and further suppose that if then . Then for some such that the zeta function

converges. Moreover, has the Euler product

.

Recall that in the statistical-mechanical interpretation, we are looking at a system whose states are finite collections of particles of types and whose energies are given by ; then the above is just the partition function. In the special case of the zeta function of a Dedekind abstract number ring, is the commutative monoid of nonzero ideals of under multiplication, which is free on the prime ideals by unique factorization, and . In the special case of the dynamical zeta function of an invertible map , is the free commutative monoid on orbits of (equivalently, the invariant submonoid of the free commutative monoid on ), and , where is the number of points in .

An interesting result that demonstrates, among other things, the ubiquity of in mathematics is that the probability that two random positive integers are relatively prime is . A more revealing way to write this number is , where

is the Riemann zeta function. A few weeks ago this result came up on math.SE in the following form: if you are standing at the origin in and there is an infinitely thin tree placed at every integer lattice point, then is the proportion of the lattice points that you can see. In this post I’d like to explain why this “should” be true. This will give me a chance to blog about some material from another math.SE answer of mine which I’ve been meaning to get to, and along the way we’ll reach several other interesting destinations.