"A good stock of examples, as large as possible, is indispensable for a thorough understanding of any concept, and when I want to learn something new, I make it my first job to build one." – Paul Halmos

Archive for the ‘physics.stat-mech’ Category

The principle of maximum entropy asserts that when trying to determine an unknown probability distribution (for example, the distribution of possible results that occur when you toss a possibly unfair die), you should pick the distribution with maximum entropy consistent with your knowledge.

The goal of this post is to derive the principle of maximum entropy in the special case of probability distributions over finite sets from

The previous post on noncommutative probability was too long to leave much room for examples of random algebras. In this post we will describe all finite-dimensional random algebras with faithful states and all states on them. This will lead, in particular, to a derivation of the Born rule from statistical mechanics. We will then give a mathematical description of wave function collapse as taking a conditional expectation.

I put up a post over at the StackOverflow blog describing a little of what I’ve been up to this summer.

Curiously enough, the Zipf distribution which shows up in that post is the same as the zeta distribution that shows up when trying to motivate the definition of the Riemann zeta function. I’m sure there is a conceptual explanation of this connection somewhere, probably coming from statistical mechanics, but I don’t know it. I suppose the approximate scale invariance of the zeta distribution is relevant to its appearance in many real-life statistics, as described in Terence Tao’s blog post on the subject here.

In the previous post we described the following result characterizing the zeta distribution.

Theorem: Let be a probability distribution on . Suppose that the exponents in the prime factorization of are chosen independently and according to a geometric distribution, and further suppose that is monotonically decreasing. Then for some real .

I have been thinking about the first condition, and I no longer like it. At least, I don’t like how I arrived at it. Here is a better way to conceptualize it: given that , the probability distribution on should be the same as the original distribution on . By Bayes’ theorem, this is equivalent to the condition that

which in turn is equivalent to the condition that

.

(I am adopting the natural assumption that for all . No sense in excluding a positive integer from any reasonable probability distribution on .) In other words, is independent of , from which it follows that for some constant . From here it already follows that is determined by for prime and that the exponents in the prime factorization are chosen geometrically. And now the condition that is monotonically decreasing gives the zeta distribution as before. So I think we should use the following characterization theorem instead.

Theorem: Let be a probability distribution on . Suppose that for all and some , and further suppose that is monotonically decreasing. Then for some real .

More generally, the following situation covers all the examples we have used so far. Let be a free commutative monoid on generators , and let be a homomorphism. Let be a probability distribution on . Suppose that for all and some , and further suppose that if then . Then for some such that the zeta function

converges. Moreover, has the Euler product

.

Recall that in the statistical-mechanical interpretation, we are looking at a system whose states are finite collections of particles of types and whose energies are given by ; then the above is just the partition function. In the special case of the zeta function of a Dedekind abstract number ring, is the commutative monoid of nonzero ideals of under multiplication, which is free on the prime ideals by unique factorization, and . In the special case of the dynamical zeta function of an invertible map , is the free commutative monoid on orbits of (equivalently, the invariant submonoid of the free commutative monoid on ), and , where is the number of points in .

An interesting result that demonstrates, among other things, the ubiquity of in mathematics is that the probability that two random positive integers are relatively prime is . A more revealing way to write this number is , where

is the Riemann zeta function. A few weeks ago this result came up on math.SE in the following form: if you are standing at the origin in and there is an infinitely thin tree placed at every integer lattice point, then is the proportion of the lattice points that you can see. In this post I’d like to explain why this “should” be true. This will give me a chance to blog about some material from another math.SE answer of mine which I’ve been meaning to get to, and along the way we’ll reach several other interesting destinations.

I finally learned the solution to a little puzzle that’s been bothering me for awhile.

The setup of the puzzle is as follows. Let be a weighted undirected graph, e.g. to each edge is associated a non-negative real number , and let be the corresponding weighted adjacency matrix. If is stochastic, one can interpret the weights as transition probabilities between the vertices which describe a Markov chain. (The undirected condition then means that the transition probability between two states doesn’t depend on the order in which the transition occurs.) So one can talk about random walks on such a graph, and between any two vertices the most likely walk is the one which maximizes the product of the weights of the corresponding edges.

Suppose you don’t want to maximize a product associated to the edges, but a sum. For example, if the vertices of are locations to which you want to travel, then maybe you want the most likely random walk to also be the shortest one. If is the distance between vertex and vertex , then a natural way to do this is to set

where is some positive constant. Then the weight of a path is a monotonically decreasing function of its total length, and (fudging the stochastic constraint a bit) the most likely path between two vertices, at least if is sufficiently large, is going to be the shortest one. In fact, the larger is, the more likely you are to always be on the shortest path, since the contribution from any longer paths becomes vanishingly small. As , the ring in which the entries of the adjacency matrix lives stops being and becomes (a version of) the tropical semiring.

That’s pretty cool, but it’s not what’s been puzzling me. What’s been puzzling me is that matrix entries in powers of look an awful lot like partition functions in statistical mechanics, with playing the role of the inverse temperature and playing the role of energies. So, for awhile now, I’ve been wondering whether they actually are partition functions of systems I can construct starting from the matrix . It turns out that the answer is yes: the corresponding systems are called one-dimensional vertex models, and in the literature the connection to matrix entries is called the transfer matrix method. I learned this from an expository article by Vaughan Jones, “In and around the origin of quantum groups,” and today I’d like to briefly explain how it works.