Classical Mechanics versus Thermodynamics (Part 1)

It came as a bit of a shock last week when I realized that some of the equations I’d learned in thermodynamics were just the same as equations I’d learned in classical mechanics—with only the names of the variables changed, to protect the innocent.

Why didn’t anyone tell me?

For example: everybody loves Hamilton’s equations: there are just two, and they summarize the entire essence of classical mechanics. Most people hate the Maxwell relations in thermodynamics: there are lots, and they’re hard to remember.

But what I’d like to show you now is that Hamilton’s equations are Maxwell relations! They’re a special case, and you can derive them the same way. I hope this will make you like the Maxwell relations more, instead of liking Hamilton’s equations less.

First, let’s see what these equations look like. Then let’s see why Hamilton’s equations are a special case of the Maxwell relations. And then let’s talk about how this might help us unify different aspects of physics.

Hamilton’s equations

Suppose you have a particle on the line whose position and momentum are functions of time, If the energy is a function of position and momentum, Hamilton’s equations say:

Maxwell’s relations

There are lots of Maxwell relations, and that’s one reason people hate them. But let’s just talk about two; most of the others work the same way.

Suppose you have a physical system like a box of gas that has some volume pressure temperature and entropy Then the first and second Maxwell relations say:

Comparison

Clearly Hamilton’s equations resemble the Maxwell relations. Please check for yourself that the patterns of variables are exactly the same: only the names have been changed! So, apart from a key subtlety, Hamilton’s equations become the first and second Maxwell relations if we make these replacements:

What’s the key subtlety? One reason people hate the Maxwell’s relations is they have lots of little symbols like saying what to hold constant when we take our partial derivatives. Hamilton’s equations don’t have those.

So, you probably won’t like this, but let’s see what we get if we write Hamilton’s equations so they exactly match the pattern of the Maxwell relations:

This looks a bit weird, and it set me back a day. What does it mean to take the partial derivative of in the direction while holding constant, for example?

I still think it’s weird. But I think it’s correct. To see this, let’s derive the Maxwell relations, and then derive Hamilton’s equations using the exact same reasoning, with only the names of variables changed.

Deriving the Maxwell relations

The Maxwell relations are extremely general, so let’s derive them in a way that makes that painfully clear. Suppose we have any smooth function on the plane. Just for laughs, let’s call the coordinates of this plane and . Then we have

for some functions and This equation is just a concise way of saying that

and

The minus sign here is unimportant: you can think of it as a whimsical joke. All the math would work just as well if we left it out.

(In reality, physicists call as the internal energy of a system, regarded as a function of its entropy and volume They then call the temperature and the pressure. It just so happens that for lots of systems, their internal energy goes down as you increase their volume, so works out to be positive if we stick in this minus sign, so that’s what people did. But you don’t need to know any of this physics to follow the derivation of the Maxwell relations!)

Now, mixed partial derivatives commute, so we have:

Plugging in our definitions of and this says

And that’s the first Maxwell relation! So, there’s nothing to it: it’s just a sneaky way of saying that the mixed partial derivatives of the function commute.

The second Maxwell relation works the same way. But seeing this takes a bit of thought, since we need to cook up a suitable function whose mixed partial derivatives are the two sides of this equation:

There are different ways to do this, but for now let me use the time-honored method of ‘pulling the rabbit from the hat’.

Here’s the function we want:

(In thermodynamics this function is called the Helmholtz free energy. It’s sometimes denoted but the International Union of Pure and Applied Chemistry recommends calling it which stands for the German word ‘Arbeit’, meaning ‘work’.)

Let’s check that this function does the trick:

If we restrict ourselves to any subset of the plane where and serve as coordinates, the above equation is just a concise way of saying

and

Then since mixed partial derivatives commute, we get:

or in other words:

which is the second Maxwell relation.

We can keep playing this game using various pairs of the four functions as coordinates, and get more Maxwell relations: enough to give ourselves a headache! But we have more better things to do today.

Hamilton’s equations as Maxwell relations

For example: let’s see how Hamilton’s equations fit into this game. Suppose we have a particle on the line. Consider smooth paths where it starts at some fixed position at some fixed time and ends at the point at the time Nature will choose a path with least action—or at least one that’s a stationary point of the action. Let’s assume there’s a unique such path, and that it depends smoothly on and . For this to be true, we may need to restrict and to a subset of the plane, but that’s okay: go ahead and pick such a subset.

Given and in this set, nature will pick the path that’s a stationary point of action; the action of this path is called Hamilton’s principal function and denoted (Beware: this is not the same as entropy!)

Let’s assume is smooth. Then we can copy our derivation of the Maxwell equations line for line and get Hamilton’s equations! Let’s do it, skipping some steps but writing down the key results.

For starters we have

for some functions and called the momentum and energy, which obey

and

As far as I can tell it’s just a cute coincidence that we see a minus sign in the same place as before! Anyway, the fact that mixed partials commute gives us

which is the first of Hamilton’s equations. And now we see that all the funny and things are actually correct!

Next, we pull a rabbit out of our hat. We define this function:

and check that

This function probably has a standard name, but I don’t know it. Do you?

Then, considering any subset of the plane where and serve as coordinates, we see that because mixed partials commute:

we get

So, we’re done!

But you might be wondering how we pulled this rabbit out of the hat. More precisely, why did we suspect it was there in the first place? There’s a nice answer if you’re comfortable with differential forms. We start with what we know:

Next, we use this fundamental equation:

to note that:

See? We’ve managed to switch the roles of and at the cost of an extra minus sign!

Then, if we restrict attention to any contractible open subset of the plane, the Poincaré Lemma says

Since

it follows that there’s a function with

This is our rabbit. And if you ponder the difference between and , you’ll see it’s So, it’s no surprise that

The big picture

Now let’s step back and think about what’s going on.

Lately I’ve been trying to unify a bunch of ‘extremal principles’, including:

In my post on quantropy I explained how the first three principles fit into a single framework if we treat Planck’s constant as an imaginary temperature. The guiding principle of this framework is

maximize entropy subject to the constraints imposed by what you believe

And that’s nice, because E. T. Jaynes has made a powerful case for this principle.

However, when the temperature is imaginary, entropy is so different that it may deserves a new name: say, ‘quantropy’. In particular, it’s complex-valued, so instead of maximizing it we have to look for stationary points: places where its first derivative is zero. But this isn’t so bad. Indeed, a lot of minimum and maximum principles are really ‘stationary principles’ if you examine them carefully.

What about the fourth principle: Occam’s razor? We can formalize this using algorithmic probability theory. Occam’s razor then becomes yet another special case of

All of this deserves plenty of further thought and discussion—but not today!

Today I just want to point out that once we’ve formally unified classical mechanics and thermal statics (often misleadingly called ‘thermodynamics’), as sketched in the article on quantropy, we should be able to take any idea from one subject and transpose it to the other. And it’s true. I just showed you an example, but there are lots of others!

I guessed this should be possible after pondering three famous facts:

• In classical mechanics, if we fix the initial position of a particle, we can pick any position and time at which the particle’s path ends, and nature will seek the path to this endpoint that minimizes the action. This minimal action is Hamilton’s principal function which obeys

In thermodynamics, if we fix the entropy and volume of a box of gas, nature will seek the probability distribution of microstates the minimizes the energy. This minimal energy is the internal energy , which obeys

• In classical mechanics we have canonically conjugate quantities, while in statistical mechanics we have conjugate variables. In classical mechanics the canonical conjugate of the position is the momentum , while the canonical conjugate of time is energy In thermodynamics, the conjugate of entropy is temperature while the conjugate of volume is pressure All this is fits in perfectly with the analogy we’ve been using today:

• Something called the Legendre transformation plays a big role both in classical mechanics and thermodynamics. This transformation takes a function of some variable and turns it into a function of the conjugate variable. In our proof of the Maxwell relations, we secretly used a Legendre transformation to pass from the internal energy to the Helmholtz free energy

where we must solve for the entropy in terms of and to think of as a function of these two variables.

Similarly, in our proof of Hamilton’s equations, we passed from Hamilton’s principal function to the function

where we must solve for the position in terms of and to think of as a function of these two variables.

I hope you see that all this stuff fits together in a nice picture, and I hope to say a bit more about it soon. The most exciting thing for me will be to see how symplectic geometry, so important in classical mechanics, can be carried over to thermodynamics. Why? Because I’ve never seen anyone use symplectic geometry in thermodynamics. But maybe I just haven’t looked hard enough!

Indeed, it’s perfectly possible that some people already know what I’ve been saying today. Have you seen someone point out that Hamilton’s equations are a special case of the Maxwell relations? This would seem to be the first step towards importing all of symplectic geometry to thermodynamics.

Post navigation

59 Responses to Classical Mechanics versus Thermodynamics (Part 1)

(1) iirc the equations you here advertise as Maxwell’s I was taught as Boltzmann’s?

(2) pardon my French, but an imo good enough reason not to name thermodynamics thermostatics is that in French that name allows to stage the second law with a pun – thus reminding that entropy is a measure of e.g. punning micro-states. L’entropie met un terme aux dynamiques – entropy terminates the dynamics.

I’m surprised this paper isn’t better known! I wonder if it could be because the American Journal of Physics is a ‘teaching journal’.

By the way: I didn’t use the phrase ‘Hamilton–Jacobi’ in my blog article, because I wanted to keep the jargon down to a bare minimum, but of course the idea of deriving Hamilton’s equations by taking derivatives of Hamilton’s principal function is tightly connected to the Hamilton–Jacobi equation. It seems that for the isomorphism between classical mechanics and thermodynamics to become vivid, the Hamilton–Jacobi approach to Hamilton’s equations is nicer than the more common one focused on the phase space with coordinates and . But once we know the isomorphism exists, we can use any approach we want!

Like Blake Stacey says a couple of comments down, any previous work on this correspondence probably stems from Caratheodory’s formulation of thermodynamics, which relies on differential forms… Incidentally, isn’t there a Hamiltonian formulation of geometrical optics? It’d be lovely to get optics into the mix too!

I was wondering why you didn’t mention the Hamilton-Jacobi equation, but now that I’ve had a chance to compare my notes with this blog entry, I agree that your approach is much neater!

The reference I have on Caratheodory’s formulation of thermodynamics is Frankel’s The Geometry of Physics (second edition, 2004), section 6.3, which I’ve never actually read (well, never that carefully). Historical background can be found at a slightly more elementary mathematical level in Max Born’s Natural Philosophy Of Cause And Chance (1949), chapter 5.

The reference I have on Caratheodory’s formulation of thermodynamics is Frankel’s The Geometry of Physics (second edition, 2004), section 6.3, which I’ve never actually read (well, never that carefully).

I don’t remember what it says about Caratheodory’s formulation of thermodynamics in section 6.3, but that is an awesome book, so you should read it!

In the discussion on Hamiltonian mechanics, the function X looks like a generating function of the second kind, while S is a generating function of the first kind. Not sure what X is routinely called, though…

You’re reawakening all those quarter glimpsed things I never fully got:

Something called the Legendre transformation plays a big role both in classical mechanics and thermodynamics.

Why did it also play a role in information geometry?

I guess we need to understand the essence of the Legendre transform. The Legendre transform shows up whenever we minimize or maximize something subject to constraints. That happens a lot.

Here’s how it goes. Suppose you have a function of two variables—it could be more or fewer, but two is a nice example. Purely for fun let’s call this function . Now suppose you want to find the minimum of subject to a constraint on one of the variables—say, . We can do this using the yoga of Lagrange multipliers, which I hope you know and love. But let me describe that yoga in a mildly nonstandard way.

First, we find the value of and that minimize

Here is called the Lagrange multiplier. We don’t vary it while doing the minimization. Thus, the location of the minimum will depend on We figure out how. We then cleverly choose so that takes the value given by our constraint. We then read off

But after we get in the habit of this, we start to love

and think of it in various new ways. It’s a function of three variables. But we can think of it as a function of just and —at least if we’re lucky. How? Because if we fix and , there will be a unique $S$ minimizing —at least if we’re lucky. So, we use that . Then depends on just and

This is what we mean when we write

We call the Legendre transform of our original function .

Now we can approach our original task—locating the minimum of for a fixed value of —in a slightly new way! Now we’re locating the minimum of for some fixed value of .

It’s all so tautologous and simple that it sounds a bit confusing and pointless. We’re really not doing much! But it sounds grand—in fact it is grand—when we attach physically significant names to our variables.

Our original task was to find what a box of gas will do when it minimizes energy subject to a constraint on entropy.

We’ve changed this to an equivalent problem: find what a box of gas will do when it minimizes free energy subject to a constraint on temperature.

Here ‘find what it would do’ means find the value of the remaining variable , which was a bystander in the above game. But there could be lots of such variables.

Anyway, I hope you get it. We’ve switched our focus from the originally constrained variable to the Lagrange multiplier. We call the Lagrange multiplier the conjugate of the original variable. And we call our new function of interest (here ) the Legendre transform of our original function (here ).

I gave my current favorite explanation of the ‘essence’ of the Legendre transformation, but I didn’t answer David’s real puzzle, which I take to be: if the Legendre transformation can be seen as as limit of the Laplace transform, why do we see both Laplace transforms and Legendre transformations showing up in thermal statics?

This is a great puzzle and I don’t think I’ve reached the bottom of it. But here’s a start.

In the limit of thermal statics, it reduces to classical statics, where the principle of least energy reigns supreme. Whenever we minimize things we expect to see Legendre transformations, so in classical statics we do.

As we go to we’re doing thermal statics. Now, instead of choosing the one state of minimum energy we say all possible states occur with different probabilities—with a state of energy showing up with probability proportional to . So, our Legendre transforms turn into Laplace transforms.

But, this probability distribution on states still minimizes something: namely, the free energy! So, we still see Legendre transforms showing up in thermal statics!

So, we see both Laplace transforms and Legendre transforms in thermal statics.

Ah, that makes sense. Something similar happens with probabilities (though in view of the association between energies and probabilities that was always likely). When you have a distribution over a space you can always lift it up to a point in the space of distributions. So the Legendre distribution in the latter case is shifting you between coordinates for the moment-determined subspaces and those of the corresponding exponential families.

We note that equations of state—by which we mean identical relations among the thermodynamic variables characterizing a system—are actually first‐order partial differential equations for a function which defines the thermodynamics of the system. Like the Hamilton‐Jacobi equation, such equations can be solved along trajectories given by Hamilton’s equations, the trajectories being quasistatic processes which obey the given equation of state. This gives rise to the notion of thermodynamic functions as infinitesimal generators of quasistatic processes, with a natural Poisson bracket formulation. This formulation of thermodynamic transformations is invariant under canonical coordinate transformations, just as classical mechanics is, which is to say that thermodynamics and classical mechanics have the same formal structure, namely a symplectic structure.

Here’s what Peterson says in his introduction:

The “true structure” of thermodynamics, in this sense of coordinate invariance, is easy to find when one has once wondered about it, and the result is the same as in mechanics: the Poisson bracket structure (what mathematicians call a symplectic structure). This fact is, I believe, well known to some people, but it was not known to me when I was worrying about it, so I offer it to those who may find themselves similarly placed. I still find it surprising that thermodynamics and classical mechanics, those two pillars of classical physics, are—in a sense to be described—formally isomorphic!

Well, actually his Poisson brackets differ by an overall sign from the usual ones in classical mechanics, if you use the analogy I suggest. But that’s not surprising: the overall sign of the Poisson brackets is somewhat a matter of convention (though the conventions interlock and you have to be careful of changing just one).

Great! As Mark Peterson points out in his paper, the analogy between classical mechanics and thermodynamics makes a lot of things clearer. It would be fun to incorporate that into a course, and from the comments here you’ll see a number of people are already doing it. I’d love to try it myself someday—maybe when I get back to U.C. Riverside.

For some time I’ve suspected that various physical theories somehow use the same mathematics, but don’t have sufficient background to test this out. Your example here would be one of them. A friend of mine, Rob Tyler, once told me that he had published an article somewhere showing that the equations of electromagnetism are the same as those of fluid dynamics (Maxwell’s equations correspond to the vorticity relations I think). I’m not sure where all this leads, but at least it suggested a cute possibility on that score: what if anyone who solved either Yang-Mills or Navier-Stokes was entitled to $2,000,000 from the Clay foundation since by solving one they’d essentially solved the other? I just wish I knew what I was talking about on the matter.

Qualitatively, I don’t see that as very surprising. It has been known for a long time by mathematicians that symplectic topology has very deep and rich applications in quantum mechanics. Also, as several of your former posts indicate, quantum mechanics itself shares many similarities with thermodynamics (this is especially true if you subscribe to the ensemble interpretation).

It is important to look a bit deeper into the physics behind the equations, though. Hamiltonian mechanics goes far beyond just Hamilton’s equations of motion. The richness of Hamiltonian questions raises a few questions about this article: What does the fact that being invariant with varying signify physically (analogous to the Poincare/Liouville integral invarint)? Can you relate thermodynamic variable via canonical transformations? Is the symplectic theory of thermodynamics useful in applications or in theory (perhaps symplectic integration of thermodynamic quantities could prove useful for engineers)?

Darn, it is curiosities like these that make me regret my ignorance of thermodynamics!

One of the first things I learned to do with Hamilton’s equations of motion was to study the case where the initial condition is not exactly known: instead of saying we have at time a particle at with momentum , or a set of particles with a well-specified position and momentum each, all we know is the probability density for where things are and what they’re doing, . If this is how we’ll gamble about what’s happening at time , and if we accept that Newton’s laws apply, how should we gamble about what will happen at ? The Liouville equation tells us

where is the Hamiltonian which encodes all the ways the particles can push and pull on one another.

So, by formal analogy, can we say something about “statistical thermal statics”, i.e., a situation where we don’t know exact values for the macroscopic state variables? I guess the formal analogue of the Liouville equation would read

relating the change in under an infinitesimal quasi-static change in volume to the Poisson bracket of with the pressure .

So, by formal analogy, can we say something about “statistical thermal statics”, i.e., a situation where we don’t know exact values for the macroscopic state variables?

That’s an interesting idea! It’s sort of amusing, since in statistical mechanics, thermal statics is already statistical. But taking a probability distributions of probability distribution, and collapsing it down to a probability distribution, is a perfectly fine thing (formalized using the Giry monad, if you feel like showing off).

I think you’re exactly right about the analogue of the Liouville equation (though I don’t vouch for the minus sign); this should be a spinoff of M. J. Peterson’s comment:

Like the Hamilton‐Jacobi equation, such equations can be solved along trajectories given by Hamilton’s equations, the trajectories being quasistatic processes which obey the given equation of state.

The interesting thing about this monad is that it can lead to PDF’s that lack defined moments. For example the practice of applying an exponential distribution to an exponential distribution leads to a resultant PDF which lacks a mean but still has a median value. I have a feeling that this is a maximum entropy situation for a median-only constrained PDF, but I haven’t been able to set up the variational parameters correctly.

So my math puzzle is “Find the maximum entropy probability density function given a constraint of known median value”.

It’s sort of amusing, since in statistical mechanics, thermal statics is already statistical.

Right. On the other hand, a statement like “the temperature is 300 plus-or-minus 5 degrees Celsius” makes sense in an engineering context.

But taking a probability distributions of probability distribution, and collapsing it down to a probability distribution, is a perfectly fine thing (formalized using the Giry monad, if you feel like showing off).

Since WebHubTel and Blake are both interested in probability distributions of probability distributions, and the literature I’ve been able to find doesn’t look as readable as it should be, here’s a little mini-course:

Probability distributions aren’t nearly as flexible as probability measures, so we should really use those. Suppose is a space, and let be the space of probability measures on Then we’d like to talk about , the space of probability measures on the space of probability measures on And there should be a map

which collapses a probability measure on the space of probability measures on down to a probability measure on

For example, if there’s a 50% chance that the coin in my hand is fair (so it has a 50% chance of landing heads up), but there’s a 50% chance that it’s rigged so that it has a 90% chance of landing heads up, we say there’s a 70% chance of its landing heads up.

It’s not hard write down a formula for how should work in general, so I’ll leave that as a little exercise. The challenging part is this:

When I say is ‘a space’ I’m being pretty vague. We need to be a measurable space to define measures on it. That’s straightforward enough… but then we need to also be a measurable space, so we can define .

So, we need to find some class of measurable spaces such that is again a measurable space. And ideally we’d like to again be a measurable space in the same class! That would let us go hog-wild and define not just but also and so on. Believe it or not, these things are useful too!

Mathematicians have found a nice answer to this puzzle, though perhaps not the ultimate ideal answer: it’s to use ‘standard Borel spaces’. This is a class of measure spaces that includes all the ones you’d ever care about (unless you’re really insane!), and has the property that if is a standard Borel space, so is . Even better, there’s a nice complete classification of standard Borel spaces: two standard Borel spaces are isomorphic iff they have the same cardinality. This is a theorem due to Kuratowski.

So, it turns out we get a functor sending standard Borel spaces to standard Borel spaces, and a monad, meaning that for each standard Borel space we get a map

as desired, and this map is incredibly well-behaved. Now is not the time for me to explain monads; I’ve done it before in This Week’s Finds. But anyway, this functor is sometimes called the Giry monad… though often people use that term to mean something very slightly different, where instead of probability measures we use measures whose integral is less than or equal to 1 (called sub-probability measures).

So, what’s a standard Borel space? I explained that in week272, but I’ll say it again:

For starters, it’s a kind of measurable space, meaning a space equipped a collection of subsets that’s closed under countable intersections, countable unions and complement. Such a collection is called a sigma-algebra and we call the sets in the collection measurable. A measure on a measurable space assigns a number between 0 and +∞ to each measurable set, in such a way that for any countable disjoint union of measurable sets, the measure of their union is the sum of their measures.

A nice way to build a measurable space is to start with a topological space. Then you take its open sets and keep taking countable intersections, countable unions and complements until you get a sigma-algebra. This may take a long time, but if you believe in transfinite induction you’re bound to eventually succeed. The sets in this sigma-algebra are called Borel sets.

A basic result in real analysis is that if you put the usual topology on the real line, and use this to cook up a sigma-algebra as just described, there’s a unique measure on the resulting measurable space that assigns to each interval its usual length. This is called Lebesgue measure.

We don’t care about the metric in this game. So, we use the term Polish space for a topological space that’s homeomorphic to a complete separable metric space. It was the Poles, like Kuratowski, who first got into this stuff.

And often we don’t even care about the topology. So, we use the term standard Borel space for a measure space whose measurable sets are the Borel sets for some topology making it into a Polish space.

In short: every complete separable metric space has a Polish space as its underlying topological space, and every Polish space has a standard Borel space as its underlying measurable space.

Now, it’s hopeless to classify complete separable metric spaces. It’s even hopeless to classify Polish spaces. But it’s not hopeless to classify standard Borel spaces! The reason is that metric spaces are like diamonds: you can’t bend or stretch them at all without breaking them entirely. But topological spaces are like rubber… and measurable spaces are like dust. So, it’s very hard for two metric spaces to be isomorphic, but it’s easier for their underlying
topological spaces – and even easier for their underlying measurable spaces.

For example, the line and plane are isomorphic, if we use their usual sigma-algebras of Borel sets to make them into measurable spaces! And the plane is isomorphic to for every , and all these are isomorphic to a separable Hilbert space! As measurable spaces, that is.

In fact, every standard Borel space is isomorphic to one of these:

• a countable set with its sigma-algebra of all subsets,

• the real line with its sigma-algebra of Borel subsets.

That’s pretty amazing. It means that standard Borel spaces are classified by just their cardinality, which can only be finite, countably infinite, or the cardinality of the continuum. The "continuum hypothesis" says there’s no cardinality between the countably infinite one and the cardinality of the continuum–but we don’t need the continuum hypothesis to prove this result.

I’d still like to better understand the appearance of the Legendre transform.

The Legendre transform is analogous to the Fourier transform. We use the Fourier transform when we want to write a function as a sum of pieces that transform by multiplication when translated (i.e. exponentials). We can use the Legendre transform when we want to write a convex function as the max of a bunch of functions that transform by addition when translated (i.e. linear functions). So why does it appear here?

In classical mechanics the Legendre transform appears as the limit of Fourier transforms that arise in quantum mechanics. I think, but I’m not sure, that the Legendre transforms in thermodynamics arise from the limit of Laplace transforms that come from convolving probability distributions (along the lines of something I G+’ed a while back).

Stephen Lavelle wrote:

Just goes to show how fundamentally unimaginative humans really are ;)

John Baez wrote:

Allen Knutson wrote:

“Have you looked into Erik Verlinde’s entropic forces paper? I don’t think it’s the same thing, but it must be mentioned here. http://arxiv.org/pdf/1003.1015v1”

I could never make much sense of that paper, but while writing my next blog article the phrase ‘entropic force’ did come to mind, because I’ll talk about the principle of maximum entropy S in thermostatics versus the principle of minimum potential energy V in classical statics, and dV = -F plays a similar role to the role played by dS in this blog article. So yeah, something may be going on here.

John Baez wrote:

Dan Piponi: I understand what people are talking about when they say the Legendre transform is ‘T → 0 limit’ of the Laplace transform; James Dolan explained this stuff to me, and I used to talk about it a lot with David Corfield on the n-Cafe.

In a nutshell, it all boils down to how the ‘tropical rig’, namely the numbers from 0 to +infinity with min as addition and + as multiplication, can be seen as a T → 0 limit of the usual ring of real numbers, where we conjugate the usual operations by a T-dependent function. We can formulate thermostatics as matrix mechanics using the real numbers; then classical statics arises in the T → 0 limit. Similarly, after a Wick rotation, we can see classical mechanics as an h → 0 limit of quantum mechanics. In thermostatics the Laplace transform reigns supreme; in quantum mechanics this becomes the Fourier transform, and in the T → 0 or h → 0 limit these reduce to the Legendre transform.

I’m not really explaining anything, just sketching the ideas. However, this circle of ideas sits at an odd angle to what I’m thinking about now: the role of the Legendre transform in thermostatics! I think that’s what’s puzzling you, and I think that’s what’s puzzling David Corfield in his comment here:

John Baez: if that Wikipedia article is missing something important, why not add it? :)

Cliff Harvey wrote:

By the way, on the subject of the Legendre transform, heres a nice document I came across a while back, which includes a discussion of the applications to thermodynamics (though it may be a bit basic if you’re a pro): http://arxiv.org/abs/0806.1147v2 “Making Sense of the Legendre Transform”.

Kazimierz Kurz wrote:

But what you are writing is old known fact. I read about it more that 20 years ago in Jamiołkowski, Ingarden, Mrugała “Fizyka statystyczna i termodynamika” (in translation “Statistical Physics and thermodynamics” in Polish) from 1990. They explain how using Pfaff’s forms, differential geometry and contact forms define thermodynamics in the same way as classical mechanics. They use differential forms, Legendre manifolds and even “thermodynamical brackets” which corresponds to Poison brackets from classical hamiltionian dynamics. It is quite popular book in Poland although is not very common way of explaining thermodynamics in that way. I do not know if this book has English edition. Unfortunately in google books there is only cover form this book http://books.google.pl/books/about/Fizyka_statystyczna_i_termodynamika.html?id=ADzZAQAACAAJ&redir_esc=y, and you cannot look inside,..

I do not have subscription to this sites so I do not know what is inside, but author and title maybe point up on something interesting here.

Dan Piponi wrote:

Cliff Harvey: I’ve read that article, but it really only convinced me that there’s still a need for someone to write an article that makes sense of the appearance of the Legendre transform in physics. I’ve a hunch John Baez is the man for the job!

Ekaropolus Van Gor wrote:

There is a “new” formalism made by Hernado Quevedo in which they use a Riemann metric and identify the thermodynamic interaction with curvature on a manifold. Here is the article:

If that Wikipedia article is missing something important, why not add it?

Maybe I will. I sometimes do. It could easily slip into a full-time job for me. If I could make duplicate copies of myself, one of them would definitely enjoy doing this all day.

John Baez wrote:

Kazimierz Kurz – Thanks! It indeed seemed likely that someone must have discovered this already – it’s too easy, and too important. That’s why I asked if someone knew about it. But I’d never encountered it in all my studies. If it were only available in an out-of-print book in Polish, that might explain this. But Blake Stacey pointed it out it’s also explained here:

Abstract: We note that equations of state—by which we mean identical relations among the thermodynamic variables characterizing a system—are actually first‐order partial differential equations for a function which defines the thermodynamics of the system. Like the Hamilton‐Jacobi equation, such equations can be solved along trajectories given by Hamilton’s equations, the trajectories being quasistatic processes which obey the given equation of state. This gives rise to the notion of thermodynamic functions as infinitesimal generators of quasistatic processes, with a natural Poisson bracket formulation. This formulation of thermodynamic transformations is invariant under canonical coordinate transformations, just as classical mechanics is, which is to say that thermodynamics and classical mechanics have the same formal structure, namely a symplectic structure.

I haven’t read this article yet, but I’ll be interested to see if it refers to Jamiołkowski, Ingarden, Mrugała’s Fizyka Statystyczna i Termodynamika or any of the other papers you mention.

John Baez wrote:

Cliff Harvey – thanks! I read and enjoyed “Making sense of the Legendre transform” and talked about it in “week289” back when I was starting to become obsessed with these issues:

If you read this you’ll see a big analogy between lots of different theories that have a pair of variables analogous to “p” and “q” in classical mechanics. And you’ll see this plea:

I’ve spent decades thinking about the Legendre transform in the context of classical mechanics, but not so much in thermodynamics. I think its appearance in both subjects should be a big part of the analogy I’m talking about here. But if anyone knows a clear, detailed treatment of the analogy between classical mechanics and thermodynamics, focusing on the Legendre transform, please let me know!

Unfortunately nobody pointed one out a reference back then so I had to figure it out myself! But it was fun to do, so I don’t really mind.

Bernard Beard wrote:

Ever study finance, John? It made a lot more sense to me once I understood that the net present value of a future cash flow is simply one particular value of the Laplace transform of that cash flow, the value corresponding to the selected discount rate.

John Baez wrote:

Bernard Beard – Yes, that’s a great example. I think my friend James Dolan pointed this out to me! For a long time the Laplace transform just seemed like a low-budget version of the Fourier transform for people who didn’t understand complex numbers (there are lots of problems you can solve using either one), but then I started meeting examples where it’s clearly the right thing to think about.

An upcoming topic there is Gell-Mann’s concept of complexity/simplicity referred to as plectics. This is interesting because it may fit into the Occam’s razor bucket of parsimony and is the root term in symplectic geometry.

The climate scientists are interested in this general topic because it could help make headway into complex general circulation models (GCM’s).

V.I. Arnold once wrote something along the lines of: the reason why thermodynamics is hard is that it is naturally formulated on an odd-dimensional phase space. I.e. contact geometry as opposed to symplectic geometry. This makes me a bit suspicious of such a direct analogy between thermodynamics and classical mechanics.

On other hand, quantum statistical mechanics does seem to be relatable to classical mechanics. Since there is always an irrelevant overall complex phase in QM, it is natural to consider the projective Hilbert space. Brody and Hughston showed that the projective Hilbert space is a symplectic manifold. The paper is “Geometrization of statistical mechanics” Proc. R. Soc. Lond. A (1999) 455, 1683–1715

V.I. Arnold once wrote something along the lines of: the reason why thermodynamics is hard is that it is naturally formulated on an odd-dimensional phase space. I.e. contact geometry as opposed to symplectic geometry.

Well, I like contact geometry too, and I’d be very interested to see what Arnol’d was actually talking about. What’s the simplest example does he give of an odd-dimensional phase space in thermodynamics?

As you can see from my post, thermodynamics works quite nicely on an even-dimensional phase space. But that doesn’t necessarily conflict with the odd-dimensional approach. Indeed, classical mechanics can be formulated nicely on either an even-dimensional symplectic manifold or an odd-dimensional contact manifold. So, we can use the analogy between classical mechanics and thermodynamics, described here, to cook up an odd-dimensional phase space for thermodynamics. In the example I’m talking about this article, I think that would have and as coordinates.

Here’s another nice paper emphasizing the fact that a projectivized Hilbert space is a symplectic manifold:

Abstract: States of a quantum mechanical system are represented by rays in a complex Hilbert space. The space of rays has, naturally, the structure of a Kähler manifold. This leads to a geometrical formulation of the postulates of quantum mechanics which, although equivalent to the standard algebraic formulation, has a very different appearance. In particular, states are now represented by points of a symplectic manifold (which happens to have, in addition, a compatible Riemannian metric), observables are represented by certain real-valued functions on this space and the Schr\”odinger evolution is captured by the symplectic flow generated by a Hamiltonian function. There is thus a remarkable similarity with the standard symplectic formulation of classical mechanics. Features—such as uncertainties and state vector reductions—which are specific to quantum mechanics can also be formulated geometrically but now refer to the Riemannian metric—a structure which is absent in classical mechanics. The geometrical formulation sheds considerable light on a number of issues such as the second quantization procedure, the role of coherent states in semi-classical considerations and the WKB approximation. More importantly, it suggests generalizations of quantum mechanics. The simplest among these are equivalent to the dynamical generalizations that have appeared in the literature. The geometrical reformulation provides a unified framework to discuss these and to correct a misconception. Finally, it also suggests directions in which more radical generalizations may be found.

Above we see Dan Piponi wondering about the Legendre transform in classical statics versus the Laplace transform in thermal statics, and me responding that the former is a limit of the latter. If we formally make the temperature imaginary we get quantum mechanics and the Fourier transform.

I doubt I’ve said anything original here. Classical mechanics is well known to be the limit of quantum mechanics as and it’s well known that in this limit we find that occurrences of the semiring are replaced by the semiring . But I’ve never seen an article that attempts to describe classical mechanics in terms of repeated inf-convolution even though this is close to Hamilton’s formulation and I’ve never seen an article that shows the parallel with the Schrödinger equation in this way. I’m hoping someone will now be able to say to me “I’ve seen that before” and post a relevant link below.

It’s really in thermal statics that we can rigorously understand how the semiring arises from a 1-parameter deformation of the semiring . The parameter is temperature, . A further ‘Wick rotation’ gives us quantum mechanics with

I’m not sure these free online books exactly what Dan wants, but it’s probably buried in them somewhere:

These were the proceedings of the International Workshop on Idempotent and Tropical Mathematics and Problems of Mathematical Physics, held at the Independent University of Moscow, Russia in 2007. Litvinov has also written other good review articles, like this:

My understanding has always been that in thermodynamics the macroscopic quantities come in pairs — one extensive and one intensive — with one singleton extensive quantity left over. For example the pairs are typically taken to be (energy, temperature), (volume, pressure), (particle number, chemical potential), etc. In this description, entropy is the odd one out and the goal of thermodynamics is to write down .

(It is tempting to imagine that time is conjugate to — second law any one?! — but as you point out we are really doing thermostatics not thermodynamics!)

So I guess I don’t see that odd-dimensional phase spaces need to be cooked up. On the other hand, I never knew that classical mechanics could be formulated on contact manifolds as well!

I took a course at UIUC that taught thermodynamics using differential forms. The course covered various applications of differential forms/differential geometry and thermodynamics was just one of many. I lost my notes and don’t remember the visiting professor’s name, but I remember how excited he was. The presentation was similar to what you have here.

I’ve seen lots of presentations of thermodynamics using differential forms. I’d never seen the analogy to classical mechanics exploited to bring symplectic geometry (or if you like, Poisson brackets) into play in thermodynamics. But you can find it in Peterson’s paper, and also maybe Jamiołkowski, Ingarden, and Mrugała’s book Fizyka Statystyczna i Termodynamika.

Did your professor talk about Poisson brackets in thermodynamics? I’m curious how well-known these are.

Yes. I believe he did, but it was long ago and this stuff was pretty new to me back then. The visiting professor was Eastern European (I believe) and I wouldn’t be surprised if his reference was the latter one you mention.

I showed you last time that in many branches of physics—including classical mechanics and thermodynamics—we can see our task as minimizing or maximizing some function. Today I want to show how we get from that task to symplectic geometry.

Hello Azimuth, I love your article on the deep connections between Hamilton’s equations of classical mechanics and the Maxwell relations of thermodynamics!! Is this in a book somewhere? (I don’t have a printer.) Best regards, Michael Martin, MD. (Just an MD, not a PhD!)

this is an interesting post—i saw it months ago. i collect papers more or less showing equivalences between classical mechanics and thermodynamics/ nonequilibrium stat mech. ( i hadnt even considered maxwells electrodynamics, except that there appear to be at least two lagrangians that will get you there).
there is the well known attempt via fisher information (reviewed critically by streater) and many other more current ones (some just qualitative—eg i noticed the ‘gravity law’ in economics was derived via stat mech by a wilson in the 60’s).
i guess i first saw this type of thing in the onsager-machlup functional—you can get a principle of least action/hamilton jacobi eqn for diffusion (ie a conservative process for what is usually thought of as a nonconservative one).
(there is alot of this stuff out there—it even has now gotten into proc royal society london by several authors, who often dont cite each other).
i cant really figure out if all the approaches are the same (one can even go into e nelson’s approach to quantum theory).
(my niece is in high school in singapore as an aside).

I was surprised to rediscover that the Maxwell relations in thermodynamics are formally identical to Hamilton’s equations in classical mechanics—though in retrospect it’s obvious. Thermodynamics obeys the principle of maximum entropy, while classical mechanics obeys the principle of least action. Wherever there’s an extremal principle, symplectic geometry, and equations like Hamilton’s equations, are sure to follow.

Variational principles are implicit in what I’m doing, even though they’re not required for the calculations you wrote down. All the mathematical structures in this post—and even more, as described in Part 2—show up whenever we try to extremize a smooth function of several variables. In classical mechanics we’re minimizing Hamilton’s principal function (basically the action); in thermodynamics we’re minimizing entropy.

Given this, the interesting puzzle—to me, anyway—is the relation between the principle of least action and the principle of maximum entropy. That’s what Blake Pollard and I investigated in our paper on quantropy. There’s still more to understand, though!

Given that the integral of Hamilton’s principal function over a path seems only to depend on the difference , what is the result of ‘minimising S’? It surely can’t be a path. Or perhaps I’m missing something crucial about how to integrate ?

I think what I don’t understand is what role the HPF plays in this recipe for mechanics. Is it:
1. Use some minimisation procedure to find path
2. Evaluate HPF for that path
3. Decide which represents time, label as , and hence derive Hamilton’s equations
?

In which case I think I was looking for step 1 somewhere in your post, and getting confused.

Incidentally, I find methods like this that put time and space coordinates on an equal footing to be very attractive — as far as I’m concerned, they only differ because they show up in different places in a Lorentzian metric. However, looking it up it seems that this is called ‘non-autonomous mechanics’ and is in general fairly tricky, e.g. I found http://arxiv.org/pdf/math-ph/0604063.pdf

I think what I don’t understand is what role the HPF plays in this recipe for mechanics. Is it:
1. Use some minimisation procedure to find path
2. Evaluate HPF for that path
3. Decide which represents time, label as and hence derive Hamilton’s equations
?

Yes, that’s about right.

In which case I think I was looking for step 1 somewhere in your post, and getting confused.

I explained it here:

Suppose we have a particle on the line. Consider smooth paths where it starts at some fixed position at some fixed time and ends at the point at the time . Nature will choose a path with least action—or at least one that’s a stationary point of the action. Let’s assume there’s a unique such path, and that it depends smoothly on and . For this to be true, we may need to restrict and to a subset of the plane, but that’s okay: go ahead and pick such a subset.

Given and in this set, nature will pick the path that’s a stationary point of action; the action of this path is called Hamilton’s principal function and denoted (Beware: this is not the same as entropy!)

Of course this is just a special case of a more general procedure. For example we could consider a spacetime manifold and fix a point

Suppose we can define a quantity called the ‘action’ for any path . Suppose there is a unique action-minimizing path from to any point . Then given , let be the action of the action-minimizing path from to the point Then

is Hamilton’s principal function, and starting from here we can derive Hamilton’s equations as explained in my blog article.

We can work in arbitrary coordinates and write the coordinates of as We don’t need to choose one coordinate and call it ‘time’: the formalism doesn’t really require that. Hamilton’s equations are really just the trivial statement that

Here I’m assuming that everything is as differentiable as we want it to be, and just to be careful I’m assuming there’s a unique action-minimizing path. The second assumption is false when there are ‘caustics’.

I suppose the awkward bit would be finding an expression for the action that is also agnostic with respect to the choice of time coordinate. A physical theory with action does seem to give a special role to , while it’s much more elegant to parametrise by arc length as you did and never have to mention time at all.

Sorry, no preview available here. The bug in your post was a subtle one (after you fixed the obvious ones). I fixed the problem by retyping the letters in your LaTeX comments that didn’t parse. Sometimes this happens when people cut-and-paste text that’s in a strange font.

Thanks! It’s interesting how they seem to be claiming to get a metric on their contact manifold using the Fisher information metric. At least that’s how I interpret the first sentence of their abstract:

It has been shown that contact geometry is the proper framework underlying classical thermodynamics and that thermodynamic fluctuations are captured by an additional metric structure related to Fisher’s Information Matrix. In this work we analyze several unaddressed aspects about the application of contact and metric geometry to thermodynamics. We consider here the Thermodynamic Phase Space and start by investigating the role of gauge transformations and Legendre symmetries for metric contact manifolds and their significance in thermodynamics. Then we present a novel mathematical characterization of first order phase transitions as equilibrium processes on the Thermodynamic Phase Space for which the Legendre symmetry is broken. Moreover, we use contact Hamiltonian dynamics to represent thermodynamic processes in a way that resembles the classical Hamiltonian formulation of conservative mechanics and we show that the relevant Hamiltonian coincides with the irreversible entropy production along thermodynamic processes. Therefore, we use such property to give a geometric definition of thermodynamically admissible fluctuations according to the Second Law of thermodynamics. Finally, we show that the length of a curve describing a thermodynamic process measures its entropy production.

1) We should keep in mind that classical mechanics is only an approximation to quantum mechanics, just as classical thermodynamics is only an approximation to statistical mechanics.

The symplectic structure is even more interesting and more useful when applied to the modern (post-classical) versions. Hamilton-Jacobi theory is amusing, but it’s less useful than out-and-out QM, and not significantly simpler.

2) Stat mech is basically the analytic continuation of QM, continued in the direction of imaginary time. This point and its ramifications are discussed in e.g. Feynman and Hibbs Quantum Mechanics and Path Integrals (1965). The classical limit of QM is obtained by the method of stationary phase, whereas the classical limit of stat mech is obtained by the method of steepest descent … so the two subjects are very nearly but not quite identical.

Again: If we’re going to make connections, it is even more interesting and more useful to connect the modern (post-classical) versions.

FWIW note that Planck invented QM as an outgrowth from stat mech … not directly from classical mechanics. So the connections are there, and have been since Day One.

3) Many (but not all) of the familiar formulas of thermodynamics can usefully be translated to the language of differential forms. In many cases all that is required is a re-interpretation of the symbols, leaving the form of the formula unchanged; for instance we interpret dE = T dS – P dV as a vector equation.

I say “not all” formulas because more than a few of the formulas you see in typical thermodynamics books are nonsense. This includes (almost) all expressions involving “dQ” or “dW”. Such things simply do not exist (except in trivial cases). Daniel Schroeder in An Introduction to Thermal Physics (1999) rightly calls them a crime against the laws of mathematics. With a modicum of self-discipline it is straightforward to do thermodynamics without committing such crimes.

Differential forms make thermodynamics simpler and more visually intuitive … and simultaneously more sophisticated, more powerful, and more correct.

Is the contrast between having QM derive from stat mech and having it from classical mechanics, not really stiffer or steeper than presented here? If you start from the stat mech side isn’t there a sense by which the other view misleads on the materiality or definiteness of the wave function outside any context of identically prepared systems?

How To Write Math Here:

You need the word 'latex' right after the first dollar sign, and it needs a space after it. Double dollar signs don't work, and other limitations apply, some described here. You can't preview comments here, but I'm happy to fix errors.