Network Theory (Part 11)

Noether proved lots of theorems, but when people talk about Noether’s theorem, they always seem to mean her result linking symmetries to conserved quantities. Her original result applied to classical mechanics, but today we’d like to present a version that applies to ‘stochastic mechanics’—or in other words, Markov processes.

What’s a Markov process? We’ll say more in a minute—but in plain English, it’s a physical system where something hops around randomly from state to state, where its probability of hopping anywhere depends only on where it is now, not its past history. Markov processes include, as a special case, the stochastic Petri nets we’ve been talking about.

Our stochastic version of Noether’s theorem is copied after a well-known quantum version. It’s yet another example of how we can exploit the analogy between stochastic mechanics and quantum mechanics. But for now we’ll just present the stochastic version. Next time we’ll compare it to the quantum one.

Markov processes

We should and probably will be more general, but let’s start by considering a finite set of states, say To describe a Markov process we then need a matrix of real numbers The idea is this: suppose right now our system is in the state . Then the probability of being in some state changes as time goes by—and is defined to be the time derivative of this probability right now.

So, if is the probability of being in the state at time we want the master equation to hold:

This motivates the definition of ‘infinitesimal stochastic’, which we recall from Part 5:

Definition. Given a finite set , a matrix of real numbers is infinitesimal stochastic if

and

for all .

The inequality says that if we start in the state , the probability of being found in some other state, which starts at 0, can’t go down, at least initially. The equation says that the probability of being somewhere or other doesn’t change. Together, these facts imply that that:

That makes sense: the probability of being in the state $i$, which starts at 1, can’t go up, at least initially.

Using the magic of matrix multiplication, we can rewrite the master equation as follows:

and we can solve it like this:

If is an infinitesimal stochastic operator, we will call a Markov process, and its Hamiltonian.

(Actually, most people call a Markov semigroup, and reserve the term Markov process for another way of looking at the same idea. So, be careful.)

Noether’s theorem is about ‘conserved quantities’, that is, observables whose expected values don’t change with time. To understand this theorem, you need to know a bit about observables. In stochastic mechanics an observable is simply a function assigning a number to each state .

However, in quantum mechanics we often think of observables as matrices, so it’s nice to do that here, too. It’s easy: we just create a matrix whose diagonal entries are the values of the function And just to confuse you, we’ll also call this matrix So:

One advantage of this trick is that it lets us ask whether an observable commutes with the Hamiltonian. Remember, the commutator of matrices is defined by

Noether’s theorem will say that if and only if is ‘conserved’ in some sense. What sense? First, recall that a stochastic state is just our fancy name for a probability distribution on the set . Second, the expected value of an observable in the stochastic state is defined to be

for any function on . The reason is that later, when we generalize from a finite set to a measure space, the sum at right will become an integral over . Indeed, a sum is just a special sort of integral!

Using this notation and the magic of matrix multiplication, we can write the expected value of in the stochastic state as

We can calculate how this changes in time if obeys the master equation… and we can write the answer using the commutator :

Lemma. Suppose is an infinitesimal stochastic operator and is an observable. If obeys the master equation, then

Proof. Using the master equation we have

But since is infinitesimal stochastic,

so for any function on we have

and in particular

Since , we conclude from (1) and (2) that

as desired. █

The commutator doesn’t look like it’s doing much here, since we also have

which is even simpler. But the commutator will become useful when we get to Noether’s theorem!

Noether’s theorem

Here’s a version of Noether’s theorem for Markov processes. It says an observable commutes with the Hamiltonian iff the expected values of that observable and its square don’t change as time passes:

Theorem. Suppose is an infinitesimal stochastic operator and is an observable. Then

if and only if

and

for all obeying the master equation.

If you know Noether’s theorem from quantum mechanics, you might be surprised that in this version we need not only the observable but also its square to have an unchanging expected value! We’ll explain this, but first let’s prove the theorem.

Proof. The easy part is showing that if then and . In fact there’s nothing special about these two powers of ; we’ll show that

for all . The point is that since commutes with , it commutes with all powers of :

So, applying the Lemma to the observable , we see

The backward direction is a bit trickier. We now assume that

for all solutions of the master equation. This implies

or since this holds for all solutions,

We wish to show that .

First, recall that we can think of is a diagonal matrix with:

So, we have

To show this is zero for each pair of elements , it suffices to show that when , then . That is, we need to show that if the system can move from state to state , then the observable takes the same value on these two states.

In fact, it’s enough to show that this sum is zero for any :

Why? When , , so that term in the sum vanishes. But when , and are both non-negative—the latter because is infinitesimal stochastic. So if they sum to zero, they must each be individually zero. Thus for all , we have . But this means that either or , which is what we need to show.

So, let’s take that sum and expand it:

which in turn equals

The three terms here are each zero: the first because is infinitesimal stochastic, and the latter two by equation (3). So, we’re done! █

Markov chains

So that’s the proof… but why do we need both and its square to have an expected value that doesn’t change with time to conclude ? There’s an easy counterexample if we leave out the condition involving . However, the underlying idea is clearer if we work with Markov chains instead of Markov processes.

In a Markov process, time passes by continuously. In a Markov chain, time comes in discrete steps! We get a Markov process by forming where is an infinitesimal stochastic operator. We get a Markov chain by forming the operator where is a ‘stochastic operator’. Remember:

Definition. Given a finite set , a matrix of real numbers is stochastic if

for all and

for all .

The idea is that describes a random hop, with being the probability of hopping to the state if you start at the state . These probabilities are nonnegative and sum to 1.

Any stochastic operator gives rise to a Markov chain And in case it’s not clear, that’s how we’re defining a Markov chain: the sequence of powers of a stochastic operator. There are other definitions, but they’re equivalent.

We can draw a Markov chain by drawing a bunch of states and arrows labelled by transition probabilities, which are the matrix elements :

Here is Noether’s theorem for Markov chains:

Theorem. Suppose is a stochastic operator and is an observable. Then

if and only if

and

for all stochastic states

In other words, an observable commutes with iff the expected values of that observable and its square don’t change when we evolve our state one time step using .

You can probably prove this theorem by copying the proof for Markov processes:

Puzzle. Prove Noether’s theorem for Markov chains.

But let’s see why we need the condition on the square of observable! That’s the intriguing part. Here’s a nice little Markov chain:

where we haven’t drawn arrows labelled by 0. So, state 1 has a 50% chance of hopping to state 0 and a 50% chance of hopping to state 2; the other two states just sit there. Now, consider the observable with

It’s easy to check that the expected value of this observable doesn’t change with time:

for all . The reason, in plain English, is this. Nothing at all happens if you start at states 0 or 2: you just sit there, so the expected value of doesn’t change. If you start at state 1, the observable equals 1. You then have a 50% chance of going to a state where the observable equals 0 and a 50% chance of going to a state where it equals 2, so its expected value doesn’t change: it still equals 1.

On the other hand, we do not have in this example, because we can hop between states where takes different values. Furthermore,

After all, if you start at state 1, equals 1 there. You then have a 50% chance of going to a state where equals 0 and a 50% chance of going to a state where it equals 4, so its expected value changes!

So, that’s why for all is not enough to guarantee . The same sort of counterexample works for Markov processes, too.

Finally, we should add that there’s nothing terribly sacred about the square of the observable. For example, we have:

Theorem. Suppose is an infinitesimal stochastic operator and is an observable. Then

if and only if

for all smooth and all obeying the master equation.

Theorem. Suppose is a stochastic operator and is an observable. Then

if and only if

for all smooth and all stochastic states

These make the ‘forward direction’ of Noether’s theorem stronger… and in fact, the forward direction, while easier, is probably more useful! However, if we ever use Noether’s theorem in the ‘reverse direction’, it might be easier to check a condition involving only and its square.

1) “the number describes the probability per unit time of hopping from the state to the state ” is not quite correct. is the rate at which the system hops from state to state . In other words, for an infinitesimal time the probability for jumping is . The difference between the two is similar to the difference between the interest rate and the AER (annual equivalent rate, for non-UK readers).

2) “Together, they imply that the probability of staying in the same place goes down: .” is not clear to me. What is this “probability of staying in the same place” and it is going down as what changes? I think it would be clearer to say that is the rate at which a system in state leaves that state.

3) “we call a Markov process”. The term “Markov process” already has a different well-defined mathematical meaning. The group of operators is often referred to as the “Markov semigroup”. I am happy with the term “Hamiltonian” for the generator of this semigroup.

In principle I like the approach taken here of giving physicists’s quantum mechanical names to probability theory concepts. It makes the theory more accessible to physicists. In that vein I would go even further and use Dirac’s bra-ket notation rather than the integral notation you employ. So, for example, instead of I would write where satisfies . I think the integral notation is unsatisfying to both mathematicians and physicists (physicists will be wondering where the dx went and mathematicians will want to know what measure is used).

“the number describes the probability per unit time of hopping from the state j to the state i” is not quite correct. is the rate at which the system hops from state j to state i. In other words, for an infinitesimal time the probability for jumping is .

This issue comes up over and over when I write about these things. I feel I have trouble explaining this concept both accurately and very quickly.

I completely understand the problem with what I said: it might seem like the probability that the system hops from state j to state i in, say, one second. As you know, what I really mean is to take the probability that the system hopes from state j to state i in seconds, divide it by , and then take the limit as . But that takes a while to say!

I’m always afraid that calling this quantity “the rate at which the system hops from state j to state i” will confuse people, because this description doesn’t mention probabilities. This rate is a “probabilistic rate”, and the phrase “probabilistic rate” is not part of everyday English. I guess say things like “on average, a bus comes by every 10 minutes”. But if you say “the rate at which buses come by is 1 per hour”, I bet they won’t guess you mean a probabilistic rate.

I think that leaving out the word “unit” would help: “the number describes the probability per time of hopping from the state j to the state i.”

Or I could say “the number describes the average rate at which the system hops from the state j to the state i.”

What do people think is clearest? This time I’ll add a lengthy precise description, but I don’t want to always have to give such a long description. I want a clear short description that nonexperts can understand.

John, this discussion has been very instructive for me. It opened my eyes to the problems one runs in to when one wants to be precise and colloquial at the same time. In particular this is difficult if one wants a blog post to be readable to someone who has not read the previous blog posts in the series.

I would vote for “probabilistic rate”. You are right that this is not part of everyday English. Therefore I suspect it also does not carry any misleading connotations. Most readers will probably just swallow it and the curious ones will be tempted to read your earlier posts with the precise explanations. Initially I had felt that “stochastic rate” might work, but I now realise that the word “stochastic” might sound technical.

One of my books on probability uses “probability intensity of transition from state i to state j”. What you call the Hamiltonian I would call a “rate matrix”, or an “instantaneous rate matrix” if I thought the former was likely to confuse.

“Probability intensity” is an interesting phrase. I’m not sure most people would instantly understand it, but they could learn.

“Rate matrix” is certainly clearer than “Hamiltonian”, so I should mention that in the (dreamt-of) final polished version of these notes. “Hamiltonian” is mainly good for helping physicists see that all this stuff is a lot like quantum mechanics. In quantum mechanics we have

while here we have

While we’re comparing conventions, I should add that lots of people prefer

and this indeed has advantages. But I thought that sticking a minus sign would seem peculiar to beginners.

“Together, they imply that the probability of staying in the same place goes down: .” is not clear to me. What is this “probability of staying in the same place” and it is going down as what changes?

These comments are really useful, because while you and I both know what I really meant to say, I plan to turn these posts into a paper or book someday, and then it’s important that they be clear.

So, here’s what I mean. The probability of staying in some particular place, say place , is the matrix element

and this goes down as time passes:

But again, the problem comes when I try to say this very quickly and informally but still clearly.

I think it would be clearer to say that is the rate at which a system in state i leaves that state.

Again, I avoided saying this because this rate is a “probabilistic rate”, a rate of change of probabilities, and what you say here doesn’t make that clear. “The rate at which the probability of staying in the state diminishes” is perhaps more precise—but it sounds stilted, not conversational.

Also, the minus sign also looks like it’s inserted ad hoc when we say things this way. What’s uniformly true is that

is the probability of hopping from state to state after time and

So is the “instantaneous rate of change, at , of the probability of hopping from state to state after time .”

But I’d like a way to say this that’s quick, informal, yet clear. Of course I need to explain this idea clearly and patiently somewhere. But then there will be times I need to remind people of it—and those reminders should be terse but not misleading.

Thank you John, that was helpful. While reading that sentence that gave me difficulties I had not realised that the probability you were talking about was given as a simple exponential and therefore I had not made the connection between the decrease in that probability and the sign of .

I know I am in a pedantic mood. But I think being pedantic is fun. So here I go again. There is a difference between two probabilities. gives the probability of _being_ in state at time given that we start in state at time . The probability of staying (in the sense of never leaving) in state until at least time is . Luckily they both have the same derivative at , so they both go down at . The probability of staying has the added benefit of going down also at . The probability of being in the state on the other hand could conceivably go up again later. So it is good that you chose to talk of the probability of staying.

“we call a Markov process”. The term “Markov process” already has a different well-defined mathematical meaning. The group of operators is often referred to as the “Markov semigroup”.

Hmm, when I read the definition of Markov process, it sounds like a long-winded way of describing a Markov semigroup. Isn’t there a one-to-one correspondence between Markov processes and Markov semigroups? If there is, I can just insert a little note saying that I’m abusing language a bit.

(I’m a bit of a radical, I’m afraid: I think the world needs people who try using terminology in new ways… as long as they define it. Such people are nuisances, I know. But they provide the lubrication needed to eventually find the optimal terminology: otherwise things get locked in place at suboptimal local maxima.)

I am happy with the term “Hamiltonian” for the generator of this semigroup.

Good, because I want you to be happy, and that’s what I’m going to use.

I would go even further and use Dirac’s bra-ket notation rather than the integral notation you employ.

I can’t do that, because as I explained in Part 5, the fundamental structure here is not the Hilbert space but rather the space , which doesn’t have an inner product on it! This is the big philosophical point I’m trying to make throughout these notes. I’ll quote myself and then say a bit more:

Probability versus quantum theory

Suppose we have a system of any kind: physical, chemical, biological, economic, whatever. The system can be in different states. In the simplest sort of model, we say there’s some set of states, and say that at any moment in time the system is definitely in one of these states. But I want to compare two other options:

• In a probabilistic model, we may instead say that the system has a probability of being in any state . These probabilities are nonnegative real numbers with

• In a quantum model, we may instead say that the system has an amplitude of being in any state . These amplitudes are complex numbers with

Probabilities and amplitudes are similar yet strangely different. Of course given an amplitude we can get a probability by taking its absolute value and squaring it. This is a vital bridge from quantum theory to probability theory. Today, however, I don’t want to focus on the bridges, but rather the parallels between these theories.

We often want to replace the sums above by integrals. For that we need to replace our set by a measure space, which is a set equipped with enough structure that you can integrate real or complex functions defined on it. Well, at least you can integrate so-called ‘integrable’ functions—but I’ll neglect all issues of analytical rigor here. Then:

• In a probabilistic model, the system has a probability distribution, which obeys and

• In a quantum model, the system has a wavefunction, which obeys

In probability theory, we integrate over a set to find out the probability that our systems state is in this set. In quantum theory we integrate over the set to answer the same question.

We don’t need to think about sums over sets and integrals over measure spaces separately: there’s a way to make any set into a measure space such that by definition,

In short, integrals are more general than sums! So, I’ll mainly talk about integrals, until the very end.

In probability theory, we want our probability distributions to be vectors in some vector space. Ditto for wave functions in quantum theory! So, we make up some vector spaces:

• In probability theory, the probability distribution is a vector in the space

• In quantum theory, the wavefunction is a vector in the space

You may wonder why I defined to consist of complex functions when probability distributions are real. I’m just struggling to make the analogy seem as strong as possible. In fact probability distributions are not just real but nonnegative. We need to say this somewhere… but we can, if we like, start by saying they’re complex-valued functions, but then whisper that they must in fact be nonnegative (and thus real). It’s not the most elegant solution, but that’s what I’ll do for now.

Now:

• The main thing we can do with elements of , besides what we can do with vectors in any vector space, is integrate one. This gives a linear map:

• The main thing we can with elements of , besides the besides the things we can do with vectors in any vector space, is take the inner product of two:

This gives a map that’s linear in one slot and conjugate-linear in the other:

First came probability theory with ; then came quantum theory with . Naive extrapolation would say it’s about time for someone to invent an even more bizarre theory of reality based on In this, you’d have to integrate the product of three wavefunctions to get a number! The math of Lp spaces is already well-developed, so give it a try if you want. I’ll stick to and today.

Privately I often use angle brackets like this:

to denote the operation I’m publicly calling the integral

This heightens the resemblance to Dirac’s bracket notation: quantum mechanics uses for the expected value of an observable, while stochastic mechanics uses .

However, I’m sure that writing for the expected value of an observable in the state would annoy lots of people. For one thing, lots of people use , sweeping under the carpet. This is sort of stupid, but it’s completely entrenched.

So, for now I’m using instead. And this has the advantage of having a fairly self-evident meaning: I’m integrating the function over the space .

Everyone has their own notation and nobody likes anyone else’s. I take that for granted as a condition of life. I don’t want to know why other people hate my notation; I don’t expect them to care why I hate theirs. I prefer to discuss more interesting things. So, this comment will be somewhat grumpy in tone.

I am trying to clarify and exploit the logical relation between probability theory and quantum theory. This involves noting the similarities but also respecting the differences.

I am not at all interested in ‘following the spirit of Dirac’, if that means ‘glossing over mathematical subtleties’. However, I don’t want to scare my readers by introducing too much formalism too soon—especially if I haven’t worked out the details!

So far in this series of posts, I’m pursuing the philosophy that quantum theory is about Hilbert spaces while probability theory is about vector spaces equipped with some other structure. This extra structure is something like that of an integration algebra… but that may not be quite right, so I’d rather not talk about it yet.

So, instead, I’m saying that quantum theory is about while probability theory is about . This is easier for everyone to understand.

Given this, I want to write the integral of the function as , rather than trying to artificially force probability theory into looking like quantum theory by writing it as .

There is a certain quaint charm in using an integral sign to denote integration, after all.

But if someone held a gun to my head and forced me to use Dirac notation here, I would write , which at least makes some sense: as you note, we can say we’re pairing the element with the element .

But if we try to understand the relation between quantum theory and probability theory this way, I believe we’ll get quite confused.

Anyway, there are lots of interesting issues to discuss here, but I think it will be easiest if we decouple them from the question of what notation to use.

Do any of these people reflect out loud about what this approach means? It means something like: there’s a god-given ‘default state’ called 1, and the expectation value of an observable in the state is the transition amplitude . But actually it’s weirder than that, since if we have

then the $\psi$ will hardly ever count as a quantum state, since typically

and similarly, unless our measure space is a probability measure space the default state will be neither a stochastic state:

nor a quantum state:

So it’s all very weird. Basically, it ignores the fact that quantum states should have

while stochastic states are very different beast, with

We can get a stochastic state from a quantum state by forming : we all learn about this in school, when people discuss the probability interpretation of the wavefunction.
Conversely (though I never hear anyone talk about this) we can get a quantum state from a stochastic state by forming . But in the approach where we talk about , it seems we are simply pretending a stochastic state is a quantum state, while neglecting all the problems this raises!

Believe me, I’d be fascinated if someone could tell a coherent story about this… I’m not trying to nip an nascent idea in the bud… but so far all my thoughts about this suggest it’s a wrong road.

Does it still count as a “nascent idea” if it’s been around since 1976? :-P

More seriously, I think the main issue is that most of the people involved just weren’t that concerned with quantum-to-classical transitions. If the smallest thing you’re considering is a rabbit, a sand grain or even a clump of cells in a human neocortex, going from a probability distribution to a quantum density matrix or vice versa isn’t a top priority. So, while being able to lift tools out of the quantum toolbox is nice, relating a stochastic description of a system to a quantum description of the same physical system isn’t a goal.

There are two immediately apparent differences from ordinary quantum field theory: first, there is no factor of in the Schrödinger equation (3) — but this is familiar from euclidean formulations of conventional quantum theories; second, the hamiltonian is not hermitian. In many cases it will turn out that, nevertheless, its eigenvalues are real. (Complex eigenvalues correspond to oscillating states which are known to occur in some chemical reactions.) However, the most important difference is one of interpretation: expectation values of observables are not given by , since this would be bilinear, rather than linear, in the probabilities . Instead, for an observable which is diagonal in the occupation number basis, its expectation value is of course

and it is straightforward to show that this may be expressed as

since the state is a left eigenstate of all the , with unit eigenvalue.

Second, I think that the people who study diffusion-limited reactions, active-to-absorbing phase transitions, directed percolation and the like are generally eager to skip past the first steps of defining the formalism and get to a Lagrangian they can play with. A better notation at the beginning may obviate the need for a few awkwardnesses further along (e.g., field redefinitions); I’ll have to look into that. The stuff they seem to spend the most time worrying over comes after they’ve a stochastic Hamiltonian in the coherent-state representation: renormalization, estimating critical exponents, etc.

Does it still count as a “nascent idea” if it’s been around since 1976?

I’d say the mathematical trick has been around since 1976. The nascent idea lurking in this trick is that we can think of a probability distribution as a quantum state if we normalize it in a nonstandard way and promise to only ask about its transition amplitudes to a certain ‘default’ state . Mathematical tricks often conceal ideas that are too strange for people to say in words.

More seriously, I think the main issue is that most of the people involved just weren’t that concerned with quantum-to-classical transitions.

Yes, that’s one part of it. But even if we don’t try to describe the same system both classically and quantumly, there’s also the question of the logical relation between the classical and quantum descriptions: that’s what I’m especially interested in. But this is not the sort of question that ‘practical’ people tend to enjoy—perhaps because they can’t imagine what one might do with the answer.

Second, I think that the people who study diffusion-limited reactions, active-to-absorbing phase transitions, directed percolation and the like are generally eager to skip past the first steps of defining the formalism and get to a Lagrangian they can play with.

Right. For me the murky beginning steps are the most interesting part, because they hint at a relation between quantum mechanics and probability theory that seems a bit different than the ‘obvious’ one, where rather than the wavefunction acts like a probability distribution. I’ve got a bunch of ideas about this that I’ll reveal as soon as I can.

With Google chrome and IE, what I see is mathematically correct, but very messy. The slash through the equals sign is too far to the right, and the arrow is made of an equals sign and and arrow which don’t line up.

Thanks, guys! Does anyone see the &ne; as an equal sign? I now think it could be because on this computer I’ve downloaded fonts so that jsmath doesn’t need to grab them from somewhere else. I assume you guys are getting the little message on top, about jsmath?

It shows an extra ‘=’ prepended to the => sign (is this to make a long implication arrow?) In Opera itself the ‘=’ is at a slight angle but when I did a screen grab it came out straight, except that you can see the join, so I have some kind of optical illusion as well.

Anyhow, I am wondering if the extra ‘=’ is being moved around somehow.

Eric Forgy also lured me into thinking about the Schrödinger versus Heisenberg pictures in stochastic mechanics.

So far I’ve been using time-independent observables and letting states evolve in time via

This is the Schrödinger picture. However, we may also use time-independent states and let observables evolve in time via

This is the Heisenberg picture. These pictures are compatible in that we may use either one to compute the expected value of an observable measured in the state after waiting a time $t$, and we get the same answer:

In the Schrödinger picture we have the master equation

while in the Heisenberg picture we have

This is amusingly different than quantum mechanics. In quantum mechanics we define a time-dependent version of either the state or the observable by setting

There’s nothing like posting something publicly to stir up thoughts that make that post seem ill-considered and rash! I’ve changed my mind a bit about the Heisenberg picture in stochastic mechanics. While nothing I said above seems mathematically incorrect, it’s upsetting that while the product of observables is an observable, we have

if, as above, we define

So, suppose is infinitesimal stochastic. Also suppose our set of states is finite, to avoid subtleties of analysis I’d rather postpone thinking about. Then is defined for negative as well as positive , and while it’s usually not stochastic for negative times, we have

for both positive and negative times.

Then, we can either use time-independent observables and let states evolve in time via

or use time-independent states and let observables evolve in time via

These pictures are compatible in that we may use either one to compute the expected value of the observable measured in the state after waiting a time , and we get the same answer:

I don’t get one thing. You write “That is, we need to show that if the system can move from state j to state i, then the observable takes the same value on these two states.”

So, if I understand well, if the graph is connected, observable O will take the same value on all of the states. Otherwise, it will have different constant values in each component (but I do not consider disconnected graphs as really interesting, as each component is completely independent of another: each process happens on its own).

So, maybe what you proved is not “Noether’s theorem”, but the (still nice) result that

“If a time-independent observable’s average and variance do not vary in time, then the observable is uniform of the vertex set”.

– The fact that first and second moments play the crucial role for Markov processes resounds with the continuous variable case, where the underlying stochastic processes have at each time gaussian distributions – the gaussian has only first and second nonvanishing moments – and something similar happens in Pawula’s theorem for the truncation of the Kramers-Moyal expansion after the second term.

So, if I understand well, if the graph is connected, the observable O will take the same value on all of the states. Otherwise, it will have different constant values in each component…

That’s right. By the way, for people who don’t understand what you said, let me add that you’re taking the points of our set as the vertices of a directed graph, and drawing an edge from to whenever is nonzero.

(but I do not consider disconnected graphs as really interesting, as each component is completely independent of another: each process happens on its own).

Well, you may not consider it interesting, but that’s what a conserved quantity does: it splits the set into a disjoint union of subsets on which takes different constant values, and our Markov process then becomes a ‘disjoint union’ of Markov processes on these subsets. It’s exactly like in quantum mechanics, where a conserved quantity splits the Hilbert space up as a direct sum of eigenspaces, and time evolution separately preserves each eigenspace.

Personally I consider this very interesting: this is how conserved quantities let us simplify physics problems! And they arise quite often: for example, in the reversible reaction we considered last time:

the total number of particles of types 1 and 2 is conserved. This explains how from a single Poisson equilibrium state we were able to extract a lot of different equilibrium states in which that number took different values. I’ll work out this example in detail sometime, for people who need a bit of help.

OK, I buy it. But still I prefer the formulation “If a time-independent observable’s average and variance do not vary in time, then the observable is uniform over the vertex set (of a connected graph)”. I think it does have something deep in it related to the key role of the first and second moment for stochastic processes.

(Also because in QM you can build entangled wave functions over factorized subspaces, while here the superposition between probabilties, or populations, is always what one would call a “mixture” in the QM case.)

On the Schrödinger/Heisenberg picture: I’ve seen people using a sort of “interaction picture”, where the hamiltonian is split in the waiting time contribution and an interaction hamiltonian and then take care of these two pieces when exponentiating . It’s very useful for guessing the correct path measure, for example. I myself had a complete discussion of this procedure on my master thesis, but it is in italian. However, I’ve never seen it discussed in relation to the evolution of a conjugate observable . It would be interesting to see what happens if one discharges part of the evolution (the “free” one) on an observable and part (the “interacting”) on the probability measure itself. Maybe it would make calculations easier.

Hi! Great to see you here again! James Dolan had suggested to me the idea of using an interaction picture of precisely this sort. I’d never seen it before. What’s I find amusing is that the particle’s probability of staying where it is decays before the particle jumps somewhere else… as if it’s dreaming of the jump before it goes:

This is different than the interaction picture in quantum mechanics, where by itself is already self-adjoint, so that the free evolution is unitary between the ‘jumps’.

This is precisely what I had in mind. I can send you via email a couple of pages from my master thesis if you want: they are in Italian, but the formulas are quite clear. Funnily, I don’t have references… I wrote that chapter out of some personal notes of my professor, which didn’t have references neither. In the field, it’s like everybody knows about it but nobody knows exactly where it comes from…

“as if it’s dreaming of the jump before it goes”: this is always the effect it has when we project statistical arguments onto the individuals, like when people play long-overdue numbers at the lotteries…

This is precisely what I had in mind. I can send you via email a couple of pages from my master thesis if you want: they are in Italian, but the formulas are quite clear.

If it mainly says what we’ve already discussed, I guess I won’t make you bother. I guess this is some sort of ‘folk wisdom’.

“as if it’s dreaming of the jump before it goes”: this is always the effect it has when we project statistical arguments onto the individuals, like when people play long-overdue numbers at the lotteries…

Puzzle. Suppose is a stochastic operator and is an observable. Show that commutes with iff the expected values of and its square don’t change when we evolve our state one time step using In other words, show that

if and only if

and

for all stochastic states

Answer. One direction is easy: if then for all so

where in the last step we use the fact that is stochastic.

For the converse direction we can use the same tricks that worked for Markov processes. Assume that

and

for all stochastic states . These imply that

and

We wish to show that . Note that

To show this is always zero, we’ll show that when , then . This says that when our system can hop from one state to another, the observable must take the same value on these two states.

For this, in turn, it’s enough to show that the following sum vanishes for any :

Why? The matrix elements are nonnegative since is stochastic. Thus the sum can only vanish if each term vanishes, meaning that whenever .

Word spreads fast! Here’s an announcement of a talk at the Oxford OASIS series. That stands for Oxford Advanced Seminar on Informatic Structures.

Dear all,

For this week’s OASIS seminar we have the pleasure of a talk by Harvey Brown, the professor in philosophy of physics at Oxford who is well-known for his work on the foundations of quantum mechanics, relativity theory, and the role of symmetry principles in physics, including several books. Moreover, he is a very clear and entertaining speaker! This Friday he will convince us that we need to take symmetries and their subtleties more seriously.

Time and place: This Friday, 2pm, Lecture Theatre B, Department of Computer Science.

Title: Noether’s famous 1918 symmetry theorem — what does it prove?

Abstract: Recently, Brendan Fong and John Baez have provided an analogue in stochastic mechanics to what they call Noether’s theorem in quantum mechanics. Noether’s original theorem, relating symmetries and conservation principles, was the first in a series of theorems she proved in 1918 within a program in the calculus of variations, inspired by interpretational problems related to conservations laws in general relativity. I will sketch the background to Noether’s work and give special emphasis to the form and meaning of her “first” theorem. An unusual application of the theorem to quantum mechanics will be exploited.

Philosophers of physics being as they are, the phrase “what they call” makes me afraid he’s planning to chide me for using the term “Noether’s theorem” in a very extended sense, not very close to that of her original 1918 paper. Physicists being as they are, such chiding wouldn’t stop me. But I’m curious to hear what he actually says. The talk will be videotaped and put on the OASIS website. Furthermore, Brendan is now at Oxford and can hear the talk in person!

The functional calculus allows you to apply any function to any self-adjoint matrix (and thus any self-adjoint operator on a finite-dimensional Hilbert space), or any holomorphic function to any matrix (and thus to any linear operator on a finite-dimensional space).

Also, when we have f(O), does f being smooth mean that, it can be expanded in the form of a power series in O?

How To Write Math Here:

You need the word 'latex' right after the first dollar sign, and it needs a space after it. Double dollar signs don't work, and other limitations apply, some described here. You can't preview comments here, but I'm happy to fix errors.