Tuesday, May 23, 2006

Quantum Mechanics and Probability Theory

I've been spending a bit of time thinking about quantum mechanics as an 'exotic' probability theory. It's actually very easy to view quantum mechanics this way and what interests me more is figuring out exactly how quantum mechanics differs from conventional probability theory.

Quantum mechanics looks just like probability theory with a slight twist. Instead of assigning probabilities to events you assign 'amplitudes' which are complex numbers. Computations with these amplitudes are carried out very similarly to computations with probabilitites. If we use a() to mean 'amplitude' a(A or B)=a(A)+a(B) and a(A and B)=a(A)a(B) under similar conditions to those in which P(A or B)=P(A)+P(B) and P(A and B)=P(A)P(B) in conventional probability theory. But these rules for a() only apply when the events A or B remain 'unobserved'. When we actually make observations we switch to a different rule where we now assign probabilities using the rule P(A)=|a(A)|² and these probabilities can be given your favourite probabilistic interpretation.

One interesting thing we can do is recast conventional probability theory so that it looks more like quantum mechanics. For example, consider coin tossing with two outcomes, H and T. We can use these states as the basis for a 2D vector space and write them as |H> and |T>. The outcome of a fair coin toss can be written as 0.5|H>+0.5|T>. When we come to 'observe' the coin its state vector 'collapses' to |H> or |T>. Suppose we have two such coins. Then the joint state space is the tensor product of the state spaces for individual coins. The rule for combining two independent coins into a joint state is the ordinary tensor product. So combining two fair coins gives the state (0.5|H>+0.5|T>)⊗(0.5|H>+0.5|T>)=0.25|H>|H>+0.25|H>|T>+0.25|T>|H>+0.25|T>|T> which can be interpreted as giving the usual probability assigments for fair coin tossing. If we couple the coins together on some way then the joint state might not be a product of individual coin states, ie. we might no longer have independence and their states would be 'entangled'.

Consider a physical process applied to the tossed coin. For example suppose we have a machine that always turns a head over to reveal a tail but only has a 50% chance of turning over a tail. We can think of this as a linear operator mapping |H>→|T> and |T>→0.5|H>+0.5|T>. In this formalism any physical process must be a linear operator given by a stochastic matrix.

Anyone who has studied a little quantum mechanics will recognise many of the phenomena above: the tensorial nature of joint systems, the linearity, the entanglement, the 'collapse' of the state on observation and so on as being nearly identical to features of quantum mechanics. So contrary to popular opinion I think I believe that whatever is interesting about quantum mechanics has nothing to do with any of these features. You can even shoe-horn a variant of the many worlds 'interpretation' of quantum mechanics into probability theory by insisting on the primacy of the state vector and pointing out that when a coin is observed the observation is described by a linear map defined by |H>→|H>|Observer sees H> and |H>→|T>|Observer sees T>.

But almost everyone agrees that quantum mechanics is weird. So how does it really differ from probability theory?

The biggest one is the feature called 'destructive interference'. In conventional probability theory we can combine two states together to make a new state. For example suppose process A generates state |A> and process B generates state |B>. Then we can choose to run either A or B with 50% probability resulting in a new state |C>=p|A>+(1-p)|B>. Suppose A and B are coin states. Then the probability of getting a head when observing state |C> is bounded between the probabilities of getting a head in states A and B. This is a kind of convexity condition. If processes A and B are both roughly fair there's no way of choosing p to 'refine' these states to make another one that's more unfair. And here's the big difference from quantum mechanics. In QM we can choose p to be negative (or any other complex number) and escape from this convexity condition. If we have a quantum coin toss, say |A>=0.6|H>+0.8|T> and |B>=0.8|H>+0.6|T> (in QM the sums of the squares of the moduli are one, not the sums themselves, that's why I chose 0.6 and 0.8), then we can combine them into |C>=-12/7|A>+16/7|B> and get a pure |H> state. We have made the |T> terms cancel out. This is an example of destructive interference. (How do we actually carry out this linear map for arbitrary complex numbers? It's actually very easy - see below.)

One very curious consequence of the above is that we can't ignore low amplitude outcomes. Classicaly, suppose |A>=sqrt(1-e)|H>+e|T> and |B>=sqrt(1-f)|H>+f|T>. If e and f are small enough then no matter what linear combination of |A> and |B> we use to make |C> we can choose to ignore the possibility that we have a head. But in QM, suppose |A>=sqrt(1-|e|²)|H>+e|T> and |B>=sqrt(1-|f|²)|H>+f|T>. Then be carefully crafting |C> from |A> and |B> we can make the probability of getting heads as high as we like.

It's destructive interference that underlies many of the interesting phenomena of QM. For example Shor's factorisation algorithm exploits destructive interference to remove those parts of the quantum computer state that give the wrong answer leaving us with an output that represents a factor.

Non-convexity also leads to other significant consequences. In conventional probability the basis elements |H> and |T> are 'special'. They lie at the extreme 'corners' of the space of possible states and can't be written as stochastic combinations of any other states. But for a quantum coin toss we've shown above that there is no special basis. The state |H> can be written as a linear combination of |A> and |B> just as easily as |A> can be written in terms of |H> and |T> and this is reflected in the fact that we can make a machine that combines |A> and |B> states to make an |H> state. In fact, this is straightforward to observe in the lab. Instead of considering |H> and |T> consider |+> and |-> which correspond to spin up or down states for an electron. Amazingly, if you rotate this state through 90 degrees to make a "spin left" state might get something like (|+>-i|->)/√2. (That answers the question I asked above, just rotate the system.) There is no conventional analogue of this. It is this that is the central problem with Schrödinger's cat. Many people think the problem is that the cat's state becomes something like (|alive>+|dead>)/√2 and can't figure out how to assign meaning to this. But this isn't an issue for QM at all, you can formalise conventional probability so that the cat is described by such a vector and nobody has any difficulty with that. The issue with the Cat is that when the cat is observed it appears to 'collapse' into either the state |dead> or |alive>. Why these two basis elements? In conventional probability these vectors are special. But in QM they are no more special than other vectors.

So in summary - I think that entanglement, wave function collapse and a whole host of other phenomena that are presented as specifically quantum phenomena are nothing but irrelevant distractions and aren't specific to QM at all. The interest value in QM comes from non-convexity and the important problem to solve is the preferred basis problem. (There's also the possibly unanswerable philosophical question of why in Heaven's name the universe looks like a stochastic system with complex numbers for probabilities? And if you're a Bayesian, in what way can a complex number possibly be interpreted as a degree of uncertainty?)

8 Comments:

Well I see entanglement as exactly analogous to joint probability distributions of non-independent variables. Bell's theorem shows that those joint probabilities can't possibly obey the usual rules of conventional probability. So I don't see it as fundamentally about entanglement.

In more detail (and it's a long time since I worked through the proof of Bells' theorem so I may be slightly off) what Bell's theorem shows is that there is no way of interpreting the results of the EPR experiment as coming from a conventional classical setup with some hidden unknown variables described by some (conventional) probability distribution. If you reject anything other than this kind of approach then you are forced to accept some kind of spooky action at a distance.

If you look at the discussion here you can see how the 'paradox' comes about because we can craft states where the coefficient of a basis element can be negative. What Bell's theorem says is not that classical systems can't be entangled, but that there is a limit to how correlated such systems can be. By exploiting the fact that amplitudes can be complex you can make quantum systems much more correlated than classical ones. And that's why I see this kind of non-convexity as much more interesting than entanglement itself.

Some things Oded says are wrong. Like "For example, it says that you cannot make non-Unitary transformations, but this by itself does not mean that you can effect any Unitary transformation that you want." isn't quit correct. There are "universal" quantum logic gates that allow you to get as close as you like to any unitary transformation. However, I still agree with the general tenor of his comments. (Though this is mostly a different topic to what I posted on.)

I see the main difference as whether the coefficients are complex or real. With, in fact, a little bit more of a twist: you point out both this difference, and that in QM, as opposed to Classical, the coefs are squared when normalizing the states.

So, it seems (and I'm thinking out loud more than making any definite or well-defined claims) that we jump all the way from the rig of positive integers to the ring of complex numbers, and over each impose a natural "unit sphere" condition.

(In fact, it's there's also something projective going on: we don't, in QM, care about the total phase of our system.)

So, one natural question: what is the proper in-between "probability" theory allowing positive and negative, but only real, probabilities? What's the correct norm?

Going the other way: what about probability theories with coefficients in the Quaternions?

I'd have thought the |.|² norm would be fine for reals and quaternions.

I think you can get something like quaternion probabilities if you consider physical systems with SU(2) symmetry. Any time you have a particle whose internal symmetry carries the adjoint representation of SU(2) (which is basically the vector space of quaternions x acted on by the unit quaternions via x→qxq^-1) then we have a quaternion valued wave function. (You'd get the adjoint representation if the particle was a bound state of a pair of particles carrying the fundamental representation.) The components of the quaternion tell you about the particle's composition - but if you ignore the particle's inner state and just consider its distribution in space then you can think of it having a quaternion valued 'amplitude' distribution. (I think that's right, I did my particle physics courses nearly 20 years ago...)