Saturday, March 18, 2017

Particles' wave functions always spread superluminally

It's been almost a week since we discussed Jacques Distler's confusion about some basics of quantum field theory. He posts several blog posts a year, a quantum field theory course is probably the only one he teaches, and he was "driven up the wall" by a point that almost every good introductory textbook makes at the very beginning. I expected that within a day or two, he would post a detailed text with the derivations saying "Oops, I've been silly [for 50 years]".

It just didn't happen. He still insists that the one-particle truncation of a quantum field theory is perfectly consistent and causal. In particular, he repeated many times in his blog post (search for the word "superluminal") that the relativistically modified Schrödinger's equation for one particle (with a square root) guarantees that the wave packets never spread faster than the speed of light. Oops, it's just too bad.

By these comments, Jacques says that he is ignorant about many things that I (and my instructors) considered basics of quantum field theory since I was an undergraduate, such as:

The special theory of relativity and quantum mechanics are consistent but their combination is constraining and has some unavoidable consequences – some basic general properties of quantum field theories.

Consistent relativistic quantum mechanical theories guarantee that objects capable of emitting a particle are necessarily able to absorb them as well, and vice versa.

For particles that are charged in any way, the existence of antiparticles becomes an unavoidable consequence of relativity and quantum mechanics.

Probabilities of processes (e.g. cross sections) that involve these antiparticles are guaranteed to be linked to probabilities involving the original particles via crossing symmetry or its generalizations.

The pair production of particles and antiparticles becomes certain when energy \(E\gg m\) is available or when fields are squeezed at distances \(\ell \ll 1/m\) (much) shorter than the Compton wavelength.

Only observables constructed from quantum fields may be attributed to regions of the Minkowski spacetime so that they're independent from each other at spacelike separations (because they commute or anticommute).

Wave functions that are functions of "positions of particles" unavoidably allow propagation that exceeds the speed of light and there can't be any equation that bans it. The causal propagation only applies to quantum fields (the observables), not to wave functions of particles' positions.

Equivalently, almost all trajectories of particles that contribute to the Feynman path integral are superluminal and non-differentiable almost everywhere and this fact can't be avoided by any relativistic version of the mathematical expressions. Causality is only obtained by a combination of emission and absorption, contributions from particles and antiparticles, and at the level of quantum fields (observables).

It's a lot of basic stuff that Jacques should know but instead, he doesn't know it and these insight drive him up the wall. Let's look at those things.

The most well-defined disagreement is about the "relativistically corrected" Schrödinger equation\[

\] You see that it's like the usual one-particle equation except that the non-relativistic formula for the kinetic energy, \(E=|\vec p|^2/2m\), is replaced by the relativistic one, \(E=\sqrt{|\vec p|^2+m^2}\), with the same Laplacian (times \(-\hbar^2\)) substituted for \(|\vec p|^2\).

Jacques believes that when you substitute a localized wave packet for \(\psi(x,y,z)\) at \(t=0\) and you wait for time \(t'\), it will only spread to the ball of radius \(t'\) away from the original region: it will never propagate superluminally. Search for "superluminally" in his blog post and comments. Oops, it's wrong and embarrassingly wrong.

I think that the simplest way to see why he's wrong is to realize that the equation above still has the usual non-relativistic limit. As long as you guarantee that \(|\vec p| \ll m\) in the \(c=\hbar=1\) units, the evolution of the wave packets must be well approximated by non-relativistic physics and the non-relativistic Schrödinger equation.

Consider an actual electron moving around a nucleus. In the hydrogen atom, the motion is basically non-relativistic. Consider an initial localized wave packet for the electron that has a uniform phase, is much larger than the Compton wavelength \(\hbar/mc\approx 2.4\times 10^{-12}\,{\rm m}\) (it's simply \(1/m\) in the \(c=\hbar=1\) units) but still smaller than the radius of the atom. For example, the radius of the packet is \(10^{-11}\) meters. Outside a sphere of this radius, the wave function is zero.

Will this wave packet spread superluminally? You bet. By construction, the average speed is about an order of magnitude lower than the speed of light which is reasonably non-relativistic. So with a 1% accuracy (squared speed), and aside from the irrelevant phase linked to the additional additive shift \(E_0=mc^2\) to the energy, the wave packet will spread like if it followed the non-relativistic Schrödinger equation\[

\] Let's set \(V(x)=0\). OK, how do the wave packets spread according to the ordinary Schrödinger equation? Let's ask Ron Maimon – every good self-didact is enough to answer such questions. Well, it's simple: the Schrödinger equation is just a diffusion (or heat) equation where the main parameter is imaginary. If \(m\) above were imaginary, \(m=i\mu\), then the solution to the diffusion equation would be\[

\] The width of the Gaussian packet goes like \(\Delta x\sim \sqrt{t/\mu}\). It's very simple.

If you know the graph of the square root, you must know that the speed is initially very high. The speed \(dx/dt\) scales like the derivative of the square root of time, i.e. as \(1/\sqrt{t\mu}\). For times shorter than \(1/\mu\), the speed with which the wave packet spreads unavoidably exceeds the speed of light. It's kosher that we're looking at timescales shorter than the "Compton time scale" of the electron. We only assumed that the spatial size of the wave packet is longer than the Compton wavelength. Whether an analogous scaling is obeyed by the dependence on time depends on the equation itself and the answer is clearly No. The asymmetric treatment of space and time in the equation (the square root is only used for the spatial derivatives) may be partly blamed for that asymmetry.

Just to be sure, all the scalings are the same for the value of \(\mu=-im\) that is imaginary.

If you don't feel sure that our non-relativistic approximation was adequate for the question, I can give you a stronger weapon: the exact solution of the equation (Schrödinger's equation with the square root). What is it? Well, it's nothing else than the retarded Green's function – as taught in the context of the quantum Klein-Gordon field. Look e.g. at Page 7 of these lectures by Gonsalves in Buffalo.

The retarded function is the matrix element of the evolution operator for the one-particle Hilbert space\[

G_{\rm ret}(x-x') = \bra{x,y,z} \exp(H(t-t')/i) \ket{x',y',z'}.

\] When the particle is initially (a delta function) at the position \((x',y',z')\) at time \(t'\) and you wait for time \(t-t'\) i.e. you evolve it by the square-root-based Hamiltonian up to the moment \(t'\), and you ask what will be the amplitude at the position \((x,y,z)\), the answer is nothing else than the retarded Green's function of the difference between the two four-vectors.

Can the retarded Green's functions be analytically calculated? As long as you include Bessel functions among your "analytically allowed tools", the answer is Yes. If we set the four-vector \(x'=0\) to zero, the retarded Green's function is simply\[

\] For small and large timelike or spacelike separation, the Bessel function of the first kind used in the expression asymptotically is an odd function of the argument and behaves as (the sign is OK for positive arguments)\[

\] But another lesson of the calculation is that the Green's function is nonzero even for \(x^\mu x_\mu\) negative, i.e. spacelike separation – although it decreases roughly as \(\exp(-m|x|)\) over there if you redefine the normalization by the factor of \(2E\) in the momentum space (which is a non-local transformation in the position space). See the last displayed equation on page 2 of Gonsalves:

Relativistic Causality:

Quantum mechanics of a single relativistic free point particle is inconsistent with the principle of relativity that signals cannot travel faster than the speed of light. The probability amplitude for a particle of mass \(m\) to travel from position \({\bf r}_0\) to \({\bf r}\) in a time interval \(t\) is\[

Gonsalves also quotes "particle creation and annihilation" and "spin-statistics connection" as the other two unavoidable consequences of a consistent union of quantum mechanics and special relativity. He refers you to Chapter 2 of Peskin-Schroeder to learn these things from a well-known source.

OK, you might ask, what's the right modification of the wave equation for one particle that guarantees that the wave packet never spreads luminally?

There is none. The condition that the packet never spreads superluminally would violate the uncertainty principle, a fundamental postulate of quantum mechanics.

Why is it so? I can give you a simple idea. If you compress the particle to a small region, \(\Delta x \ll 1/m\), much smaller than the Compton wavelength, the uncertainty principle unavoidably says \(\Delta p \gg m\), so the motion is ultrarelativistic. You could think that \(\Delta p\gg m\) or \(p\gg m\) is still consistent with \(v\leq 1\) but the evolved wave packets are unavoidably far from those that minimize the product of uncertainties and as the Bessel mathematics above shows, the piece in the spacelike region just can't exactly vanish, basically due to the non-local character of the operators.

Similar derivations could be made with the help of the Feynman path integral. The typical trajectories contributing to the Feynman propagator are superluminal and non-differentiable almost everywhere and this fact does hold even in the calculation of the propagators in quantum field theory, a relativistic theory. As I discussed in a blog post in 2012, the superluminal or non-differentiable nature of generic paths in the path integral is needed for Feynman's formalism to be compatible with the uncertainty principle. Recall that we have solved a paradox: the calculation of \(xp-px\) in the path integral should amount to the insertion of the classical integrand \(xp-px\) to the path integral but this classical insertion is zero. The paradox was resolved thanks to the generic paths' being non-differentiable: the time ordering of \(x(t)\) and \(p(t\pm \epsilon)\) mattered.

So does quantum field theory prevent you from sending signals to spacelike-separated regions? And how is it achieved?

Yes, quantum field theory perfectly prohibits any propagation of signals superluminally or over spacelike separations. It does so by using the quantum fields. Quantum fields such as \(\Phi(x,y,z,t)\) and functions of them and their derivatives are associated with spacetime points and they commute or anticommute with each other when the separation is spacelike.

The zero commutator means that you may measure them simultaneously – that the decision to measure one doesn't influence the other or that the order of the two measurements is inconsequential. Just to be sure, the previous sentence doesn't say that these spacelike-separated measurements are never correlated. They may be correlated but correlation doesn't mean causation. They're only correlated if the correlation (mathematically described as entanglement within quantum mechanics) follows from the previous contact of the two subsystems that have evolved or moved to the spacelike-separated points.

The point is that the outcomes themselves may be correlated but the human decisions – e.g. which polarization is measured on one photon – do not influence the statistics for the other photon itself at all. The existence of the "collapse" associated with the first measurement doesn't change the odds for the second measurement – although if you know the result into which the first measurement "collapsed", you must refine your predictions for the outcome of the second measurements because a correlation/entanglement could have been present. OK, how does this vanishing of the spacelike-separated commutators agree with the fact that the packets spread superluminally? On page 27 of Peskin-Schroeder, you may see that the "commutator Green's function" is a difference between two ordinary Green's functions and because those two are equal in the spacelike region, the value just cancels in the spacelike region.

But again, the Fourier transform of the ordinary propagator such as \(1/(p^2-m^2+i\epsilon)\) does not vanish in the spacelike regions of the 4-vector \(x^\mu\). It cannot vanish because this position space propagator knows about the correlation of fields at two points of space. And the fields in nearby, spacelike-separated points are correlated, of course (very likely to be almost equal), especially if they are closer than the Compton wavelength. You may view this correlation as a result of the escaping of high-momentum or high-energy quanta to infinity. Only low-momentum or low-energy quanta are left in the vacuum and its low-energy excitations – and because of the Fourier relationship of \(x\) and \(p\), this absence of high-energy quanta means that the quantum fields can't depend on the spatial coordinates too much.

You know, the message is that the ban on superluminal signals is compatible with quantum mechanics but the creation and annihilation of particles must be unavoidably allowed when you reconcile these two principles, special relativity and quantum mechanics. Jacques Distler believes that relativistic causality works even in "QFT truncated to the one-particle Hilbert space" which simply isn't right. He's really misunderstanding the key reason why quantum field theory was needed at all.

Try to calculate the expectation value of the commutator of two fields \(F(x)\) and \(G(y)\) at two spacelike-separated points \(x,y\). The fields \(F,G\) may be the Klein-Gordon \(\Phi\) itself or some bilinear constructed out of it, e.g. the component of a current \(J^0\) that Distler talks about at some point. Imagine that you're calculating this commutator. You first expand \(F,G\) in terms of \(\Phi\) and its derivatives. Then you insert the expansions of \(\Phi\) in terms of the creation and annihilation operators. And you know the expectation values of the type \(\bra 0 \Phi(x)\Phi(y) \ket 0\). When you time-order \(x,y\), it's just the usual propagator in the position space.

The precise calculation will depend on the operators you choose but a general point is true: There will be lots of individual terms that are nonzero for spacelike \(x-y\). Only if you sum all these terms – which will pick creation operators from \(F\) and annihilation operators from \(G\) and vice versa etc., you can achieve the cancellation.

In particular, if you consider the operators \(F,G \sim J^0\), those will contain terms of the type \(a^\dagger a\) as well as \(b^\dagger b\) for a field whose particles and antiparticles differ. Only if you include the correlators of from both particles and antiparticles matching between the points \(x,y\), you may get a cancellation of the commutator (its expectation value).

In other words, the fact that a quantum field is capable of both creating a particle and annihilating an antiparticle (which is the same for "real" fields) is absolutely vital for its ability to commute with spacelike-separated colleagues!

This insight may be formulated in yet another equivalent way. You just can't construct a localized – relativistically causally well-behaved – field operator at a given point that would only contain terms of a given creation-annihilation schematic type, e.g. only \(a^\dagger a\) but no \(b^\dagger b\), only \(a^\dagger\) but no \(b\), and so on. Any operator that has a well-defined "number of particles of each type that it creates or annihilates" is unavoidably "non-local" and can't exactly commute with its spacelike-separated counterparts!

If you wanted to study the truncation of the quantum field theory to a one-particle Hilbert space where the number of particles is \(N=1\), and the number of antiparticles (and all other particle species) is zero, then all "first-quantized" operators on your Hilbert space correspond to some combination of operators of the \(a_k^\dagger a_m\) form. You annihilate one particle and create one particle. But no such combination of operators may be strictly confined to a region so that it would commute with itself at spacelike-separation.

Students who have carefully done some basic calculations in quantum field theory know this fact from many "happy cancellations" that weren't obvious for some time. For example, consider the quantized electromagnetic field. Write the total energy as\[

H = \int d^3 x\,\frac{1}{2}\zav{B^2+ E^2},

\] i.e. the integral of the electric and magnetic energy density. Substitute \(\vec A\) and its derivatives for \(\vec B,\vec E\), and write \(A\) and its derivatives in terms of creation and annihilation operators for photons. So you will get terms of the form \(a^\dagger a\), \(aa\), and \(a^\dagger a^\dagger\). At the end, the total Hamiltonian only contains the terms of the \(a^\dagger a\) "mixed" type but this simplified form is only obtained once you integrate over \(\int d^3 x\) which makes the terms \(a a\) and \(a^\dagger a^\dagger\) vanish because of their oscillating dependence on \(x\). If you only write the energy density itself, it will unavoidably contain the operators of the type \(aa\) and \(a^\dagger a^\dagger\) – annihilating or creating two photons – too. And the terms of all these forms are equally important for the quantum field to be well-behaved, especially for the vanishing of its commutators at spacelike separations.

The broader lesson is that important principles of physics are ultimately reconcilable but the reconciliation is often non-trivial and implies insights, principles, and processes that didn't seem to unavoidably follow from the principles separately. So the combination of relativity and quantum mechanics implies the basic phenomena of quantum field theory – antiparticles, pair production, the inseparability of creation and annihilation, spin-statistics relations, and a few other things.

In the same way, perhaps a more extreme one, the unification of quantum mechanics and general relativity is possible but any consistent theory obeying both principles has to respect some qualitative features we know from quantum gravity – as exemplified by string theory, probably the only possible precise definition of a consistent theory of quantum gravity. In particular, black holes must carry a finite entropy, be practically indistinguishable from heavy particle species, and such heavy particle species must exist. The processes around black holes and those involving elementary particles are unavoidably linked by some UV-IR relationships and string theory's modular invariance is the most explicit known example (or toy model?) of such relationships.

In combination, the known important principles of physics are far more constraining than the principles are separately and they imply that the "kind of a theory we need" or even "the precise theory" is basically unique. This strictness is ultimately good news. If it didn't exist, we would be drowning in the infinite field of possibilities. Because of the "bonus" strictness resulting from the combination of important principles of physics, we know that a theory combining quantum mechanics and special relativity must work like quantum field theory and a theory that also respects gravity as in general relativity has to be string/M-theory.