The upper photon is encouraged to land on a photographic plate, D0, where a single photon normally contributes to an interference pattern. The entangled lower partner, the idler photon, goes to some mirrors and undergoes another treatment. In the delayed choice quantum eraser experiment, it ultimately lands in one of the detectors D1,D2,D3,D4. It's designed in such a way that if the detection of the idler photon occurs in D3 or D4, the which-slit information can be extracted, so the interference pattern is gone for the upper photon as well (the slit is the same for both photons). If the idler photon lands in D1 or D2, respectively, the which-slit information cannot be extracted, and the upper photons in these cases do create an interference pattern in D0, but only if you treat the D1 and D2 cases separately – these two interference patterns are "complementary" to each other.

One of the questions that Ahmed basically asked was whether there would be an interference pattern if you replaced all the detectors D1,D2,D3,D4 for the idler by another photographic plate D0' (dee-zero-prime).

The answer is, of course, that if you only measure the upper photon and record all events, you won't see any interference pattern because the interference pattern is only extracted from the correlations i.e. if you detect some information about the idler photon that knows about the relative phase.

However, if you watch the positions \(z,z'\) of both photons on the photographic plates D0 and D0', there will be some correlations in the distributions. What is the mathematical description and why does it work?

I will be using Feynman's path-integral derivation of the interference pattern. We will effectively sum over two trajectories only – those from one slit and those from the other slit.

Start with the simple experiment without any splitting of the photons. The state vector for that photon after it has gone through the slits is\[

\ket\psi = \frac{ \ket L + \ket R } {\sqrt{2}}

\] We will always use \(L,R\) (left-right) for the identification of the two slits (these labels correspond to "red" and "cyan" on the diagram at the top). Note that both slits contribute equally to the state. Also, the relative phase between \(L,R\) is known and it is one – a real, positive number. What interference pattern is created by this single photon on the photographic plate D0?

Well, the probability amplitude is a sum over histories and we may effectively replace them by straight trajectories from the two slits if we're only interested roughly in the experiment between the two slits.

So the photon goes from \(L\) or \(R\) to the point \(z\) (coordinate in the \(L\) to \(R\) direction) on the photographic plate. The action of the first trajectory is equal to\[

K(z-a)^2

\] where \(a\) is the \(z\) coordinate of the slit \(L\). Here, \(K\) is some coefficient and the quadratic dependence on \(z\), with the minimum at \(z=a\), results from the Taylor expansion of the Pythagorean distance between the slit and the point \(z\) at the photographic plate, OK? Similarly, the path from \(R\) to the position \(z\) has the action\[

K(z+a)^2

\] where \(z=-a\) is the position of the slit \(R\) along the relevant axis and the real positive coefficient \(K\) is the same. Only the relative phase between these two amplitudes will matter. The difference between these two actions is\[

K[(z+a)^2-(z-a)^2] = 4Ka \cdot z.

\] Let me choose units in which \(2Ka=1\), just to simplify my life. You can rewrite the whole derivation with all the coefficients. OK, so the situation is the same as a situation in which the \(L\)-to-\(z\) trajectory has the action \(+z\) while \(R\)-to-\(z\) has the action of \(-z\). We also set \(\hbar=1\). So the total probability amplitude encoding the possibility that the particle coming from the slight lands at the point \(z\) is the sum of the two amplitudes from the two trajectories (corresponding to the two slits)\[

\frac{\exp(+iz)+\exp(-iz)}{2} = \cos z.

\] The two phases added up to the cosine of \(z\). I added the normalization factor of \(1/2\) to talk about cosines – the normalization is always such that the total probability is one. Fine. So the probability that the photon lands at the point \(z\) is simply \(\cos^2 z\). That's the usual interference pattern. I hope that this was basically stuff you are familiar with even if this path-integral derivation (a sum over just two trajectories!) may have been more concise than anything you've heard before.

Fine, let's now split the photon to two. Each photon in the pair will actually be a photon of the "same kind" that we described above (not a photon of one-half of energy of it). What will change? Well, instead of the state\[

\] The numbers \(1,2\) indicate the photons from the splitting procedure. This label is independent of the slits \(L,R\). For example, the photon \(1\) is the upper one while the photon number \(2\) is the idler photon.

Note that the state \(\ket\psi\) was obtained from the previous one by replacing \(L\) with \(L_1 L_2\), the tensor product, and the same thing was done with the \(R\) part of the state. The relative phase between the terms \(L_1 L_2\) and \(R_1 R_2\) is again one i.e. real positive, just like before. The relative phase could be different and then the predictions would be different. Relative phases always matter in quantum mechnics – they really encode all the novelties that quantum mechanics brings relatively to just classical probability distributions.

However, what's important is that this statement doesn't allow you to say that the relative phase between \(L_1\) and \(R_1\) – which is relevant if you only decide to look at the photon \(1\) – is known and real positive. Instead, if you look at our state\[

\] from the perspective of the photon \(1\) – with the purpose of predicting this photon and ignoring the photon \(2\) – then you must realize that the factors \(\ket{L_2}\) and \(\ket{R_2}\) in the expression above play the role of some additional "phases". Don't get me wrong, they're not numbers. But they're two basis vectors of the second photon's Hilbert space. Their length is known – they're normalized vectors in that Hilbert space – but their "phase" is not known. In particular, the relative phase of \(\ket{L_2}\) and \(\ket{R_2}\) isn't known. It's not even well-defined because they're really not vectors that are proportional to each other.

But if you wanted to ignore the second photon altogether, you just can't replace \(\ket{L_2}\) and \(\ket{R_2}\) by two numbers "one". You have to replace them with general phases (the normalization is one, so it's just phases). So effectively, the state vector is\[

\] where the phases \(\phi_L\) and \(\phi_R\) – or the phase shifts, if you wish – must be remembered to be different in general. OK, what happens when you calculate the probability distribution on the first photon's photographic plate? You will get the amplitude\[

\cos(z+\phi_L-\phi_R)

\] which is shifted relatively to what we had before – because the relative phase is more general. You may square it and the squared cosine of \(z+\phi_L-\phi_R\) is the probability distribution. It's an interference pattern but shifted by \(\phi_L-\phi_R\) and if you don't make any measurement on the second photon, you just don't know anything about the phase. So quantum mechanics will predict a pattern for the first photon only that is obtained by averaging over all possible values of the angle \(\Delta\phi = \phi_L-\phi_R\). And be sure that the averaging of\[

\cos^2 (z+\Delta \phi)

\] over all values of \(\Delta \phi\) between \(0\) and \(2\pi\) is simply \(1/2\). It's the averaging of a squared cosine over a whole period, and it's simply a constant, namely \(1/2\). You won't see any interference pattern – maxima and minima – if you only look at the first photon because the relative phase between the two slits is effectively averaged over.

What about the situation in which you measure the positions of both entangled photons and you restore the whole distribution \(\rho(z_1,z_2)\) for the two photons, including all the correlations?

In the path-integral approach, this is just a very modest modification of the simple double-slit calculation at the top. You must just be sure to do the modification properly.

If we have two slits followed by the splitter to a pair of photons, and measure their positions at two photographic plates as positions \(z_1,z_2\), the path-integral calculation is telling you to sum (the probability amplitudes) over two possible histories only. Either both photons came from the slit \(L\), or they came from the slit \(R\) (their shared parent went through \(L\) or \(R\), respectively). Now, the actions \(S_L,S_R\) appearing in the integrand \(\exp(iS)\) of the path integrals are obtained by summing the action from the photon \(1\) and from the photon \(2\). Actions are "additive" for separated systems – equivalently, the amplitudes (behaving as \(\exp(iS)\) or sums of such terms) are products.

At the top, I said that effectively, the action from the slit \(L\) was \(+z\) and from the slit \(R\), it was \(-z\). That's why the sum of these two contributions to the probability amplitude was\[

\exp(+iz) + \exp(-iz)

\] which is \(2\cos z\). Here, the individual terms must contain the \(z\) from both photons. So to calculate the probability amplitude that the photons \(1,2\) land at places \(z_1,z_2\), you have to replace the sum of the two exponentials above by\[

\exp(+iz_1+iz_2) + \exp(-iz_1-iz_2)

\] We just replaced a simple action \(S\) by the sum of two analogous actions \(S_1,S_2\) for the two photons. The which-slit information of the photons \(1,2\) is the same because the photon pair is created at one of the two places (but the possibilities must be combined using probability amplitudes) but \(z_1,z_2\) are independent.

So obviously, the probability amplitude (after we divide it by two again) will be\[

A = \cos(z_1+z_2)

\] and the probability distribution will be\[

\cos^2(z_1+z_2).

\] So the position of the points where the two photons land will tend to be correlated in this way. If you learn about \(z_2\), for example, you may substitute it to the formula above and the probability prediction for the first photon will be \(\cos^2(z_1+\Delta \phi)\) where \(\Delta\phi=z_2\) is just a particular phase shift. Almost all the combinations of \(z_1\) and \(z_2\) will be allowed but \(z_1+z_2\) will be more likely to be close to an integer multiple of \(\pi\) than to a half-integer multiple of \(\pi\). (Note that the period of \(\cos^2 z\) is \(\pi\).)

Again, the two entangled photons are prepared so that their properties are typically correlated with each other. What the correlation looks like depends the precise manipulation with the two photons and the measurements at the end. But this correlation is always a consequence of the two photons' common origin, not any communication.

And the interference patterns – everything that seems to know about some "relative phase" – may only be seen if you use some information from the other photon as well because this information is needed to learn something about the relative phase. If you simply collect all the photons \(1\) regardless of the properties of the photon \(2\), it's equivalent to superposing all the probability distributions i.e. to the averaging over all possible values of the relative phase. And the resulting picture will therefore be constant and contain no interference maxima or minima.

What happens in a particular experiment with some splitters (conversion), mirrors, half-transmitting splitters of one photon, photographic plates, and Yes/No detectors may be a big complex to calculate. But even if you don't immediately know what quantum mechanics predicts – I am in no way trained to quickly compute what happens in any experiment or its variation you tell me about, either – it's important to realize that quantum mechanics always allows you to calculate all the distributions. And when a particular choice of measured properties of the two photons is chosen, you may always interpret the result as a symptom of a correlation that has existed from the beginning.

Quantum entanglement is just the quantum description of a pure composite state that is "ready to display all the kinds of correlations that exist according to quantum mechanics". The entanglement existed from the beginning – when the two entangled photons were created. So the correlations in the predictions was there from the beginning, too. The only new thing in quantum mechanics is that physical systems may be probed by measurements of observables that usually don't commute with each other so you can't imagine that physical systems have all properties at the same moment (the uncertainty principle bans it). But once you know what you have measured, the procedure may be retroactively interpreted so that you can see that the correlations were there from the beginning.