It is generally agreed that quantum physics
(quantum mechanics and field
theory) is peculiar and unintuitive. Things just don’t happen the way we
expect. Why is this? There are several reasons, but here I want to focus
on one of the most important: how uncertainty is expressed in quantum
physics.

Now probability is a measure of uncertainty that is familiar to us, so let us
start with that. Before we can start thinking about probability, we need to
establish what it is. Probability theory assumes that there are a set number of
possible outcomes, and assigns a number to each of those outcomes which
in some way is related to the confidence we have of that outcome being
realised.

So we start with a set of states. For example, in a coin toss there are two states,
heads and tails. When rolling a six sided dice, there are six states, numbered 1 to 6.
Each of these states represents one possible outcome. These are examples of discrete
states, but we can also expand probability theory to continuous states (for example,
any real number) by borrowing the concept of the infinitesimal from calculus.
The first axiom of probability is that we can describe the possible outcomes in
terms of such a set of states. The next axioms are the properties that
this set must have.

The basis states are irreducible, i.e. we can’t split them into smaller,
more fundamental, regions. For example, when rolling the dice, we
could combine the possible outcomes into three sets: {1,2},{3,4} and
{5,6}. We can perform computations using these sets, but they are
not irreducible because they can be split up into more fundamental
units, namely, {1},{2},{3},{4},{5},{6}. These outcomes are, however,
irreducible because they can’t be reduced any further.

The basis states are orthogonal, i.e. that there is no overlap between them.
For the example of the dice, if we chose one set of results A as the outcomes
{1,2,3,4} and the set of results B as the outcomes {4,5,6}, then it would
be possible to be in both A and B at the same time (namely if we roll
a 4), i.e. the two sets overlap, so they are not orthogonal. In classical
probability, orthogonal states mean that you can’t be simultaneously found
yourself two different states after the same ”roll” at the same time.

The basis states are complete, i.e. every possible outcome is included.
The set of results {1},{2},{3},{4},{5} is not complete, because it is also
possible to roll a 6.

The basis is unique, i.e. there is only one natural way in which we can
describe the system. For example, the only way we can describe the
possible outcomes is to say that there is one state representing heads, and
the other representing tails. They are the only options.

We should ask how well this language idea of a
limited number of possible outcomes and states fits in with different metaphysical
systems. After all, we want to eventually apply probability theory to make
physical predictions, so we are going to need to ask whether the assumptions
the physics is built on are consistent with the premises of probability.
These assumptions follow from our metaphysical thought.

Start with Aristotle. The idea of a set of basis states
very nicely with Aristotelian thought. Aristotle's philosophy is also built
around the idea that things can exist in several discrete states. These states
are called potentia (the two terms are not synonymous -- there is more to
potentia than just a listing of states -- but the potentia can be defined in
so that they satisfy the properties we need for these basis states).
Each potentia implies the observable properties of the being in that state. If
you know the potentia, then you can compute the properties. Motion is movement
from one potentia to another.

To give the example of the dice roll. Each possible final resting place of the
dice is represented by one potentia. Each intermediate state after throwing it
is represented by another potentia. So, in our hand before we roll it, the
dice is in one potentia, which then moves through a sequence of different ones
as it passes through the air, until it comes to rest in its final state. The
motion of the dice through this sequence of potentia can be traced by
understanding the natural tendencies of each potentia (i.e. if it is in one
state, it has a tendency to move into another state in the next moment of
time), and also its interactions with the air and the surface of the table and
so on. Obviously in classical physics we would use Newton's laws to determine
which potentia will be occupied by the dice next; for the Aristotelian that is
just one possible way in which the details of actualisation can be described.

The Aristotelian can then take the various potentia that describe the dice at
rest, set them together according to the number shown, and those are the
basis states for the final outcome of the dice roll. He then would apply
standard probability theory to that basis, and perhaps derive the
probabilities based on his knowledge of how the dice was first thrown and how
it was held.

Aristotle requires these states to be orthogonal -- only one potentia can
be actualised at any given time, and they need to be complete, so every possible
motion is described by his system. He doesn't need them to be
irreducible or unique. For example, he allows for substances to be compound.
The potentia of a compound substance are not irreducible, because the substance
contains simpler parts, but neither is it just the sum of its parts. The form
of water is not just hydrogen and oxygen added together, but it has its own
distinct potentia and properties distinct from the two gases. Equally, I don't
see why Aristotle would need the potentia to be a unique basis (although
obviously Aristotle would not have thought about things in this way, so we can't
ask him). However, Aristotelian thought is not inconsistent with uniqueness and
irreducibility; it just doesn't require them.

What about more modern philosophies? Empiricism immediately has difficulties,
because these states are not directly observed from experiment. They are a
theoretical construct, and empiricism denies the possibility of theoretical
knowledge. Thus even if we construct such a basis of states as a book-keeping
convenience, we have no way to link them to actual properties of the real world.
But without that link, we can have no confidence that any predictions we make
would be applicable.

Nominalists are, I think, in even bigger trouble. To say that an object can
exist in different states is to say that all these potentia are related
together in some way. Together, they form a set. That set is a philosophical
universal, since it links different objects of the same type. But universals
are precisely what nominalists deny. Once again, the nominalist could say that
we just introduce these possible states as a convenience, but the same problem
of linking this convenience to the real world applies. Probability theory is
used to make predictions, and those predictions are of the form "How many outcomes
of a particular type will we find after we perform the experiment a large number
of times?" We are grouping together outcomes into types; those types are universals,
and that rules out nominalism.

The mechanical philosophy reduces the universe to just matter in motion.
Thus there are certain fundamental corpuscles of matter which can be described
by location, momentum, and maybe a few other properties. These corpuscles are
seen as being indestructible, and compound objects are simply the sum of their
parts. The laws of physics in mechanism are deterministic, and if one knew
precisely all the locations and locomotions of all the particles in the universe
at one moment in time, then one could in principle predict where they are at
every other moment in time (however, since we don't know with certainty, there
is still room for seeking probabilistic answers). All of nature is described by
mechanical principles. Mechanism thus leads naturally to the axioms of completeness, uniqueness,
irreducibility, and orthogonality. Where it struggles is with the whole idea of
basis states altogether; a mechanist doesn't need to and therefore does not tend
to think in terms of these states. The state notation is not inconsistent with
the mechanical philosophy, but neither is it directly implied by it.

The idealist, who sees things perhaps as a giant computer simulation, is in the
opposite difficulty to the empiricist. He has no trouble drawing up the list of
states, but without some physical principle external to his intellect has
difficulty saying what it means for one of the states to be occupied (or the
potentia to be actualised). The intellectual system which the idealist states is
all there is can't be complete; there needs to be something beyond it to
make things actual.

So, of these five major philosophical systems, only the Aristotelian and the mechanical
seem consistent with the underlying assumptions of probability theory we have
discussed so far.
They directly imply some of the premises we have come up with so far, and don't
contradict the others.

In probability theory, we need to assign numbers to
each state which in some way represent our certainty or uncertainty of that particular
outcome coming to pass. So how do we assign a number to these states? This number is known as the
probability, and it is defined using four axioms.

The number is real (i.e. doesn’t have any contributions from the square
root of minus one in it).

The number assigned to each state is zero or greater.

The sum of the numbers assigned to each state is one.

The probability of achieving an outcome A or of achieving an B, where A
and B are orthogonal states is the probability of A plus the probability of
B.

These axioms lead to an intuitive result. Why? because they are also satisfied by
a frequency distribution. Suppose that we roll the dice a large number of times, and
record how many times we get each outcome. At the end of the exercise, we divide
each number by the total number of rolls. This gives us a frequency distribution. The
frequency distribution ultimately arises from counting things, which is something we
naturally do.

The difference between probability and frequency is that probability is something
we compute while frequency is something we measure. Some people, as is well known,
confuse the two. They treat a probability distribution as the limit of a frequency
distribution after an infinite number of measurements. There are many reasons why
this is foolhardy, but to my mind the most important is that by confusing a
probability distribution with a frequency distribution, they lose the ability to
compute the distribution in advance. Frequentists have, by defining the
term probability to be a virtual synonym of frequency, denied
themselves the use of their reasoning abilities. We have two tools available to us:
reason and experience (alternatively theory and experiment); and frequentists have
basically said from the outset that it is impossible to obtain a theoretical knowledge.
This is the philosophy of empiricism (or perhaps a particularly hard form of
empiricism), and it is refuted by the simple observation that theoretical
physicists have correctly calculated things before they were measured by
experiment.

Now, if probability is a form of computation, it is a form of reasoning. Every
chain of reasoning goes back to premises, axioms and definitions; and ultimately
these can’t be proven by reasoning alone. So every probability is conditional upon the
assumptions built into the model used for the calculation. Let us say we are tossing a
coin. If we make the assumption that the coin is balanced, we will assign a
probability of 0.5 to it coming up heads. If it is weighted in a particular way, we will
assign a probability of 0.75. Both of these calculations of the probability are perfectly
legitimate, even though they give different results. However, once we take a particular
coin, it is in practice weighted in a particular way. We can measure how it is
weighted, and calculate from this a probability for it coming up heads. We can
then test our calculation (and therefore the premises it was built on) by
performing the experiment a large number of times, and comparing our prediction
against the result. More precisely, we calculate the probability of the measured
result occurring under various different premises. Using a theorem, Bayes'
theorem, we can then extract the probability for each set of premises for the
nature of the coin given the observed set of measurements.
Induction thus cannot lead to certainty that one set of premises is correct, but
it can allow us to systematically express how certain we should be about
them all. This allows us to eliminate numerous possibilities for all practical
purposes.

Probability is thus a numerical measurement of uncertainty parametrised in such
a way that it follows the same rules as counting states. And since counting is
everybody’s first introduction to numbers and how most people naturally think about
numbers, probability is thus a very natural and intuitive way of expressing
uncertainty. Anything else would just be weird.

So let us consider the standard quantum experiment, the two slit experiment. We
have a source S which emits a beam of particles, towards a barrier with two slits, A
and B. The particles then hit a screen at position X. Our understanding of the initial
conditions and the physics is denoted as p. It doesn’t matter for this example what
that physics is. It could be Newton’s physics, Aristotle’s, Einsteins, or something
entirely different; the only stipulation is that it accurately describes the paths of the
particles.

Now if the particle passes through slit A, there is a certain probability that it will
hit the screen at position X, which we denote as P(X|Ap). This is expressed as a
probability rather than a certainty; maybe we are uncertain about the initial
conditions or the mass of the particle or the forces acting on it, so we are not certain
about the final result X. Equally the probability that the particle will hit X
after passing through slit B is P(X|Bp). The probability that it will pass
through slit A is P(A|p) and the probability that it will pass through slit B is
P(B|p).

From Bayes’ theorem, we write

Add these two equations together, and we get,

We now use the definition of probability to write that,

where P(A∪B|Xp) denotes the probability that it passes through either slit A or slit
B if it hits the screen at X given the initial conditions and physics denoted by p.
But it must pass through either slit A or slit B, so this probability is just
1.

The end result of this is

(1)

which basically states that the probability of the particle hitting the screen at X is
the sum of the probability of it hitting the screen at that location after passing
through slit A and the probability of it hitting the screen at that location after
passing through slit B.

Probabilities can be used to predict the frequency distribution. So suppose that
we are firing 1000 tennis balls out of our source. Our screen consists of a series of
buckets, and P(X) represents the probability that the ball goes into a particular
bucket. Suppose that our theory predicts that half of them pass through
slit A and half of them pass through slit B. We confirm these predictions
by performing the experiment and counting how many balls pass through
each slit. It comes close enough to a 50-50 split that we believe our theory.
Suppose further that the theory predicts that every 4 balls out of 500 that pass
through slit A hit the screen at position X and every 10 balls out of 500 that
pass through B hit the screen at position X. We confirm this by coving up
slit A, firing 1000 balls and counting how many hit the screen at position
X, and it comes close enough to the predicted number that we accept the
theory. We cover up slit B, do the same thing, and again everything looks
good.

Now we are ready to perform the actual experiment. Our theory predicts that on
average we should expect to get 14 balls hitting X. We leave both slits uncovered,
and get ready to do the experiment. Obviously, we don’t have that many samples
(1000 isn’t a big enough number for probabilities and frequencies to converge), so we
wouldn’t expect to get the predicted result if we just do the experiment once. So we
will set up to perform the experiment several thousand times and take the average.
That average should be close enough to the expected result of 14 that we confirm the
theory.

And if we performed this experiment with tennis balls, we would get a result near
enough to 14 that we believe the theory.

But now let us replace the tennis balls with electrons. Again, we predict and
confirm that half the electrons go through slit A and half through slit B; cover up slit
B and 8 out of a thousand electrons go into X, and so on. The set up works out in
exactly the same way as with the tennis balls.

So, when we perform the full experiment, we confidently expect to get the result
14.

Except we don’t. We find that the answer is 6.

Now this is weird. 4 fewer electrons hit X when both slits are open than when slit
B alone was open. It just doesn’t make sense to our intuition.

Could it be that there is some difference in the physics that describes electrons
and the physics that describes tennis balls? Some difference hidden within the
condition p that makes all the difference? Certainly the physics describing electrons is
very different from the physics describing tennis balls. But that is completely
irrelevant. We didn’t need to know what p was in the algebra above; and given that
we have individually confirmed each of the numbers on the right hand side of
equation (1) (to a good enough accuracy) no matter what p is, as long as P(A|p) = ½
and so on, our prediction for the fraction of electrons hitting the detector at X,
P(X|p) cannot be affected. That number just comes from the mathematical laws of
probability.

So what we have to conclude is that standard probability theory is wrong, or at
least that it doesn’t apply in quantum physics. At least one of the assumptions of
probability theory listed above is invalid. And this is why quantum physics is
weird.

Quantum physics is still a mathematical representation of reality. We still want to
represent our uncertainty for each possible outcome with a number. But
that number is not a probability. It violates the axioms of probability. It is
therefore not proportional to an expected frequency after a huge number of
measurements.

Our intuition concerning uncertainty is based on counting descrete objects. We
can’t really imagine anything else. We can’t imagine quantum physics. But we can
still understand it intellectually, in the same way as always: we list premises and work
through to conclusions.

So which of the axioms of probability listed above are violated in quantum theory?
I’ll discuss that next time.

Some html formatting is supported,such as <b> ... <b> for bold text , < em>... < /em> for italics, and <blockquote> ... </blockquote> for a quotation
All fields are optional
Comments are generally unmoderated, and only represent the views of the person who posted them.
I reserve the right to delete or edit spam messages, obsene language,or personal attacks. However, that I do not delete such a message does not mean that
I approve of the content. It just means that I am a lazy little bugger who can't be bothered to police his own blog.