One of the fundamental challenges in the philosophy of science is the problem
of induction. (Philosophical) induction is the process of trying to
uncover premises from conclusions. Deduction is the opposite, reasoning
from premise to conclusion. Deduction is secure; we can, with correct
logic, always say that if the premise is true, then so is the conclusion.
But the problem is that the opposite is not always the case: contradictory
premises can lead to the same conclusion. Thus, even if the conclusion
of the scientific argument is correct, one cannot be certain that the
induced premises are right. Some other set of ideas could lead to the same
predictions.

One of the goals of science is to come to an understanding of how the
universe evolves. The question of why is ultimately one of philosophy,
although scientists can guide and give hints to the philosophers. In short,
we want a theory that can make accurate predictions of how physical objects
will behave in certain circumstances. The primary tools available are
abstraction and experiment. We first of all abstract physical qualities,
and present them in a numerical form. We then make measurements under
controlled circumstances, and measure how those numbers change as the
circumstances change. Obviously, this means that science is restricted to
what can be abstracted numerically; but as Pythagoras, Plato and
Bradwardine originally hoped, and we can now confirm, that includes a
great deal of what we seek to understand. Not everything, but a large
proportion of it.

The early scientists worked by making careful measurements, plotting them
on a graph, and finding a mathematical function which fit the data. We
can think of things such as Boyle's law of gases, Snell's law of
refraction, Kepler's laws of the solar system, Hooke's law of extension
under tension, and so on. These mathematical functions are then used to
predict how similar materials will behave in other circumstances than in
the laboratory. This process is, of course, pure induction. It is a little
complicated because the experiments don't have perfect precision, and
therefore neither do the formulae used to fit to them, so there is always
some uncertainty in the final result and thus any predictions. However,
it is known how to deal with this uncertainty; instead of a definite
result, we express it as being within certain bounds, and use the
mathematics of probability to deal with the consequences. This is
important, but it doesn't undermine the overall approach.

But now we come to the problem. How do we know that we fit to the right
mathematical function? All we know is that the function runs through the
points we measured; it need not be the simplest form that does so.
Interpolation (guessing where the line runs between the measurements)
is usually reasonably safe, although not always. Extrapolation (guessing
the course of the line beyond the points), on the other hand is
occasionally OK, but far more frequently very dangerous. Induction of this sort,
without a prior theory to guide the extrapolation (which, of course, if
we have no other principles than induction from experiment to guide us,
we don't have), should never be relied on. There is just too high a risk
of missing something important by selecting too simple a curve to explain
the data. If the data looks boring, it might be that nature is boring,
but more likely it is because we have missed something important.

Consider, for example, the plot above. This is a resonance plot, common
in particle physics. The experimental set-up is that a pion and a proton
are shot at each other with various different centre of mass energies. A
detector is set up pointing at the point of collision, but with the line connecting
it to the collision at a large angle from the directions of the beam.
So if the particles pass by each other without interacting, then nothing
will hit the detector. But they might "collide" with each other; and if
they do they can scatter at various angles, and some of these angles will
be just right to hit the detector. Of course, the theory that describes this
scattering is relativistic (Dirac) quantum mechanics rather than a Newtonian
picture, so we can expect things to be a little bit weird when we go down
to the details, but the underlying idea is similar.

The cross-section, on the
y-axis, counts how many particles hit the detector. The x-axis gives the
centre of mass energy of the incoming beam. The black dots are the
actual experimental measurements that were made in this real life example.
But, since I am using this as an example of the perils of induction,
suppose that the experimenters
were a little less meticulous and only measured at the energies where I have
super-imposed the squares over the image.

The shape of the curve is of interest
because peaks in the height of the curve occur at just the right energies
for a new particle to be created. The energy of the incoming beam
corresponds to the mass of the new particle, in this case a Delta-Boson.
Everywhere else, the pion and proton just bounce off each other, similar
to what we would expect in classical physics. Occasionally they are
deflected from their original path, and will hit the detector with a
certain frequency. That frequency has a small dependence on the energy.
But when they have just the right energy (or near enough), something
different can happen.
The pion and proton combine to form a Delta-Boson. Because all the energy
goes into the Delta-Boson mass (and the total momentum is zero),
the new particle will be more or less stationary. Then, a short time later,
the Delta decays back into a pion
and proton. These can emerge from it in any direction with equal likelihood,
and therefore more
particles will hit the detector than in the case of the standard
collision. Thus we get a peak in the cross-section at the energy which
corresponds to the mass of the Delta.

Obviously the experimenters here were very meticulous and made a lot
of measurements. But suppose instead that they had only made measurements
corresponding to the blue squares, and had no theory to guide them. What would
they conclude, if they used the standard methods of induction?
The curve up to that point looks very smooth. They would
fit it with a quadratic equation or something similar, and come up
with a wonderful "law of scattering" which completely misses all of the
exciting stuff. They would conclude that there is no evidence for creation
of a Delta Boson. And that is the problem with extrapolation. What might
look like a simple set of data need not be outside the range of measurement.

Even interpolation doesn't always help. Suppose that they decide to take
a few more measurements at lower energies. Most of the time they will hit
the peak structure, which might give them a hint that something strange is
happening. But they could be unlucky, and measure at the energies
corresponding to the red squares. Again they will come to completely the
wrong conclusion.

Obviously in real life the experimenters took enough measurements to map
out the real shape of the curve. But, even so, there is always a small
distance between the data points; one can't measure at every possible energy.
How can they know that they haven't missed exciting
variations from the simple curve between those measurements?

And this is the problem of induction, and why experiment alone is never
enough. If you see nothing exciting, you can never be sure that you
haven't missed out on something crucial between the gaps of your
measurements. You can measure more, but it will never be enough to provide
the needed surety.

Of course, the situation is improved by the concept of falsification.
While you can never be sure that your proposed answer is right, you can
at least be certain that a large number of possible answers are wrong.
There is a process of elimination. And that is good, as far as it goes.
But it is still not enough to make us certain.

Even better is the method of prediction: you make predictions from your
model outside the range of your data, or filling in the gaps of your data.
If the predictions are wrong, you know your model is wrong. If they are
right, then you can be more confident. The important point with prediction
is that there is no possible way in which you can use the new data to
help shape your model; with old data there is always the possibility to
jig the model until it fits the data. With prediction, you at least get
your theory out in the open, and will either look a fool or a genius
depending on the result of the next experiment. But even this method can't
give us certainty. See the red squares above.

So induction is good, but not good enough. But fortunately, physicists
today don't need to rely on induction alone, because we have something
else to guide us. We have advanced enough to, in part, use deduction.

Before going further, I need to discuss the means by which physicists
abstract from the real world to their representation of it. I think the
best way of thinking about it is in terms of the mathematical process of
mapping. A map is just a connection between two lists of data. For example,
we can have a map between fruits and colours; on one hand we have a list
of fruit and on the other a list of colours, and the mapping connects
each fruit in the list with a colour. Some mappings are many to one,
others one to one. A one to one mapping is always reversible. One can also
map from a subset of set A to the entirety of set B. In this
case, every item in set B can be used to directly represent
something in set A, though some aspects of A are not
covered by the representation.

For example, in physics, we create a mapping between physical locations
and moments in time and points in a geometrical space. To be successful,
this assumes certain things about physical space/time, but the success
of physical theory suggests that the assumption is not unreasonable.
We could just work with the geometrical representation, but it is
easier to perform a second mapping between the geometrical space and an
algebraic space. It is easier to perform algebraic calculations than
working in pre-Cartesian geometry. This second mapping assigns a set of
numbers to every point in the geometrical space, and consequently these
numbers also represent physical space time. However, there is no unique
way of performing this second mapping. We have to create a rule relating
each point in space/time with a set of numbers. This rule is called
a coordinate system, it is an arbitrary choice we have to make in the
construction.

We don't only map points in space time to an algebraic representation;
we can identify the states or potentia of physical particles, and create
an algebraic representation of those states and the properties of
particles inhabiting those states.
With physical particles existing in space/time, there is clearly going
to be a relationship between the algebraic representation of the
particles and the algebraic representation of space time.

Once we have the algebraic representation, we can then crank the handle,
do a bit of mathematics, and calculate other qualities of the particles.
This calculation depends on certain laws and rules, which we call
the laws of physics. These laws connect one point in the representation
to another.
Maybe we calculate where the particle will be located at some future
time, or how it might interact with other particles, or some of its
observable properties, and so on. This is still all done in the algebraic
representation of reality.
But because there is a one to one mapping between the algebraic
representation and the physical reality, we can use the same mapping that
got us from reality to the premises of our calculation, to return from
the calculation's conclusions in the representation back to physical
reality. Every point in the abstract algebraic space corresponds to
something in the real world. That then enables us to compare against experiment.

The coordinate system is something we introduced as part of the
abstraction. It is not part of the real world. Therefore our final
results when we go back to the real world cannot depend in any way on
our choice of coordinate system. This limits what the laws of physics
could be.

Now there are two ways in which physical results could be coordinate
independent:

The fundamental equations of the laws of physics appear different in
different coordinate systems. This gives a weak condition on the laws.

The fundamental equations representing the laws of physics are the
same whatever coordinate system we choose. This gives a strong
condition on what the laws could be.

If we take the second option, that doesn't mean that everything will be
the same in every coordinate system. The locations of particles,
the numbers we give to distinguish their different states or potentia,
for example, would still differ from one coordinate system to another.
What it would mean would be that (for example) whatever equivalent we
have for Newton's laws of motion would be
the same no matter which coordinate system we choose.

For example, Newton's laws of motion are the same for coordinate systems
which are distinguished by a translation. If I choose one coordinate system,
and my friend another which only differs from mine in that his numbers are
always five greater than mine, then, if Newton's laws of motion describe
physics as I parametrise it, then they would also describe physics as my
friend observes it. However, there are other possible ways in which the
two coordinate systems can differ where Newton's laws are not invariant.
For example, if my friend associates each point with a number that is
the cube of the number I use. Newton's second law cannot then
simultaneously be the right law of physics in both my and my friend's
coordinate systems. However, there are other
possibilities, such as Einstein's theory of general relativity, whose key
equations are independent of the choice of coordinate systems, and can
cope with weird things such as accelerating reference frames or cubed
coordinates.

Of course, Newton's laws are a bit passé amongst physicists. Today,
we tend to think of things in terms of principles of least action.
The action is a quantity that is a function of the position and locations
of all the particles in the universe, at every point in space and every
moment in time. If you move a particle from one location to another, then
the action will take a different value. The route by which particles
travel through the universe also affects the value of the action. The
principle of least action, used in pre-quantum physics, states that the
route that the particles take is the one that minimises the action.
In general the value of the action will also depend on which coordinate
system we choose.

Now we haven't yet specified what the mathematical form of the action is,
but that's OK, because we are still talking about general principles
rather than going into specifics. There is one mathematical form of the
action which
leads to a set of physical laws identical to Newton's laws of motion;
take a different another mathematical form, and minimising the action will
give you Einstein's equivalents. Basically, whatever classical
(deterministic) equations of physics you can conceive of, there is some
action whose minimisation within a given coordinate system will be equivalent to
those laws.

The action also plays a role in quantum physics. Classical physics is
deterministic: each particle can only take one route from A to
B. In quantum physics, on the other hand, the particle can take
any of a large number of paths to get from one place to another. We don't
know which path it is going to take, so when considering how likely it is
that a particle starting at A will finish at B we need to
consider every possible path. However, the routes are not all of
equal likelihood. What I call the likelihood (and everyone else the
amplitude) for each path depends on a mathematical function, which is
related to the action (the weight is the complex exponential of the
action; in quantum physics this serves as the definition of the action).
So in quantum physics the action you get when you suppose
that a particle takes one particular path from
A to B is related to the likelihood
that the particle will take that path. It is most probable
(each path has an equal magnitude of its likelihood but not an equal probability)
that the
particle will take a path close to the minimum of the action, which is
why classical physics is such a good approximation to the real thing.
But once we get down to a small enough length scale, where quantum
effects dominate, we no longer see that is the case.

And now we come, at last, to symmetry.

Take a piece of paper, and draw grid lines on it representing Cartesian
coordinates. Place a square tin on the paper, and draw a line around
its base. Now rotate the tin around its centre through any angle that
isn't a right angle or a multiple of right angles. Draw another line
around its base. The two lines will not coincide.

Now repeat the process with a cylindrical tin. This time, as long as you
find the exact centre of the tin, you will find that the line you draw
around its circumference will not change after the rotation.

That's all fairly obvious and hardly seems revolutionary, but mathematicians have taken this idea,
and used the word symmetry to describe it. Something has a symmetry if
it, after a particular non-trivial transformation, such as
a rotation, it remains unchanged. The cylinder has a circular symmetry.
The square doesn't, but is only symmetric for rotations through right
angles.

Instead of rotating the cylinder, we can get the same effect by rotating
the paper in the opposite direction. Whether we rotate the paper or the
square tin, we will still get the same pattern of lines drawn on the paper
at the end of the day. The paper has grid-lines representing
our coordinate system. So a rotation of the paper represents a change of
the coordinate system. So we can equally say that the line representing
the circumference of the cylinder is unchanged under a rotation of the
coordinate system. So something has a symmetry if its mathematical
form is unchanged after a transformation of the coordinate system.

Now in physics, the coordinate system is something we choose when we
construct the mapping from the physical system to the algebraic
representation. Which coordinate system we select is entirely arbitrary,
we only have to be consistent throughout the calculation.

But that isn't much use. Suppose that I select a coordinate system,
X. And my friend Bob selects a different coordinate system,
Y. If we are to communicate with each other, we need some means
of converting from one coordinate system to another. But that is always
possible, through some mathematical transformation; this transformation
could be a rotation, as in my example with the cylinder, or something
more complicated. But (baring certain assumptions related to the topology
and dimensions of the space; if we are both correctly mapping from reality
these caveats aren't an issue) this is always possible.

But Bob and I might find that some of our equations have the same
mathematical form despite our different coordinate systems. Something
unchanged under a change in coordinate systems is a symmetry, so in this
sense we can say that this part of the laws of physics are symmetric.

Now the most important thing in physics is the action, which as I said
is related to the likelihood that a particle will take a particular path
from A to B. Now the likelihood is related to the
probability, and the probability is a count of how many particles arrive
at B. This is something we measure in the real world; it transfers
directly from our abstraction to reality. But if the likelihood is
something we can (indirectly) measure, then it can't depend on our choice of
coordinate system. Which means that the action also can't depend on
our choice of coordinate system. Which means that the action must
be symmetric under a wide range of coordinate transformations.
And there are very few specific mathematical forms for the action
which satisfy this criteria. Thus we know that the true laws of physics must be
one of those handful of options. And the only experiment I needed to
perform to reach this conclusion was the one that told me that the world
is quantum rather than classical, and uncertainty parametrised by
likelihoods rather than probability.

Now obviously I have simplified this argument for this post, and the full
story, once we go
into the mathematical details, has a few more complications than I have implied.
In particular,
likelihood is not the same as the probability we measure, and there are
transformations of the likelihood which leave probability untouched; which
means that the result is a tiny bit, but only a tiny bit, weaker than I
implied. Additionally,
we don't just have
to consider the likelihood we assign to each path, but the measure of the
integral we use to sum over those likelihoods. I also haven't discussed
the most important types of coordinate transformation (gauge transformations),
nor have I mentioned
the distinction between local and global transformations, which is
important and makes it harder to rule out option (1) of a coordinate
independent physics. But these complications can be overcome. The basic
principle remains: the action is one of the handful of options
which satisfy the symmetries.

And once we know what the action is, we have an almost complete theory
of physics. Everything else is just working out the consequences
implied by that
action, and checking them against experiment.

I have made certain assumptions along the way (such as that the laws of
physics are quantum rather than classical, the assumptions behind the
argument that the action must be symmetrical as I have described,
and that at least part of reality can be represented algebraically in the
manner described),
and this procedure is not
completely limiting. For example, it can't tell us how many types of
fundamental particle there are, or how strongly they interact with each
other. These things we have to measure. But what this tells us is if
these assumptions are true, then the laws of physics must be one of a
very limited number of options.

In summary, our knowledge of physics now, not entirely but to a large
extent, arises from arguing from a set of premises which almost seem
self evident (or follow directly from qualitative observation) to conclusion. In
short, theoretical physics has advanced sufficiently that deduction
rather than induction plays the leading role. And this is all because of
our understanding of the importance of symmetry.

I have only scraped the surface of the role that symmetry plays in
modern physics; constraining the possible actions (and Lagrangians) is
just one of its many roles. In both general relativity and
quantum field theory, almost everything boils down to some aspect of the
theory of symmetry or another. But I want to stop here, because this is
the particular aspect of symmetry I will need for subsequent posts.

Some html formatting is supported,such as <b> ... <b> for bold text , < em>... < /em> for italics, and <blockquote> ... </blockquote> for a quotation
All fields are optional
Comments are generally unmoderated, and only represent the views of the person who posted them.
I reserve the right to delete or edit spam messages, obsene language,or personal attacks. However, that I do not delete such a message does not mean that
I approve of the content. It just means that I am a lazy little bugger who can't be bothered to police his own blog.