a

r
X
i
v
:
0
7
0
8
.
0
0
5
2
v
1

[
p
h
y
s
i
c
s
.
g
e
n
-
p
h
]

1

A
u
g

2
0
0
7
An Introduction to Relativistic Quantum
Mechanics
I. From Relativity to Dirac Equation
M. De Sanctis
a, b
a
Departamento de F´ısica, Universidad Nacional de Colombia, Bogot´a D. C.,
Colombia.
b
INFN sez. di Roma, P.le A. Moro 2, 00185 Roma, Italy.
e-mail : mdesanctis@unal.edu.co and maurizio.desanctis@roma1.infn.it
Abstract
By using the general concepts of special relativity and the require-
ments of quantum mechanics, Dirac equation is derived and studied.
Only elementary knowledge of spin and rotations in quantum me-
chanics and standard handlings of linear algebra are employed for the
development of the present work.
PACS number(s): 03.30.+p, 03.65.Pm
1
Contents
1 Introduction 3
1.1 Notations and Conventions . . . . . . . . . . . . . . . . . . . . 5
2 Relativity 6
2.1 Fundamental Aspects of Lorentz Transformations . . . . . . . 7
2.2 Electromagnetism and Relativity . . . . . . . . . . . . . . . . 12
2.3 The Hyperbolic Parametrization of the Lorentz Transformations 15
2.4 Lorentz Transformations in an Arbitrary Direction . . . . . . 18
2.5 The Commutation Rules of the Boost Generators . . . . . . . 20
3 Relativistic Quantum Wave Equations 25
3.1 Generalities and Spin 0 Equation . . . . . . . . . . . . . . . . 25
3.2 Spin 1/2 Dirac Equation . . . . . . . . . . . . . . . . . . . . . 30
3.3 The Gamma Dirac Matrices and the Standard Representation 36
3.4 Parity Transformations and the Matrix γ
5
. . . . . . . . . . . 41
3.5 Plane Wave Solutions and the Conserved Dirac Current . . . 43
4 Appendix. Properties of the Pauli Matrices 49
2
1 Introduction
According to the present knowledge of physics, the ultimate constituents of
matter are quarks and leptons. Both of them are particles of spin 1/2 that
interact by interchanging spin 1 particles, namely photons, gluons, W
+
, W
−
and Z
0
. The existence of the Higgs spin 0 particle is presently under exper-
imental investigation.
The issues of relativity and quantum mechanics, that are strictly necessary
to understand atomic and subatomic world, have favored the development of
local ﬁeld theories in which, as we said, the interactions are mediated by the
interchange of the (virtual) integer spin particles mentioned above. A general
feature of these theories is that, in the ﬁeld Lagrangian or Hamiltonian, the
interaction term is simply added to the term that represents the free motion
of the particles.
As for the free term of the matter, spin 1/2, particles, it gives rise to the
Dirac equation, that represents the relativistic, quantum mechanical wave
equation for these particles.
These arguments explain the great importance of Dirac equation for the
study of particle physics at fundamental level. However, it is also strictly
necessary to understand many important aspects of atomic physics, nuclear
physics and of the phenomenological models for hadronic particles.
An introduction to this equation represents the objective of the present work
that is mainly directed to students with good foundations in nonrelativistic
quantum mechanics and some knowledge of special relativity and classical
electrodynamics.
We shall not follow the historical development introduced by Dirac and
adopted by many textbooks. In that case, the Lorentz transformation (boost)
of the Dirac spinors is performed only in a second time, without clarifying
suﬃciently the connection between the mathematics and the physical mean-
ing of that transformation.
In this paper the Dirac equation will be derived starting from the basic
principles of special relativity and quantum mechanics, analyzing the trans-
formation properties of the relativistic spinors.
This development will be carried out without entering into the mathematical
details of the Lorentz group theory, but keeping the discussion at a more
physical level only using the mathematical tools of linear vector algebra, as
row by column matrix product and vector handling.
3
In our opinion this introductory approach is highly recommendable in order
to stimulate the students to make independent investigations by using the
powerful concept of relativistic covariance.
In a subsequent work we shall analyze in more detail the properties of Dirac
equation and derive some relevant observable eﬀects. To that work we shall
also defer an introduction to the ﬁeld theory formalism that is needed to give
a complete physical description of subatomic world.
The subjects of the present work are examined in the following order.
In Subsection 1.1 we give some tedious but necessary explanations about the
adopted notation.
In Section 2 we study some relevant aspects special relativity, focusing our
attention on the properties of the Lorentz transformations.
Their fundamental properties are recalled in Subsection 2.1.
We brieﬂy analyze, in Subsection 2.2, classical electrodynamics as a relativis-
tic ﬁelds theory.
In Subsection 2.3 we examine the hyperbolic parametrization of the Lorentz
transformations, introducing concepts and techniques that are widely applied
in relativistic quantum mechanics for the construction of the boost operators.
Lorentz transformations in an arbitrary direction are given in subsection 2.4.
A very important point of this work is studied in Subsection 2.5, where the
commutation rules of the Lorentz boost generators, rotation generators and
parity transformation are derived.
In Section 3 we make use of the concepts of relativity to lay the foundations
of relativistic quantum mechanics.
In Subsection 3.1 we discuss, as an example, the relativistic wave equation
for a spin 0 particle.
In Subsection 3.2 we introduce the (quantum-mechanical) Dirac equation for
spin 1/2 particles, starting from the commutation rules of the boost genera-
tors, rotation generators and parity transformation.
The properties of the Dirac Gamma matrices and their diﬀerent representa-
tions are examined in Subsection 3.3.
Some relevant matrix elements of Dirac operators, as γ
5
, are studied in Sub-
section 3.4.
4
Finally, plane wave solutions and the corresponding conserved current are
found and discussed in Subsection 3.5.
The Appendix is devoted to study some useful properties of the Pauli matri-
ces.
1.1 Notations and Conventions
We suggest the reader to read cursorily this Subsection and to go back to it
when he ﬁnds some diﬃculty in understanding the other parts of the paper.
First of all, the space time position of a particle is denoted as x
µ
= (x
0
, r)
with x
0
= ct and r = (x
1
, x
2
, x
3
). To avoid confusion, we use this last
notation instead of the standard one, that is (x, y, z).
Greek letters of the “middle” part of the alphabet, as µ, ν, ρ, σ, ... running
from 0 to 3, are used to denote four-vector components. On the other hand
the letters of the beginning of the Greek alphabet, as α, β, δ, ... running
from 1 to 3, denote three-vector components. This last notation with upper
indices will be used extensively even though the corresponding quantity does
not make part of a four-vector.
Repeated indices are always summed, unless otherwise explicitly stated.
For two three-vectors, say a and b, the scalar product is denoted as
ab = a
α
b
α
If one of the two vectors is a set of the three Pauli (σ
δ
) or Dirac (α
δ
), (γ
δ
)
matrices, we use the notation
(σa) = σ
δ
a
δ
, (αa) = α
δ
a
δ
, (γa) = γ
δ
a
δ
Furthermore, the notation ∇ collectively indicates the derivatives with re-
spect to the three components of the position vector r.
Lower indices are only used for four-vectors and denote their covariant com-
ponents as explained just after eq.(2.2). Invariant product of two four-vectors
is introduced in eq.(2.3). For the unit vectors we use the standard notation
ˆa =
a
|a|
When a four-vector is used as an argument of a ﬁeld or wave function, the
Lorentz index µ, ν, ρ, σ, ... is dropped and, more simply , we write
A
µ
(x), ψ(x)
5
where x represents collectively all the components of the four-vector x
µ
.
In order to denote products of matrices and four-vectors, we arrange the
components of a four-vector, say x
µ
in a column vector, denoted as [x]. The
corresponding transposed vector [x]
T
is a row vector. Standard Latin letters,
without indices, are used to denote matrices. See, for example, eq.(2.6). We
use this notation also for the set of the four Dirac matrices α
µ
at the end of
Subsection 3.2.
Four components Dirac spinors, introduced in Subsection 3.2, are handled
according to the same rules of vector algebra. They are denoted by a Latin
letter without parentheses.
We recall that the hermitic conjugate of the Dirac spinor u is a row spinor
deﬁned as:
u
†
= u
∗T
For the commutator of two matrices (or operators), say Q, R, we use the
notation
[Q, R] = QR −RQ
For the anticommutator we use curly brackets
{Q, R} = QR + RQ
2 Relativity
The principle of relativity, that was found by Galilei and Newton, states that
it is possible to study physical phenomena from diﬀerent inertial reference
frames (RF) by means of the same physical laws. The hypothesis of an
absolute reference frame is not allowed in physics.
Obviously, one has to transform the result of a measurement performed in
a reference frame to another reference frame, primarily the measurements of
time and space.
Requiring the speed of light c to be independent of the speed of the reference
frame, as shown by th Michelson-Morley experiment, one obtains the Lorentz
transformations that represent the formal foundation of Einstein’s special
relativity. The reader can ﬁnd in ref.[1] a simple and satisfactory development
of this point.
6
2.1 Fundamental Aspects of Lorentz Transformations
Considering a RF S
′
moving at velocity v along the x
1
-axis with respect to
S, one has the standard Lorentz transformations
x
′0
= γ(x
0
−
v
c
x
1
)
x
′1
= γ(−
v
c
x
0
+ x
1
)
x
′2
= x
2
x
′3
= x
3
(2.1a)
where x
0
= ct, (x
1
, x
2
, x
3
) = r and γ = [1 −(v/c)
2
]
−1/2
.
A thorough study of the subject of this Subsection, that consists in general-
izing the previous equations, can be found in ref.[2]. In the present paper we
highlight some speciﬁc aspects that are relevant for a quantum-mechanical
description of elementary particles.
The Lorentz transformations of eq.(2.1a) can be syntetically written as
x
′µ
= L
µ
ν
(v)x
ν
(2.1b)
where the indices µ, ν take the values 0, 1, 2, 3 and x
µ
is denoted as con-
travariant four-vector.
By introducing the Minkowsky metric tensor
g
µν
= g
µν
=
_
¸
¸
¸
_
1 0 0 0
0 −1 0 0
0 0 −1 0
0 0 0 −1
_
¸
¸
¸
_
(2.2)
one can construct covariant four-vectors x
µ
= g
µν
x
ν
and invariant quantities
as products of covariant and contravariant four-vectors. For example, given
two contravariant four-vectors, say s
µ
= (s
0
, s) and l
µ
= (l
0
, l), one can
construct their covariant counterparts s
µ
= (s
0
, −s), l
µ
= (l
0
, −l) and the
quantity
s
µ
l
µ
= s
µ
l
µ
= s
µ
g
µν
l
ν
= s
µ
g
µν
l
ν
= s
0
l
0
−sl (2.3)
that is invariant under Lorentz transformation:
s
µ
l
µ
= s
′
µ
l
′µ
(2.4)
7
In particular, the Lorentz transformation of eq.(2.1a) is obtained [1,2] by
requiring the invariance of the propagation of a spherical light wave, that is
the invariance of x
µ
x
µ
= 0.
The invariance equation (2.4) requires
g
µρ
L
ρ
ν
(v)L
µ
σ
(v) = g
νσ
(2.5)
In many cases it is very useful to work with standard linear algebra notation.
Furthermore, at pedagogical level, this technique is very useful to introduce
standard handling of Dirac spinors.
Identifying a four-vector x
µ
with the column vector [x], the invariant product
of eq.(2.4) is written as
s
µ
g
µν
l
ν
= [s]
T
g[l] (2.6)
where the upper symbol T denotes the operation of transposition. By means
of this notation, eq.(2.5) reads
L(v)gL(v) = g (2.7)
where we have used the important property, directly obtained from eq.(2.1a),
that L
T
(v) = L(v). Also, a covariant four-vector x
µ
is [x
c
] = g[x]. Its
transformation is
[x
′
c
] = gL(v)[x] = gL(v)gg[x] = gL(v)g[x
c
] (2.8)
Let us now multiply eq.(2.7) by g from the right, obtaining
L(v)gL(v)g = 1 (2.9)
In consequence
gL(v)g = L
−1
(v) (2.10)
it means that the covariant four-vectors, look at eq.(2.8)!, transform with the
inverse Lorentz transformations. By means of direct calculation or by using
the principle of relativity one ﬁnds that
L
−1
(v) = L(−v) (2.11)
We recall some relevant physical quantities that are represented by (i.e. trans-
form as) a four-vector. As previously discussed, we have the four-position
(in time and space) of a particle denoted by x
µ
.
8
We now deﬁne the four-vector that represents the energy and momentum of
a particle.
Previously, we introduce the (invariant) rest mass of the particle. In the
present work this quantity will be simply denoted as the mass m. We shall
never make use of the so-called relativistic mass.
We also deﬁne the diﬀerential of the proper (invariant) time as
dτ =
1
c
[dx
µ
dx
µ
]
1/2
=
_
(dt)
2
−
1
c
2
(dr)
2
_
1/2
=
= dt
_
1 −(
v
c
)
2
_
1/2
=
dt
γ
(2.12)
where the velocity
v =
dr
dt
represents the standard physical velocity of the particle measured by an ob-
server in a given reference frame. Furthermore, the factor γ is a function of
that velocity, of the form:
γ =
_
1 −(
v
c
)
2
_
−1/2
The energy-momentum four-vector is obtained diﬀerentiating the four-position
with respect to the proper time and multiplying the result by the mass m.
One has
p
µ
= (
E
c
, p) = m
dx
µ
dτ
= (mcγ, mvγ) (2.13)
In previous equation, E represents the energy of the particle and p its three-
momentum. More explicitly, the energy is
E = mc
2
γ
For small values of the velocity |v| << c one recovers the nonrelativistic
limit, that is
E ≃ mc
2
+
1
2
mv
2
+ ... (2.14a)
p ≃ mv + ... (2.14b)
Note that the energy and momentum of a particle belong to the four-vector
of eq.(2.13). In consequence, energy and momentum conservation can be
9
written in a manifestly covariant form. For example, in a collision process in
which one has a transition from an initial state (I) with N
I
particles, to a ﬁnal
state (F) with N
F
particles, the total energy and momentum conservation is
written by means of the following four-vector equality
N
I

i=1
p
µ
i
(I) =
N
F

i=1
p
µ
i
(F) (2.15)
that holds in any reference frame. A complete discussion of the physical con-
sequences of that equation and related matter is given in ref.[3]. Only recall
that, at variance with nonrelativistic mechanics, mass is not conserved. In
general, mass-energy transformations are represented by processes of creation
and destruction of particles. As a special case, a scattering reaction is deﬁned
elastic, if all the particles of the ﬁnal state remain the same (obviously, with
the same mass) as those of the initial state.
Four-momentum conservation of eq.(2.15) is a very simple example. In gen-
eral, a physical law written in a manifestly covariant form automatically
fulﬁlls the principle of relativity introduced at the beginning of this section.
A physical law is written in a manifestly covariant form when it is written as
an equality between two relativistic tensors of the same rank: two Lorentz
invariants (scalars), two four-vectors, etc..
Going back to eq.(2.13) one can construct the following invariant
p
µ
p
µ
=
_
E
c
_
2
−p
2
= (mc)
2
(2.16)
The second equality is obtained in the easiest way by calculating the invariant
in the rest frame of the particle, where p
µ
= (mc, 0).
From the previous equation one can construct the Hamiltonian of a particle,
that is the energy written as function of the momentum
E = [(pc)
2
+ (mc
2
)
2
]
1/2
(2.17)
that in the nonrelativistic limit reduces to
E ≃ mc
2
+
p
2
2m
+ ...
10
Note that in eq.(2.17) we have taken only the positive value of the square
rooth. This choice is perfectly legitimate in a classical context, where the
energy changes its value in a continous way. On the other hand negative
energy solutions cannot be discarded when considering quantum-mechanical
equations.
From eqs.(2.13) and (2.17), the velocity of a particle is
v =
p
E
, |v| ≤ c
In the second relation, the equality is satiﬁed by massless particles. The
constraint on velocity has a more general validity, as we shall see when re-
vising electromagnetism: everything that carries information cannot have a
velocity greater than the speed of light c.
For the study of both classical and quantum-mechanical (ﬁeld) theories it is
very important to determine the transformation properties of the derivative
operator
∂
∂x
µ
=
_
1
c
∂
∂t
,
∂
∂r
_
=
_
1
c
∂
∂t
, ∇
_
The reader is suggested to derive them by using directly the chain rule. We
propose here a simpler proof. Let us consider the invariant x
ν
x
ν
= (x
0
)
2
−r
2
and apply to it the derivative operator. One has
∂
∂x
µ
x
ν
x
ν
= 2x
µ
= 2(x
0
, −r) (2.18a)
That is, the derivative with respect to the contravariant components gives,
and transforms as , a covariant four-vector (2x
µ
in the previous equation).
Conversely, the derivative with respect to the covariant components trans-
forms as a contravariant four-vector:
∂
∂x
µ
x
ν
x
ν
= 2x
µ
= 2(x
0
, r) (2.18b)
For this reason the following notation is introduced
∂
∂x
µ
= ∂
µ
(2.19a)
11
and
∂
∂x
µ
= ∂
µ
(2.19b)
Straightforwardly one veriﬁes that
∂
∂x
µ
∂
∂x
µ
= ∂
µ
∂
µ
=
1
c
2
∂
2
∂t
2
−∇
2
(2.20)
is an invariant operator.
2.2 Electromagnetism and Relativity
The elements that have been developed in the preceding Subsection will help
us to understand the relativistic properties of classical electromagnetism.
In summary, electromagnetism is a local theory in which the interaction
between charged particles is carried by the electromagnetic ﬁeld, at light
speed c. A complete analysis of this theory can be found, for example, in
refs.[2,4].
With respect to interaction propagation, the reader should realize that New-
ton’s theory of gravitational interaction is not compatible with special rela-
tivity. In fact the gravitational potential energy
V
g
= −
Gm
1
m
2
r
depends instantaneously on the distance r between the two bodies. If one
body, say the #1, changes its position or state, the potential energy, and
in consequence, the force felt by the body #2 changes at the same instant,
implying a transmission of the interaction at inﬁnite velocity.
Note that, on the other hand, the expression of Coulomb potential energy,
that is formally analogous to the Newton’s gravitational one, holds exactly
exclusively in the static case. According to classical electromagnetism, if the
interacting particles are in motion, it represents only approximatively their
interaction. This approximation is considered good if their relative velocity
is
|v| << c
The fundamental quantity of electromagnetism is the vector potential ﬁeld
A
µ
= (A
0
, A).
12
A ﬁeld is, by deﬁnition, a function of the time-space position x
ν
. As done in
most texbooks, in the following we shall drop the index ν of the argument,
simply writing A
µ
= A
µ
(x).
Synthetically, we recall that the Maxwell equations have the form
∂
ν
∂
ν
A
µ
=
4π
c
j
µ
(2.21)
with the Lorentz invariant Gauge condition
∂
µ
A
µ
= 0 (2.22)
where we have introduced the current density
j
µ
= (cρ(x), j(x)) (2.23)
Applying the derivative operator ∂
µ
to eq.(2.21) and using eq.(2.22), one
ﬁnds the current conservation equation, that is
∂
µ
j
µ
=
∂
∂t
ρ(x) +
∂
∂r
j(x) = 0 (2.24)
All the equations written above are manifestly covariant and the Lorentz
transformations can be easily performed. If a solution of eqs.(2.21) and (2.22)
is found in a reference frame S, it is not necessary to solve the equations in
the reference frame S
′
, but simply one can transform the electromagnetic
ﬁeld:
A
′µ
(x
′
) = L
µ
ν
(v)A
ν
(x(x
′
)) (2.25)
In more detail, one has
(i) to tranform the ﬁeld A
µ
, mixing its components by means of L
µ
ν
(v), that
is the ﬁrst factor of the previous equation, but also
(ii) to express the argument x of the frame S as a function of x
′
measured in
S
′
, that is, recalling eq.(2.11)
x
ν
= L
ν
ρ
(−v)x
′ρ
We brieﬂy deﬁne the last operation as argument re-expression.
The reader should note that such double transformation occours in the same
way when a rotation is performed. In this case the space components A
are mixed by the rotation matrix (for this reason the electromagnetic ﬁeld
13
is deﬁned as a vector ﬁeld) and the argument r must be expressed in terms
of r
′
by means of the inverse rotation matrix. Under rotation, in the time
component A
0
, one only has the argument re-expression of r.
In principle it is possible to construct a scalar ﬁeld theory (even though
there is no evidence of such theories at macroscopic level). In this case the
ﬁeld is represented by a one-component function φ(x). Both the Lorentz
transformation and the rotations only aﬀect the argument x in the same way
as before, but no mixing can occur for the single component function φ. One
has only to perform the argument re-expression.
We shall now explain with a physical relevant example the use of the trans-
formation (2.25) for the electromagnetic ﬁeld.
Let us consider a charged particle moving with velocity u along the x
1
-axis.
What is the ﬁeld produced by this particle ?
We introduce a reference frame S in which the particle is at rest, while the
observer in S
′
sees the particle moving with velocity u along x
1
. The velocity
of S
′
with respect to S is v = −u. The ﬁeld in S is purely electrostatic, that
is
A
0
= A
0
(ct, r) =
q
|r|
(2.26a)
A = A(ct, r) = 0 (2.26b)
where q represents the charge of the particle. We ﬁnd A
′µ
by means of
eq.(2.25). First, one has
A
′0
= γA
0
(2.27a)
A
′1
=
u
c
γA
0
(2.27b)
A
′α
= A
α
= 0 (2.27c)
with α = 2, 3 and γ = [1 −(u/c)
2
]
−1/2
.
Now we express |r| in terms of (ct
′
, r
′
), that is we perform the argument
re-expression.
By means of eqs.(2.1) and (2.11) one has
x
1
= γ(−ut
′
+ x
′1
)
x
α
= x
′α
14
so that
|r| = [γ
2
(−ut
′
+ x
′1
)
2
+ (x
′2
)
2
+ (x
′3
)
2
]
1/2
(2.28)
By means of the previous equation the ﬁnal expression for the ﬁeld of
eqs.(2.27a,b) is
A
′0
(ct
′
, r
′
) = qγ[γ
2
(−ut
′
+ x
′1
)
2
+ (x
′2
)
2
+ (x
′3
)
2
]
−1/2
(2.29a)
A
′1
(ct
′
, r
′
) =
u
c
A
′0
(ct
′
, r
′
) (2.29b)
This example has been chosen to explain the procedure for transforming a
ﬁeld function.
The ﬁeld of eqs.(2.29a,b) can be directly derived in the frame S
′
by solving
the Maxwell equations (2.21),(2.22) as done in refs.[2,4]. The technique of the
Li´enard Wiechert potentials can be used. But, as the reader should check,
much more mathematical eﬀorts are required.
2.3 The Hyperbolic Parametrization of the Lorentz
Transformations
Going back to the Lorentz transformations of eq.(2.1) we note that the coef-
ﬁcients of the of the transformation matrix L(v) are:
γ , −
v
c
γ
We square both terms (the minus sign disappears in the second one) and
take the diﬀerence, obtaining
γ
2
−
_
v
c
γ
_
2
= 1
Recalling that the hyperbolic functions satisfy the relation
ch
2
ω −sh
2
ω = 1
one can choose the following parametrization
γ = ch ω ,
v
c
γ = sh ω (2.30a)
15
with
v
c
= th ω (2.30b)
that connects the hyperbolic parameter ω with the standard velocity v. We
now show the reason why the parametrization L(ω) is very useful for the
following developments.
Let us consider two subsequent Lorentz transformations along the x
1
axis
with hyperbolic parameters η and ξ. The total transformation is given by
the following product of Lorentz transformations, that, by using the vector
algebra notation, is written in the form
[x
′
] = L(η)L(ξ)[x] (2.31)
The reader can calculate explicitly L(η)L(ξ) by means of standard rules for
row by column matrix product, then recalling
sh(η + ξ) = sh η ch ξ + sh ξ ch η
ch(η + ξ) = ch η ch ξ + sh ξ sh η
one ﬁnds
L(η)L(ξ) = L(η + ξ) (2.32)
Note that the previous result strictly depends on the chosen hyperbolic
parametrization. Due to the relativistic nonlinear composition of veloci-
ties, considering two subsequent Lorentz transformations, with v/c = th η
and w/c = th ξ, one has, in contrast to eq.(2.32),
L(v)L(w) = L(v + w) (2.33)
On the other hand, the composition of two Lorentz transformations with
hyperbolic parametrization, as given in eq.(2.32), has the same form as the
composition of two rotations around the same axis.
Eq.(2.32) is the clue for the following development.
By means of hyperbolic parametrization, we can now turn to express a ﬁnite
Lorentz transformation in terms of the corresponding inﬁnitesimal transfor-
mation.
Let us consider the case of small velocity, that is v/c << 1 or equivalently,
for the hyperbolic parameter, ω ≃ 0 (see eq.(2.30b)).
16
In particular, at ﬁrst order in ω or in v/c, one has
ch ω ≃ 1 , sh ω ≃ ω ≃
v
c
and, in consequence
L(ω) ≃ 1 + ωK
1
(2.34)
where 1 and K
1
respectively represent the identity matrix and the generator
of the Lorentz transformation matrix along the x
1
axis. This second term is
usually called boost generator.
Explictly, the matrix K
1
is easily obtained considering eqs.(2.1), (2.31a) and
the above Taylor expansions of the hyperbolic functions. It has the form
K
1
=
_
¸
¸
¸
_
0 −1 0 0
−1 0 0 0
0 0 0 0
0 0 0 0
_
¸
¸
¸
_
= −
_
σ
1
0
0 0
_
(2.35)
The second expression in the previous equation is given for pedagogical rea-
sons, that is to familiarize the reader with block matrices.
In fact, the 4×4 matrix K
1
is written as a block matrix, in which each block
is represented by a 2 × 2 matrix. In particular the upper left block is the
Pauli matrix σ
1
, in the other “0” blocks the four entries of each block are all
vanishing. The properties of the Pauli matrices are studied in the Appendix.
For their deﬁnition see eq.(A.1).
The reader should note the following two points:
(i) there is no direct connection between σ
1
of the previous equations and
the quantum mechanical spin operator,
(ii) as for the row by column product of a block matrix, the same rules of
standard matrices must be used.
In order to reconstuct the ﬁnite boost L(ω), (ω ﬁnite), we apply N times,
with N →∞, the inﬁnitesimal transformation of eq.(2.34).
The linear boost composition law of eq.(2.32) allows to derive the following
equation
L(ω) = lim
N→∞
_
1 +
ω
N
K
1
_
N
= exp(ωK
1
) (2.36a)
= 1 + (ch ω −1)
_
1 0
0 0
_
+ sh ωK
1
(2.36b)
17
The second equality of eq.(2.36a)is obtained by comparing the series expan-
sion in powers of ω of the exponential, with
_
1 +
ω
N
K
1
_
N
for N →∞ .
Eq.(2.36b) is derived working on that series expansion. One has the following
rules for the powers of K
1
(K
1
)
0
= 1 , (K
1
)
2n
=
_
1 0
0 0
_
, (K
1
)
2n+1
= K
1
that, for example, can be derived from the corresponding properties of σ
1
by
means of eq.(A.4).
The coeﬃcients that multiply (K
1
)
2n
and (K
1
)
2n+1
can be summed up, giv-
ing ch ω − 1 and sh ω, respectively. One can straightforwardly check that
eq.(2.36b) is equal to eq.(2.1) with the hyperbolic parametrization of eq.(2.30a).
We have developed in some detail this example as a guide to construct the
ﬁnite boost transformations for Dirac spinors in eqs.(3.17a,b).
We remind the reader that all the relevant properties of the boost transfor-
mation are contained in the inﬁnitesimal form given in eq.(2.34) with the
matrix boost generator of eq.(2.35). The ﬁnite expression of the boost is
obtained by means of a standard mathematical procedure that does not add
new physical information.
2.4 Lorentz Transformations in an Arbitrary Direction
In the previous developments we have considered Lorentz transformations
along the x
1
-axis. The transformations along the x
2
- and x
3
-axis are directly
obtained interchanging the spatial variables. In this way (as done in ref.[4])
one obtains a suﬃciently general treatment of relativistic problems. For
completeness and to help the reader with the analysis of some textbooks (as
for example ref.[2]) and research articles, we now study Lorentz transforma-
tions with an arbitrary boost velocity v direction. The comprehension of the
other Sections of this work does not depend on this point. In consequence,
the reader (if not interested) can go directly to eq.(2.41).
For deﬁniteness we consider the time-space four-vector x
µ
= (x
0
, r), but the
results hold for any four-vector.
The transformation equation (2.1) can be generalized in the following way
x
′ 0
= γ(x
0
−
rv
c
) (2.37a)
18
r
′
ˆ v = γ(−
v
c
x
0
+rˆ v) (2.37b)
r
′
⊥
= r
⊥
(2.37c)
where the unit vector ˆ v has been introduced so that v = vˆ v with v > 0 and
the notation r
⊥
denotes the spatial components of r perpendicular to v.
Eq.(2.37a) directly represents the Lorentz transformation of the time com-
ponent of a four-vector for an arbitrary direction of the boost velocity.
Some handling is necessary for the spatial components of the four-vector.
Starting from eqs.(2.37b,c) we now develop the transformation for r. One
can parametrize this transformation according to the following hypothesis
r
′
= r + f(v)
1
c
2
(rv)v + g(v)
1
c
x
0
v (2.38)
Note that it correctly reduces to the identity when v = 0 and automatically
gives eq.(2.37c) for the perpendicular components of the four-vector.
Multiplying the previous equation by ˆ v and comparing with eq(2.37b) one
ﬁnds
g(v) = −γ (2.39a)
f(v) =
γ −1
(
v
c
)
2
=
γ
2
γ + 1
(2.39b)
where the last expression of eq.(2.39b) is obtained by using the standard
deﬁnition of the factor γ.
Analogously to eq.(2.36a), the Lorentz transformation for the four-vector
given by eqs.(2.37a) and (2.38) can be written in exponential form, as
L(ωˆ v) = exp(ωˆ vK) (2.40)
with the same connection between ω and v as in eq.(2.30b). The matrices of
the boost generator K = (K
1
, K
2
, K
3
) are deﬁned as
K
1
=
_
¸
¸
¸
_
0 −1 0 0
−1 0 0 0
0 0 0 0
0 0 0 0
_
¸
¸
¸
_
K
2
=
_
¸
¸
¸
_
0 0 −1 0
0 0 0 0
−1 0 0 0
0 0 0 0
_
¸
¸
¸
_
K
3
=
_
¸
¸
¸
_
0 0 0 −1
0 0 0 0
0 0 0 0
−1 0 0 0
_
¸
¸
¸
_
(2.41)
19
Note that K
1
had been already derived in eq.(2.35). Furthermore, K
2
and K
3
can be directly obtained performing the Lorentz transformation analogously
to eq.(2.1) but along the x
2
- and x
3
-axis and repeating the procedure that
leads to eq.(2.35).
2.5 The Commutation Rules of the Boost Generators
The most important property of the boost generators or, more precisely, of
the matrices K given in eq.(2.41), is represented by their commutation rules.
Let us consider an illustrative example. For generality, we shall denote the
Lorentz transformation as boost, using the symbol B.
In a ﬁrst step, we perform a boost along x
2
with a small velocity. At ﬁrst
order in the hyperbolic parameter ω
2
one has
B
2
≃ 1 + ω
2
K
2
Analogously, in a second step, we make a boost along x
1
, with hyperbolic
parameter ω
1
, that is
B
1
≃ 1 + ω
1
K
1
The total boost, up to order ω
1
ω
2
, is
B
12
= B
1
B
2
≃ 1 + ω
1
K
1
+ ω
2
K
2
+ ω
1
ω
2
K
1
K
2
(2.42a)
Note the important property that the product of two boosts is a Lorentz
boost because it satisﬁes eq.(2.7), as it can be directly veriﬁed.
We now repeat the previous procedure inverting the order of the two boosts,
obtaining
B
21
= B
2
B
1
≃ 1 + ω
1
K
1
+ ω
2
K
2
+ ω
1
ω
2
K
2
K
1
(2.42b)
What is the diﬀerence between the two procedures ? Subtraction of eqs.(2.42a,b)
gives
B
12
−B
21
≃ ω
1
ω
2
[K
1
, K
2
] (2.43)
where the standard notation for the commutator of the matrices K
1
and K
2
has been introduced. Explicit calculation gives
[K
1
, K
2
] =
_
¸
¸
¸
_
0 0 0 0
0 0 1 0
0 −1 0 0
0 0 0 0
_
¸
¸
¸
_
(2.44)
20
At this point two (connected) questions are in order. What is the meaning
of the noncommutativity of the boost generators? Which physical quantity
is represented by the commutator of the last equation?
To answer these questions it is necessary to recall some properties of the
rotations.
They are initially deﬁned in the three dimensional space. Let us rotate the
vector r counterclockwise, around the x
3
axis, of the angle θ
3
. For a small
angle, at ﬁrst order in θ
3
, one obtains the rotated vector
r
′
≃ r + θ
3
ˆ
k ×r (2.45a)
where
ˆ
k represents the unit vector of the x
3
axis. One can put r and r
′
in the
three component column vectors [r] and [r
′
] so that the previous equation
can be written with the vector algebra notation as
[r
′
] ≃
_
1 + θ
3
s
3
_
[r] (2.45b)
where s
3
(see the next equation) represents the three-dimensional generator
matrix of the rotations around the axis x
3
. The same procedure can be
repeated for the rotations around the axes x
1
and x
2
. The generator matrices
are
s
1
=
_
¸
_
0 0 0
0 0 −1
0 1 0
_
¸
_ , s
2
=
_
¸
_
0 0 1
0 0 0
−1 0 0
_
¸
_ , s
3
=
_
¸
_
0 −1 0
1 0 0
0 0 0
_
¸
_
(2.46)
As it is well known the (previous) rotation generator matrices do not com-
mute:
[s
α
, s
β
] = ǫ
αβδ
s
δ
(2.47)
where we have introduced the Levi-Civita antisymmetric tensor ǫ
αβδ
.
As for the noncommutativity, this situation is partially similar to the case
of the boost generators shown in eq.(2.43) but, for the rotations, eq.(2.47)
shows that, given two generator matrices, their commutator is proportional
to the third matrix, while we have not yet identiﬁed the physical meaning of
the matrix in the r.h.s. of eq.(2.44).
Pay attention ! In quantum mechanics, from eqs.(2.46), (2.47) we can intro-
duce the spin 1 operators as
j
α
1
= −i¯ hs
α
21
satisfying the standard angular momentum commutation rules. Our s
α
do
not directly represent the three spin operators.
It is very important to note that the physical laws must be invariant under
rotations. To make physics we assume that space is isotropic. As the hy-
pothesis of an absolute reference frame must be refused, in the same way the
idea of a preferencial direction in the space is not allowed by the conceptual
foundations of physics.
Obviously, rotational invariance must be compatible with relativity. This
fact is immediately evident recalling that the rotations mix the spatial com-
ponents of a vector without changing the scalar product of two three-vectors,
say a and b: a
′
b
′
= ab.
The time components of the corresponding four-vectors also remain unal-
tered: a
′ 0
= a
0
and b
′ 0
= b
0
. Consider, as two relevant examples, the time
and energy that represent the zero components of the position and momen-
tum four-vectors, respectively.
It means that rotations satisfy the invariance equation (2.3) and, in conse-
quence, they are fully compatible with relativity. In terms of 4 ×4 matrices,
eq.(2.45b) is generalized as
[x
′
] ≃
_
1 + θ
3
S
3
_
[x] (2.48)
The 4 ×4 generator matrices are deﬁned in terms of 3 ×3 s
α
as
S
α
=
_
¸
¸
¸
¸
¸
_
0| 0 0 0
−−−−−−
0|
0| s
α
0|
_
¸
¸
¸
¸
¸
_
(2.49)
As it will be written in eq.(2.50b), the matrices S
α
obviously satisfy the same
commutation rules of eq.(2.47).
We can now verify that the r.h.s. of eq.(2.44) represents −S
3
.
In general one has the following commutation rules
[K
α
, K
β
] = −ǫ
αβδ
S
δ
(2.50a)
For completeness, we also give
[S
α
, S
β
] = ǫ
αβδ
S
δ
(2.50b)
22
and
[S
α
, K
β
] = ǫ
αβδ
K
δ
(2.50c)
where the last equation means that the boost generator K transforms as a
vector under rotations.
As for the derivation of the Dirac equation that will be performed in the next
section, we anticipate here that a set of K
α
and S
α
matrices (diﬀerent from
eqs.(2.41) and (2.46),(2.49)) will be found, that satisfy the same commuta-
tion rules of eqs.(2.50a-c). In mathematical terms, these new matrices are a
diﬀerent representation of the Lorentz group, allowing to satisfy in this way
the relativistic invariance of the theory.
For the study of the Dirac equation, it is also necessary to introduce another
invariance property related to a new, discrete, space-time transformation. It
is the parity transformation, or spatial inversion, that changes the position
three-vector r into −r, leaving the time component unaltered. This deﬁnition
shows that spatial inversion does not change the invariant product of two
four-vectors and, in consequence, is compatible with relativity.
Parity is a discrete transformation that does not depend on any parameter.
On the other hand, recall that rotations are continous tranformations, that
continously depend on the rotation angle. Obviously, spatial inversion cannot
be accomplished by means of rotations.
Note that, under parity transformation, ordinary, or polar vectors, as for
example the momentum p, do change sign in the same way as the position
r, while the axial vectors, as for example the orbital angular momentum
l = r×p, do not change sign. On the other hand they transform in standard
way under rotations.
Using the deﬁnition given above, parity transformation on the space-time
position,
[x
′
] = Π[x]
is accomplished by means of the diagonal Minkowsky matrix. We can write
Π = g
that holds for the spatial inversion of all the four-vectors.
23
From the previous deﬁnition, one can easily verify the following anticommu-
tation rule with the boost generators
{Π, K
α
} = 0 (2.51a)
or equivalently
ΠK
α
Π = −K
α
(2.51b)
where we have used the standard property Π
2
= 1.
Furthermore
[Π, S
α
] = 0 (2.52a)
or equivalently
ΠS
α
Π = S
α
(2.52b)
It shows that the rotation generators do not change sign under spatial inver-
sion, that is they behave as an axial-vector.
The determinant of Lorentz boost and rotations is equal to +1, while for
spatial inversion it is −1.
Note that eqs.(2.51a)-(2.52b) represent general properties of the parity trans-
formation that do not depend on the tensor to which it is applied. They are
derived, and hold, in the case of four-vectors, but they are also assumed to
hold for the Dirac spinors. But, in this case, the following critical discussion
is necessary.
In fact, after these formal developments, we can ask: being parity compatible
with relativity, are the physical laws of nature really invariant under spatial
inversion?
The situation is diﬀerent with respect to rotations, that represent a necessary
invariance for our understanding of nature.
Initially, parity was considered an invariance of physics, but in the ﬁfties
the situation changed. In fact, some experiments on beta decay showed that
weak interactions are not invariant under spatial inversion. On the other
hand, gravitational, electromagnetic and strong (or nuclear) interactions are
parity invariant.
When deriving the Dirac equation, we shall require the fulﬁllment of parity
invariance, having in mind the study of electromagnetic and strong interac-
tions. In a following work we shall discuss the weakly-interacting neutrino
equations, that are not invariant under parity transformation.
24
We conclude this section mentioning another discrete transformation, called
time reversal, that consists in changing the sign of time: t
′
= −t. Classical
laws of physics are invariant with respect to this change of the sense of
direction of time. The action of time reversal on the space-time four vector
is represented by the matrix T = −Π = −g.
At microscopic level, time reversal invariance is exact in strong and electro-
magnetic processes, but not in weak interactions. However, this violation is
of diﬀerent kind with respect to that of parity transformation.
We conclude pointing out that in the formalism of ﬁeld theories the product of
the three transformations : C (Charge Conjugation), P (Parity) and T (Time
Reversal ), is an exact invariance, as conﬁrmed by the available experimental
data.
3 Relativistic Quantum Wave Equations
In this Section we shall study the procedure to implement the principles of
special relativity in the formalism of quantum mechanics in order to introduce
the fundamental Dirac equation.
Previously, in Subsection 3.1 we shall analyze the general properties of the
four-momentum operator in quantum mechanics and discuss at pedagogical
level the Klein-Gordon equation for spinless particles.
3.1 Generalities and Spin 0 Equation
Let us ﬁrstly recall the Schr¨odinger equation for a free particle. In the coor-
dinate representation it has the form
i¯ h
∂ψ(t, r)
∂t
= −
¯ h
2
2m
∇
2
ψ(t, r) (3.1)
It can be obtained by means of the following eqs.(3.2a-c), performing the
translation, in terms of diﬀerential operators acting onto the wave function
ψ(t, r), of the standard nonrelativistic expression
E =
p
2
2m
25
It clearly shows that Schr¨odinger equation (3.1) is essentially nonrelativistic
or, in other words, not compatible with Lorentz transformations.
As discussed in refs.[5,6], the fundamental relation that is used for the study
of (relativistic) quantum mechanics associates the four-momentum of a par-
ticle to a space-time diﬀerential operator in the following form
p
µ
= i¯ h∂
µ
(3.2a)
that, as explained in Subsection 2.1, means
p
0
c = E = i¯ h
∂
∂t
(3.2b)
and
p = −i¯ h∇ = −i¯ h
∂
∂r
(3.2c)
The reader may be surprised that at relativistic level the same relations hold
as in nonrelativistic quantum mechanics. As a matter of fact, eqs.(3.2a-c)
express experimental general properties of quantum waves, as given by the
De Broglie hypothesis.
Furthermore, the connection with relativity is possible because i¯ h∂
µ
is a
contravariant four-vector operator.
The easiest choice to write a relativistic wave equation consists in translating
eq.(2.16) (instead of the nonrelativistic expression !) in terms of the space-
time diﬀerential operators given by the previous equations. One has
−¯ h
2
∂
µ
∂
µ
ψ(t, r) = (mc)
2
ψ(t, r) (3.3a)
or, more explicitly, multiplying by c
2
−(¯ hc)
2
_
1
c
2
∂
2
∂t
2
−∇
2
_
ψ(t, r) = m
2
c
4
ψ(t, r) (3.3b)
Exactly as done for the electromagnetic ﬁeld equations in Subsection 2.3,
recalling the invariance of ∂
µ
∂
µ
, one realizes that previous equation is mani-
festly covariant.
In order to make explicit calculations in atomic, nuclear and subnuclear
physics, it is necessary to remember some numerical values (and the corre-
sponding units !). We start considering the following quantities that appear
in eq.(3.3b):
¯ hc = 197.327 MeV fm
26
that is the Planck constant ¯ h multiplied by the speed of light c, expressed as
an energy multiplied by a length. The energy is measured in MeV
1MeV = 10
6
eV = 1.6022 ×10
13
Joule
and the length in fm (femtometers or Fermis)
1fm = 10
−15
m = 10
−13
cm
Furthermore, the particle masses are conveniently expressed in terms of their
rest energies. We give a few relevant examples
m
e
c
2
= 0.511 MeV
for the electron
m
p
c
2
= 938.27 MeV
for the proton, and
m
n
c
2
= 939.57 MeV
for the neutron.
Also note that the operator ∂
µ
is, dimensionally, a length
−1
, that in our units
gives fm
−1
.
Going back to the formal aspects of eq.(3.3a,b), usually called Klein-Gordon
equation, we note the two following aspects:
(i) Being based on the relativistic relation among energy, momentum and
mass of eq.(2.16) with the De Broglie hypothesis of eqs.(3.2a-c), the mani-
festly covariant Klein-Gordon equation has a general validity, in the sense
that the wave fuctions of all the relativistic free particles must satisfy that
equation. As for the Dirac equation for spin 1/2 particles, see eq.(3.41) and
the following discussion.
(ii) In the Klein-Gordon equation it does not appear the particle spin. Or,
equivalently, the function ψ(t, r) is a one-component or scalar ﬁeld function
that describes a spin 0 particle, as it happens in nonrelativistic quantum
mechanics when spin is not included.
The eﬀects of rotations and Lorentz boosts only consist in the argument re-
expression discussed in Subsection 2.2. As explained in textbooks of quantum
27
mechanics, see for example ref.[7], the (inﬁntesimal) rotations are performed
by using the orbital angular momentum operator as generator.
The Klein-Gordon equation admits plane wave solutions, corresponding to
eigenstates of the four-momentum p
µ
= (
E
c
, p) in the form
ψ
p
(t, r) = N exp
_
i
¯ h
(−Et +pr)
_
(3.4a)
= N exp
_
−
i
¯ h
p
µ
x
µ
_
= N exp
_
−
i
¯ h
[p]
T
g[x]
_
(3.4b)
where N represents a normalization constant. The expression (3.4b) has
been written using explicitly the Lorentz covariant notation.
The most relevant point here is that the energy eigenvalue E can assume
both positive and negative values (we shall see that it holds true also for
Dirac equation !) We have
E = p
0
c = λǫ(p) (3.5a)
where
ǫ(p) = [(pc)
2
+ (mc
2
)
2
]
1/2
(3.5b)
and the energy sign λ = +/ −1 have been introduced.
In quantum mechanics the λ = −1 solutions cannot be eliminated. They are
strictly necessary to have a complete set of solutions of the wave equation.
They can be correctly interpreted by means of charge conjugation in the
framework of ﬁeld theory, as done in most textbooks. Historically, starting
from the work by Dirac, negative energy solutions lead to the very important
discovery of the antiparticles, that have the same mass (and spin) but opposite
charge with respect to the corresponding particles.
We shall not analyze this problem here but postpone it to a subsequent work.
As for the positive energy solutions, one can immediately check that in the
nonrelativistic regime (|p|c << mc
2
) the Schr¨odinger limit is obtained.
As an illustrative exercise, it may be useful to perform a Lorentz boost in
eq.(3.4b). Given that we are considering a scalar ﬁeld, we have to make only
the argument re-expression.
28
In this concern recall that, for positive energy, the wave function of eq.(3.4a,b)
represents a particle state such that an observer in S measures the particle
four-momentum p
µ
.
In the reference frame S
′
, for the space-time position one must use
[x] = L
−1
[x
′
]
(both the velocity and the hyperbolic parametrizations can be adopted and,
for simplicity, no argument has been written in L
−1
) and replace it in eq.(3.4b).
In the argument of plane wave exponential one has
[p]
T
g[x] = [p]
T
gL
−1
[x
′
] = [p]
T
Lg[x
′
] = [p
′
]
T
g[x
′
]
where we have used gL
−1
= Lg from eq.(2.10) and also [p]
T
L = [p
′
]
T
. In
the previous result we recognize the invariance equation that, in standard
notation, reads
p
µ
x
µ
= p
′
µ
x
′µ
Physically, it means that an observer in S
′
measures the particle transformed
four-momentum p
′µ
.
The Klein-Gordon equation admits a conserved current. We shall consider
the form, related to a transition process, that is used in perturbation theory
to calculate the corresponding probability amplitude.
To derive the conserved current one has to make the following three steps.
(i) Take eq.(3.3b) with a plane wave solution ψ
p
I
(t, r) for an initial state of
four-momentum p
I
.
(ii) Take eq.(3.3b) with a plane wave solution ψ
p
F
(t, r) for a ﬁnal state of
four-momentum p
F
and make the complex conjugate.
(iii) Multiply the equation of step (i) by the complex conjugate ψ
∗
p
F
(t, r) and
the equation of step (ii) by ψ
p
I
(t, r). Then subtract these two equations,
obtaining
[∂
µ
∂
µ
ψ
∗
p
F
(t, r)]ψ
p
I
(t, r) −ψ
∗
p
F
(t, r)∂
µ
∂
µ
ψ
p
I
(t, r) = 0 (3.6)
Note that the mass term has disappeared. The previous equation can be
equivalently written as a conservation equation in the form
∂
µ
J
µ
FI
(t, r) = 0 (3.7)
29
where the conserved current is deﬁned as (multiplying by the conventional
factor i¯ h)
J
µ
FI
(t, r) = i¯ h[ψ
∗
p
F
(t, r)∂
µ
ψ
p
I
(t, r) −(∂
µ
ψ
∗
p
F
(t, r))ψ
p
I
(t, r)] (3.8a)
= (p
µ
I
+ p
µ
F
)N
I
N
F
exp
_
i
¯ h
q
µ
x
µ
_
(3.8b)
In the last equation the four-momentum transfer q
µ
= p
µ
F
−p
µ
I
of the transi-
tion process has been introduced.
The conserved current J
µ
FI
(t, r) is manifestly a four-vector.
The latter eq.(3.8b), that is obtained by explicit use of the wave functions, is
very interesting. The ﬁrst term (p
µ
I
+p
µ
F
) represents the so-called four-vector
vertex factor.
Applying to eq.(3.8b) the derivative operator ∂
µ
one veriﬁes that current
conservation relies on the following kinematic property of the vertex factor
q
µ
(p
µ
I
+ p
µ
F
) = p
F
µ
p
F
µ
−p
I
µ
p
I
µ
= 0 (3.9)
that is automatically satisﬁed because the mass of the particle remains the
same in the initial and ﬁnal state.
As for the general properties of the current given in eqs.(3.8a,b) we ﬁnd that
in the static case, i.e. p
F
= p
I
, the time component J
0
II
is negative if negative
energy states (λ = −1) are considered. It means that one cannot attach to
J
0
II
the meaning of probability density as it was done with the Schr¨odinger
equation. For this reason we do not discuss in more detail the plane wave
normalization constant N.
Again, a complete interpretation of the Klein-Gordon equation and of its
current is obtained in the context of ﬁeld theory.
3.2 Spin 1/2 Dirac Equation
In nonrelativistic quantum-mechanics a spin 1/2 particle is described by a
two-component spinor φ. The spinor rotation is performed by mixing its
components. At ﬁrst order in the rotation angle θ
α
, one has
φ
′
≃ (1 −
i
2
θ
α
σ
α
)φ (3.10)
30
where the three Pauli matrices σ
α
have been introduced. Their properties
are studied in the Appendix.
What is important to note here is that the matrix operators S
α
[2]
= −
i
2
σ
α
play the same rˆole in realizing the rotations as the matrices S
α
deﬁned in
eq.(2.49). For this reason, their commutation rules are the same as those
given in eq.(2.50b). Also, the spin or intrinsic angular momentum operator
is deﬁned [7] multiplying by ¯ h/2 the Pauli matrices σ
α
.
Formally, we have introduced the two-dimensional representation of the ro-
tation group ( the three-dimensional representation corresponds to spin 1,
etc.).
Finally, the spatial argument of the spinor φ (not written expicitly in eq.(3.10))
is rotated with the same rules previously discussed for the arguments of the
ﬁeld functions, that is one has to perform the argument re-expression.
In quantum mechanics, the generator of these rotations is the orbital angu-
lar momentum operator l = r ×p , so that the total angular momentum is
given by the three generators of the total rotation (on the spinor and on the
argument), in the form
j
α
= l
α
+
¯ h
2
σ
α
We can now try to introduce relativity. We shall follow a strategy similar to
that of refs.[5,8], but avoiding many unessential (at this level) mathematical
details.
First, we note that for a particle at rest, the relativistic theory must coincide
with the previous nonrelativistic treatment.
Second, we make the following question: can we ﬁnd a set of three 2 × 2
boost matrices (acting on the two-component spinors) that satisfy, with the
S
α
[2]
= −
i
2
σ
α
replacing the S
α
, the same commutation rules as the K
α
in
eqs.(2.50a-c)?
The answer is yes. A simple inspection of eqs.(2.50a-c) and use of the stan-
dard property of the Pauli matrices given in eq.(A.2) show that the matrices
K
α
[2]
=
τ
2
σ
α
satisfy those commutation rules.
Eq.(2.50a) requires τ
2
= 1, while eq.(2.50c) does not give any new constraint
on the parameter τ, that, in consequence can be chosen equivalently as
τ = +/ −1.
However, a serious problem arises when trying to introduce the parity trans-
formation matrix. It must satisfy, both the anticommutation rule with the
31
boost generators as in eq.(2.51a) and the commutation rule with the rotation
generators as in eq.(2.52a). In our 2 ×2 case, boost generators and rotation
generators are proportional to the Pauli matrices, so there is no matrix that
satisﬁes at the same time the two rules [8].
In consequence, we can construct a two-dimensional theory for spin 1/2 par-
ticles that is invariant under Lorentz transformations but not under parity
transformations.
On the other hand, the ﬁrst objective that we want to reach is the study
of the electromagnetic interactions of the electrons in atomic physics and in
scattering processes. To this aim we need an equation that is invariant under
spatial inversion.
A parity noninvariant equation for spin 1/2 particles, based on the transfor-
mation properties outlined above, will be used for the study of the neutrinos
that are created, destroyed and in general interact only by means of weak
interactions that are not invariant under spatial inversions.
In order to construct a set of matrices for spin 1/2 particles satisfying both
Lorentz and parity commutation rules, we make the two following steps:
(i) we consider matrices with larger dimension;
(ii) we exploit the sign ambiguity of τ in the boost generator.
More precisely, it is suﬃcient to introduce the following 4 ×4 block matrices
K
δ
[D]
=
1
2
_
σ
δ
0
0 −σ
δ
_
=
1
2
α
δ
(3.11)
where we have taken τ = +1 and τ = −1 in the upper and lower diagonal
block, respectively.
Important note: the previous equation represents the deﬁnition of the three
matrices α
δ
. We use the greek letter δ (instead of α) as spatial index to avoid
confusion between the indices and the matrices.
With no diﬃculty, for the spinor rotations we introduce
Σ
δ
=
_
σ
δ
0
0 σ
δ
_
(3.12a)
so that
S
δ
[D]
= −
i
2
Σ
δ
(3.12b)
32
Note that, taking into account the discussion for the transformation of the
two-dimensional spinors with S
δ
[2]
and K
δ
[2]
, the commutation rules of eqs.(2.50a-
c) for Lorentz transformations and rotations are automatically satiﬁed by the
block diagonal matrices K
δ
[D]
, S
δ
[D]
introduced above.
For the spatial inversion, we ﬁnd the 4 ×4 block matrix
Π
[D]
=
_
0 1
1 0
_
= β (3.13)
with the property
Π
[D]
= Π
†
[D]
= Π
−1
[D]
(3.14)
It satisﬁes the anticommutation with the boost generators of eq.(2.51a,b),
that means
{Π
[D]
, K
δ
[D]
} = {β, α
δ
} = 0 (3.15)
The speciﬁc form of Π
[D]
straightforwardly satisﬁes also the rules (2.52a,b).
In technical words, we have obtained a representation of the Lorentz group,
including parity, for spin 1/2 particles.
Introducing explicitly the four component Dirac spinor u, its boost transfor-
mation is written in the form
u
′
= B
[D]
(ω)u (3.16)
The (inﬁnitesimal) form of B
[D]
(ω) at ﬁrst order in ω is
B
[D]
(ω) ≃ 1 −
1
2
ω (αˆ v) (3.17a)
where, as usual, ˆ v represents the unity vector of the boost velocity. (For
simplicity, we do not write it explicitly in B
[D]
(ω).)
The ﬁnite transformation is obtained in the same way as in eqs.(2.36a,b) and
(2.40) but using the properties of the Pauli matrices, as it is shown in detail
in eqs.(A.17),(A.18) and in the following discussion in the Appendix. One
has
B
[D]
(ω) = exp[−
ω
2
(αˆ v)] = ch(
ω
2
) −(αˆ v)sh(
ω
2
) (3.17b)
On the other hand, the spinor rotations are obtained by replacing the σ
δ
with the 4 ×4 matrices Σ
δ
in eq.(3.10).
33
Furthermore, when changing the reference frame, one has always to perform
the argument re-expression in the Dirac spinors u.
We note that, while the rotations are represented by a unitary operator, the
Lorentz boost are not. More precisely, B
[D]
(ω) is a antiunitary operator, that
is
B
†
[D]
(ω) = B
[D]
(ω) (3.18)
A unitary, but inﬁnite dimensional (or nonlocal) representation of the boost
for spin 1/2 particles can be obtained. This problem will be studied in a
diﬀerent work.
The next task is to construct matrix elements (in the sense of vector algebra
and not of quantum mechanics, because no spatial integration is performed)
of the form u
b
†
Mu
a
, that, when boosting u
a
and u
b
, transform as Lorentz
scalar and Lorentz four-vectors. The case of pseudoscalars and axial-vectors
will be studied in Subsection 3.4.
We shall keep using the word matrix elements throughout this work, but
in most textbooks they are commonly denoted as Dirac covariant bilinear
quantities.
Given a generic 4 ×4 matrix M, by means of eq.(3.17a) the transformation
of the matrix element up to ﬁrst order in ω, is
u
′
b
†
Mu
′
a
≃ u
b
†
Mu
a
−
1
2
ω ˆ v
δ
u
b
†
{α
δ
, M}u
a
(3.19)
The Lorentz scalar matriz element is easily determined by means of a matrix
M
s
that anticommutes with the α
δ
so that the second term in the r.h.s. of
eq.(3.19) is vanishing. Simply recalling eqs.(3.15) and (3.13) one has
M
s
= β =
_
0 1
1 0
_
(3.20)
where we are using the deﬁnition of the β Dirac matrix given in eq.(3.13).
As for the four-vector matrix element, one needs four matrices M
µ
v
. To ﬁnd
their form in a simple way, let us consider a boost along the x
1
-axis, that
in eq.(3.19) means ˆ v = (1, 0, 0). By means of eq.(3.19), to recover the four-
vector Lorentz transformation (see eqs.(2.1) and (2.34)), one needs
1
2
{α
1
, M
0
v
} = M
1
v
(3.21)
34
for the transformation of M
0
v
, and
1
2
{α
1
, M
1
v
} = M
0
v
(3.22)
for the transformation of M
1
v
.
The solution is easily found calculating the anticommutators of the Dirac ma-
trices α
δ
by means of the anticommutators of the Pauli matrices of eq.(A.3).
One has
M
0
v
= α
0
= 1, M
1
v
= α
1
(3.23a)
and the solution for all the components is
M
µ
v
= α
µ
= (1, α
1
, α
2
, α
3
) (3.23b)
Pay attention: α
0
= 1 is not introduced in most textbooks.
We can resume the previous equations, also for ﬁnite Lorentz boosts, as
B
[D]
(ω)βB
[D]
(ω) = β (3.24a)
or, equivalently
B
[D]
(ω)β = βB
−1
[D]
(ω) (3.24b)
for the scalar matrix elements, and
B
[D]
(ω)α
µ
B
[D]
(ω) = L
µ
ν
(ω)α
ν
(3.25)
for the four-vector ones.
The previous developments, recalling the expression of the four-momentum
operator given in eqs.(3.2a-c), allow to write a linear covariant wave equation
in the form
i¯ hc ∂
µ
α
µ
ψ(x) = mc
2
βψ(x) (3.26)
that is the Dirac equation, where m is the particle mass and ψ(x) = ψ(t, r)
is a four component Dirac spinor representing the particle wave function.
Intuitively, the covariance of the Dirac equation can be proven multiplying
the previous equation from the left by a generic hermitic conjugate Dirac
spinor. In the l.h.s. one has a Lorentz scalar given by the product of the
(contravariant) four-vector martix element of α
µ
with the (covariant) oper-
ator i¯ hc ∂
µ
. In the r.h.s. one has the Lorentz scalar directly given by the
matrix element of β.
35
More formally, we can prove the covariance of the Dirac equation in the
following way. We write the same equation in S
′
and show that is equivalent
to the (original) equation in S. We have
i¯ hc ∂
′
µ
α
µ
ψ
′
(x
′
) = mc
2
βψ
′
(x
′
) (3.27)
The spinor in S
′
is related to the spinor in S by means of eq.(3.16):
ψ
′
(x
′
) = B
[D]
(ω)ψ(x
′
(x)) (3.28)
We replace the last expression in eq.(3.27) and multiply from the left that
equation by B
[D]
(ω). In the r.h.s., by means of eq.(3.24a) one directly obtains
βψ. In the l.h.s., one has to consider eq.(3.25), transforming the equation in
the form
i¯ hc ∂
′
µ
L
µ
ν
(ω)α
ν
ψ(x
′
(x)) = mc
2
βψ(x
′
(x))
We can use the more synthetic vector algebra notation, writing
∂
′
µ
L
µ
ν
α
ν
= [∂
′
]
T
gL[α] = [∂]
T
g[α] = ∂
µ
α
µ
(3.29)
where in the second equality we have used gL = L
−1
g.
In this way we have shown the equivalence of eq.(3.27), written in S
′
, with
the original equation (3.26), written in S.
3.3 The Gamma Dirac Matrices and the Standard Rep-
resentation
The physical content of the Dirac equation is completely contained in eq.(3.26)
and in the related transformation properties. However, to work in a more
direct way with Dirac equation and its applications, some more developments
are necessary.
First, we introduce the Dirac adjoint spinor that is preferably used (instead
of the hermitic conjugate) to calculate matrix elements. It is deﬁned as
¯ u = u
†
β (3.30)
Its transformation law is straightforwardly obtained in the form
¯ u
′
= u
′†
β = u
†
B
[D]
(ω)β = ¯ uB
−1
[D]
(ω) (3.31)
36
where eq.(3.24b) has been used. As it must be for a representation of the
Lorentz boost, B
−1
[D]
(ω) is obtained inverting the direction of the boost veloc-
ity
B
−1
[D]
(ω) = ch(
ω
2
) + (αˆ v)sh(
ω
2
) ≃ 1 +
1
2
ω (αˆ v) (3.32)
As an exercise, the reader can check that B
[D]
(ω)B
−1
[D]
(ω) = 1 by using the
properties of the α
δ
matrices.
Note that in the previous results there is no new physical content. We can
represent the Lorentz scalar (invariant) as
u
†
b
βu
a
= ¯ u
b
u
a
(3.33)
In fact, we have learned in eq.(3.31) that ¯ u transforms with B
−1
[D]
(ω).
We introduce the Dirac matrices γ
µ
deﬁned as
γ
µ
= βα
µ
(3.34a)
Recalling that β
2
= 1, one has
α
µ
= βγ
µ
(3.34b)
The four-vector matrix element can be written as
u
†
b
α
µ
u
a
= ¯ u
b
γ
µ
u
a
(3.35)
and the Dirac equation (3.26) takes the usual form
i¯ hc ∂
µ
γ
µ
ψ(x) = mc
2
ψ(x) (3.36)
For clarity we give the explicit expression of the γ
µ
:
γ
0
= β =
_
0 1
1 0
_
, γ
δ
=
_
0 −σ
δ
σ
δ
0
_
(3.37)
As it will be discussed in the following, this is the so-called spinorial repre-
sentation of the Dirac matrices.
Starting from the anticommuation rules of the α
µ
one ﬁnds the following
fundamental ! anticommutation rules of the γ
µ
{γ
µ
, γ
ν
} = 2g
µν
(3.38)
37
Furthermore, one easily veriﬁes that γ
0
is hermitic while the γ
δ
are anti-
hermitic:
γ
µ†
= g
µµ
γ
µ
= γ
0
γ
µ
γ
0
(3.39)
Note that in g
µµ
the index µ not summed; the last equality is obtained by
standard use of eq.(3.38). Furthermore, the previous equation also holds in
the standard representation of the Dirac matrices that will be introduced in
the following.
We can now easily examine the usual procedure that is adopted to introduce
the Dirac equation. Consider, for example, refs.[6,9]. The diﬀerential wave
equation for a spin 1/2 particle is assumed to be linear with respect to the
four-momentum operator introduced in eq.(3.2a,c) and to the particle mass.
According to this hypothesis, the equation is written as
i¯ hc ∂
µ
Γ
µ
ψ(x) = mc
2
ψ(x) (3.40)
where the Γ
µ
are four adimensional matrices to be determined.
Then, one multiplies by i¯ hc ∂
µ
Γ
µ
and, by using the same eq.(3.40), obtains
in the l.h.s. another factor mc
2
. The equation takes the form
−(¯ hc)
2
∂
ν
Γ
ν
∂
µ
Γ
µ
ψ(x) = (mc
2
)
2
ψ(x) (3.41)
As we said in Subsection 3.1, the wave function of any relativistic parti-
cle must satisfy the Klein-Gordon equation (3.3a,b). This property must
be veriﬁed also in our case. To this aim, we make the following algebraic
manipulation
∂
ν
Γ
ν
∂
µ
Γ
ν
=
1
2
∂
µ
∂
ν
(Γ
ν
Γ
ν
+ Γ
µ
Γ
ν
)
It shows that the Γ
µ
must satisfy the anticommutation rules of eq.(3.38).
The lowest dimension for which it is possible is 4 and we can identify the
Γ
µ
with the γ
µ
of eq.(3.37) that have been derived by means of relativistic
transformation properties.
In any case, (we repeat) the previous development is useful to show that the
solutions of the Dirac equation are also solutions of the Klein-Gordon one.
We can expect that also Dirac equation admits negative energy solutions.
We now face a diﬀerent problem. In Subsection 3.2 we have seen that the
relevant point for the covariance of the Dirac equation is represented by the
38
anticommutation rules of the α
δ
and β matrices. The same is true for the
γ
µ
. In other words, their speciﬁc form is not important, provided that the
anticommutation rules are fulﬁlled. We now look for another representa-
tion, diﬀerent from eq.(3.37), and more useful for practical calculations. We
construct this new representation starting from a speciﬁc solution of Dirac
equation (3.26) or (3.36).
Let us consider a particle at rest, that is, in a three-momentum eigenstate
with p = 0. The spatial components
∂
∂r
of the derivative operator, when
applied to the corresponding wave function, give zero. The Dirac equation
reduces to
i¯ h
∂ψ(x)
∂t
= mc
2
_
0 1
1 0
_
ψ(x) (3.42a)
We can split the Dirac spinor into two two-component spinors
ψ =
_
η
ξ
_
So that eq.(3.42a) is written as a system of coupled equations:
i¯ h
∂η
∂t
= mc
2
ξ
i¯ h
∂ξ
∂t
= mc
2
η (3.42b)
We can sum and subtract these two equations introducing the new two-
component spinors
ϕ =
1
√
2
(ξ + η)
χ =
1
√
2
(ξ −η) (3.43)
(the factor
1
√
2
guarantees that normalization of the new Dirac spinor is not
changed). One ﬁnds
i¯ h
∂ϕ
∂t
= mc
2
ϕ
i¯ h
∂χ
∂t
= −mc
2
χ (3.44)
These equations are equivalent to eq.(3.42b) but they are decoupled. Techni-
cally, we have diagonalized the r.h.s. rest frame Hamiltonian of eq.(3.42a).
39
The solutions are easily found:
ψ
+
=
_
ϕ
0
_
with positive energy E = +mc
2
,
ψ
−
=
_
0
χ
_
with negative energy E = −mc
2
. The presence of two energy values repre-
sents a general property of relativistic wave equations.
The advantage of the solutions ψ
+/−
of eq.(3.44) is that only one two-
component spinor is nonvanishing while the other is zero. In the positive
energy case, the nonvanishing spinor can be identiﬁed with the nonrelativis-
tic one. Furthermore, when considering a positive enery particle with small
(nonrelativistic) velocity, we can expect the lower components of ψ
+
to be
(not zero but) small with respect to the upper ones.
For these reasons we apply that transformation to a generic Dirac spinor,
not only in the case p = 0.
More formally, we perform the transformation of eq.(3.43) by introducing the
following matrix
U =
1
√
2
_
1 1
1 −1
_
(3.45)
that satisﬁes
U
†
= U
−1
= U
We multiply from the left the Dirac equation (3.36) by U and insert UU = 1
between the γ
µ
and ψ . In this way we transform the Dirac wave function
and, at the same time, the Dirac matrices obtaining
γ
µ
st
= Uγ
µ
U (3.46)
where the γ
µ
st
are the Dirac matrices in the standard representation, while
the γ
µ
of eq.(3.37) have been given in the so-called spinorial representation.
In most physical problem (specially if a connection with nonrelativistic physics
is wanted) the standard representation is adopted. Generally the index “st”
is not explicitly written. In the following of the present work we shall also
adopt this convention.
40
Note that, due to the property of U given above, if two matrices in the
spinorial representation satisfy an (anti)commutation rule, the corresponding
matrices in the standard representation also satify the same rule.
In particular, this property holds for the anticommutation rule of eq.(3.38)
of the γ
µ
. In the standard representation they have the form
γ
0
= β =
_
1 0
0 −1
_
, γ
δ
=
_
0 σ
δ
−σ
δ
0
_
(3.47)
The hermitic conjugate satify the same eq.(3.39). As for the α
µ
, by using
eq.(3.34b), one has
α
µ
st
= Uα
µ
U = UβUUγ
µ
U = γ
0
st
γ
µ
st
(3.48)
Explicitly, without writing the index “st”, they are
α
0
=
_
1 0
0 1
_
, α
δ
=
_
0 σ
δ
σ
δ
0
_
(3.49)
Note that the spin Σ
δ
matrices of eq.(3.12a) keep the same form in the
spinorial and standard representation.
In consequence, one can deﬁne K
[D]
=
1
2
α
δ
and S
δ
[D]
= −
i
2
Σ
δ
by using the
standard representation for the α
δ
(and the Σ
δ
): the boost and rotation
generators commutation rules are equivalently fulﬁlled. Furthermore, the
expression of the boost operator is the same as in eq.(3.17a,b), with the α
δ
written in the standard representation.
3.4 Parity Transformations and the Matrix γ
5
There is a ﬁfth matrix that anticommutes with the other γ
µ
. It is γ
5
:
{γ
µ
, γ
5
} = 0 (3.50)
In the spinorial and standard representations, one has, respectively
γ
5
sp
=
_
−1 0
0 1
_
, γ
5
st
=
_
0 −1
−1 0
_
(3.51)
Note that γ
5
†
= γ
5
and (γ
5
)
2
= 1.
41
Furthermore, we use the deﬁnition of ref.[5], but, as done in many texbooks,
γ
5
can be deﬁned multiplying eq.(3.51) by −1. All its properties remain
unchanged. Pay attention to which deﬁnition is used !
To understand the physical meaning of the matrix elements of γ
5
, it is useful
to go back to Dirac spinor parity transformation. As shown in eq.(3.13),
this transformation is u
′
= βu being β = γ
0
. Let us consider the parity
transformation for Lorentz scalar and four-vector matrix elements. Standard
use of the γ
µ
anticommutation rule (3.38) gives
¯ u
b
′
u
a
′
= ¯ u
b
u
a
(3.52a)
and
¯ u
b
′
γ
0
u
a
′
= ¯ u
b
γ
0
u
a
(3.52b)
¯ u
b
′
γ
δ
u
a
′
= −¯ u
b
γ
δ
u
a
(3.52c)
These results have an easy physical interpretation: a Lorentz scalar and a
time component of a four-vector (for example a charge density) do not change
sign under spatial inversion, while the spatial components of a four-vector
(for example a current density) do change sign.
Let us now consider the following matrix element
¯ u
b
γ
5
u
a
The Lorentz boost are studied by means of eq.(3.19) taking M
ps
= γ
0
γ
5
.
Standard use of eqs.(3.50) (3.38) and (3.34a,b) show that
{α
δ
, γ
0
γ
5
} = 0 (3.53)
so that we can conclude that our matrix element is invariant under Lorentz
transformations. The same can be shown for rotations using the generator
of eq.(3.12a,b).
But, what happens with spatial inversion ? We have
¯ u
b
′
γ
5
u
a
′
= ¯ u
b
′
γ
0
γ
5
γ
0
u
a
′
= −¯ u
b
u
a
(3.54)
It means that our matrix element changes sign under parity transformation.
It is a pseudo-scalar quantity.
42
In terms of elementary quantities, a pseudo-scalar is given by the product
of an axial vector (see the discussion of subsect 2.5) with a standard vector,
for example the spin with the three-momentum: sp. (It is not possible to
use the orbital angular momentum instead of spin because one has lp = 0,
identically).
We now consider the following matrix element
¯ u
b
γ
5
γ
µ
u
a
Standard handling (that is left as an exercise) with the γ
µ
and γ
5
shows
that, under Lorentz boosts and rotations, it transforms as a four-vector, but,
under spatial inversion, one has
¯ u
b
′
γ
5
γ
0
u
a
′
= ¯ u
b
γ
0
γ
5
γ
0
γ
0
u
a
= −¯ u
b
γ
5
γ
0
u
a
(3.55a)
and
¯ u
b
′
γ
5
γ
δ
u
a
′
= ¯ u
b
γ
0
γ
5
γ
δ
γ
0
u
a
= +¯ u
b
γ
5
γ
δ
u
a
(3.55b)
We have an axial four-vector. Its time component changes sign, while the
space components do not.
3.5 Plane Wave Solutions and the Conserved Dirac
Current
In this last Subsection we shall ﬁnd the plane wave solutions of the Dirac
equation for a noninteracting particle, and, as in the case of the Klein-Gordon
equation, we shall determine the conserved current.
At this point the equations become very large and it is necessary to ﬁnd a
strategy to simplify the calculations and avoid to lose the physical meaning
of the developments. For this reason, most textbooks adopt the system of
units in which
¯ h = c = 1
In any part of the calculations one can go back to the standard units recalling
the following dimensional equalities
[¯ h] = [E] [T], [c] = [L] [T]
−1
and use the numerical values given in Subsection 3.1.
43
In this way, Dirac equation (3.36) is written in the form
[i∂
µ
γ
µ
−m] ψ(x) = 0 (3.56)
Let us make the hypothesis that the wave function ψ(x) can be factorized in
plane wave exponential, identical to that of the Klein-Gordon equation given
in eqs.(3.4a,b), and a Dirac spinor not depending on the four-vector x. Also
using eq.(3.5a) for positive and negative energy, being λ the energy sign, we
can write
ψ
λpσ
(x) = u(λ, p, σ) exp [i(−λǫ(p)t +pr)] (3.57)
The spin label σ of the Dirac spinor (not to be confused with the Pauli
matrices) will be discussed in the following.
Applying the space-time derivative operator to the previous equation one has
i∂
µ
ψ
λpσ
(x) = (λǫ(p), −p)u(λ, p, σ) exp[i(−λǫ(p)t +pr)] (3.58)
where the the minus sign in −p is due to the use of covariant components of
the operator i∂
µ
.
We insert the last result in the Dirac equation (3.56). Cancelling the expo-
nential factor, it remains the following matrix equation for the Dirac spinor:
[λǫ(p)γ
0
−(pγ) −m]u(λ, p, σ) = 0 (3.59)
As in eq.(3.43), we write the four component Dirac spinor in terms of two
two-component ones:
u(λ, p, σ) =
_
ϕ
χ
_
(3.60)
where ϕ, χ are respectively deﬁned as upper and lower components of the
spinor. For brevity we do not write the indices λ, p, σ in ϕ and χ.
Using the γ
µ
in the standard representation of eq.(3.47), we can write eq.(3.59)
in the form:
(λǫ(p) −m)ϕ −(pσ)χ = 0 (3.61a)
(λǫ(p) + m)χ −(pσ)ϕ = 0 (3.61b)
Considering positive energy states, that is λ = +1, we obtain the lower
components χ
+
in terms of ϕ
+
by means of eq.(3.61b):
χ
+
=
(pσ)
ǫ(p) + m
ϕ
+
(3.62a)
44
In this case it is not possible to write ϕ
+
in terms of χ
+
using eq.(3.61a)
because, with λ = +1, the factor λǫ(p) −m is vanishing for p = 0.
Conversely, for negative energy states, that is λ = −1, from eq.(3.61a) we
obtain the upper components:
ϕ
−
= −
(pσ)
ǫ(p) + m
χ
−
(3.62b)
In this way we have found the plane wave solutions of Dirac equation for a
noninteracting particle. The two-component spinors ϕ
+
, χ
−
can be chosen
(but it is not the only possible choice), as those of the nonrelativistic theory.
Denoting them as w
σ
, with the property w
†
σ
′ w
σ
= δ
σ
′
σ
, one has explicitly
w
+
=
_
1
0
_
, w
−
=
_
0
1
_
for spin up and down, respectively.
In consequence the Dirac spinors u(λ, p, σ) can be put in the form
u(+1, p, σ) = N
_
w
σ
(pσ)
ǫ(p)+m
w
σ
_
(3.63a)
and
u(−1, p, σ) = N
_
−
(pσ)
ǫ(p)+m
w
σ
w
σ
_
(3.63b)
We point out that, in general, the spin label σ of w
σ
does not represent the
spin eigenvalue in a ﬁxed direction, for example the x
3
axis. This property
holds true only for a particle at rest. In this case the previous solutions
coincide with the solutions of eq.(3.44).
General properties of spin and angular momentum for Dirac equation will be
studied in a subsequent work.
The Dirac spinors of eqs.(3.63a,b) can be also conveniently written as
u(λ, p, σ) = N u(λ, p)w
σ
(3.64)
with
u(+1, p) = N
_
1
(pσ)
ǫ(p)+m
_
(3.65a)
45
and
u(−1, p) = N
_
−
(pσ)
ǫ(p)+m
1
_
(3.65b)
where the u(λ, p) represent 4 × 2 matrices. They must be applied onto the
two-component (column) spinors w
σ
, giving as result the four component
Dirac (column) spinors of eqs.(3.63a,b).
Note that, in contrast to the nonrelativistic case, the Dirac spinors depend
on the momentum of the particle.
We now discuss the normalization factor N. In nonrelativistic theory, the
plane wave of a spin 1/2 particle is “normalized” as
ψ
pσ
(x) =
1
√
V
w
σ
exp [i(−Et +pr)]
where V represents the (macroscopic) volume where the particle stays. The
probabilty of ﬁnding the particle in this volume is set equal to one. However,
V is a ﬁctitious quantity that always disappears when physical (observable)
quantities are calculated. In consequence, for the sake of simplicity, one can
put V = 1. In this way, one has
ψ
†
pσ
′ (x)ψ
pσ
(x) = δ
σσ
′
A similar result can be obtained for the Dirac equation plane waves, putting
in eqs.(3.63a)-(3.65b)
N = N
nc
=
¸
¸
¸
_
ǫ(p) + m
2ǫ(p)
(3.66)
where nc stands for not covariant . In fact this normalization cannot be
directly used for the calculation of covariant amplitudes. With this nonco-
variant normalization, the Dirac wave function satiﬁes the following normal-
ization equation that is analogous to the nonrelativistic one
ψ
†
λ
′
pσ
′ (x)ψ
λpσ
(x) = δ
λλ
′ δ
σσ
′ (3.67)
As an exercise, verify this result and that of eq.(3.69), by using eq.(A.7) for
the products of (σp). Also use the identity
p
2
= [ǫ(p)]
2
−m
2
= (ǫ(p) + m)(ǫ(p) −m)
46
The covariant normalization is obtained taking
N = N
cov
=
¸
ǫ(p) + m
2m
=
¸
ǫ(p)
m
N
nc
(3.68)
By using this normalization one has
¯ u(λ
′
, p, σ
′
)u(λ, p, σ) = (−1)
λ
δ
λλ
′ δ
σσ
′ (3.69)
that, also recalling eq.(3.52a), represents an explicitly Lorentz invariant con-
dition.
In many textbooks a slightly diﬀerent covariant normalization in used, that
is
N
cov
′
= N
cov
√
2m
so that a factor 2m appears in the r.h.s. of eq.(3.69).
When reading a book or an article for the study of a speciﬁc problem, pay
attention to which normalization is really used !
For further developments it is also introduced the spinor corresponding to
negative energy, negative momentum −p (and spin label σ). From eq.(3.63b)
or (3.65b) one has
u(−1, −p) = N
_
(pσ)
ǫ(p)+m
1
_
(3.70)
Note that
u(−1, −p) = −γ
5
u(+1, p) (3.71)
That spinor is standardly applied to w
σ
, as in eqs.(3.63b) and (3.64).
We conclude this section studying the transition current associated to the
Dirac equation in the same way as we studied that of the Klein-Gordon
equation in eqs.(3.6)-(3.8b).
First, one has to write the Dirac equation for the adjoint wave function
¯
ψ(x) = ψ
†
(x)γ
0
To this aim, take the Dirac equation (3.56) and calculate the hermitic con-
jugate. By using eq.(3.39), one ﬁnds
−i∂
µ
ψ
†
(x)γ
0
γ
µ
γ
0
−ψ
†
(x)m = 0 (3.72a)
47
Multiplying this equation from the right by −γ
0
one obtains
i∂
µ
¯
ψ(x)γ
µ
+
¯
ψ(x)m = 0 (3.72b)
that is the searched equation.
As done for the Klein-Gordon equation we obtain the conserved current by
means of the following three steps.
(i) Take eq.(3.56) with a plane wave, initial state, solution ψ
I
(x) correspond-
ing to energy sign λ
I
, three-momentum p
I
and spin label σ
I
.
(ii) Analogously, take eq.(3.72b) with a plane wave, ﬁnal state, solution
¯
ψ
F
(x).
(iii) Multiply the equation of step (i) by
¯
ψ
F
(x) and the equation of
step (ii) by ψ
I
(x). Then sum these two equations (note that the scalar mass
term disappears), obtaining
∂
µ
J
µ
FI
(x) = 0 (3.73)
where the Dirac conserved current is
J
µ
FI
(x) =
¯
ψ
F
(x)γ
µ
ψ
I
(x) (3.74a)
= ¯ u(λ
F
, p
F
, σ
F
)γ
µ
u(λ
I
, p
I
, σ
I
) exp(iq
µ
x
µ
) (3.74b)
with the four-momentum tranfer q
µ
= p
µ
F
−p
µ
I
. The four-vector character of
the Dirac current is manifestly shown by the previous equation.
The Dirac four-vector vertex is
¯ u
F
γ
µ
u
I
= ¯ u(λ
F
, p
F
, σ
F
)γ
µ
u(λ
I
, p
I
, σ
I
) (3.75)
Due to current conservation it satisﬁes, analogously to eq.(3.9),
q
µ
¯ u
F
γ
µ
u
I
= 0 (3.76)
Note that in the static case the current density (diﬀerently from the Klein
Gordon equation) is a positive quantity both for positive and negative enery
states, as shown explictly by the second equality of the following equation:
J
0
II
=
¯
ψ
I
(x)γ
0
ψ
I
(x) = ψ
†
I
(x)ψ
I
(x) > 0 (3.77)
This property allows to attach (for some speciﬁc problems) a probabilistic
interpretation to that quantity and to consider ψ(x) as a wave function in
48
the same sense of nonrelativistic quantum mechanics. However, the presence
of negative energy solutions requires, in general, the introduction of the ﬁeld
theory formalism.
The vertex of eq.(3.75) at ﬁrst glance looks very diﬀerent with respect to
that of the Klein-Gordon equation (p
µ
F
+p
µ
I
) given in eq.(3.8b). The so-called
Gordon decomposition, with some algebra on the Dirac matrices, shows that
it can be written in a form that is more similar to the Klein-Gordon one.
This procedure will be analyzed in a subsequent work.
For the moment, using the properties of the Pauli matrices, the reader can
show that
¯ u(λ, p, σ
′
)γ
µ
u(λ, p, σ) =
p
µ
m
δ
σσ
′ (3.78)
with p
µ
= (ǫ(p), p). The covariant normalization of eq.(3.68) has been used.
We conclude this work noting that, at this point, the reader should be able
to use the main tools related to Dirac equation, being also familiarized with
the issues of relativity in quantum mechanical theories.
More formal details and calculations of physical observables can be found in
many textbooks and will be studied in a subsequent work.
4 Appendix. Properties of the Pauli Matri-
ces
The three Pauli matrices are deﬁned as follows
σ
1
=
_
0 1
1 0
_
, σ
2
=
_
0 −i
i 0
_
, σ
3
=
_
1 0
0 −1
_
(A.1)
they are 2 × 2, traceless, hermitic (σ
α†
= σ
α
) matrices. The Pauli matrices
fulﬁll the the following commutation rules
[σ
α
, σ
β
] = 2iǫ
αβγ
σ
γ
(A.2)
One deﬁnes the spin, that is the intrinsic angular momentum operator, mul-
tiplying the σ
α
by ¯ h/2.
By means of this deﬁnition, the spin satisﬁes the standard angular momentum
commutation rules, that are
49
[j
α
, j
β
] = i¯ hǫ
αβγ
j
γ
Independently, the Pauli matrices fulﬁll the anticommutation rules
{σ
α
, σ
β
} = 2δ
αβ
(A.3)
Summing up eqs.(A.2) and (A.3) and dividing by two, one obtains the very
useful relation
σ
α
σ
β
= δ
αβ
+ iǫ
αβγ
σ
γ
(A.4)
Obviously only two of eqs.(A.2), (A.3) and (A.4) are independent.
Given the three-vectors a and b, let us multiply the previous expression by
a
α
and b
β
, summing over the components. One obtains
(σa)(σb) = ab + i(σa ×b) (A.5)
Note that (σa) represents the following matrix
(σa) =
_
a
3
a
1
−ia
2
a
1
+ ia
2
−a
3
_
(A.6)
and analogously for (σb) and (σa ×b).
In eq.(A.5), if b = a, the vector product is vanishing, so that one has
(σa)
2
= a
2
(A.7)
Starting from this equality we can calculate the function f(σa).
To this aim we recall that, if a function f(x) of a standard variable x has the
Taylor expansion
f(x) =
∞

n=0
c
n
(σa)
n
(A.9)
The result is obviously a 2 ×2 matrix.
Incidentally, the previous deﬁnition, that makes use of the Taylor expansion
in powers of the argument matrix, is a general one: it holds not only for (σa)
50
but also if the argument of the function is a matrix of any dimension or if it
is a linear operator. In the present case, the powers (σa)
n
in eq.(A.9) can be
calculated by means of eq.(A.7). We also use (σa)
0
= 1.
We make here some algebraic developments to obtain a “closed” expression
for eq.(A.9).
First, let us write separately the even and the odd powers in the expansion
(A.8):
f(x) =
∞