This is an introduction to the concepts and procedures of tensor analysis. It makes use of
the more familiar methods and notation of matrices to make this introduction. First it is worthwhile
to review the concept of a vector space and the space of linear functionals on a vector space.

A vector space is a set of elements V and a number of associated operations. There is an addition
operation defined such that for any two elements u and v in V there is an element w=u+v. Furthermore there
is an element of V, call it the zero vector 0, such that for any element u of V, u+0=u.
And for any element u of V there is an element of V, say v, such that u+v=0. The element v is
called the additive inverse of u and is denoted as −u. There is a set of scalars Λ
for a vector space such that for any element v of V and any element λ of Λ there is
an element of V, denoted as λv and called the scalar product of λ and v. Usually the
set of scalars is the real numbers or the complex numbers.

For a vector space there is set of elements of V, called a basis, such that any vector in V can be
reprsented as a linear combination of the basis elements. The linear combination is given by the
scalar coefficients for the basis elements. Thus any vector in V can be represented by an ordered
array of scalar elements, which are called the components of the vector. Such an order array can be
considered a column vector of the scalar elements.
The basis of a vector space may or may not be finite.

Linear functionals on the vector space are functions which map the elements of V into the set of
scalars Λ. Addition and scalar multiplication on the set of linear functionals are defined and
likewise a zero element and additive inverses. Thus the linear functionals over the set of
vectors V form a vector space, called the dual space to the vector space for V.

If the vector space for V is of finite dimension n then so is its dual space and thus the dual space
has the same structure as that for V. If the vector space for V is not of finite dimensions then
the dual space not necessarily of the same structure.

For the vector space of n-dimensional column vectors the dual space may be thought of as also n-dimensional
vectors. Then the linear functionals may be considered as matrix products; i.e, if X is an element
of V and F is an element of the dual space then the linear functional is the matrix product of the transpose
of F wth X; i.e.,

z = FTX

Now consider a change in the coordinate system for V such that the vector of components X gets changed
into the vector of components Y by multiplication by an invertible matrix M; i.e.,

Y = MX

z = FTM-1Y
and hence
GT = FTM-1 and therefore
G = (M-1)TF = (MT)-1F

The dual vectors get transformed by the inverse of the transpose of the transformation of the vectors.
Thus there are some vectors which get transformed by one rule and other vectors which get transformed
by an associated alternate rule. A similar arrangement occurs in tensor analysis in which some
tensors are called covariant and transform according to one rule and others are called contravarianat
and transform according to an alternate but related rule.

It is very tempting to identify the vectors of the dual space as row vectors. This would illustrate
how the dual space could have the same mathematical structure as the primal (original) space yet be distinct.
However there is no provision in tensor analysis for identifying one vector as a column vector and another
one as a row vector.

A tensor is an entity which is represented in any coordinate system by an array
of numbers called
its components. The components change from coordiate system to coordinate in a systematic
way described by rules. The arrays of numbers are not the tensor; they are only the representation of
the tensor in a particular coordinate system. A vector is a special case of a tensor. A vector
is an entity which has direction and magnitude and is represented by a one dimensional array of numbers.
Unfortunately it is common to consider any one dimensional array of numbers as a vector.

A more general notion of a vector involves not only its direction and magnitude but also its point of
application. Thus a vector can be represented by two points; one point representing its point of application
and the line between the points giving its direction and magnitude. The points in n-dimensional space can be considered
invariant but their representations in a coordinate system are given by two n-dimensional arrays of numbers,
say X1 and X2. If the coordinate system is changed by shifting the origin (translation),
rotating the axes and/or stretching the axes (dilation) then the components of the representation of the
points change. The translation of the origin is given by an n-dimensional array, say A and the rotation
and dilation by a matrix, say B. The new representations of a point P is Y where

Y = A + BX

if yi = fi(x1, x2, … , xn) for i=1,2,…,n
then there exists a set of functins {gi: i=1,…,n} such that
if xi = gi(y1, y2, … , yn) for i=1,2,…,n

The conditions for the existence of an inverse transformation can be stated in terms of the
Jacobian determinant for the transformation. Let Ji,j=∂yi/∂xj and
let J be the matrix formed from the Jij's. The condition for the existence
of an inverse transformation within some region R is that det(J)≠0 within that region.

If the x-coordinates are transformed to y-coordinates and the y-coordinates are then transformed to
z-coordinates then matrix for the transformation of the x-coordinates to the z-coordinates is just
the product of the matrices for the two intermediate transformations; i.e., if J is the matrix for the
transformation from the x's to the y's and K is the matrix of the transformation from the y's to the z's
then the matrix of the transformation from the x's to the z's is JK. Thus the Jacobian for the transformation
from the x's to the z's is det(J)det(K) and hence if det(J) and det(K) are both nonzero in some region R
then so is det(JK). Thus if both of the intermediate transformations are allowable then so is their
composition.

There is the trivial transformation yi=xi, called the identity transformation. The
Jacobian of this transformation is equal to unity.

Since the composition of a transformation with its inverse is just the identity transformation it follows
that product of the Jacobian of a transformation with the Jacobian of the inverse transformation is
equal to unity and hence det(J−1)=det(J)−1.

All of this adds up to the mathematically interesting proposition that the set of allowable transformations
of coordinate systems form a group.

The transformation from cylindrical coordinates to Euclidean coordinates will be used for the illustration. First these
coordinates will be given in their more familiar form and then converted to the subscripted notation. The cylindrical coordinates
are (r, θ, z) and the Euclidean coordinates
are (x, y, z) where

The Jacobian matrix of the inverse transformation, the inverse of the above matrix, is then

Its determinant is 1/r, which of course is the reciprocal of the determinant of the Jacobian matrix of the original transformation.

(To be continued.)

Let X be a representation of a vector in n-dimensional space and let h=f(X) be a scalar function of X.
If the X representation is transformed to another coordinate system by the transformation T(), such that
Y=T(X), where T is a vector-valued function then the scalar function f(X) gets transformed into
g(Y) where g(Y)=f(T-1(Y)). For simplicity let inverse transformation T-1(Y) be
represented as S(Y). Thus g(Y)=f(S(Y)).

Consider the vector whose components are {∂f/∂xi, i=1,…,n}.
This is called the gradient of the scalar function f. Now consider the gradient of the
scalar function g(Y); i.e., {∂g/∂yj, j=1,…,n}. How are the
components of the gradient of g related to the componets of the gradient of f? Calculus
gives the answer as

∂g/∂yj = Σi∂f/∂xi(∂xi/∂yj)

Summations arise almost always when there is a repeated index such as i in the above
equation. Albert Einstein promoted the convention that since a repeated index almost
always means summation over
that index the summantion sign Σ can be dispensed with. Therefore in tensor
notation the above equation is

Let (∂xi/∂yj) be denoted as the element of a matrix M and
let (∂f/∂xi) and (∂f/∂yj) be the components of the
vectors ∇xf and ∇yf, respectively. Then in matrics notation the
above transformation of the gradient vector is represented as

∇yf = M∇xf

Any entity whose components transform in this way is called a covariant tensor of
rank 1. There are other entities which transform in a similar but different way.
Consider the differentials dyj and dxi for i and j ranging from 1 to n.
From calculus

dyj = Σi(∂yj/∂xi)dxior, in tensor notation
dyj = (∂yj/∂xi)dxi

The quantities (∂yj/∂xi) are the elements of what previously had been
referred to as the Jacobian matrix of the transformation. Denoting the Jacobian matrix as J then the
transformation of the vector of infinitesimals dX to dY is represented as

dY = JdX

This transformation is called contravariant and a vector transforming in this way
would be called a contravariant tensor of rank 1.

What was denoted above as the matrix M is the Jacobian matrix of the inverse transformation.
The covariant transformation of the gradient vectors expressed previously could be represented as

∇yf = J-1∇xf

These two rules for transformation, covariant and constravariant, are the defining characteristic of tensors. They can be
extended to multiple indices. The indices for covariance are conventionally denoted as subscripts and contravariance as
superscripts.

The Metric Tensor

One scalar quantity of importance in geometry is distance. In Euclidean coordinates for an n-dimensional space the
formula for the length ds of an infinitesimal line segment is

(ds)² = Σi(dxi)²
or, using the summation convention,
(ds)² = dxidxi

If dX represents the column vector of dxi's then the formula for the line length could be expressed as

(ds)² = (dX)T(dX)

If the coordinate system is changed from the Euclidean X's to some coordinate system of Y's then

dxi = (∂xi/∂yj)dyj

In matrix notation this may be expressed as

dX = MdY

where M is the matrix whose mi,j element is (∂xi/∂yj). Thus
(dX)T=(dY)T(M)T
and hence

(ds)² = (dY)T(M)TMdY = (dY)TGdY

where the matrix G is (M)TM. The elements of G are given by

gi,k = (∂xi/∂yj)(∂xk/∂yj)

The two dimensional array of the gi,k's is called the metric tensor. That it is in fact a tensor and
a covariant one at that is something that needs to be proven. It is the metric tensor for the coordinate system Y. The metric
tensor for the Euclidean coordinate system is such that gi,k=δi,k, where δi,k=0 if
i≠j and =1 if i=k.

For cylindrical coordinates the inverse transform of Euclidean to cylindrical, i.e., cylindrical to Euclidean, is
in familiar notation,

(ds)² = (dr)² + (rdθ)² + (dz)²

The inverse of the metric tensor is significant. In the case of the cylindrical coordinate system
this inverse G-1 is

In tensor analysis the metric tensor is denoted as gi,j and its inverse is denoted as
gi,j. This latter notation suggest that the inverse has something to do with contravariance.
For a column vector X in the Euclidean coordinate system its components in another coordinate system are
given by Y=MX. Now consider G-1X. Since G=MTM,

G-1 = M-1(MT)-1 = M-1(M-1)T

This is the transformation rule for a contravariant tensor. It is also the rule for what in the
Introduction was referred to as a vector of the dual space. Thus multiplication of a covariant tensor
by the contravariant metric tensor creates a contravariant tensor. The operation also works in the
other direction. Multiplication of a contravariant tensor by the metric tensor produces a covariant
tensor.

Suppose the transformation of the coordinate system is such that Y=MX, then the linear functional H
in the new coordinate system corresponding to F in the original coordinate is given by

H = (MT)-1F

Now consider GF where G=MTM.
Then

H = (MT)-1GF = (MT)-1(MTM)F
which reduces to
H = MF

This is the transformation rule for a covariant tensor.
Thus multiplication of a covariant tensor by the inverse metric tensor produces a contravariant tensor.

The Christoffel Symbols

The Christoffel 3-index symbol of the first kind is defined as

[ij,k] = ½[∂gik/∂xj + ∂gik/∂xi −
∂gij/∂xk]

A property of these symbols obvious from the definition is that

[ij,k] = [ji,k]

When the Chiristoffel symbols of the first kind are multiplied by the elements of the inverse of the
metric tensor and results summed over the third index a different set of symbols are generated which are
called the Christoffel 3-index symbols of the second kind. There are different notations used for these
symbols, but the typographically most convenient is, in analogy with the symbols of the first kind, {ij,k}.
I. S. Sokolninikoff in his Tensor Analysis uses a two level symbol impossible to create in HTML.
The Princeton school uses a symbol of the form Γijk but the superscripts and
subscripts cannot be properly lined up using HTML. Therefore {ij,k} will be used here. The definition
is

{ij,k} = gkp[ij,p]
where, of course, there is asummation over the index p.

It follows from the symmetry of [ij,k] with respect to the interchange of the first two indices that

{ij,k} = {ji,k}

The Christoffel symbols are essential in defining the derivative of tensors. However the Christoffel
symbols themselves do not represent a tensor. That is to say they do not transform upon a change
in coordinate system according to either the covariant or contravariant rules of transformation.

Differentiation of Tensors

Previously it was noted that the gradient vector for a scalar function is a covariant tensor. The gradient
is the set of partial derivatives of a scalar function, say f(x1, …, xn), i.e.,
(∂f/∂xi) for i=1 to n.

Consider now the second derivatives of the scalar function

∂²f/∂xj∂xi

When the coordinate system is changed from X's to Y's the gradient components transform according to
the tensor formula

∂f/∂yi = (∂f/∂xp)(∂xp/∂yi)

However the formula for the transformation of the second derivatives is