at least in the beginning of the linear algebra
unit, we'll be following the Axler textbook closely enough that
supplementary lecture notes should not be needed. Some important
extensions/modifications to the treatment in Axler:

[cf. Axler, p.3]
Unless noted otherwise, F may be an arbitrary
field, not only R or C. The
most important fields other than those of real and complex numbers
are the field Q of rational numbers, and the
finite fields Z/pZ
(p prime). Other examples are the field
Q(i) of complex numbers with rational
real and imaginary parts; more generally,
Q(d1/2) for any nonsquare
rational number d; the “p-adic numbers”
Qp (p prime),
of which we'll say more when we study topology next term;
and more exotic finite fields such as the 9-element field
(Z/3Z)(i).
Here's a review
of the axioms for fields, vector spaces, and related mathematical
structures
[corrected 7.ix.10: I somehow forgot to include
the associative property among the vector space axioms )-:]

[cf. Axler, p.22] We define the span of an arbitrary subset
S of (or tuple in) a vector space V as follows:
it is the set of all (finite) linear combinations
a1v1 + … +
anvn
with each vi in S and each
ai in F. This is still the smallest
vector subspace of V containing S. In particular,
if S is empty, its span is by definition {0}. We do
not require that S be finite.

Unlike Axler, we discuss “quotient vector spaces”
(and even quotient modules) explicitly and introduce them early.
If N is a submodule of the module M
over some ring A,
the quotient moduleM/N consists of all
equivalence classes in M, where we define
the equivalence relation “∼” to mean that
m∼m' if and only if
m−m' is contained in N.
One must then check that the module operations inherited
from M do give M/N the structure of a
well-defined A-module.
[“Well-defined” means that the operations
do not depend on the choice of representative
from an equivalence class.]
A familiar example is the cyclic finite group
Z/nZ
(recall that abelian group=Z-module).
In the case of vector spaces, the dimension of a quotient vector space
V/U is called the codimension in V
of its subspace U; if V is finite-dimensional,
this codimension equals dim(V) − dim(U).

Warning: in general the space F[X]
(a.k.a. P(F))
of polynomials in X, and its subspaces
Pn(F)
of polynomials of degree at most n,
might not be naturally identified
with a subspace of the space FF of
functions from F to itself. The problem is that
two different polynomials may yield the same function.
For example if F is the field of 2 elements
the polynomial X2-X
gives rise to the zero function. In general different polynomials
can represent the same function if and only if
F is finite — do you see why?

In Axler's Theorem 2.10, The hypothesis that the spanning set
be finite (implicit in Axler's use of “lists”)
is not necessary as long as the vector space V
is finite dimensional. Indeed let S be an arbitrary
spanning set. Since V is finite dimensional,
it has a finite spanning set S'. Write each element
of S' as a linear combination of elements of S.
These linear combinations involve only finitely many elements
of S, so S has a finite subset S0
that spans S' and thus spans V. Now apply Axler's
argument to find a subset of S0, and thus
a fortiori of S, that is a basis of V.

Here's an extreme example of how basic theorems about
finite-dimensional vector spaces can become utterly false for
finitely-generated modules: a module generated by just one element
can have a submodule that is not finitely generated. Indeed,
for any field F let A be the ring of polynomials
in infinitely many variables Xj. As usual
we can regard A as a module over itself, with a single
generator 1. A submodule is then just an ideal of the ring.
Choose the ideal I generated by all the Xj,
which consists of all polynomials with constant coefficient equal 0.
Then if there are infinitely many indices j then I
is infinitely generated; indeed any generating set must be
at least as large as the index set of j’s,
so for every cardinal ℵ we can make a ring A with a
singly-generated module (namely A itself) and with a submodule
that cannot be generated by fewer than ℵ elements.
For a subtler example, consider the ring we might call
“F[X1/2∞]”,
consisting of F-linear combinations of monomials
Xn/2k
for arbitrary nonnegative integers n and k.
Again let I be the ideal generated by the nonconstant monomials,
which is not finitely generated, though there are generating sets
that are “only” countably infinite.
The new behavior involves the countable generating set
{X1/2k | k≥0}:
there is no minimal generating subset, because each
X1/2k
is a multiple of X1/2k'
for any k'>k. Likewise for the ring generated
by all monomials Xr
with r a nonnegative rational number (or even all
Xr
with r a nonnegative real number).

Suppose T : V → W
is a linear transformation.
Axler's notation for the image of T
was already becoming rather old-fashioned when he wrote his book;
these days simply T(V) is common
(and likewise for any function at all).
The terminology “null space” (whether one or two words)
for T -1(0) is also somewhat quaint;
we usually say “kernel” and write
“ker(T)” [and (LA)TeX already provides
the command \ker to typeset this properly].

Axler also unaccountably soft-pedals the important notion of
duality.
[corrected 18.ix.10:
If W is the direct sum of U and V
then the copy of U*
in W* is the “annihilator”
in W* of V, not of U.]

About “Lemma 3.?”:
some notes and warnings about the behavior of
Hom(V,W) under (finite or infinite) direct sums.

Here's a brief preview of
abstract nonsense (a.k.a. diagram-chasing),
and a diagram-chasing interpretation of quotients and duality.

Axler proves the Fundamental Theorem of Algebra using
complex analysis, which cannot be assumed in Math 55a
(we'll get to it at the end of 55b).
Here's a proof using the topological tools
we'll develop at the start of 55b.
(Axler gives the complex-analytic proof on page 67.)
[corrected 24.ix.10 to fix a minor typo:
when f-tilde is introduced, it is f-tilde that vanishes at zero, not f]

We shall need some “eigenstuff” also in an infinite-dimensional
setting, so will not assume that any vector space is (nonzero) finite
dimensional unless we really must.

If T is a linear operator on a vector space V,
and U is an invariant subspace, then the quotient space
V/U inherits an action of T.
Moreover, the annihilator of U in V*
is an invariant subspace for the action of the adjoint operator
T* on V*.
(Make sure you understand why both these claims hold.)

Triangular matrices are intimately related with “flags”.
A (complete) flag in a finite dimensional vector space V
is a sequence of subspaces {0}=V0,
V1, V2, …,
Vn=V, with each
Vi of dimension i and containing
Vi-1. A basis
v1, v2, …,
vn
determines a flag: Vi is the span of the first
i basis vectors. Another basis
w1, w2, …,
wn
determines the same flag if and only if each
wi is a linear combination of
v1, v2, …,
vi
(necessarily with nonzero vi coefficient).
The standard flag in Fn is the flag
obtained in this way from the standard basis of unit vectors
e1, e2, …,
en.
The punchline is that, just as a diagonal matrix is one that respects
the standard basis (equivalently, the associated decomposition of
V as a direct sum of 1-dimensional subspaces),
an upper-triangular matrix is one that respects the standard
flag.
Note that the i-th diagonal entry of a triangular matrix
gives the action on the one-dimensional quotient space
Vi/Vi−1
(each i=1,…,n).

Less surprising than the absence of quotients and duality
in Axler is the lack of tensor algebra.
That won't stop us in Math 55, though.
Here's an introduction
[As you might guess from \oplus,
the TeXism for the tensor-product symbol is \otimes.]
corrected 30.ix.10 and 1.x.10
to fix a couple of minor typos and improve the presentation of
V ⊗FF'

One of many applications is the trace of
an operator on a finite dimensional F-vector space V.
This is a linear map from Hom(V,V) to F.
We can define it simply as the composition of two maps:
our identification of Hom(V,V) with
the tensor product of V* and V,
and the natural map from this tensor product to F
coming from the bilinear map taking (v*,v)
to v*(v). We'll see that this is the same
as the classical definition: the trace of T is the sum of the
diagonal entries of the matrix of T with respect to any basis.
The coordinate-independent construction via tensor algebra
explains why the trace does not change under change of basis.
(The invariance can also be proved by checking explicitly that
AB and BA have the same trace for any square matrices
A, B of the same size.)

Here are some basic definitions and facts about general
norms
on real and complex vector spaces.

Just as we can study bilinear symmetric forms
on a vector space over any field, not just R,
we can study sesquilinear conjugate-symmetric forms
on a vector space over any field with a conjugation,
not just C.
Here a “conjugation” on a field F is a field automorphism
σ:F→F
such that σ is not the identity
but σ2 is (that is, σ is an involution).
Given a basis {vi} for F,
a sesquilinear form 〈.,.〉 on F
is determined by the field elements
ai,j
= 〈vi,vj〉,
and is conjugate-symmetric if and only if
aj,i = σ(ai,j)
for all i,j.
Note that the “diagonal entries” ai,i
— and more generally 〈v,v〉
for any v in V — must be elements
of the subfield of F fixed by σ.

“Sylvester's Law of Inertia” states that
for a nondegenerate pairing on a finite-dimensional vector space
V/F, where either
F=R and the pairing is
bilinear and symmetric, or
F=C and the pairing is
sesquilinear and conjugate-symmetric,
the counts of positive and negative inner products
for an orthogonal basis constitute an invariant of the pairing
and do not depend on the choice of orthogonal basis.
(This invariant is known as the “signature” of the pairing.)
The key trick in proving this result is as follows.
Suppose V is the orthogonal direct sum of subspaces
U1, U2
for which the pairing is positive definite on U1
and negative definite on U2.
Then any subspace W of V
on which the pairing is positive definite
has dimension no greater than dim(U1).
Proof: On the intersection of W with U2,
the pairing is both positive and negative definite;
hence that subspace is {0}. The claim follows by a dimension count,
and we quickly deduce Sylvester's Law.

Over any field not of characteristic 2,
we know that for any non-degenerate
symmetric pairing on a finite-dimensional vector space
there is an orthogonal basis, or equivalently
a choice of basis such that the pairing is
(x,y)=sumi(ai xi yi)
for some nonzero scalars ai.
But in general it can be quite hard to decide whether
two different collections of ai
yield isomorphic pairings. Even over Q
the answer is already tricky in dimensions 2 and 3,
and I don't think it's known in a vector space of arbitrary dimension.
Over a finite field of odd size there are always exactly two
possibilities, as we'll see in a few weeks.

A regular graph of degree d is a Moore graph of girth 5
if any two different vertices are linked by a unique path of length
at most 2. Such a graph necessarily has
n = 1 + d + d(d−1)
= d2 + 1 vertices.
Let A be the adjacency matrix, and
1 the all-ones vector. Then (because each vertex
has degree d) 1 is a
d-eigenvector of A.
We have
(1 + A + A2) v =
d v + 〈v, 1〉 1 for all v
(proof: check on unit vectors and use linearity).
Thus A takes the orthogonal complement
of R·1
to itself and satisfies
(1 + A + A2) = d
on that space. Since this quadratic equation has distinct roots
m and −1−m
for some m > 0 (namely the positive root of
(1 + m + m2) = d),
it follows that the orthogonal complement
of R·1
is the direct sum of the corresponding eigenspaces.
Let d1 and d2
be their dimensions. These sum to
n−1 = d2, and satisfy
md1 + (−1−m)d2 + d = 0
because the matrix A has trace zero. This lets us solve
for d1 and d2.
in particular we find that their difference is
(2d − d2) / (2m+1).
Since that's an integer, either d=2
(giving the pentagon graph) or m is an integer.
Substituting m2 + m + 1
for d, we find that
16(d1 − d2)
is an integer plus 15/(2m+1), whence
m is one of 0, 1, 2, or 7. The first of these is impossible,
and the others give d = 3, 7, or 57
as claimed.

It is clear that the pentagon is the unique Moore graph of girth 5,
and degree 2, and it is not hard to check that the Petersen graph
is the unique example with d = 3. For
d = 7 there is again a unique graph up to
an interesting group of automorphisms; see this
proof outline, which also lets one count
the automorphisms of this graph. There is no spectral graph theory
there, but this approach is suggested by the inequality you'll prove
in the fifth problem of problem set 7:
if the 50 vertices are divided into two equal subsets, what's
the smallest possible number of edges between them?

For more of this kind of application of linear algebra to combinatorics,
see for example Cameron and van Lint's text
Designs, Graphs, Codes and their Links
(London Math. Society, 1991, reprinted 1996).

All of Chapter 8 works over an arbitrary algebraically closed field,
not only over C
(except for the minor point about extracting square roots,
which breaks down in characteristic 2); and
the first section (“Generalized Eigenvalues”) works over any field.

We don't stop at Corollary 8.8: let T be any operator
on a vector space V over a field F, not assumed
algebraically closed. If V is finite-dimensional, then
The Following Are Equivalent:
(1) There exists a nonnegative integer k
such that Tk=0;
(2) For any vector v, there exists a nonnegative integer
k such that Tkv=0;
(3) Tn=0, where n=dim(V).
Note that (1) and (2) make no mention of the dimension,
but are still not equivalent for operators on infinite-dimensional spaces.
We readily deduce the further equivalent conditions:
(4) There exists a basis for V for which
T has an upper-triangular matrix
with every diagonal entry equal zero;
(5) Every upper-triangular matrix for T
has zeros on the diagonal,
and there exists at least one upper-triangular matrix for T.
Recall that the second part of (5) is automatic
if F is algebraically closed.

The space of generalized 0-eigenvectors
(the maximal subspace on which T is nilpotent)
is sometimes called the nilspace of T.
It is an invariant subspace. When V is finite dimensional,
V is the direct sum of the nilspace and another invariant
subspace V', consisting of the intersection of the subspaces
Tk(V) as k ranges over all
positive integers. See Exercise 8.11;
this can be used to quickly prove Theorem 8.23
and consequences such as Cayley-Hamilton (Theorem 8.20).

The dimension of the space of generalized c-eigenvalues
(i.e., of the nilspace of T−cI) is usually called the
algebraic multiplicity of c
(since it's the multiplicity of c
as a root of the characteristic polynomial of T),
to distinguish it from the “geometric multiplicity”
which is the dimension of ker(T−cI).

We'll not cover Chapter 9, which seems to exist mainly to prove that
it is possible to construct the characteristic polynomial
over R without introducing determinants.
We'll proceed to Chapter 10, which covers determinants, but…

We'll define the determinant of an operator T
on a finite dimensional space V as follows:
T induces a linear operator T'
on the top exterior power of V;
this exterior power is one-dimensional,
so an operator on it is multiplication by some scalar;
det(T) is by definition the scalar
corresponding to T'.
The “top exterior power”
is a subspace of the “exterior algebra” of V,
which is the quotient of the tensor algebra by
the ideal generated by {v*v: v in V}.
We'll still have to construct the sign homomorphism from the
symmetry group of order dim(V) to {1,−1}
to make sure that this exterior algebra is as large
as we expect it to be, and that in particular that
the (dim(V))-th exterior power has dimension 1
rather than zero.

Interlude: normal subgroups;
short exact sequences in the context of groups:
A subgroup H of G is normal
(satisfies H = g−1Hg
for all g in G) iffH is the kernel of some group homomorphism from Giff the injection H → G
fits into a short exact sequence
{1}
→ H
→ G
→ Q
→ {1},
in which case Q is the quotient groupG/H.
[The notation {1} for the one-element (“trivial”) group
is usually abbreviated to plain 1, as in
1
→ H
→ G
→ Q
→ 1.]
This is not in Axler but can be found in any introductory text in
abstract algebra; see for instance Artin, Chapter 2, section 10.
Examples:
1
→ An
→ Sn
→ {±1}
→ 1;
also, the determinant homomorphism
GLn(F) → F*
gives the short exact sesquence
1
→ SLn(F)
→ GLn(F)
→ F*
→ 1,
and this works even if F is just a commutative ring with unit
as long as F* is understood as
the group of invertible elements of F — for example,
Z* = {±1}.

Some more tidbits about exterior algebra:

If w, w' are elements of the
m-th and m'-th exterior powers of V,
then ww'=(−1)mm'w'w;
that is, w and w' commute
unless m and m' are both odd
in which case they anticommute.

If m+m'=n=dim(V)
then the natural pairing from the
m-th and m'-th exterior powers
to the n-th is nondegenerate,
and so identifies the m'-th exterior power
canonically with the dual of the m-thtensored with the top (n-th) exterior power.

In particular, if m=1,
and T is any invertible operator on V,
then we find that the induced action of T
on the (n−1)st exterior power
is the same as its action on V*
multiplied by det(T).
This yields the formula connecting the inverse and cofactor matrix
of an invertible matrix (a formula which you may also know
in the guise of “Cramer's rule”).

For each m there is a natural non-degenerate pairing
between the m-th exterior powers of
V and V*,
which identifies these exterior powers with each other's dual.

More will be said about exterior algebra when differential forms
appear in Math 55b.

We'll also show that a symmetric (or Hermitian) matrix
is positive definite iff all its eigenvalues are positive
iff it has positive principal minors
(the “principal minors” are the determinants
of the square submatrices of all orders containing the (1,1) entry).
More generally we'll show that the eigenvalue signs determine
the signature, as does the sequence of signs of principal minors
if they are all nonzero. More precisely: an invertible
symmetric/Hermitian matrix has signature
(r,s) where
r is the number of positive eigenvalues and
s is the number of negative eigenvalues;
if its principal minors are all nonzero then
r is the number of j in
{1,2,…,n} such that the
j-th and (j−1)-st minors
have the same sign, and
s is the number of j in that range such that the
j-th and (j−1)-st minors
have opposite sign [for j=1 we always count
the “zeroth minor” as being the positive number 1].
This follows inductively from the fact that the determinant has sign
(−1)s and the signature
(r',s') of the restriction of a pairing
to a subspace has
r' ≤ r and
s' ≤ s.

For positive definiteness, we have the two further
equivalent conditions: the symmetric (or Hermitian) matrix
A=(aij) is positive definite
iff there is a basis (vi)
of Fn such that
aij = 〈vi,vj〉
for all i,j, and
iff there is an invertible matrix B such that
A=B*B.
For example, the matrix with entries 1/(i+j−1)
(“Hilbert matrix”) is positive-definite, because it is
the matrix of inner products (integrals on [0,1]) of the basis
1,x,x2,…,xn−1
for the polynomials of degree <n.
See the 10th problem set for a calculus-free proof of the
positivity of the Hilbert matrix, and an evaluation of its determinant.
For any matrix B (not necessarily invertible or even square)
with columns vi,
the matrix B*B with entries
aij = 〈vi,vj〉
is known as the Gram matrix of the columns of B.
It is invertible iff those columns are linearly independent.
If we add an (n+1)-st vector, the determinant of
the Gram matrix increases by a factor equal to the squared distance
between this vector and the span of the columns of B.
You'll use this too in PS10.

Our source for representation theory of finite groups
(on finite-dimensional vector spaces over C)
will be Artin's Algebra, Chapter 9. We'll omit
sections 3 and 10 (which require not just topology and
calculus but also, at least for §3, some material beyond 55b
to do properly, namely the construction of Haar measures);
also we won't spend much time on §7, which works out in detail
the representation theory of a specific group that Artin
calls I (the icosahedral group,
a.k.a. A5).
There are many other sources for this material,
some of which take a somewhat different point of view via the
“group algebra” C[G]
of a finite group G (a.k.a.\ the algebra of
functions on G under convolution).
See for instance Chapter 1 of Representation Theory
by Fulton and Harris (mentioned in class). A canonical treatment
of representations of finite groups is
Serre's Linear Representations of Finite Groups,
which is the only entry for this chapter in the list of
“Suggestions for Further Reading” at the end of Artin's book
(see p.604).

While we'll work almost exclusively over C,
most of the results work equally well (though with somewhat different
proofs) over any field F that contains the roots of unity
of order #(G), as long as the characteristic
of F is not a factor of #(G).
Without roots of unity, many more results are different,
but there is still a reasonably satisfactory theory available.
Dropping the characteristic condition leads to much trickier territory,
e.g. even Maschke's theorem (every finite-dimensional
representation is a direct sum of irreducibles) fails;
some natural problems are still unsolved!

Here's an alternative viewpoint on representations of a finite group
G (not in Artin, though you can find it elsewhere, e.g.
Fulton-Harris pages 36ff.): a representation of G over
a field F is equivalent to a module for the
group ringF[G].
The group ring is an associative F-algebra
(commutative iff G is commutative) that consists of the
formal F-linear combinations of group elements.
This means that F[G]
is FG as an
F-vector space,
and the algebra structure is defined by setting
eg1eg2
= eg1g2
for all g1, g2
in G, together with the
F-bilinearity of the product.
This means that if we identify elements of the group ring with
functions G→F then the
multiplication rule is
(f1 * f2)(g)
= Σg1g2=gf1(g1) f2(g2)
— yes, it's convolution again. To identify an
F[G]-module with a representation,
use the action of F to define the vector space structure,
and let ρ(g) act by multipliction by
the unit vector eg. In particular,
the regular representation is F[G]
regarded in the usual way as a module over itself. If we identify
the image of this representation with certain permutation matrices
of order #(G), we get an explicit model of
F[G] as a subalgebra of the
algebra of square matrices the same order. For example, if
G = Z/nZ
we recover the algebra of circulant matrices of order n.

Just as the regular representation is
F[G] as a module over itself,
a subrepresentation of the regular representation is an ideal
of F[G]. The fundamental theorem
of representation theory can be stated as an isomorphism between
F[G] and the direct sum of algebras
End(V) with V ranging over all the
irreducible representations of G. Comparing dimensions
gives the formula
#(G) = ΣV dim(V)2.
The F-linear isomorphism from
F[G] and the direct sum of the
End(V) is a generalization of the discrete
Fourier transform (DFT) to noncommutative groups [check that if G
is commutative we recover the DFT of
Problem Set 10].

In case we haven't officially said it yet:
the unitary group

Un that
figures prominently in Artin's treatment (see the start of 9.2, p.310)
is the subgroup of GLn(C)
that acts by isometries on
Cn
with its usual inner-product norm;
that is, matrices whose inverse equals their own Hermitian transpose.
The unitary group
U(V) = U(V, 〈.,.〉)
of a complex inner-product space
V = (V, 〈.,.〉)
is defined in much the same way; if V is finite-dimensional
then any choice of orthonormal basis identifies
U(V) with
Un.

Artin derives the diagonalizability of ρ(g)
as Corollary 2.3 (p.310) of unitarity plus the Spectral Theorem for
normal operators. This is not necessary because we already know that
a linear transformation that satisfies a polynomial equation with
distinct roots is diagonalizable, and ρ(g)
satisfies such an equation, namely
Xn−1=0
where n is the order of g.
Note that this argument applies more generally to a representation
over any field in which
Xn−1=0
has n distinct roots. Again this leads naturally
to the condition that the characteristic not be a factor
of #(G), i.e. that
#(G) is nonzero in F:
this condition guarantees that also n≠0
in F, because n is always
a factor of #(G) [a corollary of Lagrange's
theorem that the order of any subgroup of G
is a factor of #(G), see e.g.
Artin, (6.11) and (6.12) in Chapter 2, p.58].

Let T be the character table of a group of N elements,
and let T* be its Hermitian transpose.
The first part two parts of Theorem 5.9 in Chapter 9 state in effect
the identity TDT* = NI
where D is the diagonal matrix whose diagonal entries
are the sizes of the conjugacy classes of G.
But then NI is also
DT*T.
This means that the columns of T
are also orthogonal,
and the column indexed by conjugacy class C
has inner product N/#(C) with itself.
[This is an adaptation of the familiar but still remarkable fact that
the rows of a square matrix are orthonormal iff the columns are orthonormal.]
That is, for any group elements g, h we have
Σχ χ(g)χ(h) = 0
(the sum extending over the irrducible characters χ),
unless g and h are conjugate, in which case
χ(g) = χ(h) for all χ
and we get
Σχ|χ(g)|2 = N/#(C)
where C is the conjugacy class of g.
In particular, taking g = id we get
C = {id} and thus recover the fact that
N is the sum of the squares of the dimensions of
the irreducible representations.

Let G be a group of N elements, and
(V,ρ) be
any finite-dimensional representation over a field
in which N is nonzero. Let P be the operator
N−1 Σg∈G ρ(g).
This is a G-endomorphism of V,
i.e. it commutes with each ρ(g), whence
it follows that P is an idempotent: it satisfies
P2 = P. We already know
(PS9 1(i)) that this makes V the direct sum of the
eigenspaces V0 and
V1 of P,
which are respectively the kernel and image of P.
It follows that P has
trace dim(V1).
This is what lets us begin to link characters with irreducibles:
on the one hand this trace is
N−1 Σg∈G χ(g) = 〈χ, 1&rang
where χ is the character of V.
On the other hand V1
is the dimension of the G-invariant subspace
VG of V
(which we can also canonically identify with
HomG(F,V)
where F is regarded as the trivial representation).

To get from this to the dimension of
HomG(V,V’)
for any representations V,V’,
recall the identification of
Hom(V,V’)
(ignoring the G-action)
with the tensor product
V*⊗V’.
Artin doesn't describe the action of Gon V*.
We know how to get from each ρ(g)
a linear operator (ρ(g))*on V*,
but these do not compose in the right order
(unless G is abelian): we have
(ρ(g))*(ρ(h))*
= (ρ(hg))*.
So to get a representation
ρ* of Gon V* we must define
ρ*(g)
= (ρ(g−1))*
for all g. This yields the dual representation
(sometimes called the “contragredient representation”)
(V*,ρ*)of (V,ρ); its character
χ* is the complex conjugate of χ.
This gives Hom(V,V’)
= V*⊗V’
the structure of a representation of G,
and we can check that the invariant subspace is precisely
HomG(V,V’)
[which also justifies Artin's term
“G-invariant homomorphism”
for an element
of HomG(V,V’)].
We then deduce that the dimension of this space is
〈χ*χ', 1〉 = 〈χ', χ〉,
and since that's a nonnegative integer
— and thus a fortiori real —
it is also equal to 〈χ, χ'〉
as claimed.

NB Artin's definition of the inner product on class functions,
or indeed on any functions on G (Chapter 9, (5.8), page 318),
makes this inner product semilinear in the first variable.
This is consistent with his treatment elsewhere in the book
(see Chapter 4, (4.4), page 250), and is occasionally more convenient
— e.g. here we wouldn't have had to finish by observing that
the inner product is real and thus its own conjugate — but is
rarer than the convention that we (following Axler) have been using.
We have to tweak our development accordingly.

The operators introduced by Artin towards the end of his proof of
(most of) Theorem 5.9 have further uses when φ is the
character of an irreducible representation V.
Indeed consider
Pφ := N−1 Σg∈G φ*(g) ρ(g)
acting on any representation W. Since φ is a class
function we again find that Pφ
is a G-endomorphism of W.
Thus (by Schur) if W is irreducible then
Pφ is a multiple of the identity.
The correct multiple can then be identified by computing
the trace of Pφ,
and that's 〈χ, φ〉
where χ is the character of W
(whether or not W is irreducible).
Using the orthogonality relations we deduce that if
W is irreducible then
Pφ = 0,
unless W is isomorphic with V, in which case
Pφ = 1 / dim(V).
It follows that if W is the direct sum of irreducibles
Vi then
dim(V) · Pφ
is the projection to the direct sub-sum of those
Vi that are isomorphic
with V. In particular, this sub-sum is independent
of the decomposition; it is often known as the
V-isotypic subspace of W.
Examples: if V is the trivial representation then
the V-isotypic subspace is the fixed subspace
we called VG, and indeed in this case
Pφ is the average of
ρ(g) over group elements g.
If G is the group of two elements and V its
nontrivial representation then the isotypic subspace is the
(−1)-eigenspace for the non-identity
element g of G, and indeed the associated
projection Pφ is the familiar
(1−ρ(g))/2, just as projection to the
(+1)-eigenspace is the averaging operator
(1+ρ(g))/2.

If you know about integral closures and related ideas (see for instance
the beginning of the summary of field algebra)
then you can also use Pφ
to demonstrate the last part of Theorem 5.9, which Artin punts on
proving: the dimension of every irreducible representation V
is a factor of N = #(G).
Proof: N/dim(V)
is an eigenvalue of the action on the regular representation of
N·Pφ
= Σg∈G φ*(g) ρ(g);
since each of the numbers φ*(g)
is an algebraic integer (it's the sum of roots of unity), it follows
that so is N/dim(V), and since that
quotient is rational we conclude that
N / dim(V) ∈ Z
as claimed.

Our final topic for Math 55a is the
Sylow theorems, covered by Artin in
Chapter 6, Section 4 (pages 205–208).
Corollary 4.3 (a finite group G contains
an element of order p for every prime factor
pof |G|) is
a theorem of Cauchy that predates Sylow;
we'll start by giving its combinatorial proof,
which will introduce the technique that we
(following Artin) will use to prove the Sylow theorems.
Note that Artin's statement of the “second Sylow theorem”
((4.6) on page 206) is not the usual terminology, which regards
Corollary 4.7 as the second Sylow theorem and
(4.6) as a step in one proof of this theorem.

The proof Artin gives for Sylow I can be extended to give the
“1 mod p” part of Sylow III.
Artin shows that the combinatorial coefficient
bin(pem, pe)
he calls N is not a multiple
of p; extending the same argument shows that
N is congruent to
m mod p
(the first factor is m mod p,
and each of the other factors is 1 mod p).
[Simpler yet: work in charcteristic p,
and apply e times the identity
(1+X)p = 1 + Xp
to get
(1+X)pem
= (1 + Xpe)m;
now look at the Xpe
coefficient.]
Since m is also the number of orbits of each Sylow subgroup
on the pem-element subsets
of G, it follows that the number of Sylow subgroups
is congruent mod p to m/m = 1.

Artin's treatment of Sylow II may be easier to read if you just replace
each instance of s by H
(recall that s was introduced as the coset
1H of H).

Third problem set / Linear Algebra III:
Countable vs. uncountable dimension of vector spaces;
linear transformations and duality
NB: The “trickiness” of 2ii has nothing to do with
subtleties of infinite cardinals, the Axiom of Choice, etc.;
all you should need about countability (beyond the definition) is that
a countable union of countable sets is countable (as in Cantor's proof
of the countability of the rationals).

Fifth problem set / Linear Algebra V:
Tensors, eigenstuff cont'd, and a bit on inner products
Problem 5i can be less confusing if you start with compositions of maps
U → V → W
involving three different vector spaces,
and only then set all of them equal V.
Thus if A is in Hom(V,W)
then the map taking X to AX goes from
Hom(U,V)
to Hom(U,W),
i.e. from U*⊗V
to U*⊗W,
so it had better be I⊗A.
Likewise for B is in Hom(U,V)
the map taking X to XB goes from
V*⊗W
to U*⊗W,
so it ought to be B*⊗I
— not B⊗I
because we need a map from V*
to U*. (Fortunately B
and its adjoint operator B* have the same spectrum,
so even if you made this mistake you could still get the rest of
the problem right.)
Of course, once you've surmised the answer
I⊗A + B*⊗I,
you still have to prove it; by linearity, it's enough to check that
the formulas agree on pure tensors, or in convenient coordinates
if you prefer — which is indeed a special case because the usual
coordinates on U*⊗V
come from a basis of pure tensors, namely tensors of vectors in the dual
basis for U* with the basis for V.

Seventh problem set / Linear Algebra VII:
More about the spectrum, including spectral graph theory;
symplectic structures
For 2i, changing each coordinate of a λ-eigenvector v
to its absolute value preserves
〈v, v〉
and does not decrease
〈Av, v〉.
Thus by problem 1 the new vector is also a
λ-eigenvector, and we get such an eigenvector
with nonnegative entries. If μ is a negative eigenvalue,
the same construction shows that the largest eigenvalue is at least
|μ|.
•
For 2ii, show that if vi > 0 but
vj = 0 and
aij > 0
then we can increase the “Rayleigh quotient”
〈Av, v〉
/ 〈v, v〉
by making vj
positive but sufficiently small.
So if not all vi are positive we can let
I be the set of indices i of positive coordinates,
and J its complement, giving a disconnection of A
(note that I cannot be empty because then v is the
zero vector). In the connected case, dimension 1 follows because
else some eigenvector would have both positive and negative entries.
•
Finally, for 2iii if there is a (−λ)-eigenvectorv then changing each coordinate to its absolute value
produces a (+λ)-eigenvector; then if we let
I be the set of indices of positive coordinates of v
then for every positive aij
exactly one of i,j is in I, and conversely
if there is such a subset I of
{1,2,…,n} then changing
each coordinate of a (+λ)-eigenvector
to its negative produces a (−λ)-eigenvector.
•
NB The term “connected”, and the connection with
Problem 5 (see the next paragraph), should suggest
forming a graph with n vertices and an edge for each
i,j such that aij > 0.
The graph is connected iff the matrix is “connected”,
and then the condition of part iii is that this graph be
bipartite. This generalizes the corresponding results for the
adjacency matrix of the same graph.
Note that indeed the hypercube graph of Problem 5
is bipartite and its adjacency matrix has maximal eigenvalue n
and minimal eigenvalue −n.
•
There is an analogous result for matrices with nonnegative entries
that are not required to be symmetric: there is a real eigenvalue
λ such that |μ| ≤ λ for all
eigenvalues μ, even complex ones (which may exist in this generality).
There are also equality conditions in terms of connectedness of an
associated graph, which in this setting must be taken to be a directed graph.
I don't know as slick a proof of this result, at least not without
using more of the topological tools we'll develop in Math 55b.
For 5ii, let x be the vector whose v coordinate is
+1 if v is in S and −1 if not. Then
x is orthogonal to the all-1 vector,
which generates the eigenspace for eigenvalue n.
Since the second-largest eigenvalue is n−2,
it follows that
〈Ax, x〉 ≤
(n−2) 〈x, x〉
with equality iff
Ax = (n−2) x.
This yields the desired bound, with equality iff each vertex
in S has exactly one neighbor outside S,
which in turn is soon seen to be equivalent to the condition that
S is one of the 2n subsets on which one of the
n coordinates is constant.

Ninth problem set / Linear Algebra IX:
More on exterior algebra and determinants
corrected to replace duplicated Problem 4 (ouch)
and fix the Axler quote
corrected again (ouch-prime)
to fix Problem 2:
〈ω,ω〉 = 0, not〈ω,ω'〉 = 0
For problem 2, certainly if
ω = v1 ^ v2
then
ω ^ ω =
v1 ^ v2 ^ v1 ^ v2
= − v1 ^ v1 ^ v2 ^ v2
= 0.
The reverse implication can be proved by direct calculation, but it's
easier to use the fact that if ω is not a pure tensor then it is
of the form
v1 ^ v2 + v3 ^ v4
for some basis
{v1, v2, v3, v4},
and then
ω ^ ω = 2
v1 ^ v2 ^ v3 ^ v4
is nonzero.
For problem 3, choose any basis B for V;
let ψ be the wedge product of all 4n basis vectors, and
use for W a basis of pure exterior products of
2n-element subsets of B.
Then
〈ω,ω'〉 = 0
for all basis elements ω,ω'
unless ω and ω' come from complementary
subsets of B, in which case
〈ω,&omega'〉 = ±1.
So W is the orthogonal direct sum of
r := dim(W)/2two-dimensional subspaces, each with signature (1,1).
Hence W has signature (r, r).
In each subsapce, ω + ω' and
ω − ω' constitute an orthogonal basis,
and the basis vectors have inner products 2 and −2
with themselves. So we can take for W+
the span of the r positive basis vectors, and for
W− the span of the
r negative ones.

Tenth problem set / Linear Algebra X:
Signatures and real polynomials; determinants and distances;
Pontrjagin duality
corrected: in 3(ii), each
xi and yj
must be positive, not just greater than −1/2
[I was thinking of the application to Müntz for which
the matrix entries are of the form
〈tx,ty〉
= 1/(x+y+1),
not 1/(x+y)]
also, corrected 11.xi: in 1(i) the
Pj must be pairwise coprime,
not merely distinct (and then there's no need to require P
and each Pj to be monic)
and, corrected 23.xi (sorry!): in problem 3,
the xi must already be assumed distinct
in part (ii), else the matrix is only positive semidefinite.
⊕ For 1(iii), the point is to find
a polynomial you can compute easily from the coefficients
of P; it's no fair to just write something like
“Πi≤r(x−i)
Πj≤s(x+j), where
P has r positive and s negative roots”:
that's begging the question of determining r and s.
Hint: the polynomial you construct need not have
the same degree as P.
For 8(ii), you're looking for a way to efficiently combine black boxes
that compute DFT's on H and G/H
into a technique for computing DFT's on G
(so that one can recursively construct such a box if G is (say)
a cyclic group of order 2r from DFT's for
tiny groups like the 2-element group).
This should be straightforward if G is simply the
direct product of H and G/H.
But what if (as suggested by the introductory paragraph)
G is a cyclic group of order 2r
and H is its subgroup of order 2, or the subgroup of
order 2r−1 and index 2?

Eleventh and final problem set:
Representations of finite groups
corrected overnight: in problem 3,
Sk is not the same as
kS…
Problem 3 shows one approach and a few typical applications
for a result often known as “Burnside's Lemma”
[though it was already well-known in Burnside's day,
going back at least to Cauchy (1845) according to
this Wikipage]. I thank Lewis Stiller
for pointing out the connection with representation theory
some years back.
The construction illustrated by Problem 4(iii) is the
”Molien series” or ”Hilbert-Molien series”
of the ring of invariants of a representation of a finite group.
(See for example
this Wikipage; as of this writing,
the example there happens to be the same as the one in this problem.)
In general the denominator can always be taken to be a product of
factors 1-td, but it's rare for
the numerator to be simple, let alone as simple as just 1
when we're dealing with the very special case of a
complex reflection group.
For problem 7(iii), there are some alternative solutions for this
particular case, but the intended solution was to show more generally:
if V is a nonzero finite-dimensional representation,
over some field subfield F of C,
of a finite group G, then V is irreducible
if and only if EndG(V)
is a division algebra. Note that this is easy for
F=C by Schur's Lemma:
if V is the direct sum over j of isotypics
Vjmj, then
EndG(V)
is the direct sum of matrix algebras
GLmj(C),
which is a division algebra if and only if
mj=1 for one j and
mj=0 for all other j.
To do it in general, note that if some operator in
EndG(V)
is invertible then its inverse is also a
G-endomorphism of V.
Hence if EndG(V)
is not a division algebra, there is some nonzero
G-endomorphism T of V,
and then the kernel and image of T are subrepresentations
of V other than {0} and V itself.
Conversely, if V is not reducible then it is a direct sum
V1⊕V2
of two nonzero representations, and the projection to either factor
is a nonzero G-endomorphism of V
that is not invertible.

The FINAL EXAM is now on my office door.
Correction (argh...):
The n-by-n matrix A
of the introductory paragraph of the first problem
should have been called An.
Then in part&nbsp(i) take n=2m
to get A2m
[NB notA2n
as I wrote earlier].
Another typo correction:
in 3(iii),
“G has a representation (V,ρ)
whose associated character ρ takes the values…&rdquo
should of course be
“whose associated character χ takes&rdquo etc.