Edit: the original poster is Menny, but the question is CW; the first-person pronoun refers to Menny, not to the most recent editor.

I'm doing an introductory talk on linear algebra with the following aim: I want to give the students a concrete example through which they will be able to see how many notions arise "naturally". Notions such as vector spaces, the zero vector, span, linear dependency and independency, basis, dimension, "good" bases, solving linear equations, and even linear maps and eigenvectors. A related MO question is Linear algebra proofs in combinatorics.

The aim of this post is to find some more "concrete","real" and "natural" examples in this spirit that can interest everyone who loves what we do (and give them motivation to learn new definitions and formalisms). So if you have some ideas - please post them! Thanks,
Menny

I have started a discussion over on meta that mentions this question, although it is not currently focused solely on this one: tea.mathoverflow.net/discussion/566 The short version: as this question is CW, any user may edit it to improve it, and I think that the question would be improved if OP's original suggestion were moved to an answer. So I will do that.
–
Theo Johnson-FreydJul 30 '10 at 23:40

21 Answers
21

An example that my last class loved was lossy image compression using the singular value decomposition.

The SVD says that the transformation corresponding to any real matrix (not necessarily square) can be decomposed into three steps: a rotation that forgets some dimensions, a stretch along the coordinate axes, and finally a rotation. In other words, every matrix can be written the form HDA, with the rows of A being orthonormal, the columns of H being orthonormal, and D being a square diagonal matrix with nonnegative nonincreasing entries on the diagonal.

Consider a photograph that is an $768\times 1024$ array of $(red,green,blue)$ triples, which we can just as well store as 3 matrices $R$, $G$, and $B$ of real numbers. Now even though the matrix $R$ has nothing to do with transforming space, we can consider it as such, and using SVD write $R=HDA$. Call the numbers on the diagonal of $D$ by $\lambda_1\geq \lambda_2 \geq \cdots \geq \lambda_s\geq 0$, and let $D_k'$ be $diag(\lambda_1,\dots,\lambda_k,0,0,\dots)$, an $s\times s$ diagonal matrix, and let $D_k$ be $diag(\lambda_1,\dots,\lambda_k)$. Let $H_k$ be the $768\times k$ matrix formed from the first $k$ columns of $H$, and similarly let $A_k$ be the $k\times 1024$ matrix formed from the first $k$ rows of $A$. Then
$$R = HDA \approx HD_k' A = H_k D_k A_k,$$
where $\approx$ is because of continuity, which is appropriate if the $\lambda$'s that were replaced with zeros were small.

Now for the punch-line. We need $3\cdot 768 \cdot 1024$ (about 2 megabytes) real numbers initially to store the photograph. But to store $H_k$, $D_k$, and $A_k$ for each of the three colors, we need only $3(768\cdot k+k+k\cdot 1024)=5379 k$ real numbers. With $k=25$, that gives a compression ratio of about 18. That is, file size goes from around 2 mb to around 130 kb. That is, it is faster by a factor of 20 to transmit the three matrices $H_{25}, D_{25}, A_{25}$ than it is to transmit their product.

SVD is fast enough to compute that you can do this instantly (using Mathematica, say) with a picture of the audience, and they can marvel at their own blurry (but quite recognizable) faces. Also, showing the actual file sizes on disk of the original bitmap and the compressed image is quite impressive. At least, it is if your calculations come out extremely close.

What's really impressive about this example (to me, at least) is that the matrix that we start with is just a table of data and not a transformation. But by considering it as if it were a transform, we gain power over it anyway. This is great motivation for linear algebra; students find it much easier to imagine encountering a table of data than a linear transformation.

This of course works with any orthogonal transform. Adaptive transforms like the SVD have the disadvantage that you either need to transmit the basis separately from the coefficients or you need to multiply everything out as you did, in the process destroying much of the sparsity introduced by the zero truncations. The discrete cosine transform is surprisingly close to the KLT in practice and is based on a fixed basis, so you only need to transmit the coefficients; this is what is used in JPEG.
–
Per VognsenJul 31 '10 at 8:20

@Per: I don't understand what you mean by needing to transmit the basis. In this case, the matrix starts as a table of numbers, so the `obvious' basis is the correct one. I've rewritten some of what's above to make this more transparent. One thing to be cautious of with arbitrary transforms is that you may need to handle complex numbers, which can take up space and lecture time (and be hard to motivate, depending on the audience). On the up side, if you get to say, "and this is how JPEG works," then you have your finale written for you.
–
Kevin O'BryantJul 31 '10 at 16:52

He is referring to the basis of $\mathbb R^{n^2}$, not the basis of $\mathbb R^n$. The basis, in your case, consists of rank $1$ matrices of the form [column of first matrix]*[row of second].
–
Will SawinNov 14 '11 at 21:44

My favorite elementary application of linear algebra is proving that the decomposition used in Calculus of rational functions into partial fractions works.

Start with a polynomial $Q(x)=(x-r_1)(x-r_2)\cdots(x-r_n)$. Then the space of $P(x)/Q(x)$ with $deg P < deg Q$ is $n$-dimensional since it has a basis {$\frac{1}{Q(x)}, \frac{x}{Q(x)}, \frac{x^2}{Q(x)}, \dots, \frac{x^{n-1}}{Q(x)}$}. But {$\frac{1}{(x-r_1)},\frac{1}{(x-r_2)},\dots,\frac{1}{(x-r_n)}$} are linearly independent vectors in the space and thus form a basis.

Hence, $\frac{P(x)}{Q(x)}=\frac{A_1}{(x-r_1)}+\frac{A_2}{(x-r_2)}+\dots+\frac{A_n}{(x-r_n)}$ for some constants {$A_1,\dots,A_n$}, which then we can furthermore find by taking the limit of $(x-r_i)\frac{P(x)}{Q(x)}$ as $x$ goes to $r_i$.

Along similar lines, the Gram-Schmidt algorithm applied to the standard polynomial basis $1, x, x^2, \ldots$ with respect to the $\int_{[-1,1]} w(x) p(x) {\bar q}(x) dx$ inner product constructs the family of orthogonal polynomials defined by the weight function $w(x)$.
–
Per VognsenJul 31 '10 at 8:27

@Vladimir,Per: Hey,those are 2 great and simple ones!I got to remember those for when I teach linear algebra the first time!
–
Andrew LJul 31 '10 at 20:11

2

This of course works with polynomials with repeated roots, where you just throw 1/(x-r_i)^j for 1 <= j <= k with k the degree of repetition. Asking students to extend the case for Q with distinct roots to arbitrary Q might be a fun exercise.
–
Vladimir SotirovAug 1 '10 at 20:24

1

Since this question was on the front page anyway, I took the liberty to texify this answer. I hope no one minds.
–
David WhiteNov 14 '11 at 14:10

Menny's original version of the above question included the following example, which is better placed in an answer, so that it can be voted up and down. Like all answers to any CW question, this one is community wiki. The remainder of this post, unless someone else edit's it, consists of Menny's writing, so the first-person pronoun is Menny, not Theo.

My example - the Fibonacci Sequence! I'll write it in the way I intend to present it; I hope it won't bore you and give you an idea for the type of example I'm looking for.

start by defining it. To them to wonder what is the general term.

Define $F_{a,b}$ to be the Fibonacci Sequence that starts with (a,b,a+b,...). Emphasize they know the first two, determines the other terms (but not explicitly! (yet))

Ask them if there exists a sequence that they "really know", i.e., they can give me the general term. (someone will come up with the zero sequence (zero vector!)).

Tell them: if I give you the general term of, say $F_{2,3}$ can you use this information to find
the general terms of other sequences? (hopefully, we will discover that we can multiply by scalars)

emphasize this great discovery - a scalar multiple of Fib seq is another Fib seq!

Well, assume they are given $F_{2,3}$ explicitly, can they get to any other seq. by scalar multiples?

No? OK, So I'll give you another sequence, which one do you want? (linear dependency...)

Get to the fact that you can also add them!!!

Take $F_{2,4}$ Is this enough? Yes? Well how do you get to $F_{0,1}$? and to $F_{\sqrt{2},1.5}$ (solving linear equations !!!)

Well, these $F_{2,4}$, $F_{2,3}$ must by special, if we work hard and find their general terms, we would find any general term of any given Fib. seq!!!!!

What are their main properties? you can't get to one from the other, with both you can get to everyone (this is almost the definition of a basis...!)

Can we find three seq. like the last too with similar properties? how would we phrase the property "you can't get to one from the other" for 3 seq?

Well, let them show\give as an exercise\ show it yourself that this cannot be.

Ask: any two seq with the property that you can't get to one from the other, also have the property that you can get to everything with them (using scalar multi. and addition)?

Ask: the reverse question?

Summarize: We've seen a vector space, the fact that one vector cannot span 2-dim space, the fact the 3 are linearly dependent, the fact that 2 lin.indep. span and vice-versa.... and (I didn't write) that the zero vector does not help to span and you can always get to it.

BUT.....this is becoming boring! we didn't find any general term yet and we are just assuming we did. BUT we can redefine a very glorious aim: find two linearly dependent sequences with their general term!

We only "know" two sequences from High school - let's try arithmetic progressions. Well... it doesn't work.

Let's try geometric sequence! ...work it out... It works! with q that satisfies q^2=q+1. At last, a "real" motivation for solving a quadratic equation!

find a "basis"

give the formula for $F_{0,1}$!

Summarize - this time dividing the board into "formal part" which will have words like vector space etc. and a part with seq and "bad definition" as above.

If you also wish to talk about eigenvectors and give a more "natural" reason for using the geometric sequence - tell them that there is another "symmetry"/operation for the seq- The Shifting map.
- Well, a sequence is geometric if and only if it is an eigenvector.
Also, you can talk about larger recurrence laws, i.e. $a_n=a_{n-1}+a_{n-2}+a_{n-3}$ and get 3-dim space...

I also found such examples in the first chapter of Newman's Analytic number theory book (which is amazing leisure-time reading!!)

This is indeed a very good example. I taught Linear Algebra to Informatics students for a couple of years in the past and one of the tutorial questions was about the page rank algorithm and I dare say it was the most popular.
–
José Figueroa-O'FarrillJul 30 '10 at 14:15

10

Well, "linear algebra makes you rich" is a great punchline for a linear algebra class :)
–
Mariano Suárez-Alvarez♦Jul 30 '10 at 14:28

I'm not sure how much that article simplifies things, but last time I checked PageRank was based off algoriths involving (spectral) graph theory. This does of course relate to linear algebra, but is a bit more involved.
–
NoldorinJul 30 '10 at 15:00

My favorite application of linear algebra, as introduced to me by Fan Chung, is Oddtown (which I learned about from a manuscript of Lovasz, but may not be due to him).

The $n$ residents of Oddtown love to form clubs; call the family of these $\mathcal{F}$. If $F_1$ and $F_2$ ($F_1 \neq F_2$) are in $\mathcal{F}$, then $|F_1|$ must be odd (this is Oddtown!) and $|F_1 \cap F_2|$ must be even ($\scriptsize{go\;Oddtown?}$). The question is, how many clubs may these $n$ people form?

Yes, but whether we prefer n singleton clubs or (for n even) n clubs each omitting one person, depends upon how odd the residents are. I would like to see the minutes from a meeting of a club with only one person. Particularly any objections.
–
Eric TresslerJul 31 '10 at 16:49

The theory of error correcting codes is a very nice and elementary context for introducing linear algebra, assuming the students know $\mathbb{F}_2$. The notion that each message of bit-length $n$ can coded as a "vector" over $\mathbb{F}_2$ of dimension $m > n$, using some linear conditions ("linear subspace") so as to provide easy error-checking conditions, should be quite motivating. Concepts such as "linear transforms" (matrices) and "null-spaces" show up naturally when considering the parity check matrix of the code. Etc., Etc.

I like using the example of magic squares when starting to go over linear algebra, usually starting with $3\times 3$ squares. They're a nice recreational maths thing that everyone has seen before, but usually not thought about.

When asked for an example, most students come up with something like $\pmatrix{6&1&8\cr 7&5&3\cr 2&9&4}$, remembering a construction from before. When prodded for a second example, someone might suggest rotating or reflecting this example. Once it's suggest that we just want the rows, columns and diagonals to sum to the same thing, and that the numbers don't have to be distinct, someone usually thinks of $\pmatrix{1&1&1\cr 1&1&1\cr 1&1&1}$.

It then usually becomes clear that linear combinations of what we have so far will also work, and this leads naturally into asking how many squares we need in a basis, and so on. (I then ask them to work out the dimension of the space of $n\times n$ magic squares as homework.)

Another "unexpected" use of linear algebra is when they're asked to prove that things like $\sqrt2+\sqrt3$ or $\sqrt2 + \sqrt[3]2$ are algebraic. Many fiddle around until they chance upon an arrangement that works, but they all like it when we show that it's sufficient to take a few powers and say "oh, some combination of those will do". This usually goes down well, as people often like playing with numbers.

Yesterday Only I learned about Fisher's inequality and I think it is good example to show application of rank calculations.

The problem is following:

Fisher, a population geneticist and statistician, was concerned with the design of experiments studying the differences among several different varieties of plants, under each of a number of different growing conditions, called "blocks".

Let:

v be the number of varieties of plants;
b be the number of blocks.

It was required that:

1 k different varieties are in each block, k < v; no variety occurs twice in any one block;
2 any two varieties occur together in exactly λ blocks;
3 each variety occurs in exactly r blocks.

Fisher's inequality states simply that
$v \leq b$.

And its proof (given below) involves basic linear algebra.

Let the incidence matrix $M$ be a $v×b$ matrix defined so that $M_{i,j}$ is 1 if element $i$ is in block $j$ and $0$ otherwise. Then $B=MM^T$ is a $v×v$ matrix such that $B_{i,i}=r$ and $B_{i,j}=λ$ for $i \neq j$. Since $r\neq \lambda$, $det(B) \neq 0$, so $rank(B) = v$; on the other hand, $ rank(B) \leq rank(M) \leq b$, so $v \leq b$.

I have not been to able to link directly to Wikipedia page, so had to paste the question and answer here. Apologies for that.

Actually assumption 3 is not necessary; if we drop it we get $B_{ii}=r_i$ non-constant, as it was in the MO question in your link. (The conclusion is the same; check my answer there, and the subsequent comments).
–
Pietro MajerJul 30 '10 at 20:22

Seconded. Determinants and minors can be treated very effectively and intuitively from the dual combinatorial and geometric perspective of exterior algebra. It's a shame this isn't done more often in introductory courses.
–
Per VognsenJul 31 '10 at 12:54

1

I think that interpolation is a great application to present!
–
Victor ProtsakJul 31 '10 at 21:42

1

For interpolation, one great and underappreciated method in linear algebra is polarization of forms. The algorithm of de Casteljau for Bezier interpolation and de Boor's algorithm for B-spline interpolation are special cases. It gives you a numerically robust and geometrically intuitive evaluation algorithm that proceeds directly from the control points by iterated linear interpolation.
–
Per VognsenAug 1 '10 at 4:42

An abstract but still elementary application is that every field is a vector space over any of its subfields. In particular, every finite field $F$ is a vector space over its prime field, and so $|F| = p^n$ for some $n$ where $p$ is the characteristic of $F$. The same style of reasoning applied to finite extensions of $\mathbb{Q}$ gives negative solutions to the ancient problems of duplicating the cube and trisecting the angle with ruler and compass.

Galois theory has plenty of deeper applications of linear algebra to the study of field extensions. But the ones I mentioned are easily enough accessible that they could serve as motivation for the abstract approach to linear algebra.

I recommend the use of examples from linear geometry applied to computer graphics. All the basical notions of linear algebra can be easily visualized (in fact I recommend starting with Euclidean affine geometry). See the series "Graphics Gems" for specific examples http://books.google.es/books?id=fvA7zLEFWZgC (there are also a lot of texts books). In my opinion it is the best option to see linear algebra in action.

A beautiful example of applications of linear algebra in linear PDEs is the theory of harmonic functions.
With linear algebra and very few analysis one completely characterizes e.g. the space of "spherical harmonics", the eigenfunctions of the spherical Laplacian, which are the n-dimensional analogue of the trigonometric functions sin and cos.

A more elementary application along these lines is the use of linear algebra to express sin and cos of multiple angles as trigonometric polynomials (this can be further adjusted depending on the sophistication level of the audience).
–
Victor ProtsakJul 31 '10 at 21:44

Here's a fun problem from a recent linear algebra exam at my university.

While at university, all students are either in class, in the library, or at the bar. Detailed research by university management has shown that if a student is in class one minute, then after five minutes the student has a 60% chance of still being in class, a 20% chance of being in the library, and a 20% chance of being at the bar. Similarly, if the student is in the library at a certain time, then he or she has a 30% chance of being in class in five minutes' time, a 40% chance of remaining in the library, and a 30% chance of being in the bar. Finally, if the student is in the bar, then there is a 10% chance that he or she will be in class in five minutes' time, a 10% chance of being in the library, and an 80% chance of staying in the bar. What percentage of students do you expect to be in the bar after a long time?

So it's a Markov chain problem, which can be used to motivate matrices, vectors, matrix multiplication, eigenvalues and eigenvectors.

The only reservation I have about this example is that the Rubik's clock puzzle is unfortunately nowhere near as fun as Rubik's cube. Not only is it obscure, but it's basically impossible to look at both sides of the clock at once. Also, the specimens I've seen are not very well constructed and it's hard to turn the wheels.

Despite all that, I personally enjoyed solving Rubik's clock a lot, and a significant part of the fun was discovering that it was a linear algebra problem. (This was back when the puzzle first came out; I was still an undergraduate, and linear algebra was still relatively new to me.)

This article gives a nice connection between linear algebra and calculus, i.e. explains how the fundamental calculus operations of differentiation and integration can be understood instead as a linear transformation. - it should be easy to follow and gives some fascinating insights:

It's already totally obvious by Cauchy's theorem. I usually hear this problem stated differently, although I can't quite remember exactly what is supposed to be shown.
–
Qiaochu YuanAug 1 '10 at 1:41

1

Zen,how exactly is this example obvious to a general audience of beginners with only the barest essentials of linear algebra?Particularly one that hasn't seen group theory yet? Mathematicans often have trouble remembering what it was like to be a rank beginner without many tools yet in thier box.I got the idea those were the kinds of audiences Menny originally had in mind and this one will go way over thier heads.
–
Andrew LAug 1 '10 at 4:41

One of my favorite elementary applications is the classification of projective conics by invoking the spectral theorem on a polarized quadratic form. It makes short work of what would at first glance seem like a messy problem.

You write the real projective conic via a homogeneous quadratic form and polarize it to get a $3 \times 3$ symmetric matrix. The spectral theorem puts that into the form $Q^T\ D\ Q$ for an orthogonal matrix $Q$ and diagonal matrix $D$. You then absorb the square root of the nonzero diagonal magnitudes into the $Q$ factors (which generally makes them non-orthogonal but still invertible). That leaves diagonal entries that are either -1, 0 or +1. Now you just have to analyze the possible combinations of signs.
–
Per VognsenJul 31 '10 at 18:22

As important as the Fibonacci sequence and its related sequences are in the overall hierarchy of functions, it just doesn't come naturally to most beginners and the depth of its many interconnections with diverse areas of mathematics will be lost on most of them. Your examples are clever, but I seriously doubt unless your audience is made up of very strong undergraduates with math competition experience and therefore quite bit of familiarity with counting problems, your examples are likely to be met with chirping crickets being heard clearly...

Geometry and physics are much more familiar to a general mathematical audience and linear algebra has so many connections with these topics, it’s really much more natural to start with those. These are my favorite examples to give. Describe planes in R3 as linear subspaces of R3, add vectors displaying their parallel lines, give isometries as examples of linear transformations and then construct matrices for them with respect to several possible bases. Show the fundamental theorem of systems of linear equations geometrically (i.e. that the corresponding systems of lines can be parallel, perpendicular or coincident). And then discuss similarities and their corresponding row vectors as eigenvectors of the corresponding eigenspaces. And then you can solve systems of differential equations as your last magic trick.

To prepare for the lecture, I'd look at Linear Algebra Through Geometry by Thomas Banchoff and John Wermer as well as the classic Linear Algebra With Applications by Gilbert Strang. Lots of good ideas and examples in these books to guide you in preparing this talk.

If you want a lot of very nice specific examples to use in your talk, there’s a terrific discussion and application of convergent sequences of diagonalizable stochastic matrices to solve problems such as the likelihood of graduation of students at a community college and the proportion at any given time of city and rural dwellers in a populated area undergoing mass migrations in the 4th edition of Steven H. Friedberg, Arnold J. Insel and Lawrence E. Spence’s Linear Algebra. It’s in section 5.3.

1. I don't see how on earth you deduced that someone who used the Fibonacci sequence as an example was into combinatorics. 2. The examples you listed are the standard trivial examples you would teach in a beginning undergraduate course in linear algebra. They definitely are not the sorts of fun, novel examples the author is looking for. 3. I don't see how it is useful to list a couple of standard textbooks in linear algebra.
–
Andy PutmanJul 30 '10 at 20:42

11

I downvoted the response because it led with the first paragraph, which is all unjustified opinion, much of it dismissive of the OP. What followed didn't change my mind. Moreover (since you asked), I find all the spelling and punctuation mistakes unhelpful, and I wonder why you can't take a little extra time to fix these.
–
Pete L. ClarkJul 31 '10 at 1:04

8

@Andrew: Why don't you fix the spelling and punctuation mistakes this time? By not doing so, you are creating the impression that you don't take the process very seriously.
–
Pete L. ClarkJul 31 '10 at 5:57

7

@Andrew : Many students see the Fibonacci sequence in high school. I certainly did, and I went to a pretty lousy public school in Georgia. I don't know why you think it is an advanced topic...
–
Andy PutmanJul 31 '10 at 6:36

6

@Victor P: I had edited many lots of Andrew L's answers to fix his spelling and punctuation. I was disappointed that he was not putting in effort. The authors of the other questions with poor writing habits stop after very few questions and such occasional instances are not worth spending effort for reform.
–
AnweshiJul 31 '10 at 20:17

Question: How to determine the components of the graph using this matrix?

Note that the vector $\begin{bmatrix} 1 & 1 & 1 & 1 & 0 & 0 & 0 \end{bmatrix}^T$ is in the nullspace of $L$ and this vector corresponds to the first component. Can you find a second vector in the nullspace? In general, these vectors associated with the components form a basis for the nullspace (and this isn't difficult to prove). So if you find the basis for $N(L)$, you've found the components of the original graph.

In real life, graphs aren't as simple as the one pictured above. In fact, the graph may consist of one giant component with tightly clustered "approximate components" embedded within. (See any of the images in this search.) And if the graph does have a lot of components, there are more computationally efficient methods of finding them. So why introduce the graph Laplacian? It turns out that the graph Laplacian is a basic object in the field of spectral clustering, which has numerous "real life" applications. (I get to tell the students that I actually used the technique at a previous job while analyzing a large dataset.) This discussion can lead to spectral graph theory.