January 2011

Sarah Flannery became famous at a young age of 16 when her algorithm for cryptography was speculated to be a far better alternative to the widely adopted RSA algorithm. She presented an algorithm at the Young Scientist Competition with no fan fare and she won the competition. This book would not have been written but for one reason , the competition result got picked up by “London Times “ and an article appeared on the front page with a nerdy picture and a catchy title , ”Sarah Flannery, 16 , who baffled the judges with her grasp of cryptography”. Subsequently “Reuters” published the story , which put Sarah, her family, her life till then, under public eyes. She was inundated with calls from journalists, TV shows, firms offering her employment , VCs wanting to give seed capital , universities asking her to give a talk to their students etc.

This book is an attempt by Sarah to shed light in to her mathematical journey and the various factors that helped her to create the algorithm, to a far wider audience. The book has been co-authored by her father David Flannery.

Sarah Flannery hails from a village of Blarney,County Cork, Ireland where she attended a local girls school, a very ordinary school by the usual standards. Her father was a math lecturer at Cork Institute of technology. What brought her to the world of math ? Sarah credits her father for introducing her to the world of math in a unique way, “using puzzles”. David Flannery believed that math needs to enjoyed for one to be creative. One way to enjoy math is to solve puzzles which bring out mathematical concepts. So, he cultivated this puzzle culture at home.

At a blackboard in the kitchen, the father & daughter would discuss math puzzles and like a good coach, Sarah’s father merely provided directions and subtle cues and almost never gave the entire solution at once. This culture of solving puzzles and engaging one’s mind over puzzles was the main reason for Sarah to develop a liking towards math. It made math more tangible and interesting . Also , Sarah’s father never allowed to let go an opportunity to extend a puzzle once it was solved. He used to extend the solved puzzle to create another puzzle , a little difficult one , and scribble it on the blackboard. What he was doing through this culture at home was basically trying to inculcate the axiomatic method of mathematics where you build up theories over a nice structure of axioms, theorems, lemmas. For Sarah, learning through Puzzle was one of the fundamental ways to understand math.

In the book she says

Puzzles, like humour , have an universal appeal and know no boundaries – cultural , educational or otherwise. People of all ages and levels of education are attracted to the puzzle as they are to the joke. In a sense, there is an affinity between the two in that a vital ingredient of both is the element of surprise. No problem is worthy of the name “puzzle” if its solution is obvious, just as the joke whose punch line is easily anticipated is soon forgotten. The true puzzle should be accessible to all, its solution should require no specialist knowledge other than, at times, the rudiments of arithmetic and algebra. It is perhaps, the unconscious feeling that we all start out equal that gives puzzles their charm

Transition Year

This book talks about a specific aspect of Irish educational system that I found very interesting . After the end of high school, the students in Ireland can opt for a one year Transition Year Programme before moving on to 2 year college education. The purpose of transition year is described as follows at an Irish school site.

The overall mission of the Transition Year Programme is to promote the personal, social, educational and vocational development of students and to prepare them for working life. The course is designed to cater for the needs and capacities of students within a framework of broad general education which would have a substantial academic base. The course also includes a wide range of practical applications, activity-based learning, team and group work and work experience.

The Transition Year Programme acts as a "bridge" from the highly structured environment of the Junior Cycle and to the much more independent, responsible and mature attitude towards work, study, school and interpersonal relationships of the Leaving Certificate Programme.

Almost every parent in India is anxious at the time his/her kid moves from school to college. From a highly structured and regulated environment, the kid moves to college environment where the learning typically is self-regulated/ semi-structured. Most of the kids cannot handle this sudden burst of freedom and mostly fail to use it constructively. This transition year concept sounds very appealing to me and it is definitely an idea that can be explored in the context of Indian education system. Imagine a scenario where the kid after 10th class has one year time to explore his/her interests / take time out to read up stuff from varied fields / take up mini community projects / explore some programming skills / experiment with stuff etc with no pressure of performance..It would be immensely beneficial to parents too as they can get an idea of what their kids truly enjoy.. They can then guide the kids in that direction.A one year break at the right time in a person’s life might do wonders and at least increases the chance of finding interest areas. However I see resistance to this idea in Indian educational system where everybody is in a hurry to graduate out of some place or the other:).,I hope at least a few parents encourage their kids to take a year off and explore stuff.

It comes as no surprise that Sarah found her calling during the Transition year where she developed a liking towards number theory and Mathematica package. Sarah spent the entire transition year learning about prime numbers, their distribution, their properties etc. By the end of the transition year , she had pretty much decided that she would explore cryptography field besides going through college education.

Sarah then mentions a few essential math stuff needed to understand her improved algo. The details mentioned does not need any pre-requisite knowledge for understanding. She goes over the following stuff :

1. How were enciphering and deciphering systems created in the olden times ?

The basic math behind enciphering and deciphering text depends on the modulus function. It starts off from the Caesor system which can be summarized as “ C = ( P + s ) mod 26, P = ( C – s ) mod 26“. C represents enciphering and P represents deciphering system for the 26 alphabets.The above system is essentially a single key system,meaning the only thing that is essential to crack the system is key s.

The above system can be refined as follows : “C = ( m1*P ) mod 26 , P = ( m2*C ) mod 26 “.This is a two key system where instead of additive shift, there is a multiplicative shift in m. Even for such a simple system one needs to know the multiplicative inverse of m1*mod 26. Appendix provides the Euclid’s algorithm for generating the multiplicative inverse of m1*mod 26

The above system again can be refined as follows : “ C = ( m1*P + s ) mod 26 , P = ( m2*C + s ) mod 26”. This is a three key system where there is additive shift, and a multiplicative shift in m.

The above system again can be refined as follows: ” C = ( m1*P + s ) mod 17576 , P = ( m2*C + s ) mod 17576 ”.Here the key space is improved by considering trigrams At a broader level, there is a one-to-one function mapping C and P and hence the system components can be summarized as keys,key space and an invertible function.

2. What sort of Arithmetic is used in Cryptography ?

3. Why are one-way functions & trap-door functions important ?

4. What is Public Key Cryptography and What is the concept behind RSA algorithm ?

With this background provided in the book, Sarah resumes her story.Sarah working as a apprentice in Baltimore technologies meets Dr.William White who then shows a paper by his colleague Dr. Michael Purser. Sarah picks up a lot of stuff from the paper and then tries to come up with an alternative algo to RSA , an algo which is 30 times faster than RSA by exploiting the non-commutative property of matrices. Basically RSA uses exponential calculations whereas Sarah’s also uses multiplication operations.

The last part of the book shows the various phases of preparation that Sarah goes through before winning the Young Scientist award. She decides not to patent the algo and opens it for Peer review. Very soon she finds that there are cracks in the algo and all the publicity she got was unwarranted. She realizes that media will jump on to anything saucy and actually don’t care about the veracity of the news item. Meanwhile, cracks in her algorithm make her return to sanity and get back to life where she can do further research in math. Her decision of not patenting algo actually was pivotal as it helped her see chinks in her math and thus provided the much needed breathing space to continue research in cryptography.

Takeaway :

This book is a great story of a youngster who uses her time to learn and implement an algorithm , to challenge the prevalent RSA algo , which was originally designed by 3 MIT guys. A school kid having the guts to challenge the RSA algo is definitely an inspiring read. It goes on to show that , if you have to have the right attitude towards math , you can churn out kick-ass stuff, irrespective of your age.

There is a need to visualize higher dimensional spaces in various applied math problems, though one cannot give any physical meaning to such higher dimensional spaces. Our inability to see anything more than 3 dimensions does not mean we cannot visualize and understand multidimensional space. For a superb account of ways to visualize higher dimensional space, one can read “Flatland”, the classic book of a 2 dimensional world, which describes the experiences of a 2 dimensional square trying to come to terms with a 3 dimensional world. After reading “Flatland”, one becomes more open to understanding multi-dimensional spaces. Developing abstract thinking is the key for doing math/modeling and linear algebra being the study of multi-dimensional world becomes very important to enhance abstract thinking.

One typically come across matrices at a high school level or some basic math course. Some even go through a linear algebra course from a computational angle,i.e given a problem , one knows how to solve it, be it addition multiplication/inversion of matrices. Use QR /SVD/ LU decomposition to solve problems. In such an approach, though one develops analytic skills, the crucial thinking skills are lost out. What do I mean is this? Well, for example, Given two matrices, A and B, probably a high school kid/ college level student can multiply the matrices row times column way and produce C. He goes on to solve a lot of such multiplication problems and pats himself that he can solve any matrix multiplication problems in a jiffy. With computers today, he does not even need to know the odd row times column thing that is the crux of matrix multiplication. He punches in the matrices in to a calculator/ computer, and the computer throws out C which is the product of A and B.

Ask the same student, “ Why does matrix multiplication involve row times column operations ? Doesn’t it seem odd ? What is the motivation? “.In all probability he is going to draw a blank face. Many times when instructors come to a point of talking about matrix multiplication, there is some murmur in the class from students about the odd behaviour of matrix multiplication. Not many ask the reason for such a rule. They take it as granted and go through the grind of doing tons of exercises. Despite solving a lot of exercise problems, Despite cracking a few exams, what is lacking is the understanding of the principles behind as simple as matrix multiplication. This is the sad story of most of the math education that is happening world over. It is definitely the state of Indian education system for certain, as the students are being trained to be workhorses and they eventually become good at solving questions than posing interesting questions and then solving them . As one of my friend quips , “ We are good at solving a question paper set by others, We are seldom good at setting our own question paper and solving it “. Very true!

Books such as these should be read by teachers world over, starting from the high school level. The insights from such books should be used in the class so that a student can see the motivation behind various things learnt. Once an instructor/teacher provides motivation for the subject, the teaching material becomes a supplement to a specific direction. The thing that is actually taught really doesn’t matter as the student is likely to figure out stuff by himself, all the student needs is the right motivation and direction.

Anyways , coming back to my motivation to read this book. I was trying to understand the usage of Hilbert spaces in the context of Lebesgue integral and felt a great need to understand inner product spaces thoroughly. It’s explained very well in this book. Instead of merely talking about the way inner product spaces is dealt in this book, I will attempt to summarize the entire book because of two reasons. First, this is THE BEST MATH book that I have read till date. Secondly, it deserves a summary than a piece meal highlighter summary. Let me get started

Chapter 1 : Vector Spaces

The chapter comprises defining vector spaces and spelling forth the properties of a vector space. A vector space is a set V along with an addition on V and a scalar multiplication on V such that the following properties hold good: Commutativity, Associativity, Additive Identity, Additive Inverse, Multiplicative Identity and Distributive property.

Properties of Vector Spaces

A vector space has a unique additive identity

Every element in the vector space has unique additive inverse

0 v = 0 for every v belonging to vector space

a0 = 0

(-1 )v = -v for every v belonging to vector space

Vector spaces by themselves could be very large and hence the concept of sub space is introduced. A subspace U of V is a space which satisfies the additive identity, closed under addition and scalar multiplication properties

The concept of direct sums is introduced subsequently. Suppose that U1, U2, …. Un are subspaces of V , then every element of V can be written as a linear combination of elements in U1, U2, …Un and the only way to write 0 is the sum of u1, u2, u3,…un where each of these elements are 0.

Chapter 2 : Finite Dimensional Vector Spaces

This chapter introduces span, linear independence, basis and dimensionality concepts. Span is any list of vectors whose linear combination gives rise to any vector in V. Linear independence is associated with a list if the only one specific way to combine the list of vectors to produce a 0 vector. Typically the number of elements in spanning list is more than the elements in the linear independent lists. Whenever they concur, they become the basis for the vector space. The length of the basis of the vector space is termed as the dimension of the vector space.

Chapter 3 : Linear Maps

One does not usually find the word “Linear Map” in the books on linear algebra. It is refreshing to see the concept of operators being introduced using Linear Map. Basically one can call anything that maps from a space V to space W, a linear operator, if it satisfies additivity and homogeneity property. The chapter subsequently gets in to vector space of linear maps by defining the additivity and scalar multiplication.

Null space and Range of linear map are defined along with various properties. Invertible map is then defined as a map which is both injective and surjective. Often books invoke the argument of determinant being 0 for non-invertibility, however the approach using linear map is elegant. It says that the linear map needs to be injective and surjective for it to be an invertible map. The highlight of this chapter is that it shows the motivation behind matrix multiplication. “Why do we multiply two matrices the way we multiply them? “ is neatly answered by showing that it is a by product of a property involving two linear maps. The chapter then goes on to define operator which is basically a map from a space to itself. This chapter is the key to understanding everything in the book.

Chapter 4 : Polynomials

This chapter does not contain any linear algebra specific stuff but contains some basic theorems about real and complex polynomials which are useful to the study of linear maps from a space to itself.

Chapter 5: Eigen Values and Eigen Vectors

Vector Spaces, Sub spaces, Span, Linear Independence, Dimension, Linear maps are all the concepts that are given as a preview to “Eigen vectors and Eigen values”, a fascinating topic of linear algebra,. They arise in the context of Invariant Subspaces, meaning , a transformation which maps a space to itself. The book makes a simple but profound statement, which any undergrad student / PhD Math person / faculty in math would concur.

A central goal of linear algebra is to show that given an operator T belonging to the space of linear operators on V, there exists a basis of V with respect to which T is a reasonably simple matrix. To make it more concrete, goal is to find out matrices of linear maps with as many 0’s as possible in its structure

When one looks at the matrix representation of a linear map, everything in the world of matrices starts becoming clear. This chapter makes the connection between matrices and eigen values. Well, why should matrix representation of a linear map have any connection with the eigen values ? Why should all of the diagonal elements of a matrix map be non zero for the map to be invertible ? What happens when the matrix map is a diagonal matrix? What can you say about the basis of the vector space if the matrix map is diagonal? All these questions are carefully answered in this chapter.

The highlight of this chapter is the proof to the statement that every operator on a finite dimensional non zero complex vector space has an eigen value. The proof does not mention determinants, characteristic equation , blah blah . Instead it uses the concept of linear independence for proving the theorem. Absolute charm when compared to what one finds in other books where intuition is lost after reading through pages and pages of proof.

Chapter 6: Inner Product Spaces

Norm of a vector is an alternative name for the length of the vector in the space. However the basic problem with norm is that it is not linear. To induce linearity in to the discussion, inner product is introduced in the chapter. Inner product is nothing but a generalization of dot product in multidimensional space.

An inner product on V is a function that takes each ordered pair(u, v) of elements of V to a number and has the following properties – positivity, definiteness, additivity, homogeneity, conjugate symmetry. Basically an inner product space V is a vector space V along with an inner product. One can define inner product in several ways as long as the definition has the properties of inner product. One usually comes across Euclidean inner product space but that does not mean that is the only type of inner product that can be defined. The chapter then defines norm in terms of inner product. Various inequalities such as Cauchy Schwartz, Triangle inequality are proved using the properties of inner products and norms.

Gram-Schmidt procedure is then introduced which takes a list of linearly independent vectors and turns in to a list of orthonormal vectors. When we hear the word orthonormal vectors, one thing that must be kept in mind is that it is assumed that we are taking about a pre-defined inner product space. The orthogonal complement is defined using the inner product space, which is then used to define orthogonal projection. This orthogonal projection of a vector in V in to a subspace U is used in multitude of applications. So all the concepts are interlinked: inner product-norm-orthonormal basis- GramSchmidt – Orthogonal complement of U – Orthogonal projection. This connection is shown with the help of a an example, series expansion of sin(x). Personally, I found this example to be the BEST. It’s a simple example but an extremely insightful example which shows the connection between all the ideas in the chapter.

How do you represent sin(x) in a polynomial form ? The first thing that comes to anyone’s mind is Taylor series expansion, which every undergrad learns at some point or the other. This chapter covers a beautiful procedure involving Gram-Schmidt. Gram-Schmidt is basically a procedure to generate orthonormal lists from linear independent lists. So, if you can come up with a linear independent list in the space you are interested in, you can easily generate orthonormal lists.

For example, If you want to approximate sin(x) using ( 1, x1 , x2 , x3 , x4, , x5 ) , then use Gram-Schmidt to generate an orthonormal basis. Once you decide on orthonormal list (e1, e2, e3, e4, e5) , then you can use it to project the function sin(x) on the space spanned by the orthonormal list.

In the case of sin(x) the expansion using projection is better than the expansion by Taylor series which is calculus based, for a few terms. This example allows you to see the connections amongst linear independence, orthonormal vectors, inner-product, projection, functional representation on a projected subspace using Gram-Schmidt . A memorable example which will make the reader forever remember these concepts.

Chapter 7: Operators on Inner product Spaces

Once the inner product space is defined, there are various operators that can be defined on this space. This chapter introduces self-adjoint operator, normal operator, positive operator, Isometry operator, square root operator. Characteristics of each of this operator is investigated. One thing that strikes you if you have not taken the linear transformation approach before is that all the analysis that is taught with respect to matrices comes alive when looked from the lens of operator. The chapter ends with the discussion of Singular value decomposition, the application of which can be found in tons of domains. Google search engine, for one, heavily uses SVD.

The highlight of this chapter is the spectral theorem relevant to complex and real dimensional vector space. The connection between these theorems and operators is the key in understanding the rationale behind the existence of orthonormal basis of either of the spaces.

By explaining various properties of operators, one can get a visual picture of various operators. A normal operator is always a Self adjoint operator. An isometric operator is always a normal operator. A positive operator is by definition a Self-adjoint operator under some restrictions. Here is my naive representation of the operator structure :

What is the point in understanding these operators ? Next time you see a matrix , you might pause and understand the structure than merely subjecting it to some algorithm for diagnolization / eigen value decomposition. By spending some time on the structure on the matrix, you get a better insight in to the problem at hand.

Chapter 8 : Operators on Complex vector Spaces

The chapter starts off with generalized eigen vectors. The rationale for considering these vectors is that not all operators have ENOUGH eigen vectors that give rise to ENOUGH invariant subspaces. Lack of enough eigen vectors will prevent writing a vector as a direct sum of subspaces.

By defining generalized eigen vectors, one can decompose the vector space as a direct sum

The beauty of considering generalized eigen vectors is that a decomposition exists for every operator on a complex vector space. For a specific transformation T, every basis of V with respect to T has a upper triangular matrix such that the eigen value appears on the diagonal precisely .

The above statement connects the following concepts: Generalized eigen vector space , dimension, nilpotent operators and the upper triangular matrix of T. Thus multiplicity of eigen value has a good geometric meaning, i.e it represents the dimension of the generalized eigen vector space. This is a refreshing way to look at multiplicity than merely looking at repeated roots of a characteristic equation obtained by taking determinant approach. Minimal polynomial is then introduced in the chapter and the connection is made between characteristic equation and minimal polynomial. Jordan basis are then introduced thus showing that there exists an alternate basis to orthonormal basis that are also equally powerful.

Chapter 9 : Operators on Real Vector Spaces

For all the theorems applicable in the complex finite dimensional vector space, an extension to the real finite dimensional vector space is made in this chapter. Ironically, theorems in complex space are simpler, neater and easier to work with, than the real space . For almost all the theorems, there is added complexity that is appended for real vector space.

Chapter 10 : Trace and Determinant

This chapter is the main reason for the author delving on various theorems in the first 9 chapters of the book. Concepts of Trace and Determinant find their way in to math courses sans the intuition/ geometric interpretation of the same. By treating finite dimensional vector spaces and various operators, the author provides the reader with a specific kind of thinking to appreciate the concepts of trace and determinants. Diagnolization is taught in every linear algebra course but the reason for diagonalization is never given. After reading this chapter, you will develop a fresh perspective towards the entire business of diagonalization.

Trace is introduced for an operator via the characteristic equation and then is extended to the matrix of the linear operator. Similarly determinant is defined for the operator via the characteristic equation and then extended to the matrix of the linear operator. The highlight of this chapter is the computation of determinant using the finite dimensional vector space concepts. The chapter ends with deriving the change of variable formula using operator theory.

Takeaway:

This is THE BEST linear algebra book that I have read till date. By providing the rationale behind matrices using linear operators, this book opens the reader’s eyes to “What lies behind every matrix?” , “What is going on behind the various decomposition algorithms ?“ .

My motivation to read this book was to understand inner product spaces, more specifically, as a preparation for working on Hilbert Spaces, but this book gave me MUCH more than I had expected. This is one of the rare books which demonstrates quite vividly, the principle : “Power of Math lies in Abstraction.”