Mathematics for the interested outsider

Given a point in an -dimensional manifold , we have the vector space of tangent vectors at . Given a coordinate patch around , we’ve constructed coordinate vectors at , and shown that they’re linearly independent in . I say that they also span the space, and thus constitute a basis.

To see this, we’ll need a couple lemmas. First off, if is constant in a neighborhood of , then for any tangent vector . Indeed, since all that matters is the germ of , we may as well assume that is the constant function with value . By linearity we know that . But now since we use the derivation property to find

and so we conclude that .

In a slightly more technical vein, let be a “star-shaped” neighborhood of . That is, not only does contain itself, but for every point it contains the whole segment of points for . An open ball, for example, is star-shaped, so you can just think of that to be a little simpler.

Anyway, given such a and a differentiable function on it we can find functions with , and such that we can write

where is the th component function.

If we pick a point we can parameterize the segment , and set to get a function on the unit interval . This function is clearly differentiable, and we can calculate

Now if we have a differentiable function defined on a neighborhood of a point , we can find a coordinate patch — possibly by shrinking — with and star-shaped. Then we can apply the previous lemma to to get

with . Moving the coordinate map to the other side we find

Now we can hit this with a tangent vector

where we have used linearity, the derivation property, and the first lemma above. Thus we can write

and the coordinate vectors span the space of tangent vectors at .

As a consequence, we conclude that always has dimension — exactly the same dimension as the manifold itself. And this is exactly what we should expect; if is -dimensional, then in some sense there are independent directions to move in near any point , and these “directions to move” are the core of our geometric notion of a tangent vector. Ironically, if we start from a more geometric definition of tangent vectors, it’s actually somewhat harder to establish this fact, which is partly why we’re starting with the more algebraic definition.

Let’s say we have a coordinate patch around a point in an -dimensional manifold . We can use the function to give us some tangent vectors at called the “coordinate vectors”.

We define the coordinate vector as follows: given a smooth function , we define

Okay, I know that that’s confusing. But all we mean is this: start with a function . We compose it with the inverse of the coordinate map to get , where is some open neighborhood of the point . Now we can take that th partial derivative of this function and evaluate it at the point .

The first thing we really should check is that it doesn’t matter which representative we pick. That is, if in some neighborhood of , do we get the same answer? Indeed, in that case in some neighborhood of , and so their partial derivatives are identical. Thus this operation only depends on the germ .

But is it a tangent vector? It’s easy to see that it’s a linear functional, so we just have to check that it’s a derivation at :

And so we have at least these vectors at each point . We can even tell that they much be distinct — and even linearly independent — since we can calculate

where is the th coordinate projection . But we know that is always and everywhere — it takes the value if and otherwise.

Thus takes a different value on than on all the other . Further, any linear combination of the for must take the value on , while takes the value ; we see that none of the coordinate vectors can be written as a linear combination of the rest, and conclude that the dimension of is at least .

Tangent vectors are a very important concept in differential geometry, and they’re one of the biggest stumbling blocks in comprehension. There are two major approaches: one more geometric, and one more algebraic. I find the algebraic approach a bit more satisfying, since it gets straight into the important properties of tangent vectors and how they are used, and it helps set the stage for tangent vectors in other contexts like algebraic geometry. Unfortunately, it’s not at all clear at first what this definition means geometrically, and why these things deserve being called “tangent vectors”. So I have to ask a little patience.

Now, we take a manifold with structure sheaf. We pick some point and get the stalk of germs of functions at . This is a real algebra, and we define a “tangent vector at ” to be a “derivation at ” of this algebra. That is, is a function satisfying

The first of these conditions says that is a linear functional on . It’s the second that’s special: it tells us that obeys something like the product rule.

Indeed, let’s take a point and consider the operation defined by for any function that is differentiable at . This is linear, since both the derivative and evaluation operations are linear. The product rule tells us that

So satisfies the definition of a “tangent vector at “. Indeed, as it turns out corresponds to what we might normally consider the vector based at pointing one unit in the positive direction.

It should immediately be clear that the tangent vectors at form a vector space. Indeed, the sum of two tangent vectors at is firstly the sum of two linear functionals, which is again a linear functional. To see that it also satisfies the “derivation” condition, let and be tangent vectors at and check

Checking that scalar multiples of tangent vectors at are again tangent vectors at is similar. We write to denote this vector space of tangent vectors at to the manifold .

I want to call attention to one point of notation here, and I won’t really bother with it again. We seem to be using each of and to refer to two different things: a germ in — which is an equivalence class of sorts — and some actual function in for some neighborhood of which represents the germ. To an extent we are, and the usual excuse is that since we only ever evaluate the function at itself, it doesn’t really matter which representative of the germ we pick.

However, a more nuanced view will see that we’ve actually overloaded the notation . Normally this would mean evaluating a function at a point, yes, but here we interpret it in terms of the local ring structure of . Given a germ there is a projection , which we write as .

If all this seems complicated, don’t really worry about it. You can forget the whole last paragraph and get by on “sometimes we use a germ as if it’s an actual function defined in a neighborhood of , and it will never matter which specific representative function we use because we only ever ask what happens at itself.”

As long as we’re in the neighborhood — so to speak — we may as well define the concept of a “local ring”. This is a commutative ring which contains a unique maximal ideal. Equivalently, it’s one in which the sum of any two noninvertible elements is again noninvertible.

Why are these conditions equivalent? Well, if we have noninvertible elements and with invertible, then these elements generate principal ideals and . If we add these two ideals, we must get the whole ring, for the sum contains , and so must contain , and thus the whole ring. Thus and cannot both be contained within the same maximal ideal, and thus we would have to have two distinct maximal ideals.

Conversely, if the sum of any two noninvertible elements is itself noninvertible, then the noninvertible elements form an ideal. And this ideal must be maximal, for if we throw in any other (invertible) element, it would suddenly contain the entire ring.

Why do we care? Well, it turns out that for any manifold and point the algebra of germs of functions at is a local ring. And in fact this is pretty much the reason for the name “local” ring: it is a ring of functions that’s completely localized to a single point.

To see that this is true, let’s consider which germs are invertible. I say that a germ represented by a function is invertible if and only if . Indeed, if , then is certainly not invertible. On the other hand, if , then continuity tells us that there is some neighborhood of where . Restricting to this neighborhood if necessary, we have a representative of the germ which never takes the value zero. And thus we can define a function for , which represents the multiplicative inverse to the germ of .

With this characterization of the invertible germs in hand, it should be clear that any two noninvertible germs represented by and must have . Thus , and the germ of is again noninvertible. Since the sum of any two noninvertible germs is itself noninvertible, the algebra of germs is local, and its unique maximal ideal consists of those functions which vanish at .

Incidentally, we once characterized maximal ideals as those for which the quotient is a field. So which field is it in this case? It’s not hard to see that — any germ is sent to its value at , which is just a real number.

Let’s take the structure sheaves we defined last time and consider the stalks at a point . It turns out that since we’re working with sheaves of -algebras, we can sort of shortcut the messy limit process.

As before, given some open neighborhood of , we let be the algebra of smooth functions — as smooth as is itself — on . Now we define to be the ideal of those functions which vanish on some neighborhood of . Then we define the quotient

Notice that we have effectively pushed our limiting process into the definition of the ideal , where for each open neighborhood of we get an ideal of functions vanishing on . The ideal we care about is the union over all such neighborhoods , and the process of taking this union is effectively a limit.

Anyhow, there’s still the possibility that this depends on the from which we started. But this is actually not the case; we get a uniquely defined algebra no matter which neighborhood of we start from.

Indeed, I say that there is an isomorphism . In the one direction, this is simply induced by the restriction map — if two functions are equal on some neighborhood of in , then they’re certainly equal on some neighborhood of in . And this restriction is just as clearly injective, since if two functions are equivalent in then they must agree on some neighborhood of , which means they were already equivalent in .

The harder part is showing that this map is surjective, and thus an isomorphism. But given , let be an open neighborhood of whose closure is contained in — we can find one since must contain a neighborhood of homeomorphic to a ball in , and we can certainly find within such a neighborhood. Anyhow, we know that there exists a bump function which is identically on and supported within . We can thus define a smooth function on all of by setting inside and elsewhere. Since and agree on the neighborhood of , they are equivalent in , and thus every equivalence class in has a representative coming from .

We write the stalk as , or sometimes if the manifold is clear from context, and we call the equivalence classes of functions in this algebra “germs” of functions. Thus a germ subsumes not just the value of a function at a point , but is behavior in an “infinitesimal neighborhood” around . Some authors even call the structure sheaf of a manifold — especially a complex analytic manifold (which we haven’t really discussed yet) — the “sheaf of germs” of functions on the manifold, which is a little misleading since the germs properly belong to the stalks of the sheaf. Luckily, this language is somewhat outmoded.

Now that we’ve talked a bunch about presheaves and sheaves in general, let’s talk about some particular sheaves of use in differential topology. Given a smooth manifold — for whatever we choose smooth to mean — we can define sheaves of real algebras of real-valued functions for every less-stringent definition of smoothness.

In the first case of a bare topological manifold , we have no real sense of differentiability at all, and so it only makes sense to talk about continuous real-valued functions . Given an open set we let be the -algebra of real-valued functions that are defined and continuous on .

Next, if is a manifold, then it not only makes sense to talk about continuous real-valued functions — we can define just as above — but we can also talk about differentiable real-valued functions. Given an open set , we let be the -algebra of continuously-differentiable real valued functions .

As we increase the smoothness of , we can consider smoother and smoother functions. If is a manifold, we can define — the sheaf of -times continuously-differentiable functions. Given an open set , we let be the -algebra of real-valued functions on with continuous derivatives.

Continuing up the latter, if is a manifold, then we can define all of the above sheaves, along with the sheaf of infinitely-differentiable functions. And if is analytic, we can also define the sheaf of analytic functions.

In each case, I’m not going to bother going through the proof that we actually do get sheaves. The core idea is that continuity, differentiability, and analyticity are notions defined locally, point-by-point. Thus if we restrict the domain of such a function we get another function of the same kind, and pasting together functions that agree on their overlaps preserves smoothness. This doesn’t hold, however, for global notions like boundedness — it’s easy to define a collection of functions on an open cover of , each of which is bounded, which define an unbounded function when pasted together.

For each class of manifolds, the sheaf of the smoothest functions we can define has a special place. If is in class — where can be , any finite whole number, , or — then the sheaf is often just written , and is called the “structure sheaf” of . It turns out that most, if not all, of the geometrical properties of are actually bound up within its structure sheaf, and so this is a very important object of study indeed.

One more construction we’ll be interested in is finding the “stalk” of a presheaf over a point . We want to talk about how a presheaf behaves at a single point, but a single point is almost never an open set, so we need to be a bit creative.

The other thing to be careful is that we’re actually not concerned about behavior at a single point. Indeed, considering the sheaf of continuous functions on a space , we see that at any one point the function is just a real number. What’s interesting is how the function behaves in an infinitesimal neighborhood around the point.

The answer is to use the categorical definition of a limit. Given a point the collection of open neighborhoods of form a directed set, and we can take the limit .

Again, we’d like to understand this in more concrete terms, for when is a set, or a set with some algebraic structure attatched. It turns out that if we unpack all the category theory — basically using the existence theorem — it’s not really that bad.

An element of the stalk is an element of for some neighborhood of . Two elements are considered equivalent if they agree on some common neighborhood of . That is, if we have and , and if there is some so that , then we consider and to be the same element of . They don’t have to be the same everywhere, but so long as they become the same when restricted to a sufficiently small neighborhood of , they’re effectively the same.

Our usual category-theoretical juggling can now reassure us that the stalks of a sheaf of groups are groups, the stalks of a sheaf of rings are rings, and so on, all using this same set-theoretic definition.

So far our morphisms only let us compare presheaves and sheaves on a single topological space . In fact, we have a category of sheaves (of sets, by default) on . But there are also constructions that involve more than one space. The direct image functor is a way of pushing forward a sheaf structure along a continuous map. It’s relatively simple and we may find it useful, so let’s just get it out of the way now.

So, let’s say we have two topological spaces and , and a continuous function . I say that if is a sheaf on , then we can define a “direct image” sheaf on in a natural way. Indeed, given an open set , we know that its preimage is an open subset of . And so it only makes sense to define .

Now you might be thinking, “wait, if the canonical example of a sheaf is a sheaf of functions, shouldn’t we be pulling back?” And this does make a certain amount of sense: given a function on all of we can define a function on all of by composing it with , so it seems that “functions pull back” naturally. It would seem to make sense for us to take a function defined on an open subset , compose it with , and put the resulting into the set corresponding to .

But while this defines sets for all these preimages, not all preimages are of this form! Indeed, we have no idea how to define the elements of an “inverse image” sheaf over a set where is not itself open, and there is no guarantee at all that it will be. There is a way to remedy this problem, using a method called “sheafification”, but that’s a more involved subject I’d rather not dig into quite yet.

As ever, we want our objects of study to be objects in some category, and presheaves (and sheaves) are no exception. But, luckily, this much is straightforward.

Remember that we ended up defining a presheaf as a functor. Given our topological space we set up the partial order category , flipped it around to so the arrows pointed the opposite way, and then said a presheaf of sets is a functor . So the natural home for them is the functor category, where the morphisms are natural transformations.

So what does this mean for our usual case where we consider presheaves of sets, or of sets equipped with some algebraic structure? Well, it means that we map from one presheaf to another one by picking a map for each and every open set: . But these maps must be compatible with the restrictions: if then we must have . That is, given an element in , we can either first restrict it to and then map it by to , or we can first map it by to and then restrict the result to . In either case, we should get the same answer.

For the moment we will be more concerned with presheaves, but we may as well go ahead and define sheaves. These embody the way that not only can we restrict functions to localize them to smaller regions, but we can “glue together” local functions on small domains to define functions on larger domains. This time, let’s start with the fancy category-theoretic definition.

For any open cover of an open set , we can set up the following diagram:

Let’s talk about this as if we’re dealing with a sheaf of sets, to make more sense of it. Usually our sheaves will be of sets with extra structure, anyway. The first arrow on the left just takes an element of , restricts it to each of the , and takes the product of all these restrictions. The upper arrow on the right takes an element of and restricts it to each intersection . Doing this for each we get a map from the product over to the product over all pairs . The lower arrow is similar, but it takes an element in and restricts it to each intersection . This may look the same, but the difference in whether the original set was the first or the second in the intersection makes a difference, as we shall see.

Now we say that a presheaf is a sheaf if and only if this diagram is an equalizer for every open cover . For it to be an equalizer, first the arrow on the left must be a monomorphism. In terms of sets, this means that if we take two elements and so that for all $latex , then . That is, elements over are uniquely determined by their restrictions to any open cover.

The other side of the equalizer condition is that the image of the arrow on the left consists of exactly those products in the middle for which the two arrows on the right give the same answer. More explicitly, let’s say we have an for each , and let’s further assume that these elements agree on their restrictions. That is, we ask that . If this is true for all pairs , then the product takes the same value under either arrow on the right. Thus it must be in the image of the arrow on the left — there must be some so that . In other words, as long as the local elements “agree” where their domains overlap, we can “glue them together” to give an element .

Again, the example to keep in mind is that of continuous real-valued functions. If we have a continuous function and another continuous function , and if for all , then we can define by “gluing” these functions together over their common overlap: if , if , and it doesn’t matter which we choose when because both functions give the same value there.

So, a sheaf is a presheaf where we can glue together elements over small domains so long as they agree when restricted to their intersections, and where this process defines a unique element over the larger, “glued-together” domain.

About this weblog

This is mainly an expository blath, with occasional high-level excursions, humorous observations, rants, and musings. The main-line exposition should be accessible to the “Generally Interested Lay Audience”, as long as you trace the links back towards the basics. Check the sidebar for specific topics (under “Categories”).

I’m in the process of tweaking some aspects of the site to make it easier to refer back to older topics, so try to make the best of it for now.