The intuition is that the Leibnitz Rule at $P$ forces the linear operator $\partial:f \mapsto \partial(f) \in \mathbf{R}$ not just to kill constants but also to depend on $f$ only to first order at $P$ (and hence to be a directional derivative in local coordinates: $\partial(f) = D_v(f)$ where $v = \sum \partial(x_i) e_i$), as a directional derivative should. Indeed, if $f$ vanishes to first order at $P$ then for local coordinates $(x_i)$ with $x_i(P) = 0$, Taylor's theorem (the real content!) implies $f = \sum x_i x_j h_{ij}$ for smooth $h_{ij}$ near $P$, so $\partial(f) = 0$ by Leibnitz.
–
user30180Dec 28 '12 at 4:02

4

You pose the question as a question about calculus on a manifold, but is anything really lost by focusing on calculus on the reals? On the reals, either the product rule or the chain rule can be taken as the fundamental defining property of the derivative. The chain rule implies the product rule in this sense: mathoverflow.net/questions/108773/… The product rule also implies the chain rule: mathoverflow.net/questions/44774/…
–
Ben CrowellDec 28 '12 at 4:31

If you view point-derivations as operators on global smooth functions rather than on germs of smooth functions near $P$ (the latter is what I used in my preceding comment), it's still OK: the Leibnitz Rule forces $\partial(f)$ for global smooth $f$ to only depend on the germ of $f$ around $P$ (and so via bump functions, the distinction between global smooth functions and germs at $P$ doesn't matter). Indeed, if a global $f$ vanishes near $P$ then $f = (1-\phi) f$ for a smooth global bump function $\phi$ that is supported near $P$ and equals 1 near $P$, so $\partial(f)=0$ by Leibnitz.
–
user30180Dec 29 '12 at 0:59

I think that @ayanta's answer is the one that is closest to what I was aiming to (it is possible that other answers are also good but I lack the knowledge to understand them). I only wonder if the Leibnitz rule is the only one forcing linear operators to depend only on first order, or if there are other equivalents.
–
R SJan 1 '13 at 5:20

7 Answers
7

So I hope you're convinced that a directional derivative should satisfy the product rule. In that case, it's fine to postulate that a directional derivative must satisfy the product rule. The question then is: Why do we assume the Leibniz rule (+linearity) and nothing else?

I would suggest the answer is simply that it gives you the correct result. I.e., you might think that there are too many functionals* that satisfy the Leibniz rule, and we have to impose some other condition to ensure that our functionals are directional derivative operators. But one can prove that the set of functionals that satisfy the Leibniz rule has dimension equal to the dimension of the manifold, so you can be confident that you don't have too many.

Here's a different definition of the tangent space that I find more intuitive. If you have a point $P$ on your manifold $M$ and a smooth curve $\gamma:\mathbb{R} \to M$ such that $\gamma(0)=P$, then this curve defines a functional on functions around $P$. I.e., if $f$ is a smooth real-valued function defined in a neighborhood of $P$, then we get $f \circ \gamma:\mathbb{R} \to \mathbb{R}$. We can then differentiate this function at $0$ to get a number, which we denote $r_\gamma(f)$.

We can then define the tangent space to be the set of linear functionals (on the space of smooth functions on $M$ in a neighborhood of $P$) that arise from smooth paths, as above (i.e. the set of functionals of the form $r_\gamma$ for some $\gamma$). Notice that this purposely is not the set of paths; two paths give rise to the same functional iff they're tangent to each other. This is an equally valid way of defining the tangent space.

The reason the Leibniz rule suffices is then the fact that every functional that satisfies the Leibniz rule actually arises as the directional derivative with respect to some smooth path.

In fact, you can ignore functionals entirely. You can define the tangent space to be the set of smooth paths through $P$ modulo the following equivalence relation. Two paths are equivalent if in some coordinate patch around $P$, they have the same tangent vector (in the classical sense). Note that this is independent of coordinate path, so it's well-defined. The only reason we cannot just take tangent vectors in some coordinate patch is that this is not coordinate-independent, whereas these equivalence classes of paths are.

It just so happens two paths are equivalent in the sense I just described iff the define the same functional on the space of smooth functions on $M$, which is why we can use functionals to define the tangent space, and this is the same as the definition above. But I find the definition in terms of equivalence classes of paths to be the most natural.

*By a functional, I mean a linear map from the vector space of smooth real-valued functions defined in a neighborhood of $P$ to $\mathbb{R}$ that depends only on the values of the function in a neighborhood of $P$

This is only a good definition in situations where you expect tangent vectors to actually extend to paths in some reasonable sense. In algebraic geometry, for example, this is often false.
–
Qiaochu YuanDec 28 '12 at 1:53

1

+1 for using tangency classes of paths instead of derivation as the definition of tangent vectors. Personally I think it makes calculations easier but that's a subjective judgement not a mathematical one.
–
Michael MurrayDec 28 '12 at 4:15

This is a great answer. I would only add that if you don't like thinking about the collection of all smooth paths, you can simply think of vectors based at a point in any coordinate chart. Any such vector gives a directional derivative you can apply to any smooth function at the point, and you consider two such vectors to be the same if their directoinal derivatives happen to agree on all functions. This turns out to be equivalent to saying that the (total) derivative of the change-of-coordinates diffeomorphism between the two charts maps one vector to the other.
–
Eric WofseyDec 28 '12 at 4:53

For the sake of reference: W. Kühnel provides in Differential Geometry - Curves, Surfaces, Manifolds (in 5B) three possible ways to defined the tangent space. The one given here is one of them.
–
Dror AtariahDec 31 '12 at 10:00

Let $A$ be an algebra over a field $k$, such as $C^{\infty}(M)$ for $k = \mathbb{R}$. You should think of "points" as meaning $k$-algebra homomorphisms $A \to k$ and "one-parameter families of points" as meaning $k$-algebra homomorphisms $A \to k[t]$ (at least in a more algebraic setting). The intuitive meaning of "tangent vector" is "infinitesimal one-parameter family of points," and algebraically this means a morphism $A \to k[t]/t^2$. The Leibniz rule is equivalent to the statement that this homomorphism preserves products.

The more fundamental property is really the chain rule, but note that linearity and the Leibniz rule are equivalent to the chain rule for polynomials, and in an algebraic setting polynomials are the only things available. In a less algebraic setting, e.g. smooth manifolds, it's actually more natural to require the chain rule for all smooth functions; this is closely related to the idea that $C^{\infty}(M)$ is not really an algebra but a smooth algebra.

The question is about differential geometry, by someone just getting started. How is introducing the consideration of $\mathbf{R}$-algebra homomorphisms from $C^{\infty}(M)$ into $\mathbf{R}[t]$ supposed to illuminate the situation for such a person?
–
user30180Dec 28 '12 at 2:17

7

@ayanta: it is a good example of a powerful general pattern in mathematics, roughly encapsulated by the slogan "algebra is dual to geometry." Most people encounter it first in algebraic geometry but I think the realization that it can be used to make sense of certain parts of differential geometry is also valuable. If you disagree, well, you can write your own answer. It is okay to have different answers with different points of view, and answers are not only for the person who asked the question.
–
Qiaochu YuanDec 28 '12 at 3:48

5

@Qiaochu: I agree that "algebra is dual to geometry" is a useful principle for those with enough experience to appreciate its non-formal aspects, but I also think that answers should (try to) be understandable to the person who asked the question (even if partially aimed at a wider audience).
–
user30180Dec 28 '12 at 5:02

One way to think about derivations is as infinitesimal automorphisms of the ring of functions: If $T$ is an operator on functions that is very close to the identity, then write $Tf=f+Df$; if $T$ preserves multiplication as well as addition then $T(fg)=(Tf)(Tg)$ so $D(fg)=(Df)g+f(Dg)$ approximately.

You don't need to assume the Leibniz rule, but can use a linearity property instead. I addressed this in comments to my answer at Different ways of thinking about the derivative, but this may be a good place to paste those comments together into a single answer.

The set of all derivations at a point $P$ on a manifold can be thought of as the dual space of $M/M^2$, where $M$ is the maximal ideal of smooth functions vanishing at $P$. That description of the tangent space at $P$, as a dual space to $M/M^2$, makes no mention at all about the product rule, but from the special features of the manifold setting the linearity that is built into the definition of a dual space can indeed be converted into a derivation (something linear with a product rule).

Here's a general setup. Let $A$ be a local ring with maximal ideal $M$ and set $F = A/M$. (Think $A$ = local ring of smooth functions at point $P$ on a manifold and $M$ = the smooth functions on the manifold that vanish at $P$, so $F$ is essentially the field of real numbers as constant terms of series expansions for functions at $P$.) Let's now make an assumption that is true for local rings arising in the setting of manifolds but is not true for all local rings: the field $F = A/M$ naturally lies inside the ring $A$. More precisely, let's suppose there is a homomorphism $F \rightarrow A$ such that the composite $F \rightarrow A \rightarrow A/M$ is the identity. (For rings of functions at a point on a manifold this is very natural since the real numbers sit inside $A$ as constant functions. But not every local ring naturally contains its residue field, such as a local ring of char. 0 whose residue field has characteristic $p$.) We will show with this assumption that any $F$-linear map $f \colon M/M^2 \rightarrow F$ can be turned into a derivation $A \rightarrow F$.

From the assumption we get a direct sum decomposition $A = F \oplus M$. This direct sum decomposition is obvious geometrically for a smooth function $h(x)$ around a point $a$, since it comes from writing $h(x) = h(a) + (h(x) - h(a))$, with $h(a)$ a real number an $h(x) - h(a)$ vanishing at $a$. OK, now let $f$ be any element of the dual space of $M/M^2$, i.e., $f \colon M/M^2 \rightarrow F$ is an $F$-linear map (since $F = A/M$, $M/M^2$ is naturally an $F$-vector space). Saying the function $f$ is $F$-linear is the same as saying it's $A$-linear since both $M/M^2$ and $F = A/M$ are $A$-modules and in both cases $M$ multiplies on them like 0. Since $M$ is an $A$-module (it's an ideal in $A$!), an $A$-linear map $f \colon M/M^2 \rightarrow F$ can be pulled back to an $A$-linear map $M \rightarrow F$ which is 0 on $M^2$. Let us extend $f$ to a map $A \rightarrow F$ by just declaring it to act on the direct sum $A = F \oplus M$ as 0 on the $F$-summand. That is, we simply define
$$f(c+m) = f(m)$$ for any $c$ in $F$ (thought of as a subfield of $A$) and $m$ in $M$.
We are calling this extended function $f$ as well, so we have turned $f \colon M/M^2 \rightarrow F$ into a function $A \rightarrow F$ still denoted $f$.

Now we have a function $f \colon A \rightarrow F$ that is $A$-linear and it is 0 on both $F$ and $M^2$. It's time to see that $f$ satisfies the product rule! For $a$ and $b$ in $A$, write $a = c + m$ and $b = d + n$ where $c, d$ are in $F$ and $m, n$ are in $M$. Then $f(a) = f(m)$ and $f(b) = f(n)$. We have $$f(ab) = f((c+m)(d+n)) = f(cd+cn+dm + mn) = f(cd) + f(cn) + f(dm) + f(mn).$$ Since $f$ is 0 on $F$ and $M^2$, we get $f(ab) = f(cn) + f(dm) = cf(n) + df(m) = cf(b) + df(a)$.

Summary: Let $A$ be a local ring with maximal ideal $M$ and assume there is an embedding $A/M \rightarrow A$ such that the composite $A/M \rightarrow A \rightarrow A/M$ is the identity map (first map is hypothetical embedding, second is reduction mod $M$). Then any $A/M$-linear map $M/M^2 \rightarrow A/M$ can be turned into an $A$-linear derivation $A \rightarrow A/M$. The essential feature we have on manifolds is that for the local ring $A$ of smooth functions at any point, the residue field $A/M$ naturally lives inside $A$. This is because the residue field of the local ring at any point is isomorphic to ${\mathbf R}$, and we can naturally think of ${\mathbf R}$ inside the local ring at the point as the constant functions at that point.

The assumption that the residue field of a local ring has an embedding back into the local ring and this embedding is a section to the natural reduction homomorphism $A \rightarrow A/M$ could be said in a more down-to-earth way: there is a subfield of $A$ that is a set of representatives for the residue field $A/M$. That is, there's some field $K$ inside $A$ such that the natural map $A \rightarrow A/M$ is onto when we restrict it to $K$. If that's the case, the map $K \rightarrow A/M$ is onto and it's automatically one-to-one since any ring hom. out of a field is one-to-one. Thus $K$ is isom. to $A/M$ so the inverse isomorphism $A/M \rightarrow K$ is a section to $A \rightarrow A/M$.

In the setting of manifolds, $A/M$ is the real numbers and $K$ is always the set of constant functions on a neighborhood of the point of interest, so $K$ is a copy of $A/M$ sitting inside $A$.

Introducing local rings and $M^2$ is too abstract. Your basic point can be said more simply: any $\mathbf{R}$-linear assignment $\partial:f \mapsto \partial(f) \in \mathbf{R}$ on smooth functions $f$ near $P$ which kills constants and depends on $f$ only "to first order at $P$" (i.e., kills any $f$ whose Taylor series at $P$ in some - hence, any - local coordinate system begins in degree $\ge 2$), as we'd expect of "generalized directional derivatives", necessarily is a directional derivative in local coordinates and so satisfies the Leibnitz Rule at $P$. But this doesn't answer the question!
–
user30180Dec 28 '12 at 3:51

The derivation property
is the simplest possible property; all other are elaborations:
Let $M$ be a smooth manifold. Then we have in the category of commutative $\mathbb R$-algebras with unit (see p 296 of the reference below
for more details):

$Hom(C^\infty(M,\mathbb R),\mathbb R) = M$

$Hom(C^\infty(M,\mathbb R),\mathbb D) = TM$ where $\mathbb D$ (generated by $1,\epsilon$ with $\epsilon^2=0$) is the algebra of study numbers.

$Hom(C^\infty(M,\mathbb R),\mathbb D\otimes \mathbb D) = T(TM)$

For any finite dimensional commutative $\mathbb R$-algebra $A$ we have:
$Hom(C^\infty(M,\mathbb R),A) = Hom(C^\infty(M,\mathbb R),W(A))$ is a smooth manifold and describes a product preserving covariant endofunctor $T_{W(A)}$ on the category of smooth manifolds and smooth mappings. Here $W(A)$ is the subsalgebra of $A$ generated by all idempotent and nilpotent elements. Up to some covering space phenomena all product preserving functors are of this form. $W(A)$ is called the Weil algebra of $A$.

In item 2 you write "algebra of study numbers". That should be "algebra of dual numbers".
–
KConradDec 28 '12 at 18:48

1

@KConrad: Eduard Study introduced a variant of the quaternions in which the dual numbers appear; for this reason, in older literature one does see the dual numbers called "Study numbers".
–
user30180Dec 29 '12 at 0:44

If this is your first time doing differential geometry you should do the calculation that ayanta is referring to in their comment. Let $X \colon C^\infty(M) \to \mathbb{R}$ be a linear map satisfying the Leibniz rule
$$
X(fg) = X(f) g(x) + f(x) X(g)
$$
for all $f, g, \in C^\infty(M)$. Show that if $\psi = (\psi^1, \dots, \psi^n)$ are co-ordinates in any neighbourhood of $x$ then there are real numbers $X^1, \dots, X^n$ such that for all $f \in C^\infty(M)$ we have
$$
X(f) = \sum_{i=1}^n X^i \frac{\partial (f \circ \psi^{-1})}{\partial \psi^i} (\psi(x)) .
$$

Also a good exercise to show that for the same $X$ there is a smooth function $\gamma \colon (-\epsilon, \epsilon) \to M$ such that $\gamma(0) = x$ and for any $f \in C^\infty(M)$ we have
$$
X(f) = \frac{ d (f \circ \gamma) }{ dt}(0).
$$
This connects you to tangent vectors thought of as equivalence classes of curves as Davidac897 discusses.

I just want to quickly mention a connection (sorry about the pun) that hasn't been mentioned yet: the product rule may be thought of as being equivalent to integration by parts which in turn allows us extend the notion of differentiation to larger function spaces (e.g., in analysis of PDEs and the theory of distributions therein; see: distribution).

I should also note that defining the tangent space / differentiation via curves is equivalent to defining them via charts and atlases.

Finally, linearity + Leibniz's (product) rule yield all the basic properties we require for differentiation (and actually there aren't any other properties that are "left over"!). See Ben Crowell's comment and links therein.