I suppose this question can be interpreted in two ways. It is often the case that two or more equivalent (but not necessarily semantically equivalent) definitions of the same idea/object are used in practice. Are there examples of equivalent definitions where one is more natural or intuitive? (I'm meaning so greatly more intuitive so as to not be subjective.)

Alternatively, what common examples are there in standard lecture courses where a particular symbolic definition obscures the concept being conveyed.

This definition of the product topology is really not so bad when you correct the typos and translate it to words: it says every point X x Y should have an open neighborhood that's a product of open sets in X and in Y. What's wrong with that?
–
Jonathan WiseDec 2 '09 at 16:52

10

Well, the natural generalization of that definition is the box topology, whereas the natural generalization of Daniel's definition is the (categorical) product topology.
–
Qiaochu YuanDec 2 '09 at 17:07

9

My second comment: (2) The definition in terms of open sets is spiritually a construction, not a definition. It may be described as "a construction in terms of open sets that works only for finite products". The definition in terms of coarsest topology is a genuine definition, and is generally accepted as the correct definition, but it doesn't give you a construction. The genuine definition gives you much more intuition about the product, but sometimes you need a construction. Some of my fellow category theorists regard that bit about needing a construction as a heresy.
–
SixWingedSeraphDec 2 '09 at 17:24

7

The definition in terms of a coarsest topology gives you a perfectly valid construction: take the inverse image of every open set.
–
Qiaochu YuanDec 2 '09 at 17:28

7

This general point about definitions needs to be made: The definition is intended to give a (more or less) minimal technical description of the concept that implies all true theorems about the concept and nothing else. It doesn't matter if the definition emphasizes technical aspects and doesn't mention some big intuitive ideas about it. That's not what definitions are for. A teacher should provide many ways to think about the concept, some of which might constitute definitions.
–
SixWingedSeraphDec 5 '09 at 1:35

23 Answers
23

Many topics in linear algebra suffer from the issue in the
question. For example:

In linear algebra, one often sees the determinant of a
matrix defined by some ungodly formula, often even with
special diagrams and mnemonics given for how to compute it
in the 3x3 case, say.

det(A) = some horrible mess of a formula

Even relatively sophisticated people will insist that
det(A) is the sum over permutations, etc. with a sign for
the parity, etc. Students trapped in this way of thinking
do not understand the determinant.

The right definition is that det(A) is the volume of the
image of the unit cube after applying the transformation
determined by A. From this alone, everything follows. One
sees immediately the importance of det(A)=0, the reason why
elementary operations have the corresponding determinant,
why diagonal and triangular matrices have their
determinants.

Even matrix multiplication, if defined by the usual
formula, seems arbitrary and even crazy, without some
background understanding of why the definition is that way.

The larger point here is that although the question asked about having a single wrong definition, really the problem is that a limiting perspective can infect one's entire approach to a subject. Theorems,
questions, exercises, examples as well as definitions can be coming
from an incorrect view of a subject!

Too often, (undergraduate) linear algebra is taught as a
subject about static objects---matrices sitting there,
having complicated formulas associated with them and
complex procedures carried out with the, often for no
immediately discernible reason. From this perspective, many
matrix rules seem completely arbitrary.

The right way to teach and to understand linear algebra is as a fully dynamic
subject. The purpose is to understand transformations of
space. It is exciting! We want to stretch space, skew it,
reflect it, rotate it around. How can we represent these
transformations? If they are linear, then we are led to
consider the action on unit basis vectors, so we are led
naturally to matrices. Multiplying matrices should mean
composing the transformations, and from this one derives
the multiplication rules. All the usual topics in
elementary linear algebra have deep connection with
essentially geometric concepts connected with the
corresponding transformations.

"Even relatively sophisticated people will insist that det(A) is the sum over permutations..." Yes indeed. How else do you prove that SL(n) is an algebraic group? How you want to think of a determinant depends on the situation.
–
JS MilneDec 9 '09 at 6:34

33

Of course you are right, and perhaps my post is a bit of a rant! I apologize. (But surely it is implicit in the question that all the equivalent formulations of a definition might find a suitable usage.) My point is that in an undergraduate linear algebra class, a computational approach to the determinant obscures its fundamental geometric meaning as a measure of volume inflation. The permutation sum definition is especially curious in an undergraduate course, because the method is not feasible (exponential time), whereas other methods, such as the LU decomposition, are polynomial time.
–
Joel David HamkinsDec 10 '09 at 14:09

8

Victor, of course I mean the signed volume. And I don't think it is so difficult. One can even treat it axiomatically: inflating one dimension by a factor multiplies the volume by that factor; swapping two coordinates reverses orientation; skews do not change volume. From these principles one can derive the usual formulas, while also providing a feasible means to compute it.
–
Joel David HamkinsJun 2 '10 at 21:42

20

Another way to say it "axiomatically" is that the determinant is the induced endomorphism of the top exterior power of the vector space. Of course, it probably shouldn't be defined that way in a first linear algebra course!
–
Mike ShulmanJul 16 '10 at 3:14

8

This reminds me of a story recounted by a friend of mine in graduate school. He spent a lot of time in the department, and one evening was approached by an undergraduate taking a fancy class that had introduced the trace of a linear transformation in the slick coordinate-free manner. This undergraduate had been tasked with computing the trace of a certain $2\times 2$ matrix and had no idea how to proceed.
–
RamseyApr 25 '12 at 4:58

Here's another algebra peeve of mine. The definition of a normal subgroup in terms of conjugation is pretty strange until it's explained that normal subgroups are the ones you can quotient by. Again, in my opinion I think normal subgroups should be introduced as kernels of homomorphisms from the get-go.

I totally agree with this and always tell students to think of "kernel of some homomorphism" as the definition and "closed under conjugation by any element of G" as a fact that can be shown to be equivalent to it.
–
gowersDec 5 '09 at 22:35

23

I agree that as soon as define "normal subgroup" you should prove that they are exactly the kernels of homomorphisms, but in some situations (e.g., algebraic groups) its hard to show that normal subgroups are kernels and in other situations (e.g., group schemes) they aren't.
–
JS MilneDec 9 '09 at 6:21

6

Let us also remember that homomorphisms gained a foothold more than a century after normal subgroups. You need the idea of an abstract group in order for the quotients and homomorphism theorems to make sense (which is also the metamathematical reason behind the difficulties mentioned by JS Milne).
–
Victor ProtsakMay 28 '10 at 1:27

In my experience, introductory algebra courses never bother to clarify the difference between the direct sum and the direct product. They're the same for a finite collection of abelian groups, which in my opinion gets confusing.

Of course, they're quite different for infinite collections. I think students should be taught sooner rather than later that the first is the coproduct and the second is the product in $\text{Ab}$. This clarifies the constructions for non-abelian groups as well, since the direct product remains a product in $\text{Grp}$ but the coproduct is very different!

I do remember having to explain the difference to many confused people. It is quite confusing to see lecturers use the two interchangeably without mention.
–
Sam DerbyshireDec 2 '09 at 17:29

6

I'll vouch for this as a guinea pig: I did learn them as separate concepts my freshman year, and learned to be careful even in the case of two spaces. That gave me a good intuition about when to mistrust finite-case defined products (e.g. the box topology, above) because they "could be defined differently" in the infinite case. I didn't need to know the categorial definitions that early on.
–
Elizabeth S. Q. GoodmanDec 4 '09 at 5:30

10

even in the finite case, products and coproducts differ because they are NOT only the objects, but come together with structure morphisms (as every universal object).
–
Martin BrandenburgMay 24 '10 at 21:45

1

Indeed I think one has to grasp the distinction between direct sum and direct product to truly appreciate their isomorphy in most cases.
–
darij grinbergMay 28 '10 at 10:05

Not to mention that in $Grp$ the direct sum and the coproduct are also two different things. (For finitely many summands, the direct sum is again the same as the direct products; for infinitely many summands, the direct sum is neither the product nor the coproduct, although it still has a more colimitish flavour.)
–
Toby BartelsApr 4 '11 at 4:22

I increasingly abhor the introduction of the finite ring $Z_n$ not as $\mathbb{Z}/n\mathbb{Z}$ but as the set $\{0,\ldots,n-1\}$ with "clock arithmetic". (I understand that if you want to introduce modular arithmetic at the high school level or below, this is the way to go. I am talking about undergraduate abstract algebra textbooks that introduce the concept in this way.)

Two problems:

1) Using clocks to motivate addition modulo $n$: excellent pedagogy. Be sure to mention military time, which goes from $0$ to $23$ instead of $1$ to $12$ twice. But...using clocks to motivate multiplication modulo $n$: WTF? Time squared?? Mod $24$??? It's the worst kind of pedagogy: something that sounds like it should make sense but actually doesn't.

Of course soon enough you stop clowning around and explain that you just want to add/subtract/multiply the numbers and take the remainder mod $n$. This brings me to:

2) Many texts define $Z_n$ as the set $\{0,\ldots,n-1\}$ and endow it with addition and multiplication by taking the remainder mod $n$. Then they say that this gives a ring. Now why is that? For instance, why are addition and multiplication associative operations? If you think about this for a little while, you will find that all explanations must pass through the fact that $\mathbb{Z}$ is a ring under the usual addition and multiplication and the operations on $Z_n$ are induced from those on $\mathbb{Z}$ by passing to the quotient. You don't, of course, have to use these exact words, but I do not see how you can avoid using these concepts. Thus you should be peddling the homomorphism concept from the very beginning.

As a corollary, I'm saying: the concept of a finite ring $Z_n$ for some generic $n$ is more logically complex than that of the one infinite ring $\mathbb{Z}$ (that rules them all?). A lot of people seem, implicitly, to think that the opposite is true.

While I agree with the premise here, you can motivate multiplication mod n by performing $a$ tasks which each require $b$ hours...what time will it be at the end? Of course, this is really doing (integer) * (residue) and not (residue)*(residue), but then you have them observe that if you do your task $n$ times, $b$ is irrelevant, and remarkably, all that matters is how many times you perform the task mod n!!
–
Cam McLemanMay 27 '10 at 16:52

12

Also, it's probably worth noting that strictly speaking addition on a clock doesn't make much sense either; an actual clock is not the group $Z_n$, but rather the action of that group on itself.
–
Harry AltmanMay 27 '10 at 22:09

10

@Harry I'm glad someone said it! Times form an affine space. You don't add times. "3 o'clock plus 4 o'clock" means nothing. The thing you add are time intervals. Time intervals are measured by stopwatches. Stopwatches with hands generally don't wrap at 12 or 24 hours.
–
Dan PiponiJul 15 '10 at 22:34

I have seen students who must have been exposed to the introduction to $Z_n$ that Pete warns of and who think that they are specifying a $\mathbb{Z}$-module homomorphism $\{0,1,2\}\rightarrow \{0,1,2,3,4,5\}$ by setting $i\mapsto i$. This to me is the ultimate reason to avoid introducing $Z_n$ as the set $\{0,\ldots,n-1\}$.
–
Alex B.Oct 23 '10 at 16:07

One that I particularly dislike is the definition of an action of a group G on a set X as being a function $f:G\times X\rightarrow X$ that satisfies certain properties. I cannot understand why anybody gives this definition when "homomorphism from G to the group of permutations of X" is not only easier to understand but is also how one thinks about group actions later.

Why? They are really different forms of the same conjecture, and so might as well be given together. In some situations only the $G\times X\to X$ definition makes sense, for example, for an algebraic group G acting on a variety X (the automorphism group of X isn't an algebraic group). In other situations it's easier: when $G$ and $X$ have topologies, it's easier to say that $G\times X\to X$ is continuous than to first define a topology on Aut(X).
–
JS MilneDec 9 '09 at 6:29

7

@gowers: Interesting, I think of the "$f$" version as the more natural of the two. What's an action? You take a $g\in G$ and an $x\in X$, and you get a new $x'\in X$. That's precisely encoded by f. Taking in a $g\in G$ and outputting "a function which sends $x$'s to $x'$'s" seems to me to obfuscate the matter.
–
Cam McLemanMay 27 '10 at 16:41

4

Also the definition of a torsor via $G \times X \rightarrow X \times X$ needs the version you dislike...
–
Peter ArndtMay 27 '10 at 19:28

14

@ David: What's the difference between ‘you can multiply by scalars in a ring’ and ‘with a map $R \times M \to M$’?
–
Toby BartelsApr 4 '11 at 4:33

3

One place that you must think about it this way is in the theory of Poisson actions. There, $X$ is a Poisson manifold, and you could consider the group $Aut(X)$ of ichthyomorphisms of it, but unless $G$ has a trivial Poisson structure the action of $G$ on $X$ is not by ichthyomorphisms. That is, each $g\in G$ does not preserve $X$'s Poisson structure. This is reflected in the fact that $g\in G$ is usually not a Poisson submanifold. All that one has is the map $G\times X \to X$.
–
Allen KnutsonMar 15 '12 at 13:57

One of my biggest annoyances is professors or books which fail to adequately distinguish between prime and irreducible elements of a ring, Herstein if I remember correctly being a (ha ha) prime example of this. The fact that these are the same in Z, where people first learn about unique factorization, doesn't help matters.

Or that we say "irreducible" when talking about polynomials.
–
Qiaochu YuanDec 2 '09 at 17:34

19

It probably doesn't help that the "usual" definition of a prime number is as an irreducible element of the rig $N$ of natural numbers...
–
Yemon ChoiDec 2 '09 at 17:36

1

What is "prime element of a ring"? An element which generates a prime principal ideal?
–
Victor ProtsakMay 28 '10 at 1:42

2

Ahhhhhhhhhhhhhhhhhhhhhhhhhhh. I felt the same way ;)
–
David CorwinJul 15 '10 at 19:11

7

@ Yemon: Yes, the word ‘prime’ has been mangled beyond all recognition. Imagine telling Fermat that 1 is no longer prime, but 0 is! (I was going to imagine telling Pythagoras, before I remembered that he doesn't even know about 0 in the first place.)
–
Toby BartelsApr 4 '11 at 4:25

Some presentations start with Definition 1, which is entirely uninformative: nothing in it explains why on earth we bother discussing this. In contrast, Definition 2 says exactly what "independent" means: knowing that B has occured does not change the probability that A occurs as well.

A reasonable introduction to the subject should start with Definition 2; then observe there is an issue when P(B)=0, and resolve it; then observe independence is symmetric; then derive Definition 1.

Resolve it how, short of a serious digression into conditional expectation?
–
Jeff HussmannMay 27 '10 at 20:02

8

One cannot "resolve" this. If B has probability zero, any conditional probability is admissible. I think a natural approach would be showing that 2. implies 1. when P(B)>0 and then abstract and generalize to 1. The second definition is simply not workable.
–
Michael GreineckerJul 15 '10 at 20:33

1

@Mike: That works, but you can only do it while forgetting about sigma-algebras in nice enough spaces. See S. M. Samuels, The Radon-Nikodym Theorem as a Theorem in Probability. jstor.org/pss/2321055 I think "nice enough" is "Borel" but I can't verify that at this computer.
–
Neil TorontoApr 11 '11 at 15:38

1

Alternatively, if you take 2 to be the definition; you must conclude that impossible events are not independent. This actually makes sense, because it is really the same thing as the catastrophe that any false statement implies everything else. So if an impossible thing happened, then it would naturally follow that everything else would happen and so they are not logically (or probabilistically) independent.
–
MikolaMay 21 '11 at 1:05

4

@MichaelGreinecker, irrespective of its mathematical correctness, the second definition is the right way of motivating the whole thing. We mathematicians need to stop trying to present everything perfect the first-time around. Give the reader a working definition, discuss its problems, and use that to motivate the "proper" definition. That is how you engage the reader.
–
goblinJul 12 '14 at 9:01

It's a historical relic in some sense -- topologists were so concerned by naturality, whether manifolds have combinatorially distinct triangulations and issues such as that, that they decided those preoccupations were more important than imparting a solid foundational intuition as to what a homology class is.

In my experience, people who see Poincare's proof of Poincare duality first vs. the people who see a singular homology exposition usually have a far better command of what is actually going on, to the point where they view Poincare duality is something light and natural, while most students that see it through the eyes of singular homology more often see it as something distant and intractible.

And all that effort is for what? So students can know Poincare duality is true on topological manifolds, when all the examples they've seen are smooth manifolds.

edit: my preferred way to describe Poincare's proof is to modernize it a tad. Your set-up is a triangulated manifold $M$, then you construct the dual polyhedral decomposition (a CW-decomposition) so that the (simplicial) $i$-cells of $M$ are in bijective correspondence with the (dual polyhedral) $m-i$-cells of $M$. This is much more straightforward than living in the simplicial world. Then you show that (up to a sign change) the chain complex for the simplicial homology is the chain complex for the cohomology of the dual polyhedral decomposition. The fussiest bit is keeping track of the orientations in the orientable case.

I don't really understand, what you want to change concretely. What definition of homology do you prefer? In my opinion, your way of presenting Poincare duality is indeed more intuitive (so it is surely not wrong to give the students an idea of it), but has at least 3 disadvantages: 1) You first have to prove that smooth manifolds can be triangulated. 2) You have to show that the isomorphism does not depend on the choices (in some sense). 3) These ideas do not generalize well to other situations like more sophisticated dualities or the Thom iso (I think).
–
Lennart MeierMay 27 '10 at 15:33

5

Another good intuitive proof of Poincare duality (in the sense of equality of Betti numbers) is via Morse theory: replace $f$ with $-f.$
–
Victor ProtsakMay 28 '10 at 1:39

3

@Meier, Re (1) proving that manifolds have triangulations is at least as fundamental as any homology of fundamental group construction with manifolds so this seems totally natural to me. (2) depends on what applications you're interested in. After Poincare duality is set up properly there are many alternative formulations you can give it -- once there is a firm foundation in place. Re (3), the search for generality is essentially the complete opposite point of my post. To a student there's little point generalizing something for which there's little initial grasp.
–
Ryan BudneyMay 28 '10 at 22:53

2

@Victor, actually using the replace f by −f trick you see than on an oriented manifold the Morse complex is isomorphic to its dual and an orientation is required to construct the map. Depending on whether you want to carefully give the construction of the Morse complex or prove the existence of a triangulation both methods give a concrete picture of the dual cocycle but require a fair amount of geometric work.
–
Tom MrowkaApr 4 '11 at 13:24

I normally won't bother with a 5 month old community wiki, but someone else bumped it and I couldn't help but notice that the significant majority of the examples are highly algebraic. I wouldn't want the casual reader to go away with the impression that everything is defined correctly all the time in analysis and geometry, so here we go...

1) "A smooth structure on a manifold is an equivalence class of atlases..." Aside from the fact that one hardly ever works directly with an explicit example of an atlas (apart from important counter-examples like stereographic projections on spheres and homogeneous coordinates on projective space), this point of view seems to obscure two important features of a smooth structure. First, the real point of a smooth structure is to produce a notion of smooth functions, and the definition should reflect that focus. With the atlas definition, one has to prove that a function which is smooth in one atlas is also smooth in any equivalent atlas (not exactly difficult, but still an irritating and largely irrelevant chore). Second, it should be clear from the definition that smoothness is really a local condition (the fact that there are global obstructions to every point being "smooth" point is of course interesting, but also not the point). The solution to both problems is to invoke some version of the locally ringed space formalism from the get-go. Yes, it takes some work on the part of the instructor and the students, but I and a number of my peers are living proof that geometry can be taught that way to second year undergraduates. If you still don't believe there are any benefits, try the following exercise. Sit down and write out a complete proof that the quotient of a manifold by a free and properly discontinuous group action has a canonical smooth structure using (a) the maximal atlas definition and (b) the locally ringed space definition.

2) "A tangent vector on a manifold is a point derivation..." While there are absolutely a lot of advantages to having this point of view around (not the least of which is that it is a better definition in algebraic geometry), I believe that this is misleading as a definition. Indeed, the key property that a good definition should have in my opinion is an emphasis on the close relationship between tangent vectors and smooth curves. Note that such a definition is bound to involve equivalence classes of smooth curves having the same derivative at a given point, and the notion of the derivative of a smooth curve is defined by composing with a smooth function. So for those who really like point derivations, they aren't far behind. There just needs to be some mention of curves, which in many ways are really what give differential geometry its unique flavor.

3) The notion of amenability in geometric group theory particularly lends itself to misleading definitions. I think there are two reasons. The first is that modulo some mild exaggeration basically every property shared by all amenable groups is equivalent to the definition. The second is that amenability comes up in so many different contexts that it is probably impossible to say there is one and only one "right" definition. Every definition is useful for some purposes and not useful for others. For example the definition involving left invariant means is probably most useful to geometric group theorists while the definition involving the topological properties of the regular representation in the dual is probably more relevant to representation theorists. All that being said, I think I can confidently say that there are "wrong" definitions. For example, I spent about a year of my life thinking that the right definition of amenability for a group is that its reduced group C* algebra and its full group C* algebra are the same.

4) Some functional analysis books have really bad definitions of weak topologies, involving specifying certain bases of open sets. This point of view can be useful for proving certain lemmas and working with some examples, but given the plethora of weak topologies in analysis these books should really give an abstract definition of weak topologies relative to any given family of functions and from then on specify the topology by specifying the relevant family of functions.

I'm sure I could go on and on, but these four have proven to be particularly difficult and frustrating for me.

I REALLY want to read a differential geometry course based on locally ringed spaces. Do you have one?
–
darij grinbergMay 27 '10 at 16:37

4

Sheaves on Manifolds by Kashiwara is the only book length treatment of differential geometry from this point of view that I know, but it is far from an introductory text. The course I referred to in my answer was taught by Brian Conrad several years ago, and he still has lots of useful handouts on his web page from that course. Other than that, I can't help you. :(
–
Paul SiegelMay 27 '10 at 17:43

2

I think the one geometric group theorist I've talked about this with considered the existence of a Følner sequence to be the right definition...
–
Harry AltmanMay 27 '10 at 21:48

3

Also, this should really be four separate answers, for the purposes of this kind of community-wiki big list question
–
Yemon ChoiJul 16 '10 at 5:19

3

The definition of tangent spaces via curves has one, very substantial, disadvantage: it is not clear that the so-defined tangent space is a vector space. You can define an addition in charts and show that it is well-defined, but that looks, unfortunately, not very natural.
–
Johannes EbertApr 25 '12 at 8:39

R(.,.) is an equivalence relation iff R is reflexive, symmetric, and transitive.

R(.,.) is an equivalence relation iff there exists a function f such that R(a,b) iff f(a)=f(b).

Most presentations start with Definition 1, which contains no hint as to why we bother discussing such relations or why we call them "equivalences". In contrast, Definition 2 (along with a couple of examples) immediately tells you that R captures one particular attribute of the elements of the domain; and, since elements with the same value for this attribute are called "equivalent", R is called an "equivalence".

A reasonable introduction should start with Definition 2, then go on to prove Definition 1 is a convenient alternative characterization.

I've never actually seen the second definition explicitly, although I've used it implicitly often enough. I don't completely see how that's a clearer exposition, though.
–
Cory KnappDec 6 '09 at 12:15

6

A function is a great way to capture the intuitive meaning of "some property that we want to be the same." The second definition also doesn't require introducing three new concepts.
–
Qiaochu YuanDec 6 '09 at 17:00

3

It's true, the second definition doesn't require introducing the concepts of a reflexive, symmetric, or transitive relation. But I think it is still useful to give Definition 1 eventually; after all, there are plenty of relations that are transitive (for example) without being equivalences.
–
Gabe CunninghamDec 6 '09 at 22:58

37

When equivalence relations are introduced, it is usually shown that giving an equivalence relation on a set is the same as giving a partition of the set. This seems a little more natural than your (2).
–
JS MilneDec 9 '09 at 6:37

7

I always thought (1) was very nice and intuitive, after all, it says "an equivalence relation is a relation that behaves like =", which for undergrads is a nice introduction to the idea that one might care about other kinds of similarity than equality.
–
Ketil TveitenJun 2 '10 at 13:08

My biggest issue is with the coordinate-definition of tensor products. A physicist defines a rank $k$ tensor over a vector space $V$ of dimension $n$ to be an array of $n^k$ scalars associated to each basis of $V$ which satisfy certain transformation rules; in particular, if we know the array for a given basis, we can automatically determine it for a different basis. Another way to say this is that the space of tensors is the set of pairs consisting of a basis and a $n^k$ array of scalars, identified by an equivalence relation which gives the coordinate transformation law. For some strange reason, people seem to call this a coordinate-free definition. While it is in a sense coordinate-free (the transformation between coordinates lets you break free of coordinates in a sense), it is very confusing at first sight. People who use this definition will they say that certain operations are coordinate-free. What they mean by this, and it took me a long time to figure this out, is that you can do a certain algebraic operation to the coordinates of the tensor, and the formula is the same no matter which basis you work with (e.g., multiplying a covariant rank $1$ tensor with a contravariant rank $1$ tensor to get a scalar, or exterior differentiation of differential forms, or multiplying two vectors to get a rank $2$ tensor).

The much nicer definition uses tensor products. This is a coordinate-free construction, as opposed to the coordinate-full description given above. This definition is nice because it connects to multilinear maps (in particular, it has a nice universal property). It also helped me see why tensors are different from elements of some $n^k$-dimensional vector space over the same field (they are special because we are equipped not just with a vector space but with a multilinear map from $V \times \cdots \times V \to V \otimes \cdots \otimes V$. The covariant/contravariant distinction can be explained in terms of functionals. This allows you to talk about contraction of tensors without worrying having to prove that it is coordinate-invariant! Finally, once you have all that under your best, you can easily derive the coordinate transformation laws from the multilinearity of $\otimes$.

When I look at physics texts on tensors-even mathematically literate and careful texts like Frankel-I wonder how ANYONE understands the monstrousity they present tensors as.As formulas that transform by indecies that raise and lower by certain rules.No wonder only geniuses understood relativity theory before mathematicains began cleaning it up.
–
Andrew LJul 15 '10 at 22:13

I'm now satisfied to know that my physics TA also agrees with me about physicists' approach to tensors.
–
David CorwinDec 19 '10 at 5:43

8

I think this is outdated. Many physicists learn about tensors in a course on General Relativity and one of the standard textbooks is Wald, "General Relativity". It defines tensors in terms of multilinear maps, not as a collection of scalars obeying certain transformation rules. The same is true of Carroll, "Spacetime and Geometry." Most theoretical physicists in this day and age understand this view of tensors.
–
Jeff HarveyDec 23 '10 at 15:41

3

@Jeff Harvey: When I was doing my bachelor's and master's in physics (in the '00s), I never got the impression that "most theoretical physicists in this day and age understand [the coordinate-free] view of tensors." Maybe it depends on subfield / institution? Certainly I met a lot of people who did understand the coordinate-free view, but I also met a lot of people who appeared not to. This often made life difficult for me, because I have trouble understanding the coordinate-full view, and I had a very hard time getting people to help me translate things into coordinate-free language.
–
VectornautApr 25 '12 at 6:44

I've never heard this definition (the bad one) described as "coordinate-free"; I've only ever heard that term applied to the good one.
–
Toby BartelsOct 6 '14 at 6:04

Really? I think the alternate definition is much more misleading: a function is a rule... by which most students immediately think "algebraic formula."
–
Qiaochu YuanDec 5 '09 at 2:05

20

I would almost prefer not even to say what a function is at all. I'd just say that if f is a function from A to B and x is an element of A then f(x) is an element of B. And that's all you need to know. Of course, I'm exaggerating a bit, and this point of view is not sufficient after a while (e.g. how would you decide whether the set of functions from A to B is countable, how would you define function spaces, etc.?) but in some situations this is the most important fact that you need from the basic definition of functions. Of course, one would also give examples, including artificial ones.
–
gowersDec 5 '09 at 23:25

8

The nice thing about the subset of $A \times B$ definition is that it's clear what it means for one function to be equal to another. If a function is a rule you have to specify what it means for one rule to be equal to another. Similarly, things like the union and intersection of functions do not immediately make sense.
–
Ryan BudneyJan 11 '10 at 6:54

23

I think that the set-of-pairs definition is a neat formal trick, but not really how anyone intuitively thinks about a function (people use mental images of "rules of correspondence" or "machine that produce an output given an input", etc). I had a friend who disagreed and claimed that he truly thought of functions as sets of pairs. A few days later I heard him talking about the graph of a function and asked him "by the graph of a function you simply mean the function, right?". After that incident he agreed with me that nobody thinks of functions as sets of pairs. :)
–
Omar Antolín-CamarenaMay 27 '10 at 20:33

5

Regarding Tim Gowers's comment on not saying what a function is, you can take functions as a basic concept (along with sets) in the foundations of mathematics, in place of the element-hood relation. This is what Lawvere's ETCS does (although there are yet other differences between ETCS and ZFC than this).
–
Toby BartelsApr 4 '11 at 6:18

Well, to use that as a definition, you need to show that there is a thing which makes Stokes theorem true...
–
Mariano Suárez-Alvarez♦Dec 3 '09 at 3:02

4

Right. That approach is essentially the same as defining functors via universal properties; the construction to prove they exist is less important than the property.
–
Qiaochu YuanDec 3 '09 at 15:07

1

The standard definition of the exterior derivative really isn't unpedagogical; it's pretty much the only sensible definition once you have agreed on skew-symmetry (in fact, anyone that accepts the determinant as sensible should think the same of the exterior derivative).
–
Sam DerbyshireDec 4 '09 at 7:32

10

Personally the algebraic formula for $d$ leaves me cold. I have always found it much easier to define it on functions via $df(X) = Xf$ for a vector field $X$, and then extending it as an odd derivation over the wedge product which obeys $d^2=0$. It is easy to see that this defines it uniquely. I think that this is pedagogical and easy to remember.
–
José Figueroa-O'FarrillDec 4 '09 at 19:25

2

@José: It took me a while to understand what "standard algebraic definition" Jon meant (presumably, the one given by an explicit formula with partial derivatives, signs, and omitted indices), because all along I was thinking about your definition, which I think is excellent.
–
Victor ProtsakMay 28 '10 at 2:20

Similar to gowers's answer about group actions, a module over a ring R is an abelian group M together with a function $f:R\times M \to M$ that satisfies certain properties. It may set the beginner's mind at ease to hear, "They're just like vector spaces except over arbitrary rings instead of only fields," which is misleading in itself but is a good mnemonic for remembering the definition. However, I usually find it more intuitive to think of a module over R as a homomorphism from R to the endomorphism ring of an abelian group, and with this definition no mnemonic is necessary.

I agree. It took me an incredibly long time to realize that vector spaces are fields acting on abelian groups.
–
Qiaochu YuanDec 6 '09 at 4:48

11

But describing modules as "vector spaces over a ring" most directly establishes the motivation of quite a bit of the work done in an introductory course on modules (more or less, try to see how much of the theory of vector spaces goes through). When someone first sees a module, the chances that looking at a morphism from a ring to an endomorphism algebra will sound natural are quite small. The point of view afforded by the "a module is a morphism" fits more naturally in the state of mind induced by representation theory (of groups, say), but I imagine very few people become familiar with...
–
Mariano Suárez-Alvarez♦May 27 '10 at 17:43

5

...representation theory (of anything) soon enough that that can be used as motivation/context for modules and friends.
–
Mariano Suárez-Alvarez♦May 27 '10 at 17:44

Thanks Mariano, point taken. Although I now think in this way, most likely the "vector spaces over a ring" approach was appropriate for my first encounter, and I will try keep this in mind if/when I introduce students to modules. I do think that both approaches should be emphasized, even in a first course, but perhaps this is more to build intuition for later use than to provide an a priori intuitive viewpoint.
–
Jonas MeyerMay 27 '10 at 20:23

2

Also, modules over commutative rings have some special properties which can't be captured easily (or at all) if you think of a module as a representation. Concerning "vector space over a ring" perspective: it's been a long time since I read van der Waerden's Algebra, but I believe, he was descriptively speaking of "groups with operators".
–
Victor ProtsakMay 28 '10 at 2:40

Inspired by some of the comments, I would nominate the definition of infinite product topology in terms of its open sets, found in, e.g., Munkres' otherwise excellent Topology. "The product topology on $X = \prod_{\alpha \in J} X_\alpha$ is the topology generated by the sets of the form $\pi_\alpha^{-1}(U_\alpha)$, where $U_\alpha$ is an open subset of $X_\alpha$." One then proves that one can also use the basis of sets of the form $U = \prod_{\alpha \in J} U_\alpha$ where $U_\alpha$ is open in $X_\alpha$, and $U_\alpha = X_\alpha$ for all but finitely many $\alpha \in J$. This just makes it look like an annoying and unnatural modification of the box topology.

Better in my opinion is to view $X = \prod X_\alpha$ explicitly as a function space (not as some sort of tuples, though they are really functions underneath), and to use the terminology of nets. Then it becomes clear that the product topology is just the topology of pointwise convergence, i.e. a net $f_i \to f$ iff the nets $f_i(\alpha) \to f(\alpha)$ for all $\alpha \in J$.

Under this definition, Tychonoff's theorem, which previously seemed pretty obscure, has an obvious application when combined with Heine-Borel: given any set $J$ and a pointwise bounded net of functions $f_i : J \to \mathbb{R}$, there is a subnet that converges pointwise. This is maybe the most useful application, especially in functional analysis. (Indeed, I understand this was actually Tychonoff's original theorem, that an arbitrary product of closed intervals is compact.) For instance, it makes Alaoglu's theorem clear, once you see that the weak-* topology is just a toplogy of pointwise convergence.

It's nice then to compare this with the Arzela-Ascoli theorem, which says that if $J$ is a compact Hausdorff space and the functions $f_i$ are not only pointwise bounded but also continuous and equicontinuous, then a subnet (in fact a subsequence) converges not only pointwise but in fact uniformly.

It's interesting to note this is similar in spirit to item 4) in Paul Siegel's answer.
–
Mark MeckesMay 27 '10 at 17:48

1

It is not similar only in spirit! The product topology is the weak topology for the family of natural projection maps on the product. The precise relation with Nate's answer is that if $X$ is a set equipped with the weak topology corresponding to a family of maps $f_\alpha$ then a net $x_i$ converges in $X$ if and only if $f_\alpha(x_i)$ converges for each $\alpha$. I don't claim that the notion of a weak topology belongs in point set topology classes (though even the subspace topology is the weak topology for the inclusion map), but it is a surprisingly convenient organizing principle.
–
Paul SiegelMay 27 '10 at 18:20

Actually, understanding the product topology as the one forced upon you if you want the categorical product on the particular category $Top$ was one of the first things that really sold me on the usefulness and power of category theory.
–
Todd Trimble♦Apr 4 '11 at 10:50

1

This is great! "An annoying and unnatural modification of the box topology" is exactly what I thought when I first saw the definition of the product topology in Munkres. On the other hand, I've never been comfortable with defining a topology by specifying its convergent nets; how do I check that what I've defined is actually a topology?
–
VectornautApr 25 '12 at 6:25

I see the problem crop up: a certain mathematical object has many characterizations, any one of which can be taken as the definition. Which do you use when you are introducing the subject?

The first one that comes to mind is the basis of a vector space. Perhaps this is not the best example for the title question of this thread of discussion, but I know that this confuses some students. When I last taught linear algebra, we taught them at least four characterizations. It's not really that any of the characterizations is obscuring or misleading. Rather, each one highlights some important property(-ies). Of course, the better students enjoy seeing all of the characterizations, and they appreciate every one. The less facile students get flustered because they want there to be just One Right Way of thinking about them.

A similar issue arises with the characterizations of an invertible matrix or linear transformation, though at least with a matrix it seems most reasonable to define an invertible matrix as one that has an inverse, namely another matrix that you can multiply it by to get the identity matrix.

I usually tend towards the historical definition. Usually that's the one that is best-motivated for people with the least background, since it's what motivated the creator. For example, if you look at Hassler Whitney's original papers on characteristic classes, they're extremely raw, explicit and beautiful. A very charming introduction, IMO.
–
Ryan BudneyDec 4 '09 at 5:50

3

I've never taught linear algebra, so I don't know what I'm talking about here. But perhaps the claim that having an inverse is the most natural characterization of "invertible" is just an artifact of the language? If we used the word "nonsingular" or "nondegenerate" for this property, other characterizations might seem more natural.
–
Michael LugoDec 4 '09 at 15:12

@Michael Lugo: I've never taught linear algebra either, but it seems to me that the essential property of an bijective function is that it has an inverse. One reason to believe this definition is the "right" one is that its generalization, the idea of an isomorphism, is far more important than the idea of "a morphism which is both monic and epic."
–
VectornautApr 25 '12 at 6:14

I remember being confused by the multivariable calculus approach to vectorfields. Sometimes thinking of functions just as functions and sometimes as feilds. It should be possible to convey the idea of having a space of directional derivatives attatched to each point without having to talk about vector bundles.

In general I can accept that there is more going on behind the scenes than there is time for in a course, but simply knowing that there is a more general and "right" way of doing things is very helpful to me. This also tends to make the course much more interesting.

Convolution: whether it is convolution of functions, measures or sequences, it is often defined by giving an explicit formula for the resulting function (or measure, etc.). While this definition makes calculations with convolutions relatively easy, it gives little intuition into what convolution really is and often seems largely unmotivated. In my opinion, the right way to define convolution (say, of two finite complex Radon measures on an LCA group $G$, which is a relatively general case) is as the unique bilinear, weak-* continuous extension of the group product to $M(G)$ (the space of measures as above), where $G$ is naturally identified with point masses. Then one can restrict the definition to $L^1 (G)$ and get the well known explicit formula for convolution of functions. Of course, a probabilist will probably prefer to think of convolution as the probability density function associated to the sum of two independent absolutely continuous random variables. And there are other possible alternative definitions (see this Mathoverflow discussion). But the formula definition is really the hardest one to get intuition for, in my opinion.

I agree that at first glance the explicit formula doesn't say much, but I guess you don't expect the "unique bilinear, weak-* continuous extension of the group product to M(G)" definition to appear e.g. in an undergraduate real analysis course...
–
Michal KotowskiApr 11 '11 at 13:33

True, though there are some instances where such a definition becomes more easy (e.g. in a course on the representation theory of finite groups, where measures become functions and all continuity issues disappear).
–
MarkApr 12 '11 at 9:41

1

Convolution is like multiplication of polynomials (except the exponents come from some group G and the coefficients can be density functions on G instead of finite sums).
–
John Wiltshire-GordonApr 25 '12 at 4:52

Yes, but it is important. I think it is only taught once students have developed intuition for the epsilon delta definition.
–
David CorwinJul 15 '10 at 23:41

13

"A function $f$ is continuuous if and only $lim_nf(x_n) =f(lim_n x_n)$ for all convergent sequences in the domain of definition $f$". This is intuitive (once convergence of sequences is understood), convenient for proofs, general (holds for metric spaces).
–
Johannes EbertOct 23 '10 at 21:16

2

Maybe, but I find the following variant to be more intuitive than both of them: A function f is continuous at a point x if the preimage of any neighborhood of f(x) is a neighborhood of x.
–
ACLApr 4 '11 at 7:17

Induced representations defined in terms of tensor products of $G$ modules and in terms of vector-valued functions on $G$. It would be nice, if more textbook in representation theory stress this more heavily. Both definition have their advantages and disadvantages, I guess, but I personally feel more comfortable with the interpretation in terms of functions.

I don't think any of these definitions is misleading, but the fact that many books give only one of them (and some proceed to then use the other...) certainly is!
–
darij grinbergApr 11 '11 at 15:03

This. I'm working on semigroup representations and been having problems getting a clear mental image about induced representations. I've been looking at the group-oriented literature but am wary of getting my intuition wrong for semigroups.
–
kastbergApr 11 '11 at 17:22

The entire branch of point-set topology as taught in most textbooks has completely unintuitive definitions that obscure the entire subject. For instance, the definition of a topological space in terms of open sets tells you nothing about the meaning of point-set topology. It would be much more clear if topological spaces where originally defined in terms of topological closure operators since closure operators since intuitively we have $x\in\overline{A}$ if the set $A$ touches the point $x$ in some way. Other unnecessarily obscured concepts in point-set topology include the definitions of product topology, subspace topology, Hausdorff spaces, regular spaces, compact spaces, and continuous functions. Furthermore, some definitions are obscured when the spaces are not required to be Hausdorff. For instance, the notions of compactness, paracompactness, regularity, and normality do not have much meaning without the Hausdorff separation axiom.
If one has a non-Hausdorff space where every open cover has a finite subcover, then one should call that space quasi-compact and not compact. It is a shame that general topology is taught in such a meaningless fashion.

And a Hausdorff space is a space where every net(or filter) converges to at most one point. When we put this all together, a compact Hausdorff space is a space where every net(filter) accumulates at some point and converges to at most one point.
–
Joseph Van NameDec 18 '12 at 2:45

And how to define paracompactness then? If you define it in terms of decompositions of unit you lose conection with compactness
–
Ostap ChervakJan 17 '13 at 18:59

1

There are many non-trivial ways of defining paracompactness. A good topology textbook should therefore prove some of these characterizations of paracompactness. Perhaps the most intuitive one would be that a paracompact space is a $T_{1}$-space where each open cover has an open barycentric refinement. This characterization therefore says that paracompact spaces are precisely the spaces where the collection of all open covers generates a uniformity. Moreover, this uniformity is supercomplete. In fact, a space is paracompact iff it has a compatible supercomplete uniformity.
–
Joseph Van NameJan 17 '13 at 22:28

Your proposed alternative definition of compact is equivalent to the open-cover one even for non-Hausdorff spaces. So why is this meaningless again?
–
Toby BartelsFeb 12 '14 at 23:19

When we write tensor products, it's optional to indicate the ring over which we do it; we can write M ⊗ N or M ⊗R N. But for elements, we always write x ⊗ y without reference to R. You must keep it in mind and that can induce lapses. For example, v ⊗ u2 − u ⊗ uv may be ≠0: it depends on the base ring, and that doesn't appear [in the notation].

Sometimes the problem is not the concept so much as the notation we use for it.