In mathematics we introduce many different kinds of notation, and sometimes even a single object or construction can be represented by many different notations. To take two very different examples, the derivative of a function $y = f(x)$ can be written $f'(x)$, $D_x f$, or $\frac{dy}{dx}$; while composition of morphisms in a monoidal category can be represented in traditional linear style, linearly but in diagrammatic order, using pasting diagrams, using string diagrams, or using linear logic / type theory. Each notation has advantages and disadvantages, including clarity, conciseness, ease of use for calculation, and so on; but even more basic than these, a notation ought to be correct, in that every valid instance of it actually denotes something, and that the syntactic manipulations permitted on the notation similarly correspond to equalities or operations on the objects denoted.

Mathematicians who introduce and use a notation do not usually study the notation formally or prove that it is correct. But although this task is trivial to the point of vacuity for simple notations, for more complicated notations it becomes a substantial undertaking, and in many cases has never actually been completed. For instance, in Joyal-Street The geometry of tensor calculus it took some substantial work to prove the correctness of string diagrams for monoidal categories, while the analogous string diagrams used for many other variants of monoidal categories have, in many cases, never been proven correct in the same way. Similarly, the correctness of the "Calculus of Constructions" dependent type theory as a notation for a kind of "contextual category" took a lot of work for Streicher to prove in his book Semantics of type theory, and most other dependent type theories have not been analogously shown to be correct as notations for category theory.

My question is, among all these notations which have never been formally proven correct, has any of them actually turned out to be wrong and led to mathematical mistakes?

This may be an ambiguous question, so let me try to clarify a bit what I'm looking for and what I'm not looking for (and of course I reserve the right to clarify further in response to comments).

Firstly, I'm only interested in cases where the underlying mathematics was precisely defined and correct, from a modern perspective, with the mistake only lying in an incorrect notation or an incorrect use of that notation. So, for instance, mistakes made by early pioneers in calculus due to an imprecise notion of "infinitesimal" obeying (what we would now regard as) ill-defined rules don't count; there the issue was with the mathematics, not (just) the notation.

Secondly, I'm only interested in cases where the mistake was made and at least temporarily believed publically by professional (or serious amateur) mathematician(s). Blog posts and arxiv preprints count, but not private conversations on a blackboard, and not mistakes made by students.

An example of the sort of thing I'm looking for, but which (probably) doesn't satisfy this last criterion, is the following derivation of an incorrect "chain rule for the second derivative" using differentials. First here is a correct derivation of the correct chain rule for the first derivative, based on the derivative notation $\frac{dy}{dx} = f'(x)$:

(The correct second derivative of $g\circ f$ is $g''(f(x)) (f'(x))^2 + g'(f(x)) f''(x)$.) The problem is that the second derivative notation $\frac{d^2y}{dx^2}$ cannot be taken seriously as a "fraction" in the same way that $\frac{dy}{dx}$ can, so the manipulations that it justifies are incorrect. However, I'm not aware of this mistake ever being made and believed in public by a serious mathematician who understood the precise meaning of derivatives, in a modern sense, but was only led astray by the notation.

Edit 10 Aug 2018: This question has attracted some interesting answers, but none of them is quite what I'm looking for (though Joel's comes the closest), so let me clarify further. By "a notation" I mean a systematic collection of valid syntax and rules for manipulating that syntax. It doesn't have to be completely formalized, but it should apply to many different examples in the same way, and be understood by multiple mathematicians -- e.g. one person writing $e$ to mean two different numbers in the same paper doesn't count. String diagrams and categorical type theory are the real sort of examples I have in mind; my non-example of differentials is borderline, but could in theory be elaborated into a system of syntaxes for "differential objects" that can be quotiented, differentiated, multiplied, etc. And by saying that a notation is incorrect, I mean that the "understood" way to interpret the syntax as mathematical objects is not actually well-defined in general, or that the rules for manipulating the syntax don't correspond to the way those objects actually behave. For instance, if it turned out that string diagrams for some kind of monoidal category were not actually invariant under deformations, that would be an example of an incorrect notation.

It might help if I explain a bit more about why I'm asking. I'm looking for arguments for or against the claim that it's important to formalize notations like this and prove that they are correct. If notations sometimes turn out to be wrong, then that's a good argument that we should make sure they're right! But oppositely, if in practice mathematicians have good enough intuitions when choosing notations that they never turn out to be wrong, then that's some kind of argument that it's not as important to formalize them.

$\begingroup$There must have been cases where someone defined $y$ as an implicit function of $x$ via an equation $f(x,y)=0$ and then wrote something like $dy/dx=(df/dx)/(df/dy)$ (which "follows" from treating apparent fractions as fractions, but of course gets the sign wrong). Whether this has ever made it past a blackboard into a preprint is less certain.$\endgroup$
– Steven LandsburgAug 9 '18 at 17:26

$\begingroup$In his 2000 article Nick Laskin apparently got the notation $\nabla^\alpha f$ for the fractional derivative wrong and played with it as if it was a local operator. This error still persists among some physicists, for example, it is still there in Laskin's 2018 book; see here for further links. Not sure if this qualifies, so I leave this as a comment.$\endgroup$
– Mateusz KwaśnickiAug 9 '18 at 17:50

4

$\begingroup$@WillSawin: It's tricky. For an example of a paper heavily using raising operators, see arxiv.org/abs/1008.3094v2 . It tries to build them on a rigorous foundation (§2.1--2.2), by letting them act on Laurent polynomials instead of them acting on symmetric functions; but it soon slips back into pretending that they act on symmetric functions themselves (e.g., the first computation in §2.4 relies on associativity of that "action"). I have long thought about asking here on MO if there is a good way of making sense of these operators; but I'm afraid that the answers will not ...$\endgroup$
– darij grinbergAug 9 '18 at 18:49

4

$\begingroup$... necessarily be any clearer than the original sources, seeing that this is a matter of confusion rather than a specific question. Adriano Garsia has his own interpretation of raising operators (A. M. Garsia, Raising operators and Young's rule), who suggests that the operators should act on tableaux rather than on symmetric functions (see the sentences after equation 3.3); I'm not sure to what extent his suggestions can be used as a replacement for the uses of raising operators ...$\endgroup$
– darij grinbergAug 9 '18 at 18:49

5 Answers
5

Set theorists commonly study not only the theory $\newcommand\ZFC{\text{ZFC}}\ZFC$ and its models, but also various fragments of this theory, such as the theory often denoted $\ZFC-{\rm P}$ or simply $\ZFC^-$, which does not include the power set axiom. One can find numerous instances in the literature where authors simply define $\ZFC-{\rm P}$ or $\ZFC^-$ as "$\ZFC$ without the power set axiom."

The notation itself suggests the idea that one is subtracting the axiom from the theory, and for this reason, I find it to be instance of incorrect notation, in the sense of the question. The problem, you see, is that the process of removing axioms from a theory is not well defined, since different axiomizations of the same theory may no longer be equivalent when one drops a common axiom.

And indeed, that is exactly the situation with $\ZFC^-$, which was eventually realized. Namely, the theory $\ZFC$ can be equivalently axiomatized using either the replacement axiom or the collection axiom plus separation, and these different approaches to the axiomatization are quite commonly found in practice. But Zarach proved that without the power set axiom, replacement and collection are no longer equivalent.

He also proved that various equivalent formulations of the axiom of choice are no longer equivalent without the power set axiom. For example, the well-order principle is strictly stronger than the choice set principle over $\text{ZF}^-$.

My co-authors and I discuss this at length and extend the analysis further in:

We found particular instances in the previous literature where researchers, including some prominent researchers (and also some of our own prior published work), described their theory in a way that leads actually to the wrong version of the theory. (Nevertheless, all these instances were easily fixable, simply by defining the theory correctly, or by verifying collection rather than merely replacement; so in this sense, it was ultimately no cause for worry.)

$\begingroup$@EmilioPisanty I'm sorry, but I don't quite see the error to which you refer. I am not an expert in grammar, but "This is an apple that I am stating is red" seems fine to me, grammatically.$\endgroup$
– Joel David HamkinsAug 11 '18 at 17:02

6

$\begingroup$Everyone agrees what ZFC is, as a theory. What is not agreed is the meaning of this subtraction. The notation leads one incorrectly to think it is meaningful.$\endgroup$
– Joel David HamkinsAug 11 '18 at 21:15

6

$\begingroup$It's like talking about "the vector space $\mathbb{R}^3$ without the vector (1,1,0)."$\endgroup$
– Joel David HamkinsAug 11 '18 at 21:23

$\begingroup$@TomChurch I don't really follow your comment. To my way of thinking, the analogy is that we have a well-known very specific space, ZFC, with several commonly used generating sets, and someone says ZFC with the power set, which is a generator common to those generating sets. The point is that removing a generator isn't well-defined as a process on the space, but only as a process on the generating sets.$\endgroup$
– Joel David HamkinsAug 12 '18 at 11:13

This might not quite count, but if you start with a principal $G$-bundle $f:P\rightarrow B$, there are two natural ways to put a $G\times G$ structure on the bundle $P\times G\rightarrow B$ given by $(p,b)\mapsto f(p)$. Because it is standard notational practice to denote such a bundle by simply writing down the map $P\times G\rightarrow B$, there is nothing in the notation to distinguish between these structures, and therefore the notation leads you to believe they're the same.

By following this lead, Ethan Akin "proves" here that the $K$-theory of $B$ is trivial, for any base space $B$. He reports that it took three Princeton graduate students (including himself) some non-trivial effort before they found the error.

This might meet the letter of your criterion by virtue of having made it into print, but probably violates the spirit because the author had already discovered the error, and indeed the whole point of the paper was to call attention to it.

$\begingroup$That's a nice example, and an amusing paper to read! But I won't count it even by the letter of my criterion, since it doesn't seem to have been "believed publically", i.e. Ethan found the error before publishing it -- and presumably even before he found the error, he knew there had to be an error somewhere.$\endgroup$
– Mike ShulmanAug 9 '18 at 18:04

$\begingroup$I'm not sure if this is a mistake so much as a different convention. To a mathematician, $T$ is the name of the function, and it doesn't matter what arguments I feed it. To a scientist, $T(x, y)$ is the name of the function, and $T(r, \theta)$ is a different function, so that there is no way to know what $T(1, \pi)$ means without further context. (An example of context: $T(r, \theta)\bigr|_{(r, \theta) = (1, \pi)}$.) It's not clear to me that either of these is objectively wrong. (If you say "always be more explicit", then I think mathematicians fall down in many other cases.)$\endgroup$
– LSpiceAug 10 '18 at 12:08

24

$\begingroup$I think this is just plain wrong. I cannot imagine a realistic situation where a mathematician (other than a freshman answering to a professor who is known for silly 'gotcha' questions) would really think of (B). After all, the author meant something when they chose $(r,\theta)$ for their notation, right? Moreover, I find little objectionable about this notation: mathematically, $T$ is a function on a manifold, which has several standard (and standardly denoted) coordinate charts. It is defined in one chart and then calculated in the other.$\endgroup$
– Kostya_IAug 10 '18 at 13:30

14

$\begingroup$So the reason why scientist 'are taught in such a strange way' is that all physically meaning quantities are functions (or vector/tensor fields, etc.) on manifolds, usually specified by picking a chart. To be pedantic, one could introduce coordinate maps and phrase the question as: $T(\varphi(x,y))=k(x^2+y^2)$. What is $T(\psi(r,\theta))$? This is not done, as it is impractical: it blows up formulae while carrying no useful information. The composition with coordinate maps is already implied by the choice of letters.$\endgroup$
– Kostya_IAug 10 '18 at 13:50

10

$\begingroup$As far as the original question is concerned, I do kind of think that A is the "right answer" and that people choosing B are making a mistake because of the notation.$\endgroup$
– Timothy ChowAug 10 '18 at 16:01

11

$\begingroup$One more story in the same vein: Our chemistry students asked me once for help, as they had been asked the following question at their final exam in probability: "The parameters of the normal distribution are: (a) $(0,1)$; (b) $(\mu, \sigma)$; (c) $(\mu, \sigma^2)$; (d) $(\alpha, \beta)$." I found this question really amusing.$\endgroup$
– Mateusz KwaśnickiAug 10 '18 at 21:14

Ramanujan's notebooks are an interesting case study. As discussed in detail in Chapter 24 ("Ramanujan's Theory of Prime Numbers") of Volume IV of Bruce Berndt's series Ramanujan's Notebooks, Ramanujan made a number of errors in his study of $\pi(x)$, the number of primes less than or equal to $x$. It is hard to say for sure that these errors are specifically due to Ramanujan's notation rather than some other misconception, but I think a case can be made that his notation was a contributing factor. For example, Berndt writes:

It is not clear from the notebooks how accurate Ramanujan thought his approximations $R(x)$ and $G(x)$ to $\pi(x)$ were. (Ramanujan always used equality signs in instances where we would use the signs $\approx$, $\sim$, or $\cong$.) According to Hardy, Ramanujan, in fact, claimed that, as $x$ tends to $\infty$,
$$\pi(x)-R(x) = O(1) = \pi(x) - G(x),$$
both of which are false.

One could therefore argue that Ramanujan's careless use of equality signs contributed to his overestimating the accuracy of his approximations. On the other hand, one could also argue that Ramanujan's mistake was more fundamental, traceable to his inadequate understanding of the complex zeros of the zeta function.

Ramanujan also used (in effect) the notation $d\pi(x)/dx$, and one could argue that some of his misunderstandings were traceable to not having a proper definition of the notation $d\pi(x)/dx$ and yet assuming that it denoted a definite mathematical object with specific properties. Ramanujan was aware of the need for some justification because Hardy voiced his objections, and attempted to defend his notation (in this context, $n=\pi(x)$):

I think I am correct in using $dn/dx$ which is not the differential coefficient of a discontinuous function but the differential coefficient of an average continuous function passing fairly (though not exactly) through the isolated points. I have used $dn/dx$ in finding the number of numbers of the form $2^p3^q$, $2^p+3^q$, etc., less than $x$ and have got correct results.

However, as Berndt explains, Ramanujan's defense is inadequate.
For more discussion, I recommend reading the entire chapter.

$\begingroup$In particular, for $3$-smooth numbers Ramanujan's heuristics made him believe that his approximation was within $O(1)$ of the true value. Hardy and Littlewood later proved first error terms of the form $O(x^\vartheta)$ and $O(\log x)$ depending on Diophantine properties of the logarithms of integers, then showed that in fact the error term is always unbounded!$\endgroup$
– Emanuele TronAug 10 '18 at 9:30

3

$\begingroup$This is interesting, but if I understand it correctly, not I think an answer to the original question. It seems to me an instance of a single mathematician using a notation incorrectly, rather than a notation itself being incorrect (and leading to wrong results).$\endgroup$
– Mike ShulmanAug 10 '18 at 17:47

I am aware of a few articles discussing certain Macdonald polynomials in the introduction (and motivation) in the introduction, and then proceed to study properties of another family of polynomials.

The particular polynomials that are studied are not the same ones as in the Macdonald polynomials in the introduction (only similar), but the exact same notation/symbol is used for both these two families of polynomials.