Undergraduate Texts in Mathematics are generally aimed at third- and fourth-

year undergraduate mathematics students at North American universities. Thesetexts strive to provide students and teachers with new perspectives and novelapproaches. The books include motivation that guides the reader to an appreciationof interrelations among different aspects of the subject. They feature examples thatillustrate key concepts as well as exercises that strengthen understanding.

More information about this series at http://www.springer.com/series/666

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part ofthe material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting, reproduction on microfilms or in any other physical way, and transmission or informationstorage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodologynow known or hereafter developed. Exempted from this legal reservation are brief excerpts in connectionwith reviews or scholarly analysis or material supplied specifically for the purpose of being enteredand executed on a computer system, for exclusive use by the purchaser of the work. Duplication ofthis publication or parts thereof is permitted only under the provisions of the Copyright Law of thePublisher’s location, in its current version, and permission for use must always be obtained from Springer.Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violationsare liable to prosecution under the respective Copyright Law.The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoes not imply, even in the absence of a specific statement, that such names are exempt from the relevantprotective laws and regulations and therefore free for general use.While the advice and information in this book are believed to be true and accurate at the date ofpublication, neither the authors nor the editors nor the publisher can accept any legal responsibility forany errors or omissions that may be made. The publisher makes no warranty, express or implied, withrespect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

For Mari and Rex,Pina,Sarah, Hannah, and JohnPreface

Many college students take a couple of courses in calculus. Afterwards, they either(i) take no more calculus, (ii) take calculus of several variables, or (iii) take realanalysis. This book offers another option, not exclusive from options (ii) or (iii). In the first two college calculus courses, much attention is given (naturally) topreparing students for things to come. But typically there is little time devoted toappreciating the bigger picture or for generally admiring the scenery. Many of themost appealing aspects of the subject are often left for students to pick up on theirown. Unfortunately, however, students seldom do so. I have taught undergraduate real analysis and graduate real analysis for teachersfor over 15 years. These courses have evolved in a direction which attempts toaddress these concerns, and this book is a product of the evolution. The main goalis to see how beautifully things fit together, while admiring the scenery along theway. There are a lot of things in this book that experts in real and classical analysisalready know; part of the idea here is that there is no reason why a good calculusstudent should not know them as well. The book could be used as a text for a third course in calculus of a singlereal variable, as a supplementary text for a first course in real analysis, or asa reference for anyone who teaches calculus. The book is almost entirely self-contained, but readers would be best to have already taken the equivalent of twoone-semester courses in single variable calculus. Some familiarity with sequencesand some experience with proofs would be beneficial, though not entirely necessary. I have presented ideas in a manner which emphasizes breadth as much as depth.Throughout the text and in the exercises, alternative approaches to many topics aretaken. Such explorations are frequently more meaningful than simply aiming forgeneralization. Indeed, different arguments offer different insights. But whenever Ihave diverged from what is customary, I have given the usual treatments their dueattention. Many of the methods, examples, and exercises in the text are adapted from papersin the recent mathematics literature, chiefly: The American Mathematical Monthly,

viiviii Preface

The College Mathematics Journal, Mathematics Magazine, The Mathematical

Gazette, and the International Journal of Mathematics and Mathematical Sciencesin Education and Technology. I hope that readers will be encouraged to read theseand other mathematics journals. At the very least, they will find therein solutions tomany of the exercises. I have also tried to emphasize pattern: pattern of development, pattern of proof,pattern of argument, and pattern of generalization. In calculus many threads arerelated in many ways, but in the end it is a coherent subject. Theorems are oftennamed to emphasize pattern, for example the Cauchy–Schwarz Inequality andthe Cauchy–Schwarz Integral Inequality, Jensen’s Inequality and Jensen’s IntegralInequality, the Mean Value Theorem for Sums and the Mean Value Theorem forIntegrals, etc.• The real numbers are introduced carefully, but with an eye on economy. One can be something of an expert in calculus without necessarily knowing all there is to know about the real numbers. As presented here, the “completeness axiom” for the real numbers is the Increasing Bounded Sequence Property, rather than the Least Upper Bound Property or Cauchy-completeness, though the latter two notions are explored in some exercises. The Archimedean Property of the real numbers is used freely without explicit mention, but it too is addressed in a few exercises. The word “compact” is never used. The Nested Interval Property, a close cousin of the Increasing Bounded Sequence Property, also plays an important part, with bisection algorithms getting their fair attention.• Important throughout the entire book is the pair of inequalities

1 n 1 nC1 1C < e < 1C for n D 1; 2; 3; : : : n n

where e is Euler’s number e Š 2:71828. These estimates are frequently revisited,

refined, and extended. Inequalities in general play a prominent role as well. The most important of these are Bernoulli’s Inequality, the Arithmetic Mean – Geometric Mean Inequality (the AGM Inequality for short), 1 C x ex ; and Jensen’s Inequality.• Considerable emphasis is given to the symbiotic relationship between the expo- nential function and calculus itself. The former, for example, gives meaning to functions involving real exponents, it enables us to extend Bernoulli’s Inequality and the AGM Inequality, and many of their consequences.• Considerable attention is devoted to three consequences of the Intermediate Value Theorem which are often overlooked: the Universal Chord Theorem, the Average Value Theorem for Sums and its weighted version, the Mean Value Theorem for Sums. The latter two are so named because of their integral analogues, the Average Value Theorem and the Mean Value Theorem for Integrals. In obtaining these, the Extreme Value Theorem is indispensable. The relationship between sums and integrals is emphasized throughout.Preface ix

• The definite integral is developed as an extension of the notion of the average

value of a continuous function evaluated at N points. Proving that the average value of a continuous function exists is deferred to Appendix A; the proof uses some rather sophisticated ideas which are not used elsewhere. The definite integral’s relationship with area is then discussed. Readers who go on to study mathematical analysis will see that the integral as an average is a more enduring theme than the integral as area.• Chapter 7 (Other Mean Value Theorems) contains some results which have a flavor similar to that of the Mean Value Theorem. Subsequent chapters are independent of this one.• Chapter 12 (Classic Examples) is also independent of the rest of the book, except p that Wallis’s product in Sect. 12.1 is used in Chap. 13 to obtain the constant 2 which appears in Stirling’s formula.• Some important series are studied, for example, Geometric series, p-series, the Alternating Harmonic series, the Gregory-Leibniz series, and some Taylor series. But series in general are not covered systematically. For example, there is no treatment of power series, tests for convergence, radius of convergence, etc.• Quadrature rules are studied as means for doing calculus and studying inequal- ities, rather than being used for conventional numerical methods. Indeed, the quadrature rules are usually applied to a function whose definite integral is known. Particular attention is given to the Trapezoid and Midpoint Rules applied to convex/concave functions.• Motivated largely by the Mean Value Theorem for the Second Derivative, error terms are studied in Chap. 14. An inequality can often be recast as an equality which contains an error term. Jensen’s Inequality, the AGM Inequality, Young’s Integral Inequality (among others), and quadrature rules are considered in this way. Many of the topics in the book, even the better-known ones, have not beencollected elsewhere in any single volume. And a good number of these havebeen published heretofore only in journals. As a result, this book is not in directcompetition with any others. Still, there is naturally some overlap between this andother books. Most notably: A Primer of Real Functions, by R.P. Boas Jr. (Math. Assoc. of America, 1981). Excursions in Classical Analysis: Pathways to Advanced Problem Solving and Undergraduate Research, by H. Chen (Math. Assoc. of America, 2010). The Cauchy-Schwarz Master Class, by J.M. Steele (Math. Assoc. of America and Cambridge University Press, 2004). Inequalities, by G.H. Hardy, J.E. Littlewood & G. Polya (Cambridge Mathemat- ical Library, 2nd edition, 1988).x Preface

Mean Value Theorems and Functional Equations, by P.K. Sahoo & T. Riedel (World Scientific, Singapore, 1998). I have tried to write the sort of book that I would use, that I would like to ownas a reference, and that I would fairly recommend to others. To borrow a phrasefrom G. H. Hardy: I can hardly have failed completely, the subject matter being soattractive that only extravagant incompetence could make it dull.Acknowledgments

I am fortunate to have learned mathematics from some first-rate mathematicians. I

am particularly grateful to my best calculus and analysis professors at the Universityof Guelph: Pal Fischer, John A. Holbrook, George Leibbrandt, and Alexander McD.Mercer. At the University of Toronto: Ian Graham (who was also my thesis advisor)and Joe Repka. And at UNC Chapel Hill: Joseph Cima and Norberto Kerzman. Alexander Mercer, who is also my father, has been my greatest influence in allmatters mathematical. His encouragement for this project in particular cannot beoverstated. Indeed, he and my mother, Mari Mercer, have forever been supportiveof all my endeavors. I am so very thankful. My father also read large portions of the manuscript and made countless wisesuggestions. My wonderful colleague Tina Carter and my excellent student ThomasMorse III read large portions of the manuscript as well, likewise making many finesuggestions. My friends (also colleagues at Buffalo State College) Daniel W. Cunningham andThomas Giambrone have been a great help to me over the years. These are two guyswho can be relied upon for virtually anything. I am particularly indebted to Dan, forgeneral advice about this book, for additional proof reading, and for teaching memost of what I know about LaTeX: I am grateful to all those at Springer Mathematics who have helped me along theway. My editors, Kaitlin Leach and Eugene Ha, have provided exceptional guidance.The superb yet anonymous reviewers have devoted considerable time and thought.They have improved the book immeasurably. Such fine support notwithstanding, there surely remain some typographicalerrors, oversights, and even blunders – for which I take sole responsibility. I willbe happy to be informed of any errata, via the email address below.

xixii Acknowledgments

Finally, I thank my wife Pina and children Sarah, Hannah, and John for theirunwavering patience throughout this project. They could always guess the reason,whenever their husband/father disappeared to the basement for hours upon hours.

Everything is vague to a degree you do not realize till you have

tried to make it precise. —Bertrand Russell

We assume that the reader has some familiarity with the set of real numbers, whichwe denote by R. We review interval notation, absolute value, rational and irrationalnumbers, and we say a few things about sequences. The main point of this chapteris to acquaint the reader with two very important properties of R: the IncreasingBounded Sequence Property, and the Nested Interval Property.

1.1 Intervals and Absolute Value

For two real numbers a < b; we write

along with the obvious definitions for Œa; b/ etc.

For a < b; the distance from a to b is b a: One half of this distance is ba 2 . Themidpoint of the interval Œa; b is c D aCb 2 : It satisfies

ba ba aC DcDb : 2 2These are simple but useful observations. See Fig. 1.1. The distance between any two real numbers is measured via the absolute valuefunction. For x 2 R, the absolute value of x is given by p j x j D x2:

For (iv), we write j x j D j .x y/ C y j and apply (iii) to obtain

jxj jx y j C jy j; which gives jx j jy j jx y j:

Now we reverse the roles of x and y; and use (ii), to get

jy j jxj jx y j:

Taken together, these last two inequalities read

˙ jxj jy j jx y j:

That is, ˇ ˇ ˇ jxj jy j ˇ jx y j:

t u In Lemma 1.1, item (iii) is known as the triangle inequality, which is veryuseful. We’ll see in Sect. 2.3 why it gets this name. Item (iv) is also useful; it iscalled the reverse triangle inequality.Remark 1.2. The trick used in the proof of item (iv) in Lemma 1.1, of subtractingy and adding y; then using the triangle inequality, is very common in calculus andreal analysis. ı

1.2 Rational and Irrational Numbers

We denote by N the set of natural numbers:

N D f 1; 2; 3; 4; : : : g:

The set N is closed under addition and multiplication, but it is not closed undersubtraction—that is, the difference of two natural numbers need not be a naturalnumber. Appending to N all differences of all pairs of elements from N, we get the set ofintegers:

Z D f : : : ; 2; 1; 0; 1; 2; 3; : : : g:

The set Z is closed under addition, multiplication, and subtraction. But Z it is notclosed under division—that is, the quotient of two integers need not be an integer. Appending to Z all quotients of all pairs of elements from Z (with nonzerodenominators) we get the set of rational numbers:4 1 The Real Numbers

p QD W p; q 2 Z; and q ¤ 0 : q

The set Q is closed under addition, multiplication, subtraction, and division. (Thereader should agree that it is indeed closed under division.) But as the followinglemma shows, Q is not closed under the operation of taking square roots—that is,the square root of a rational number need not be a rational number. pLemma 1.3. There is no rational number x such that x 2 D 2: That is, 2 is anirrational number.Proof. (e.g., [10, 21]) We prove by contradiction. Suppose that x D p=q, wherep; q 2 Z; with q ¤ 0; is such that x 2 D 2: Then 2q 2 D p 2 . If we factor p intoa product of prime numbers, then there is a certain number of 20 s in the product.Whatever this certain number is, p 2 then has an even number of 20 s in its product ofprimes. Likewise, q 2 has an even number of 20 s in its product of primes. But then2q 2 must have an odd number of 20 s in its product of primes. Therefore 2q 2 D p 2cannot hold and we have a contradiction, as desired. t u For other proofs of Lemma 1.3, see Exercises 1.13 and 1.14. Essentially the same pargument p as given above shows that the square root of any prime number (e.g., 3;or 5; : : : etc.) is irrational. Therefore the square root of any natural number that isnot itself a perfect square, is irrational. Extending this p argument p a little further, p the nth root wepcan see that p p of any primenumber (like 2 or 3; but also 3 2; or 3 3; : : : ; or 4 2; or 4 3; : : : etc.) isirrational. Therefore the nth root of any natural number that is not itself a perfectnth power, is irrational. (See Exercise p 1.15.) p p p In older textbooks, numbers like 3 2; 3 3; 4 2; 4 3 etc. are called surds. For otherproofs that many surds are irrational, see Exercises 1.16 and 1.17.Remark 1.4. The reader will be aware of the usual English meaning of the wordirrational. The English meaning of the word surd is something like lacking sense.ı p To expand Q to include numbers like 2, and indeed all the surds, a reasonableidea now might be to append to Q all nth roots of all rational numbers. But still,important numbers like e and would remain excluded. (We’ll say a little moreabout this later.) So instead we take a different approach. Observe that any number whose decimal expansion (i.e., base 10) either termi-nates or eventually repeats, is a rational number. For example,

The reader is probably familiar with this observation, although a proper proof issomewhat cumbersome. We explore this in Exercise 1.19. (And we’ll see it again inSect. 2.1.)1.3 Sequences 5

Remark 1.5. The converse of this observation is also true: the decimal expansionof any rational number either terminates or repeats. See Exercise 1.20. ı p So since 2 is irrational (by Lemma 1.3), its decimal expansion neitherterminates nor repeats. It begins 1:414213562373095 : : : and continues endlesslywith no repeating pattern. This gives us an ida of how to proceed. We append to Q all nonterminating nonrepeating decimals. This gives the set ofreal numbers R:

R D f all decimals: terminating, repeating, or otherwise g :

As we have already indicated with our pictures, a model for R is the familiarnumber line: x 2 R corresponds to some specific point on the number line. But areal number can have two different decimal expansions. For example, if x D 0:9then 10x D 9:9; and subtracting the first equation from the second, we get 9x D 9.Therefore, x D 1:0 (as well). Likewise 3:59 D 3:6 ; 6:02379 D 6:0238 ; etc.Fortunately, ambiguities of this sort are the only ones that exist.

1.3 Sequences

Our description of the real numbers thus far has been very qualitative. To describethe real numbers precisely (and to remove our dependency on base 10, or on anyother base), one must use sequences. A sequence is a function a W N ! R. That is, the domain of the function is N.As such its set of values, or terms can be written as ˚ a.1/; a.2/; a.3/; : : : :

But it is more customary to write fa1 ; a2 ; a3 ; : : : g; or fan g1

nD1 ; or simply fan g.

Remark 1.6. For a sequence fan g, the domain doesn’t really need to be N.Sometimes it is convenient to start at index zero, or somewhere else. For examplefa.0/; a.1/; a.2/; a.3/; : : :g; or fa.5/; a.6/; a.7/; : : :g; or fa.2/; a.4/; a.6/; : : :g etc.Or a sequence may be finite, for example, f 1; 3; 5; 7; 9 g. Sequences that weconsider will generally not be finite, unless otherwise stated; the domain of thesequence is usually clear from the context. ı Saying that the sequence fan g converges to A means intuitively that an gets closerand closer to A; as n gets larger and larger. When fan g converges to A; we oftenwrite simply: an ! A:Example 1.7. The reader is surely comfortable accepting that the sequence

As we have done in Example 1.7, one can often think along the lines of theintuitive notion of convergence of a sequence. But sometimes more care is required. Precisely, fan g converges to A means that for any " > 0 (no matter how small)we can find N > 0 (which is typically large) such that

The idea now is to show that nC100

3n2 is < something, wherein the something caneasily be made < ": There are any number of ways to proceed from here. (This ispart of the reason why showing that a sequence converges can be tricky.) We observethat for n large enough, we shall have 3n1 C 100 1 3 n2 < n1 . More precisely,

n C 100 1 < for n > 50: (1.1) 3n2 nAnd 1 1 <" , n> : n "

The inclination now might be to choose N D 1=": But to make every step of theabove analysis valid, we must actually choose N D the larger of 1=" and 50. Thatis, N D maxf50; 1="g: (If " D 1=10 for example, then simply taking ˇ N Dˇ 10 doesnot guarantee that (1.1) holds.) Then for n > this N; we have ˇ nC100 3n 2 0ˇ < " asdesired. ˘ ˚1 The reader should agree that a proof that n converges to 0 is more or lesscontained in Example 1.9.Example 1.10. Consider the sequence

and this is about as much simplifying as can be done. The idea again is to show that 54.2nC1/ is < something, wherein the something can easily be made < ": And again,there are any number of ways to proceed. Observe that8 1 The Real Numbers

ˇIn3n1 ˇ we would choose N D 1=". Then for n > this N; we have

this caseˇ 3ˇ < " as desired. 4nC2 4 The latter approach yielded N D 1=". So the N D 5=.2"/ chosen in the former approach is larger than it really needs to be. But no matter, we just wanted to find any N that works. (If something is true from Monday onwards and it is true from Thursday onwards, then it is (obviously) true from Thursday onwards.) ˘Example 1.11. Consider the sequence

ˇ this case weˇ would choose N D maxf22; 1="g. Then for n > this N; we haveInˇ ˇ 1ˇ 2n2 1;001 0 ˇ < " as desired. Whether the N in this latter approach is larger thanthe N of the former approach, depends on ". ˘ The reader should look again at Example 1.9 and find a way to proceed differentfrom the way given there, thus obtaining (probably) a different N: The following is an important fundamental fact about sequences, which is almostobvious. One uses it routinely without explicit mention.Lemma 1.12. If an ! A1 and an ! A2 ; then A1 D A2 :Proof.ˇ Weˇ show that for any given " > 0; no matter how small, it is the case thatˇA1 A2 ˇ < ": For then we must have A1 D A2 : So let " > 0 be arbitrary. By thetriangle inequality (i.e., item (iii) of Lemma 1.1) and item (ii) of Lemma 1.1,ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇˇA1 A2 ˇ D ˇA1 an C an A2 ˇ ˇA1 an ˇ C ˇan A2 ˇ D ˇan A1 ˇ C ˇan A2 ˇ:

These last two terms are getting small as n gets large, since an ! A1 and an ! A2 .So things look good. To make their sum < ", we proceed as follows. Since an ! A1 ,there is N1 such that ˇ ˇ ˇan A1 ˇ < " for n > N1 : 2Since an ! A2 , there is N2 such that10 1 The Real Numbers

Of course, not all sequences are convergent. We say that the sequence fan gdiverges, or is divergent, if there exists no A 2 R for which an ! A:Example 1.14. Consider the sequence

fan g D f.1/nC1 g D f 1; 1; 1; 1; 1; 1; : : : g:

For any real number A we have, by the triangle inequality:

j anC1 an j D j anC1 A C A an j janC1 Aj C jan Aj :

If an ! A then for any " > 0, we can make the right-hand side above < ", by takingn large enough. But this is impossible since j anC1 an j D 2 for every n. Thereforefan g diverges. ˘Remark 1.15. In Example 1.14 above we used that trick again: subtracting A andadding A; then applying the triangle inequality. ı Now backpto the real numbers, for a moment. We have seen p that the decimalexpansion of 2 begins 1:414213562 : : : : So let us associate with 2 the sequence

This sequence is increasing: each term (after the first) is larger than the previousterm. This sequence is also bounded above: each term is less than 1:5 say, or lessthan 2, or less than 1:42, etc. Now each term an of this sequence is a rational number,and ˇ p ˇˇ ˇ 1 ˇan 2ˇ < ! 0: 10 n1.4 Increasing Sequences 11

pSo, by its very construction, the sequence converges to 2, which is irrational. So Qis not closed under the operation of taking the limit of a sequence which is increasingand bounded above. This suggests a way to extend the rational numbers to include the irrationalnumbers: we append to Q all limits of all sequences of rational numbers whichare increasing and bounded above. But we must do some more groundwork in orderto make these ideas precise.

1.4 Increasing Sequences

A sequence fan g is increasing if an anC1 for n D 1; 2; : : : : And fan g is strictly

increasing if an < anC1 for n D 1; 2; : : : : A sequence fan g is decreasing if an anC1 for n D 1; 2; : : : : (That is, fan g isincreasing.) And fan g is strictly decreasing if an > anC1 for n D 1; 2; : : : : A sequence fan g is bounded above if there exists a number U such that an Ufor n D 1; 2; : : : : The number U is called an upper bound for fan g: A sequence fan g is bounded below if there exists a number L such that L anfor n D 1; 2; : : : : (In which case, fan g is bounded above.) The number L is calleda lower bound for fan g:Remark 1.16. If a sequence fan g has an upper bound U; then U is not unique:the number U C 1; for example, also serves as an upper bound for fan g: So doesU C 1=10; as does U C 1;000; etc. Likewise, if fan g has a lower bound L; then Lis not unique. ı pExample 1.17. The sequence fan g D f n g is not bounded above: there is no U forwhich an U for n D 1; 2; : : : : It is bounded below, by L D 1 (and by L D 1=2,and by L D 0, and by L D 10 etc.). This sequence is strictly increasing. ˘Example 1.18. The sequence fan g D f nC1 n g is bounded above, by U D 1 forexample. It is also bounded below, by L D 1=2 for example. This sequence is alsostrictly increasing (as the reader may verify). ˘ A sequence fan g is bounded if fan g is bounded above and bounded below. Thatis, there are numbers L; U such that an 2 ŒL; U for every n 2 N: Setting M DmaxfjLj ; jU jg; we see that this is equivalent to saying that there exists a number Msuch that jan j M for every n 2 N:Example 1.19. Since 4=3 .1/n 1=3 2=3 for every n 2 N; the sequencefan g D f.1/n 1=3g is bounded above and bounded below. As such fan g isbounded: jan j 4=3 for every n 2 N: This sequence is neither increasing nordecreasing. ˘12 1 The Real Numbers

2Example 1.20. The terms of the sequence fan g D f nC1 n g D f 1C1=n n g appear to begetting arbitrarily large as n increases. We show that indeed this sequence is notbounded above. Let M be any given number (to be thought of as large). Then

n2make fan g as large as we like. Therefore fan g D f nC1 g is not bounded above. ˘ If a sequence is increasing without any upper bound, or decreasing without anylower bound, then its terms cannot be getting arbitrarily close to any particularnumber. This, in its contrapositive form, is the idea behind the following.Lemma 1.21. If an ! A; then fan g is bounded.Proof. This is Exercise 1.26. t u n2Example 1.22. We saw in Example 1.20 that the sequence fan g D f nC1 g Df 1C1=n n g is not bounded. So applying Lemma 1.21 in its contrapositive form, fan gdiverges. We might say that fan g diverges to C1. ˘ ˚ The converse of Lemma 1.21 does not hold: The sequence fan g D .1/nC1 isbounded, but as we saw in Example 1.14, it diverges.Example 1.23. Suppose that an ! A. We show that an2 ! A2 : But here we forgothe formal definition (i.e., the " and the N ); this is usually done by people with someexperience in real analysis. By the triangle inequality, ˇ 2 ˇ ˇ a A2 ˇ D j an C A j j an A j .j an j C jA j/ j an A j : n

Now by Lemma 1.21, fan g is bounded because it converges. That is, there is M > 0such that

And again, since fan g converges,

We close this section by stating four more useful facts which again, one usesroutinely without explicit mention. We assume that the reader can prove these, or isquite comfortable in accepting them. We leave their proofs as exercises.Lemma 1.24. Let ’; “ 2 R. If an ! A and bn ! B; then ’an C “bn ! ’A C “B:In particular, an C bn ! A C B and an bn ! A B:Proof. This is Exercise 1.28. t uLemma 1.25. If an ! A and bn ! B; then an bn ! AB:Proof. This is Exercise 1.29. t uLemma 1.26. If an ! A and bn ! B; with bn ¤ 0 for all n and B ¤ 0; thenanbn ! BA :Proof. This is Exercise 1.30. t uLemma 1.27. If an ! A and an 0, then A 0: Consequently (uponconsideration of bn an ), if an ! A and bn ! B; with an bn ; then A B:Proof. This is Exercise 1.31. t u These four lemmas say, respectively, that convergent sequences respect linearcombinations, products, quotients, and nonstrict inequalities. Nonstrict becauseeven if an > 0 in Lemma 1.27, we can still only conclude that A 0: For example,1=n > 0, but lim 1=n D 0. n!1

Example 1.28. As indicated, the proof of Lemma 1.25 is the content of

Exercise 1.29. But here is another approach. In Example 1.23 we showed thatif cn ! C then cn2 ! C 2 : Now it is easily verified that

1 an bn D .an C bn /2 .an bn /2 : 4

So if an ! A and bn ! B then an bn ! AB; by applying Lemma 1.24, in various

combinations, to the right-hand side. ˘

1.5 The Increasing Bounded Sequence Property

We have seen that Q is not closed under the operation of taking the limit of asequence which is increasing and bounded above. (Again, such ap sequence may wellhave a limit, but this limit may not be in Q—as is the case with 2.) So appendingto Q all such limits, we get the set real numbers R :

RDQ[ flimits of all sequences from Q which are increasing & bounded aboveg:14 1 The Real Numbers

pExample 1.29. The irrational number 2 is the limit of the sequence of rationalnumbers

fan g D f1; 1:2; 1:23; 1:234; 1:2345; 1:23456; 1:234567; : : : g:

And this sequence is increasing and bounded above (by 2; say). So x 2 R. ˘

Remarkp 1.31. The two sequences in Examples 1.29 and 1.30, which converge to 2 and to x respectively, are not unique. The reader should thinkpof other sequencesfan g which are increasing and bounded above, for which an ! 2; and an ! x: ı The following is a very important example of a sequence which is increasing andbounded above.Example 1.32. Using an idea from [12], we show that sequence

1 n 1C nof rational numbers is increasing and bounded above. As such, it converges to somereal number. For n 2 N and a ¤ b, the following identity can be found by doinglong division on the left-hand side, or simply verified by cross multiplication:

1 n 1 2n 1C < 2; and so 1C < 4: 2n 2n ˚ n n 1 2n nBut since 1 C n1 is increasing, 1 C n1 < 1 C 2n : Therefore 1 C n1 < ˚ n 4; and so 1 C n1 is bounded above. ˘ ˚ The real number to which .1 C n1 /n converges is denoted by e. The symbole is used in honor of the great Swiss mathematician Leonhard Euler (1701–1783);it is often called Euler’s number. We shall prove in Sect. 8.4 that e is irrational.Approximately, e D 2:718281828459 .Remark 1.33. Saying that e is irrational is the same as saying that e is not thesolution to any equation ax C b D 0; where a and b are integers. Notice thataxCb D 0 is a polynomial equation of degree 1: The French mathematician CharlesHermite (1822–1901) proved in 1873 that e is not a solution to any polynomialequation of any degree with integer coefficients. That is, e is a transcendentalnumber. So even by somehow attaching to Q all nth roots of all rational numbers,or even all linear combinations of these, we would still not obtain all of the realnumbers because e would remain excluded. ı The following theorem contains a fundamental property of the real numbers. Weshall appeal to it many times. It says that R is closed under the operation of takingthe limit of a sequence which is increasing and bounded above.Theorem 1.34. (The Increasing Bounded Sequence Property of R:) Any sequenceof real numbers which is increasing and bounded above converges to a real number.Proof. Let fan g be a sequence of real numbers which is increasing and boundedabove. If each an 2 Q then an ! A 2 R, exactly by our definition of R; and we arefinished. Otherwise, we consider a related sequence fbn g defined by

bn D an ; but truncated after the nth decimal place.

Then fbn g is increasing and bounded above and each bn 2 Q, and so we must havebn ! B; for some B 2 R. Now by the triangle inequality,

The first few terms of this sequence are as follows.

This is a sequence of real numbers which is increasing and bounded above. (Weleave the verification of this for Exercise 1.40.) As such, by the Increasing BoundedSequence Property (Theorem 1.34), fan g converges p to some real number ':pTo find'; notice that since an ! ' and anC1 D 1 C an , we must have ' D 1pC '.Then squaring both sides and using the quadratic formula gives ' D 1C2 5 Š1:618 . ˘Remark 1.37. The number ' is called the golden mean. It is irrational. But it isclearly not transcendental because as we saw, it is the root of a quadratic equationwith integerpcoefficients. There is some debate among historians of mathematics asto whether 2 or ' was the first-ever irrational number to be discovered [23]. ıExample 1.38. The ubiquitous number is the ratio of the circumference to thediameter of any circle. The reader is surely familiar with the formula A D r 2 ,where A is the area of a circle with radius r. By approximating the area of a circleof radius r D 1 with the area of an inscribed equilateral triangle, square, regularpentagon, regular hexagon etc., we see that is the limit of an increasing sequenceof real numbers which is bounded above. As such, is a real number. We shallprove in Sect. 12.2 that is irrational. (So, in particular, ¤ 22=7 !!) ˘Remark 1.39. The German mathematician F. Lindemann (1852–1939) proved in1882 that is in fact transcendental. So again, even by somehow attaching to Q allnth roots of all rational numbers, or even all linear combinations of these, we wouldstill not obtain all of the real numbers— would remain excluded. ıRemark 1.40. One doesn’t normally worry too much about such p things, but allpofthis gives meaning to arithmetic in R: For example, consider 2 C e: Each of 2and e is the limit ofp a sequence of rational numbers which is increasing and boundedabove,p say a n ! 2; and bn ! e: Then each an C bn is a rational number, and 2Ce is the limit of fan C bn g ; a sequence of rational numbers which is increasingand bounded above. And one can verify (but not easily) that this limit is independentof the specific choice of the sequences fan gpand fbn g of rationals, as long as each isincreasing and bounded above, with an ! 2 and bn ! e respectively. ı1.6 The Nested Interval Property 17

1.6 The Nested Interval Property

If fan g converges to A; then fan g clearly converges to A: Therefore, by the

Increasing Bounded Sequence Property of R (Theorem 1.34), every sequence fan gof real numbers which is decreasing and bounded below also converges to a realnumber. This leads to another very important property of R, as follows.Theorem 1.41. (Nested Interval Property of R) For any collection of nestedintervals Œa1 ; b1 Œa2 ; b2 Œa3 ; b3 with the property that bn an ! 0;there is a unique point which belongs to each interval.Proof. The sequence fan g is increasing and bounded above (by b1 ) and so bythe Increasing Bounded Sequence Property (Theorem 1.34), it converges to someA 2 R: The sequence fbn g is decreasing and bounded below (by a1 ) and so itconverges to some B 2 R: We must then have an A B bn forn D 1; 2; 3; : : : : But we cannot have A < B because bn an ! 0: ThereforeA D B; and this real number belongs to each interval Œan ; bn ; as desired. t u pExamplep1.42. Again, the decimal expansion for 2 begins 1:41421356 : Thenumber 2 is the only point that belongs to each of the nested intervals:

Œ1:4; 1:5 Œ1:41; 1:42 Œ1:414; 1:415 Œ1:4142; 1:4143 : ˘

˚ Example 1.43. We showed in Example 1.32 that the sequence .1 C n1 /n isincreasing and bounded above. So it has a limit, which is denoted by e (Euler’snumber). ˚In a similar way, which we leave for Exercise 1.39, it happens that thesequence .1 C n1 /nC1 is decreasing and bounded below. Now observe that

nC1 1 1 n 1 n 1 1 n1 1C 1C D 1C 1C 1 D 1C ! 0: n n n n n n

Therefore, by the Nested Interval Property of R (Theorem 1.41), the collection of

nested intervals h n nC1 i 1 C n1 ; 1 C n1 ; where n D 1; 2; 3; : : :

contains a single point, which must be e. Taking n D 5;000; for example, gives2:7180 e 2:7186: ˘ The basic string of inequalities which comes from Examples 1.32 and 1.43 is

1 n 1 nC1 1C < e < 1C : n n

We shall revisit these inequalities many times.

18 1 The Real Numbers

Example 1.44. By approximating the area of a circle of radius 1 by areas of

inscribed regular polygons with increasing areas, we described as the limitof an increasing sequence of real numbers which is bounded above. By alsoapproximating the area of the same circle by areas of circumscribed regularpolygons with decreasing areas, we can describe as the single point which belongsto a sequence of nested intervals. ˘Remark 1.45. Around 250 B.C., Archimedes used 96-sided inscribed and circum-scribed polygons to obtain the rather impressive estimates

223 22 3:14085 Š < < Š 3:14286 : ı 71 7 Finally, we point out that the three collections p of nested intervals inExamples 1.42–1.44, which yielded the numbers 2; e, and , are not unique.

Fig. 1.3 For Exercise 1.12 y

B(x2,y2)

C(x3,y3)

A(x1,y1)

P Q R x

Here, “det” is short for determinant. See [4] for some interesting extensions of thisformula. p1.13. Fill in the details of the standard textbook proof that 2 is irrational. (Thisproof was known to Euclid (300 p B.C.) and was probably known even to Aristotle(384–322 B.C.)) Assume that 2 D p=q is rational, so that 2 D p 2 =q 2 : Then2q 2 D p 2 must be even. Therefore p is even. Therefore q 2 even and so q is even.So from each of p and q we may cancel a factor of 2: Repeat. p p 2 is irrational,1.14. [6] Fill in the details of the following very slick proof thatdue to American mathematician Ivan Niven (1915–1999). p If 2 is rational p then that b 2pis an integer. Then b 2 b is athere is a smallest positive integer b such psmaller positive integer. Now consider .b 2 b/ 2: p p1.15. (a) Prove that 3 is irrational. (b) Prove that 3 11 irrational. (c) Prove thatp7 45 irrational. (d) Describe how to prove that the nth root of any natural numberwhich is not itself an nth power, is irrational.1.16. [16] Fill in the details of another proof that the square p root of any naturalnumber that is not itself a perfect square is irrational: Let n D p=q where p andq are positive integers, and have no common p factors. Then p 2 and q also have nocommon factors. But then p =q D p n D q n; which is an integer: 2

1.17. [3] Fill in the details of another proof that the square root ofpany naturalnumber that is not itself a perfect square is irrational: Suppose that n D p p=qwhere p and q are positive integers, and have no common factors. Then n Dnq=p also, but this is not in lowest terms. Therefore p is an integer multiple of q.Therefore n is a perfect square, a contradiction.1.18. We haven’t officially met logarithms yet. Still, prove that log10 .2/ is irra-tional. (log10 .2/ is that real number x for which 10x D 2.)1.19. (a) Write 0:823; 0:455; and 0:9999 : : : as fractions.Exercises 21

(b) Multiply x D 0:2 by 10 and subtract x from the result. Then solve for x to write x as a fraction.(c) Multiply y D 0:91 by 100 and subtract y from the result. Then solve for y to write y as a fraction.(d) Write 0:237 and 6:457132 as fractions.(e) Describe how to prove that any decimal which eventually repeats is a rational number. We’ll see another way to do some of this in Sect. 2.1; see also Exercise 2.4.1.20. Show by doing long division that (a) 1=3 D 0:3 (and conclude that 0:9 D 1), 1(b) 11 D 0:09; and (c) 22 7 D 3:142857. (d) Describe how to prove that any rationalnumber is either a terminating or repeating decimal. (e) Let x D p=q be a rationalnumber. What is the longest that the repeating string in its decimal expansion couldpossibly be? (The length of the repeating string in the decimal expansions in eachof 1=7 and 1=97; for example, is maximal.)1.21. (a) Is it true that the sum of two rational numbers is rational? Explain.(b) How about the sum of a rational number and an irrational number? Explain.(c) How about the sum of two irrational numbers? Explain.1.22. Prove Lemma 1.12 another way: Looking for a contradiction, assume thatA1 ¤ A2 ; say A1 < A2 : Then let " D .A2 A1 /=2.1.23. Four of the following seven sequences converge.˚Decide which four they are, nthen prove that each of them converges. (a) f n3=2 1 g; (b) cos. n / C n1 ; (c) fn.1/ g; ˚p p ˚ n 2 C3 o n2 o(d) n C 1 n ; (e) 5n61 1 ; (f) n2nC2n1 , (g) pnC1 1 p n :

1.24. Use the reverse triangle inequality to prove that if an ! A then jan j ! jAj :1.25. (a) Prove that if an ! A then for any " > 0 there exists N > 0 such that all of the terms of fan g belong to the interval .A "; A C "/; except possibly a1 ; a2 ; : : : ; aN 1 ; aN :(b) Use (a) to prove the following. If an ! A with A > 0 then there exists m > 0 and N > 0 such that an m for n > N .1.26. Prove Lemma 1.21. Suggestion: Use Exercise 1.25(a).1.27. (a) Prove that if an ! 0 and fbn g is a bounded sequence, then an bn ! 0:(b) Show that .1/ nC1=2 ! 0: n

then use Lemma 1.21.

then use the triangle inequality and Exercise 1.25(b).

1.31. (a) Prove the first part of Lemma 1.27. Hint: Assume that A < 0 to get a contradiction.(b) Explain how (a) implies the second part of Lemma 1.27: if an ! A and bn ! B; with an bn ; then A B: p p1.32. Prove that if an ! A, with an 0 (so that A 0 too) then an ! A:Hint: First dispense with the case A D 0: Then for A ¤ 0; ˇ ˇ ˇ ˇ ˇp p ˇˇ ˇˇ an A ˇˇ ˇ an A ˇ ˇ ˇ ˇ: ˇ na A D ˇ ˇp p ˇ p ˇ an C A ˇ ˇ A ˇ

Show that bn ! A. Is the converse true? Explain.1.34. Associate with each of the four real numbers x D 3:6912151821242730 : : : ,x D 0:3; x D 0:1002000300004 : : : , and x D 10:567 an increasing sequencewhich converges to x.1.35. (e.g., [1, 11, 13, 15])(a) Prove that there exist irrational numbers a and b such that ab is rational. Hint: p p2 Begin by considering 2 .(b) Prove that there exist irrational numbers a and b such that ab is irrational. Hint: p p2C1 Begin by considering 2 . See also Exercise 1.36.1.36. [20] Here is a constructive approach to Exercise 1.35.(a) We haven’t officially met logarithms yet. Still, prove that log2 .3/ is irrational. (log2 .3/ is that real number x for which 2x D 3.) p 2 log .3/(b) Verify that 2 2 is rational. So there are irrational numbers a and b such that ab is rational. p log .3/(c) Verify that 2 2 is irrational. So there are irrational numbers a and b such that ab is irrational.Exercises 23

1.37. [19] Consider the area between the circumscribed and inscribed circles of aregular n sided polygon with side lengths 1: Show that this area is independent of nand find the area.1.38. (a) Show that between any two real numbers there are infinitely many rational numbers.(b) Show that between any two real numbers there are infinitely many irrational numbers. ˚ 1.39. [12] Show that the sequence .1 C n1 /nC1 is decreasing and bounded below(and hence converges), as follows. (a) Show that for 0 a < b,

b nC1 anC1 an .b a/ < : nC1

(b) Now set a D 1 C 1

nC1 and b D 1 C n1 .1.40.p Consider the sequence fan g R defined recursively via a1 D 1; and anC1 D 1 C an for n D 1; 2; 3; : : : . Use the Increasing Bounded Sequence Property(Theorem 1.34) and mathematicalp induction to show that fan g converges to a realnumber '. Show that ' D .1 C 5/=2; the golden mean. Show that ' is irrational.1.41. For each of the four real numbers x D 3:69121518 : : : ; x D0:1002000300004 : : : ; x D 0:3; and x D 10:567; construct a sequence of nestedintervals Œa1 ; b1 Œa2 ; b2 : : : , with bn an ! 0; such that each intervalcontains x.1.42. Consider the sequence defined by a0 D 1; and anC1 D 1Ca 1 n for n D0; 1; 2; 3; : : :. ˚ (a) Show that Œa2n ; a2nC1 is a collection of nested intervals, with a2nC1 a2n ! 0.(b) Show that the point givenpby the Nested Interval Property (Theorem 1.41) is the golden mean ' D .1 C 5/=2.1.43. A set A is countable if there is a one-to-one onto function W N ! A. (Soall of its elements can be listed off: .1/; .2/; .3/; : : : :)(a) Show that Z is countable.(b) Show that fx 2 Q W 0 < x < 1g is countable.(c) Show that fx 2 R W 0 < x < 1g is uncountable, that is, is not countable.(d) Show that Q is countable. (A formula for a one-to-one onto W Z ! Q can be found in [9]. For a very slick proof that Q is countable, see [7] or [22].)1.44. [17] (If you did Exercise 1.43.) Fill in the details of the following proof thatthe set of algebraic numbers—that is, the set of all roots of all polynomials of anydegree, with integer coefficients—is countable. This amazing fact was discovered in1871 by the great German mathematician Georg Cantor (1845–1918). But first:(a) Show that a quick consequence of Cantor’s discovery is that Q is countable.(b) Consider the polynomial equation24 1 The Real Numbers

p.x/ D an x n C an1 x n1 C : : : C a1 x C a0 D 0;

where the aj 0 s are integers. This equation has at most n solutions. We may assume that an 1: How?(c) Define the index of any such polynomial p as

i ndex.p/ D jan j C jan1 j C : : : C ja1 j C ja0 j :

Show, for example, that there is only one such polynomial with index 2. There are four such polynomials with index 3. There are 11 such polynomials with index 4. Argue that there are only finitely many polynomials with a given index.(d) Now show that the set of algebraic numbers is countable.(e) Show that the set of transcendental numbers is uncountable.

I speak not as desiring more, but rather wishing a more strict

restraint. —Isabella, in Measure for Measure, by William Shakespeare

In this chapter we meet three very important inequalities: Bernoulli’s Inequality, theArithmetic Mean–Geometric Mean Inequality, and the Cauchy–Schwarz Inequality.At first we consider only pre-calculus versions of these inequalities, but we shallsoon see that a thorough study of inequalities cannot be undertaken without calculus.And really, calculus cannot be thoroughly understood without some knowledge ofinequalities. We define Euler’s number e by a more systematic method than that ofExample 1.32. We’ll see that this method engenders many fine extensions.

2.1 Bernoulli’s Inequality and Euler’s Number e

The following is a very useful little inequality. It is named for the Swiss mathemati-cian Johann Bernoulli (1667–1748).Lemma 2.1. (Bernoulli’s Inequality) Let n D 1; 2; 3; : : : : Then for x > 1;

is called a geometric series. In it, each term x k after the firstp

is the geometric meanof the term just before it and the term just after it: x k D x k1 x kC1 : Here wefind a formula for the sum of a geometric series, when it exists. For x ¤ 1 thefollowing identity can be found by doing long division on the right-hand side, orsimply verified by cross multiplication: X n 1 x nC1 xk D 1 C x C x2 C C xn D : 1x kD0

and the result follows upon letting n ! 1: ˘

Example 2.6. [12,16,55]˚ Using Bernoulli’s n Inequality (Lemma 2.1) we show again(cf. Example 1.32) that 1 C n1 converges. We have seen that the number towhich this sequence converges is Euler’s number e: First, observe that

Then applying Bernoulli’s Inequality,

nC1 1 1 1 1 1C 1 1 C 1 D 1: n .n C 1/2 n nC1

Therefore nC1 1 1 n 1C 1C nC1 n ˚ n and so 1 C n1 is increasing. In a very similar way, which we leave forExercise 2.6 (see also Example 1.43 and Exercise 1.39), one can show that˚ nC1 1 C n1 is decreasing. Then we have

.a C b/2 4ab D .a b/2 0:

Therefore,

aCb p .a C b/2 4ab; or ab : 2 p pNow if a D b; then clearly ab D aCb 2 : Conversely, if ab D aCb 2 then in thefirst line of the proof we must have .a b/2 D 0; and so a D b: t u The average A D aCb is known as the Arithmetic Mean of a and b: The quantity p 2G D ab is known as their Geometric Mean.p A rather satisfying Proof WithoutWords for Lemma 2.7, which also suggests why ab is called the Geometric Mean,is shown Fig. 2.1. See also Exercise 2.19

A G

a b p aCbFig. 2.1 G D ab A D 22.2 The AGM Inequality 29

Example 2.8. Suppose we use a balance to determine the mass of an object.

We place the object on the left side of the balance and a known mass on the rightside, to obtain a measurement a: Then we place the object on the right side ofthe balance and a known mass on the left side, to obtain a measurement b: By theprinciple of the lever (or more generally, p the principle of moments) the true mass ofthe object is the Geometric Mean ab. (See also Exercise 2.15.) ˘ Sometimes the most important feature of an inequality is the case in whichequality occurs, as the following example illustrates.Example 2.9. A rectangle with side lengths a and b has perimeter P D 2a C 2band area T D ab: Lemma 2.7 reads ab . aCb 2 /2 ; or T .P =4/2 : So a rectanglewith given perimeter has greatest area when a D b; i.e., when the rectangle is asquare. Likewise, a rectangle with given area has least perimeter when a D b; againwhen the rectangle is a square. In either case, T D .P =4/2 : ˘ The most natural extension of Lemma 2.7 is to allow n positive numbers insteadof just two. But we need to know what would be meant by Arithmetic Mean andGeometric Mean in this case. These turn out to be exactly as one might expect, asfollows. Let a1 ; a2 ; : : : ; an be real numbers. Their Arithmetic Mean is given by

1X n a1 C a2 C C an AD D aj : n n j D1

If these numbers are also nonnegative, then their Geometric Mean given by 0 11=n 1=n Y n G D .a1 /.a2 / .an / D@ aj A : j D1

A number M D M.a1 ; a2 ; : : : ; an / which depends on a1 ; a2 ; : : : ; an is called a

mean simply if it satisfies

min faj g M max faj g:

1j n 1j n

However, for practical purposes one often desires other properties, like (i) havingM.a1 ; a2 ; : : : ; an / independent of the order in which the numbers a1 ; a2 ; : : : ; an arearranged, and (ii) having M.t a1 ; t a2 ; : : : ; t an / D tM.a1 ; a2 ; : : : ; an / for any t 0:The reader should agree that A and G each satisfy (i) and (ii). The Arithmetic Mean–Geometric Mean Inequality below, or what we shallcall the AGM Inequality for short, extends Lemma 2.7 to n numbers. Thisinequality is of fundamental importance in mathematical analysis. The great Frenchmathematician Augustin Cauchy (1789–1857) was the first to prove it, in 1821.We provide his proof at the end of this section. (The Scottish mathematician ColinMaclaurin (1698–1746) had an earlier proof, around 1729, which wasn’t quitecomplete.)30 2 Famous Inequalities

The list of mathematicians who have offered proofs of the AGM Inequality overthe years is impressive. It includes Liouville, Hurwitz, Steffensen, Bohr, Riesz,Sturm, Rado, Hardy, Littlewood, and Polya. (See, e.g., [5,9,18,32,49]; the book [9]contains over 75 proofs.) Below we provide the clever 1976 proof given by K.M.Chong [10].Theorem 2.10. (AGM Inequality) Let a1 ; a2 ; : : : ; an be n positive real numbers,where n 2. Then

G A;

and equality occurs here , a1 D a2 D D an :

Proof. If n D 2 then the result is simply Lemma 2.7, so we consider n 3: Byrearranging the aj 0 s if necessary, we may suppose that a1 a2 an1 an .Then 0 < a1 A an , and so

A.a1 C an A/ a1 an D .a1 A/.A an / 0:

That is, a1 an a1 C an A : (2.1) ATake n D 3 here, and notice that the Arithmetic Mean of the two numbers a2 anda1 C a3 A is A: Now we apply Lemma 2.7 to these two numbers, along with (2.1)to get a1 a3 A2 a2 .a1 C a3 A/ a2 : AThat is,

A3 a1 a2 a3 :

Now take n D 4: The Arithmetic Mean of the three numbers a2 ; a3 and a1 C a4 A

is again A; so by what we have just shown applied to these three numbers, alongwith (2.1), a1 a4 A3 a2 a3 .a1 C a4 A/ a2 a3 : AThat is,

A4 a1 a2 a3 a4 :

Clearly we could continue this procedure indefinitely, showing that A G for anypositive integer n; and so we have proved the main part of the theorem. Now toaddress the equality conditions. If a1 D a2 D D an , then it is easily verified that2.2 The AGM Inequality 31

G D A. Conversely, if A D G for some particular n; then in the argument above,

.a1 A/.Aan / D 0 so that a1 D an D A; and therefore a1 D a2 D D an D A: t uExample 2.11. Suppose that an investment returns 10 % in the first year, 50 % inthe second year, and 30 % in the third year. Using the Geometric Mean

Œ.1:1/.1:5/.1:3/1=3 Š 1:289 ;

the average rate of return over the 3 years is just under 29 %: The Arithmetic Meangives an overestimate of the average rate of return, at 30 %: ˘Example 2.12. We saw in Example 2.9 that a rectangle with given perimeter hasgreatest area when the rectangle is a square, and that a rectangle with given area hasleast perimeter when the rectangle is a square. Likewise, using the AGM Inequality(Theorem 2.10), a box (even in n dimensions) with given surface area has greatestvolume when the box is a cube, and a box (even in n dimensions) with given volumehas least surface area when the box is a cube. ˘Example 2.13. Named for Heron of Alexandria (c. 10–70 AD), Heron’s formulagives the area T of a triangle in terms of its three side lengths a; b; c andperimeter P; as follows:

16T 2 D P .P 2a/.P 2b/.P 2c/:

Therefore, for a triangle with fixed perimeter P , its area T is largest possible whenP 2a D P 2b D P 2c: This is precisely when a D b D c, that is, when thetriangle is equilateral. Likewise a triangle with fixed area T phas least perimeter Pwhen it is an equilateral triangle. In either case, T D P 2 =.12 3/: ˘Remarkp2.14. We saw in Example 2.13 that for a triangle, we have T P 2 =.12 3/: In Example 2.9, we saw that for a rectangle, T P 2 =16: Thislatter inequality persists for all quadrilaterals having area T and perimeter P:See Exercise 2.29. These inequalities are called isoperimetric inequalities. Theisoperimetric inequality for an n-sided polygon is

P2 T ; 4n tan .=n/32 2 Famous Inequalities

and equality holds if and only if the polygon is regular. The isoperimetric inequalityfor any plane figure with area T and perimeter P is

P2 T : 4The famous isoperimetric problem was to prove that equality holds here if andonly if the plane figure is a circle. The solution of the isoperimetric problem takesup an important and interesting episode in the history of mathematics [37]. For apolished modern solution, see [25]. The reader might find it somewhat comfortingthat h i lim n tan D : n!1 nThis can be verified quite easily (see Exercise 5.46) using L’Hospital’s Rule, whichwe meet in Sect. 5.3. ıExample 2.15. [26] We show that Bernoulli’s Inequality (Lemma 2.1)

Then in H G A; the outside inequality can be rewritten rather nicely as

X n X n 1 aj n2 : (2.2) j D1 j D1 aj

Again, equality occurs here if and only if a1 D a2 D D an : The Harmonic Mean

of two numbers a; b > 0 is simply

2ab H D : aCb

In this case, (2.2) reads

1 1 aCb C 4; a b

and equality occurs here if and only if a D b: ˘

To close this section, we supply Cauchy’s brilliant 1821 proof of the AGMInequality (Theorem 2.10) but without addressing the equality conditions—thesewe leave for Exercise 2.35. The pattern of argument here is powerful and has sincebeen used by mathematicians in many other contexts. (We shall see it applied inone other context in Sect. 8.3.)Proof. Again, if n D 2; this is simply Lemma 2.7. If n D 4; we use Lemma 2.7twice:

a1 C a2 C C an C .2m n/A D A: 2mThe numerator of the left-hand side here has 2m members in the sum and so we canapply what we have proved so far to see that

1=2m a1 a2 an A.2 n/ m A:

That is, m n m a1 a2 an A2 A2 ) G n An :

t uRemark 2.17. Extending Lemma 2.7 to n D 4; 8; 16; : : : as above is not too hard,just a bit messy. Cauchy’s genius lies in being able to extending the result toany n: With this in mind we mention that T. Harriet proved the AGM Inequality(Theorem 2.10) for n D 3 around 1,600 [39]. No small feat for the time. ı

which is really what we wanted to show. t

For example, suppose that three squares with side lengths a1 ; a2 and a3 haveaverage area T . Then the single square with area T is the one with side length R.The reader should verify that R is a mean. We have seen that G A. TheCauchy–Schwarz Inequality (Theorem 2.18) shows that A R, on takingb1 D b2 D D bn D 1=n. ˘Remark 2.20. Readers who know some linear algebra might recognize theCauchy–Schwarz Inequality in the following form. For two vectors u D.a1 ; a2 ; : : : ; an / and v D .b1 ; b2 ; : : : ; bn / in Rn ; their dot product is given by P n puvD aj bj , and the length of u is given by kuk D u u . Then the Cauchy– j D1Schwarz Inequality reads

ju vj kuk kvk :

(See [48], for example, for a proof of the Cauchy–Schwarz Inequality in thiscontext.) This says that for non-zero vectors u and v we have36 2 Famous Inequalities

uv 1 1; kuk kvk

so we may define the angle between such vectors in Rn as that 2 Œ0; forwhich uv cos. / D : kuk kvk

2by the Cauchy–Schwarz Inequality. This last piece equals kuk C kvk ; and so wehave the triangle inequality in Rn :

ku C vk kuk C kvk :

So the Cauchy–Schwarz Inequality is fundamental for working in Rn . And, in Rn it

is evident why the triangle inequality is so named—see Fig. 2.2. ı

Fig. 2.2 The Triangle

Inequality ku C vk kuk C kvk u+v v

There are many other proofs of the Cauchy–Schwarz Inequality, a few of whichwe explore in the exercises. However, we would be remiss if we did not supply whatis essentially H. Schwarz’s (1843–1921) own ingenious proof, as follows. (See alsoExercises 2.38 and 2.41.) For any real number t ,

X n .t aj C bj /2 0: j D1

That is,

X n X n X n t2 aj2 C 2t aj bj C bj2 0: j D1 j D1 j D1Exercises 37

Now the left-hand side is a quadratic in the variable t and since it is 0, it must haveeither no real root or one real root. (It cannot have two distinct real roots.) Thereforeits discriminant B 2 4AC must be 0: That is, 0 12 X n X n X n @2 aj bj A 4 aj2 bj2 0: j D1 j D1 j D1

Fig. 2.3 For Exercise 2.11 a b

a−b a

b a

2.12. (a) Fill in

pthe details as follows. Let a; b 0: p of another proof of Lemma 2.7, p Since .t a/.t b/ has real zeros, conclude that ab aCb 2 :(b) Let a; c > 0: Show that if jbj > a C c then ax 2 C bx C c has two (distinct) real roots.2.13. [19] In Fig. 2.4, ABCD is a trapezoid with AB parallel to DC, and EF isparallel to each of these. Show that m is a weighted average of a and b. That is,m D pbCqa pCq ; for some p; q > 0.2.14. (a) Show that for x > 0; we have x C 1=x 2; with equality if and only if x D 1:(b) Conclude (even though we have not yet officially met the exponential function) thatExercises 39

ex C ex cosh.x/ D 1; 2 with equality if and only if x D 0:(c) [31] Show that for x > 0,

xn 1 : 1 C x C x 2 C C x 2n 2n C 1

Fig. 2.4 For Exercise 2.13 D a C

m E F

A b B

2.15. [51] The bank in your town sells British pounds at the rate 1£ D $S and buysthem at the rate 1£ D $B. You and your friend want to exchange dollars and poundsbetweenp the two of you, at a rate that is fair to both. Show that the fair exchange rateis 1£ D SB; the Geometric Mean of S and B:2.16. [13] Let H G A denote respectively the Harmonic, Geometric andArithmetic Means of two positive numbers.(a) Show that H; G; and A are the side lengths of a triangle if and only if p p 3 5 A 3C 5 < < : 2 H 2(b) Show that H; G; and A are the side lengths of a right triangle if and only if A=H is the golden mean: p A 1C 5 D : H 2

2.17. [56] P n n.nC1/(a) Prove that for any natural number n, we have kD 2 : kD140 2 Famous Inequalities

(d) Use this to obtain the desired result.

2.19. In Fig. 2.5, which shows a semicircle with diameter a C b, we can see thatA > G > H as labeled. Use elementary geometry to show that A; G; and H arerespectively the Arithmetic, Geometric and Harmonic Means of a and b:

Fig. 2.5 For Exercise 2.19

A H G

a b

2.20. (a) Suppose that a car travels at a miles per hour from point A to point B, then returns at b miles per hour. Show that the average speed for the trip is the Harmonic Mean of a and b.(b) Show that G H < A G; where H; G and A the Harmonic, Geometric, and Arithmetic Means of two numbers a; b > 0.(c) For a; b > 0, Heron’s Mean, named for Heron of Alexandria (c. 10–70 AD), is p aC ab C b HO D : 3 Show that if a ¤ b, then G < HO < A; where G and A are the Geometric and Arithmetic means of a and b.Exercises 41

2.21. Let a1 ; a2 ; : : : ; an > 0: We saw in (2.2) that

X n X n 1 aj n2 : j D1 j D1 aj

Apply this to the three numbers aCb, aCc; and bCc to obtain Nesbitt’s Inequality:

a b c 3 C C : bCc aCc aCb 2

2.22. Denote by H; G and A the Harmonic, Geometric, and Arithmetic Means of

anC1 D H.an ; bn / and bnC1 D A.an ; bn /:

Show that fŒan ; bn g is a sequence of nested intervals, with bn an ! 0:

Conclude by the Nested Interval Property (Theorem 1.41) that there is c belonging to p each of these p intervals.(c) Show that an bn D ab D G for all n to conclude that c D G: (For example, if a D 1 and b D 2;pthen fan g is an increasing sequence of rational numbers which converges to 2:)2.23.p[43] In Example 2.5 we used Bernoulli’s Inequality (Lemma 2.1) to showthat n n ! 1 as n ! 1: Prove this usingp the AGM Inequality (Theorem 2.10), bysetting a1 D a2 D D an1 and an D n:2.24. Show that

Note: Many variations of Exercise 2.25 have been discovered and rediscoveredover the years (e.g., [15, 22, 27, 29–31, 41, 57]). Other approaches can be found in[3, 17, 42, 44].2.26. [59] Apply the AGM Inequality (Theorem 2.10) to the n C k numbers

where is half of the sum of any pair of opposite angles. If the quadrilateral canbe inscribed in a circle then elementary geometry shows that D =2 and we getBrahmagupta’s formula p A D .s a/.s b/.s c/.s d /:

(And if d D 0 then the quadrilateral is in fact a triangle and we get Heron’s formula.)Show that among all quadrilaterals with a given perimeter, the square has the largestarea.Exercises 43

2.30. In our proof (that is, K.M. Chong’s) of the AGM Inequality (Theorem 2.10)we focused on A and used the inequality (2.1). Fill in the details of the followingproof, which focuses instead on G.(a) Show (assuming again a1 a2 an ) that a1 an a1 C an G : G(b) Use this to prove the AGM Inequality.2.31. Fill in the details of H. Dorrie’s beautiful 1921 proof of the AGM Inequality(Theorem 2.10), as follows. (This proof was rediscovered by P.P. Korovkin in 1952[22] and again by G. Ehlers in 1954 [5].) Lemma 2.7 is the case n D 2; so weproceed by induction, assuming that the result is true for n 1 numbers. What we Pnwant to show is that aj nG: j D1

Q n(a) Argue that since G n D aj ; at least one aj must be G; and some other aj j D1 must be G: So we may assume that a1 G and a2 G:(b) Show that a1 G and a2 G imply that

2.35. Analyze Cauchy’s proof of the AGM Inequality (Theorem 2.10) given at theend of Sect. 2.2 to obtain necessary and sufficient conditions for equality.2.36. In Cauchy’s proof of the AGM Inequality (Theorem 2.10) given at the end ofSect. 2.2, we focused on A and had (for 2m > n):

then apply the result for the 2m case.

2.37. We used Lemma 2.7 to prove the Cauchy–Schwarz Inequality (Theo-rem 2.18). Then we used the Cauchy–Schwarz Inequality to show that A R.Show that A R using Lemma 2.7 directly. When does equality hold?2.38. Fill in the details of the following proof of the Cauchy–Schwarz Inequality(Theorem 2.18), which is very similar to Schwarz’s.(a) Dispense with the case a1 D a2 D D an D 0: Pn(b) Expand the sum in the expression 0 .t aj C bj /2 . j D1 P n P n(c) Set t D aj bj = aj2 : j D1 j D1 P n (This is the t at which the quadratic .t aj C bj /2 attains its minimum.) j D1

2.39. Fill in the details of another proof of the Cauchy–Schwarz Inequality

(Theorem 2.18), as follows.(a) Replace a with a2 and b with b 2 in Lemma 2.7 to get ab 12 a2 C 12 b 2 : p(b) Write ab D t a p1 t b in (a) to show that for numbers a; b and any t > 0,

(d) Dispense with the case a1 D a2 D D an D 0; then set

2.40. Apply Schwarz’s idea, as in his proof of the Cauchy–Schwarz Inequality

P n(Theorem 2.18), to .aj C t /2 : What do you get? Can you prove whatever you j D1got using the Cauchy–Schwarz Inequality?2.41. [53] Fill in the details of the following proof of the Cauchy–SchwarzInequality (Theorem 2.18), which is quite possibly just as slick as Schwarz’s.Observe that 0 12 P n

And the inequality is reversed if the sequences have opposite monotonicity. Fill inthe details of the following proof of Chebyshev’s Inequality, for the aj0 s and bj0 s P nboth increasing. (The other case is handled similarly.) First, let A D n1 aj : j D1

P n in order to conclude that 1 n .aj A/2 .M A/ .A m/. (This inequality j D1 was obtained differently, and generalized considerably, in [4].)(b) Show that this inequality is better than, that is, is a refinement of Popoviciu’s Inequality:

1X n 1 .aj A/2 .M m/2 : n j D1 4

Hint: Show that the quadratic .Qx/.xq/ is maximized when x D 12 .QCq/:

It is easy to be brave from a safe distance.

—Aesop

Let I R be an interval—open, closed, or otherwise. For the sake of simplicity, but

without great loss, we mainly consider functions f W I ! R: Roughly speaking,if f is continuous then f .x/ is close to f .x0 / whenever x 2 I is close tox0 2 I: Many functions which arise naturally in applications are continuous onsome interval I . We shall see that continuous functions have very nice properties.The two big theorems in the world of continuous functions are the IntermediateValue Theorem and the Extreme Value Theorem. We prove these using bisectionalgorithms.

3.1 Basic Properties

Let I be an interval (open, closed, or otherwise) and let f W I ! R. We say that f

is continuous on I if f is continuous at every x0 2 I: That is, for every x0 2 I andfor any sequence fxn g in I for which xn ! x0 ;

lim f .xn / D f . lim xn / D f .x0 /:

n!1 n!1

So the operation defined by f and the operation of taking the limit can beinterchanged. More precisely: For any sequence fxn g in I for which xn ! x0 2 I;and for any " > 0, there is a number N such that jf .xn / f .x0 /j < " for n > N: If a function is continuous on I then its graph has no jumps nor breaks on I .(So one cannot really know that a particular function has a graph with no jumpsnor breaks until it has been verified that the function is continuous.) In Fig. 3.1, thegraphed function is continuous on .a; b/; except at two points.Example 3.1. We can rely very heavily on what we know about sequences to provethings about continuous functions. For example, if f and g are each continuous at x0then so is f C g: Here’s why: If xn ! x0 then f .xn / ! f .x0 / and g.xn / ! g.x0 /;

is continuous on .1; 0/ and on .0; C1/, but f is not continuous at x0 D 0 : Any

Fig. 3.3 For Example 3.3. y

Here, xn ! 0 can be y = f(x)arbitrarily close to 0, but with 1f .xn / always being a fixed y0positive distance from f .0/:So f .x/ does not get close tof .0/ as x gets close to 0

x −1 1

Roughly, the idea is that for a continuous function f defined on I; if x 2 I

is close to x0 2 I then f .x/ is close to f .x0 /. (The formal definition is neededto make precise the two instances of the word close.) The following useful resultillustrates this idea very nicely.Lemma 3.4. Let f be continuous on Œa; b; with f .x0 / ¤ 0 for some x0 2 Œa; b:Then there is a closed interval J Œa; b containing x0 such that f .x/ ¤ 0 forevery x 2 J:Proof. For n D 1; 2; 3 : : : ; let Jn be any closed interval of length .ba/ n whichcontains x0 . If the conclusion of the lemma is not true, then there is a point xn 2 Jnsuch that f .xn / D 0: Now .ba/n ! 0 and so xn ! x0 , and since f is continuouswe must have f .xn / ! f .x0 /. Finally, f .xn / D 0 implies that f .x0 / D 0, acontradiction. t u We assume that the reader has some familiarity with continuous functions.We cite the following simple facts which are inherited from Lemmas 1.24–1.26.We shall use these facts freely, often without explicit mention. Their proofs areleft as Exercises 3.2, 3.4 and 3.6, respectively. If f and g are each continuousfunctions on I; then so are ’f C “g (for any ’; “ 2 R); f g; and f =g (as long asg ¤ 0 on I ): The reader should agree that it is immediate from the definition, that the functionsf .x/ D 1 and g.x/ D x are continuous on R. Therefore, by the first two facts from56 3 Continuous Functions

the previous paragraph, any polynomial is continuous on R. And by the third fact,any rational function (a polynomial divided by a polynomial) is continuous whereverit is defined—that is, wherever its denominator is not zero. We can add many more functions to our collection of continuous functions usingthe fact that a composition of continuous functions is a continuous function. Moreprecisely: Let g W J ! I and f W I ! R be continuous functions. Then thecomposition f ı g W J ! R defined by .f ı g/.x/ D f .g.x// is a continuousfunction. Here’s why: If xn ! x0 then g.xn / ! g.x0 /; because g is continuous.And then f .g.xn // ! f .g.x0 //, because f is continuous.Example 3.5. We show that if g is continuous, then jgj is continuous. Let y0 2 R.If yn ! y0 then by the reverse triangle inequality, ˇ ˇ ˇ jyn j jy0 j ˇ jyn y0 j ! 0;

and so f .y/ D jyj is continuous at y0 : Now if g is continuous at x0 and g.x0 / D y0

then, being a composition of continuous functions, f .g.x// D jg.x/j is alsocontinuous at x0 : That is, jgj is continuous if g is continuous. ˘ The trigonometric function sin.x/ is continuous on R. We leave the verificationof this claim for Exercise 3.8. Then, being a composition of continuous functions,cos.x/ D sin. =2 x/ is continuous on R. Then, being quotients of continuousfunctions, tan.x/; csc.x/; sec.x/; and cot.x/ are continuous wherever they aredefined, i.e., wherever their denominators are not zero. One can define the exponential function f .x/ D ex for x 2 R and then aftersome justification, name f 1 .x/ D ln.x/ as its inverse (for x > 0). Alternatively,one can define the natural logarithmic function f .x/ D ln.x/ for x > 0 and thenafter some justification, name f 1 .x/ D ex as its inverse (for x 2 R). We shall saymore about each of these approaches, in Chaps. 6 and 10 respectively. Either way, exis continuous on .1; C1/, and ln.x/ is continuous on .0; C1/: We assume thatthe reader is comfortable in accepting these two claims, even though we postponetheir proper verification. Graphs of ex and ln.x/ are shown in Fig. 3.4. cos.x/ 3/Example 3.6. f .x/ D x 2 C1 C ecos.x C x ln.sin.x/ C 2/ is continuous forx 2 R. ˘

3.2 Bolzano’s Theorem

The following important result is named for Italian mathematician Bernhard

Bolzano (1781–1848). For its statement, we use the fact that two real numbers Aand B have opposite signs if and only if AB < 0:Theorem 3.7. (Bolzano’s Theorem) Let f be a continuous function on Œa; b withf .a/f .b/ < 0: Then there is at least one c 2 .a; b/ for which f .c/ D 0:3.2 Bolzano’s Theorem 57

y = ex

y= x

1 y = ln(x)

x 1

Fig. 3.4 The graphs of y D ex and its inverse, y D ln.x/: Each is the graph of the other, reflectedthe line y D x

0 ;the midpoint of Œa0 ; b0 ; and bisect Œa0 ; b0 into intervals Œa0 ; c0 and Œc0 ; b0 : Nowif f .c0 / D 0 then we are done—that is, c D c0 (and we count ourselves veryfortunate). Otherwise, since f changes sign on Œa0 ; b0 ; it must change sign oneither Œa0 ; c0 or on Œc0 ; b0 (or on both). Keep an interval on which f changes sign,rename it Œa1 ; b1 and discard the other. Now we continue this process. That is, forn D 1; 2; 3; : : : do the following: ./ Let cn D an Cb 2 n : If f .cn / D 0 then we are done—that is, c D cn : If f .an /f .cn / < 0 then set anC1 D an and bnC1 D cn , and go back to ./: If f .cn /f .bn / < 0 then set anC1 D cn and bnC1 D bn , and go back to ./:Then Œa; b Œa1 ; b1 Œa2 ; b2 Œa3 ; b3 : : : is a sequence of nested intervalswith bn an D ba2n ! 0: So by the Nested Interval Property of R (Theorem 1.41),there is a unique point c belonging to each interval. Now an ! c and bn ! c andf is continuous, so we must therefore have f .an / ! f .c/ and f .bn / ! f .c/:Now we observe that f .an /f .bn / < 0 after each pass through the algorithm, andso we must have f .c/2 0 (by Lemma 1.27). This is only possible if f .c/ D 0; asdesired. t u See Fig. 3.5 for an illustration of Bolzano’s Theorem (Theorem 3.7). For itsproof, we employed what is known as a bisection algorithm. At each step ofsuch an algorithm an interval is bisected, then one of the halves is kept (and the58 3 Continuous Functions

other discarded), based on some particular criterion. We shall employ a bisection

algorithm again in our proof of the Extreme Value Theorem (Theorem 3.23).Bisection algorithms are also used in a number of the exercises.Example 3.8. Consider the equation x sin.x/ D 1: Set f .x/ D x sin.x/1; whichis continuous on R: Now f .0/ D 1 < 0; f . =2/ D =2 1 > 0; and f . / D1 < 0; so applying Bolzano’s Theorem (Theorem 3.7) we see that the equationhas (at least) one solution in .0; =2/ and (at least) one solution in . =2; /. ˘ Let f be a function defined on I and let p 2 I . Then f has a fixed point p iff .p/ D p: That is, the point p is not changed by f —it is fixed. If f has fixed pointp, then the function f .x/ x has a zero at x D p; and conversely. This simpleobservation can be very useful.

Fig. 3.5 Bolzano’s Theorem y

pExample 3.9. Let f .x/ D x 2 x 1= p x: Then p > 0 is a zero of f if and onlyif p is a fixed point of F .x/ D x 2 1= x ; p > 0 is a zero of f if and only if p is p D 1Ca fixed point of H.x/ p 1=x 3=2 ; p > 0 is a zero of f if and only if p is a fixedpoint of G.x/ D x C 1= x: ˘ The following result shows that if the graph of a continuous function f is entirelycontained within the rectangle Œa; bŒa; b; then f must have a fixed point in Œa; b:That is, the graph must intersect the line y D x at least once. See Fig. 3.6.Lemma 3.10. (Fixed Point Lemma) Let f W Œa; b ! Œa; b be continuous. Then fhas at least one fixed point in Œa; b:Proof. If f .a/ D a then a is a fixed point, or if f .b/ D b then b is a fixed point.So we may assume that f .a/ ¤ a and f .b/ ¤ b: Let g.x/ D f .x/ x: Then g iscontinuous on Œa; b; with g.a/ D f .a/ a > 0 and g.b/ D f .b/ b < 0: That is,g.a/g.b/ < 0: So we apply Bolzano’s Theorem (Theorem 3.7) to g to see that thereis p 2 .a; b/ for which g.p/ D 0: That is, g.p/ D f .p/ p D 0; or f .p/ D p; asdesired. t u3.3 The Universal Chord Theorem 59

Fig. 3.6 A continuous y

f W Œa; b ! Œa; b has afixed point: f .x0 / D x0 b

f(x0) = x0 y = f(x)

a x0 b x

Remark 3.11. The Fixed Point Lemma (Lemma 3.10) is a special case ofBrouwer’s Theorem, due to Dutch mathematician L.E.J. Brouwer (1881–1966),which holds in much more generality. Here is an amusing instance of the theorembeing applied in two dimensions. A map of Wyoming say, can be regarded as afunction from Wyoming to a large piece of paper: Each actual point in Wyomingis mapped by the function to a dot on the paper which represents that point. Thenplacing the map flat and wholly within Wyoming (anywhere on the ground, say)can be regarded as a mapping from Wyoming to a subset of Wyoming. Brouwer’sTheorem says that there must be a dot on the map which sits exactly over the actualpoint in Wyoming which the dot represents. (See also Exercise 3.20, and the readermight consult [3, 4, 9] for other amusing examples.) ı

3.3 The Universal Chord Theorem

A function f defined on I has a horizontal chord if f .a/ D f .b/ for some

a < b 2 I: The length of this horizontal chord is then b a: A continuous functionneed not, of course, have any horizontal chords (f .x/ D x, for example). The resultbelow shows however, that if a continuous function happens to have a horizontalchord of length b a; then it must also have a horizontal chord of length .b a/=2.See Fig. 3.7. We state and prove this result on Œ0; 1 instead of Œa; b; only for thesake of simplicity; the Œa; b case is left for Exercise 3.25. (See also Exercise 3.26.)Lemma 3.12. (Half-Chord Lemma) Let f be continuous on Œ0; 1, withf .0/ D f .1/: Then there is c 2 Œ0; 1=2 such that f .c C 1=2/ D f .c/:Proof. Define the function g on Œ0; 1=2 via g.x/ D f .x C 1=2/ f .x/: Then gis continuous on Œ0; 1=2. We want to show that g has a zero in Œ0; 1=2. If g doesnot have a zero in Œ0; 1=2 then, by Bolzano’s Theorem (Theorem 3.7), g is eitheralways positive or always negative on Œ0; 1=2: Say it’s positive; if it’s negative wewould consider g: Then f .x/ < f .x C 1=2/ on Œ0; 1=2: Setting x D 0 and thenx D 1=2, we get60 3 Continuous Functions

f .0/ < f .1=2/ < f .1=2 C 1=2/ D f .1/:

But f .0/ D f .1/ and so we have a contradiction. Therefore g indeed has a zero cin Œ0; 1=2: Then g.c/ D 0 yields f .c C 1=2/ D f .c/; as desired. t uRemark 3.13. Think of a loop of wire shaped like a circle, heated in any mannerwhatsoever. Then since temperature along the wire is a continuous function,and since the wire forms a circle (i.e., f .0/ D f .1/), the Half-Chord Lemma(Lemma 3.12) implies that at any given moment, there is a pair of opposite points onthe wire having the same temperature. Notice that the wire doesn’t really need to be acircular loop—we only need to have a well defined notion of opposite. For example,at any given moment in time there is a pair of antipodal points on the earth’s equatorhaving the same temperature, and another pair of antipodal points having the samewind speed (and another pair having the same atmospheric pressure, etc.). ı

Fig. 3.7 The Half-Chord y

Lemma (Lemma 3.12) onŒa; b: The continuousfunction f has a horizontalchord of length b a so it y = f(x)must also have a horizontalchord of length .b a/=2

(b−a)/2

b−a

x a b

Applying the Half-Chord Lemma (Lemma 3.12), or more precisely Exercise 3.25,over and over again, we can see that if a continuous function has a horizontal chordof length L; then it must also have horizontal chords of lengths L=2k ; for eachk 2 N: But even more is true, as follows. Again, we take Œa; b D Œ0; 1 forthe sake of simplicity; the general Œa; b case is left for Exercise 3.27. (See alsoExercise 3.28.)Theorem 3.14. (Universal Chord Theorem) Let f be continuous on Œ0; 1 withf .0/ D f .1/; and let k be any positive integer. Then there is c 2 Œ0; 1 1=ksuch that f .c C 1=k/ D f .c/: That is, f has a horizontal chord of length 1=k:

Proof. The hypothesis of the theorem is k D 1; so we let k 2. Consider the

function g.x/ D f .x C 1=k/ f .x/ on Œ0; 1 1=k: Then g is continuous onŒ0; 1 1=k. We want to show that g has a zero in Œ0; 1 1=k. If not, then by3.4 The Intermediate Value Theorem 61

But f .0/ D f .1/ and so we have a contradiction. Therefore g indeed has a zero cin Œ0; 1 1=k: Then g.c/ D 0 yields f .c C 1=k/ D f .c/; as desired. t uRemark 3.15. Think again of a circular wire of length L, heated in any mannerwhatsoever. The Universal Chord Theorem (Theorem 3.14) implies that for eachm 2 N, there is a pair of points on the wire of distance L=m from each other(measured along the wire) which have the same temperature. ı We leave it for Exercise 3.29 to show that a continuous function f on Œ0; 1 withf .0/ D f .1/ need not have a horizontal chord of length t; if t 2 .1=2; 1/: Butmore interesting is the fact that f need not have a horizontal chord of length 1=m;if m ¤ 1; 2; 3; : : : : In this sense the Universal Chord Theorem (Theorem 3.14) isas good as it can be. This is demonstrated by the function

f .x/ D x sin2 .m / sin2 .m x/:

Here, f is continuous on Œ0; 1; with f .0/ D f .1/ D 0; yet one can check that

f .x C 1 m / f .x/ D 1 m sin2 .m / D 0

only if m is an integer. This is Exercise 3.30.

Remark 3.16. This section owes much to the excellent book [3]. From there: “Eventhough the Universal Chord Theorem was discovered by A.M. Ampere in 1806, it iscommonly attributed to P. Levy, who rediscovered it in 1934, but also showed thatit is optimal.” Levy showed that it is optimal by using precisely the f .x/ from theabove paragraph. ı

3.4 The Intermediate Value Theorem

The Intermediate Value Theorem is arguably the most important theorem aboutcontinuous functions. It amounts to improving Bolzano’s Theorem (Theorem 3.7) toallow f .a/ and f .b/ to be any two values—not just one negative and one positive.It is equivalent to Bolzano’s Theorem but it gets used more often in this latter form.The theorem says that a continuous function on a closed interval Œa; b attains everyvalue between f .a/ and f .b/: See Fig. 3.8. This property is called the IntermediateValue Property on Œa; b.62 3 Continuous Functions

Fig. 3.8 The Intermediate y

let y0 be any number between f .a/ and f .b/: Then there is at least one c 2 Œa; bfor which f .c/ D y0 :Proof. If y0 D f .a/ or y0 D f .b/ then we take c D a or c D b and we aredone. So let y0 be strictly between f .a/ and f .b/ and consider the function g.x/ Df .x/ y0 : Then g is continuous on Œa; b: And since y0 is strictly between f .a/and f .b/ we have g.a/g.b/ D Œf .a/ y0 Œf .b/ y0 < 0. Applying Bolzano’sTheorem (Theorem 3.7) to g we see that there is c 2 .a; b/ for which g.c/ D 0:That is, f .c/ D y0 as desired. t uRemark 3.18. The reader should agree, perhaps after making a sketch or two, thata function may have the Intermediate Value Property on some particular interval, yetnot be continuous on that interval. Nevertheless, it might seem that the IntermediateValue Property should characterize continuous functions in the following way: Iff satisfies the Intermediate Value Property on every subinterval of Œa; b, then fshould be continuous on Œa; b. But this is not the case either, as the followingfunction demonstrates. Let ’ 2 R and define

sin x1 if x ¤ 0 f .x/ D ’ if x D 0 :

This function attains every value between 1 and C1 (infinitely many times) onany interval which contains x D 0; yet it is not continuous on any such interval,no matter what value is chosen for ’. We leave the verification of this claim forExercise 3.34. The graph of y D f .x/ is shown (very roughly) in Fig. 3.9. ı The following is a useful consequence of the Intermediate Value Theorem(Theorem 3.17). Among other things, it is the basis for some important results thatwe shall meet in Chap. 9.3.4 The Intermediate Value Theorem 63

1 y = sin x

Fig. 3.9 The graph of f .x/ D sin x1 . This function has the Intermediate Value Property on anyinterval which contains x D 0; yet it is not continuous on any such interval

For n large enough we shall have Œan ; bn J , with u … Œan ; bn ; and thereforef .u/ > f .x/ for all x 2 Œan ; bn : But by the choice of Œan ; bn at each stage of thealgorithm, there is t 2 Œan ; bn such that

3.1. Let f be a function defined on I .

(a) Suppose that for every " > 0 there is ı > 0 such that jf .x/ f .x0 /j < " whenever x 2 I and jx x0 j < ı. Show that f is continuous at x0 :(b) Suppose that f is continuous at x0 . Show that for any " > 0 there is ı > 0 such that jf .x/ f .x0 /j < " whenever x 2 I and jx x0 j < ı.Exercises 67

3.2. Let ’; “ 2 R. Use Lemma 1.24 to prove that if f and g are continuous on I ,then ’f C “g is continuous on I . (In particular, f ˙ g is continuous on I .)3.3. Let f and g be continuous functions on I . Show that the functions ˚ ˚ .f ^ g/.x/ D min f .x/; g.x/ and .f _ g/.x/ D max f .x/; g.x/

jf .x/ g.x/j/ and .f _ g/.x/ D 12 .f .x/ C g.x/ C jf .x/ g.x/j/:3.4. Use Lemma 1.25 to prove that if f and g are continuous on I , then the productf g is continuous on I:3.5. (a) Prove directly from the definition that if f is continuous, then so is f 2 : Hint: Look at Example 1.23. (b) Write f g D 14 .f C g/2 .f g/2 to prove that if f and g are continuous, then f g is continuous. Hint: Look at Example 1.28.3.6. Use Lemma 1.26 to prove that if f and g are continuous at on I and g.x/ ¤ 0for x 2 I; then the quotient f =g is continuous on I . p3.7. Let f be continuous on I , with f 0: Show that f is continuous on I .3.8. In this exercise we show that sin.x/ is continuous at every x0 2 R.(a) Show (a picture will be helpful) that

sin.A ˙ B/ D sin.A/ cos.B/ ˙ cos.A/ sin.B/

to show that xCx0 xx0 sin.x/ sin.x0 / D 2 cos 2 sin 2 :

(d) Use (c) to show that sin.x/ is continuous at every x0 2 R.

3.9. (a) Prove that the function f .x/ D x 2 x ex has a positive root.(b) Prove that the equation cos.x/ D x has a solution in Œ0; =2: Make a sketch.3.10. Show that the equation x 4 x 2 2 D x has one negative solution and onepositive solution. Draw a picture.3.11. Prove that the equation ex D x 4 has three solutions. Make a sketch.68 3 Continuous Functions

then x0 is a solution to 2x 3 C 4x 2 2x 5 D 0: 1 x23.16. (a) Show that f .x/ D has a fixed point in Œ0; 1. 1 C x2(b) Sketch the graphs y D f .x/ and y D x on Œ0; 1: 1(c) Show that g.x/ D has a fixed point in Œ0; 1. What is this fixed point? 1Cx(d) Sketch the graphs y D g.x/ and y D x on Œ0; 1: 1 C 2 cos.x/3.17. (a) Show that f .x/ D has a fixed point in Œ0; =2: .cos.x/ C 2/2(b) Can you show that the fixed point is in fact in Œ1=4; 1=2?3.18. Let f W Œ0; 1 ! Œ0; 1 be continuous.(a) Show that there is a 2 Œ0; 1 such that f .a/ D p a2 :(b) Show that there is b 2 Œ0; 1 such that f .b/ D b:(c) Show that there is c 2 Œ0; 1 such that f .c/ D sin. c=2/:3.19. [7] Let f be continuous on Œ1; 1; with f .1/ 1 and f .1/ 1. Showthat f has a fixed point.3.20. We saw in Remark 3.11 that the Fixed Point Lemma (Lemma 3.10) is aspecial case of Brouwer’s Theorem, which holds in two dimensions. Use the factthat Brouwer’s Theorem also holds in three dimensions to argue the following: Afterstirring a cup of coffee in any manner whatsoever and then letting it settle, there isa point in the coffee which ends up exactly where it began.3.21. Suppose that one ride on a particular roller coaster lasts exactly 3 min. To keeppeople moving along, the amusement park staff runs a set of cars exactly 1:5 minafter the previous set has left. Suppose that Hannah rides at the front of a set andSarah rides at the front of the next set. Show that during Hannah’s ride there is aninstant at which she is at precisely the same elevation as Sarah.

3.22. [5] A snail begins to crawl up a stick at 6 am and reaches the top of the stickat noon. It spends the rest of the day and that night at the top. The next morningit leaves the top at 6 am and descends by the same route it used the day before,Exercises 69

reaching the bottom at noon. Prove that there is a time between 6 am and noon atwhich the snail was at exactly the same spot on the stick on both days. Note: Thesnail may crawl at different speeds, rest, or even go backwards. Snails do that. (Thisproblem has appeared in many places in many forms. It was originally posed by theAmerican mathematician and science writer Martin Gardner (1914–2010).)3.23. Suppose that one ride on a particular roller coaster lasts exactly 2 min andyou take a ride. Show that there is a time interval 15 s in length after which your netchange in elevation is zero.3.24. Suppose that a roller ride coaster is 1:2 miles long and you take a ride. Showthat there is a stretch of 0:24 miles after which your net change in elevation is zero.3.25. Modify the proof of the Half-Chord Lemma (Lemma 3.12) to obtain a versionfor Œa; b, as follows. Let f be a continuous function on Œa; b; with f .a/ D f .b/:Show that there is at least one c 2 Œa; aCb 2 such that f .c C ba 2 / D f .c/: That is,f has a horizontal chord of length .b a/=2:Hint: Consider g.x/ D f .x C .b a/=2/ f .x/ on Œa; .a C b/=2:3.26. (a) Prove the Half-Chord Lemma (Lemma 3.12) another way: Consider the function g.x/ D f .x C1=2/f .x/ on Œ0; 1=2 and observe that g.0/g.1=2/ < 0: Now apply Bolzano’s Theorem (Theorem 3.7).(b) Is this proof preferable to the one in the text? Why or why not?(c) Modify the argument in (a) to prove the Half-Chord Lemma on Œa; b:3.27. Modify the proof of the Universal Chord Theorem (Theorem 3.14) to obtain aversion for Œa; b, as follows. Let f be a continuous function on Œa; b with f .a/ Df .b/; and let k be any positive integer. Show that there exists c in Œa; b .b a/=ksuch that

(b) Find an example which shows that even if f is continuous on Œ0; 1 with f .0/ D f .1/; f need not have a horizontal chord of length t; for t 2 .1=3; 1=2/: A picture will suffice.3.30. Verify the claim made prior to Remark 3.16, that P. Levy’s example

f .x/ D x sin2 .m / sin2 .m x/

has f .0/ D f .1/; yet has no horizontal chord of length 1=m unless m is a naturalnumber. This shows that the Universal Chord Theorem (Theorem 3.14) is as goodas it can be.3.31. A snail begins to crawl up a stick at 6 am and reaches the top of the stickat noon. It spends the rest of the day and that night at the top. The next morningit leaves the top at 6 am and descends by the same route it used the day before,reaching the bottom at noon. Prove that there are two times, 21 h apart, at which thesnail was at exactly the same spot on the stick. (The snail may crawl at differentspeeds, rest, or even go backwards. Snails do that.)3.32. (a) Show that a 12-h clock that is stopped is correct twice a day.(b) Show that the conclusion of the Intermediate Value Theorem (Theorem 3.17) no longer holds if f is not continuous on Œa; b.3.33. Let a > 0: Show that the equation x 4 x 2 x D a has one negative solutionand one positive solution.3.34. Let y0 2 R and consider the function

3.37. Let f be continuous on Œa; b and let x1 ; x2 ; : : : ; xn 2 Œa; b: Prove that thereis a number c 2 Œa; b at which the Geometric Mean of f evaluated at these pointsis attained. That is, prove that there is a number c 2 Œa; b such that

Y n 1=n 1=n f .c/ D f .xj / D f .x1 / f .x2 / f .xn / : j D1

3.38. Prove the part of the Extreme Value Theorem (Theorem 3.23) which pertainsto xm :3.39. (a) Show that the conclusion of the Extreme Value Theorem (Theorem 3.23) is no longer true if f is not continuous on Œa; b.(b) What happens if f is continuous, but on an interval that is not closed?(c) Is the Extreme Value Theorem (Theorem 3.23) true if we assume only that f has the Intermediate Value Property on Œa; b?

Where the telescope ends the microscope begins, and who can say which has the wider vision? —Les Misérables, by Victor Hugo

Again we denote by I R a generic interval, and we mainly consider functions

f W I ! R: Roughly speaking, if a function has a derivative at x 2 I then it has awell-defined tangent line at .x; f .x//: Asking for a function to have a derivative ismore than asking for it to be continuous. Still, many functions which arise naturallyin applications do have a derivative for all x in some interval I . After defining thederivative, we remind the reader how to find derivatives of many different kinds offunctions. And we shall see, for the sake of applications, that horizontal tangentlines are particularly desirable.

4.1 Basic Properties

Let I be an interval (open, closed, or otherwise) and let f W I ! R. We say that

f is differentiable on I if f is differentiable at every x0 2 I: That is, for everyx0 2 I and for any sequence fxn g in I for which xn ! x0 ; and xn ¤ x0 ; it happensthat

f .xn / f .x0 / lim exists, n!1 xn x0

and depends only on x0 (that is, not on fxn g). When this is the case, we call the limitthe derivative of f at x 0 and we denote it by f 0 .x0 /. To be more precise, if f 0 .x0 / exists then for any sequence fxn g I for whichxn ! x0 and xn ¤ x0 ; and for any " > 0, there is a number N such that ˇ ˇ ˇ f .xn / f .x0 / ˇ ˇ f 0 .x ˇ 0 ˇ < " for n > N: / ˇ xn x0

y 9 D 6.x 3/; that is, y D 6x 9: ˘

f .xn / f .x0 / jxn j j0j 1=n 0

D D D .1/n : xn x0 xn x0 .1/n =n 0

Now since the sequence f.1/n g diverges, f 0 .0/ does not exist. Therefore f is notdifferentiable at x D 0: The problem here is that the graph of f .x/ D jxj has acusp/corner at .0; 0/ and so there is no well-defined tangent line at .0; 0/. (However,f is differentiable on .1; 0/ [ .0; C1/.) See Fig. 4.2. ˘

Fig. 4.2 For Examples 4.5 y

Observe that if f is differentiable at x; then for any sequence fxn g with xn ! x

(and xn ¤ x),

f .xn / f .x/ lim exists: n!1 xn x76 4 Differentiable Functions

Clearly the denominator here xn x ! 0, so for the limit to exist it must be thecase that the numerator f .xn / f .x/ ! 0 also. That is, f .xn / ! f .x/: Thissimple observation yields a connection between the differentiable functions and thecontinuous functions, as follows.Lemma 4.6. If f is differentiable at x 2 I; then f is continuous at x:Proof. This is Exercise 4.1. t u Lemma 4.6 shows that the differentiable functions form a subset of the contin-uous functions. And it is a proper subset because f .x/ D jxj is continuous atx D 0 (Example 3.5) but as we saw in Example 4.5, it is not differentiable at x D 0.Still, many functions which arise naturally in applications are differentiable on someinterval I .Remark 4.7. It was long believed by mathematicians that a continuous functionmust be differentiable, except perhaps at some isolated points, just as f .x/ D jxjis differentiable everywhere except at x D 0. But around 1872, the Germanmathematician Karl Weierstrass (1815–1897) constructed a function which iscontinuous at each point of R, yet is differentiable at no point of R. Weierstrass’sexample has an important place in the history of mathematics. ıExample 4.8. Suppose that h is differentiable at x0 2 R and let H.x/ D h.x/2 .Then for xn ! x0 with xn ¤ x0 ;

Now since h is differentiable at x0 , it is continuous at x0 , by Lemma 4.6. Therefore

h.xn / ! h.x0 /. Finally then, h2 is differentiable on R and

.h2 /0 .x0 / D 2h.x0 /h0 .x0 / for x0 2 R: ˘

Remark 4.9. In Example 4.8, Lemma 4.6 is essential. Example 4.8 is a special caseof the Chain Rule, which we meet in Sect. 4.2. ı Suppose for the moment that I D Œa; b is a closed interval and that f isdifferentiable on I: Then in particular, f is differentiable at x0 D a: That is, thederivative from the right exists at x0 D a: We denote this by fR0 .a/. Likewise, fis differentiable at x0 D b: That is, the derivative from the left exists at x0 D b.This is denoted by fL0 .b/.Example 4.10. For f .x/ D jxj ; we have fR0 .0/ D 1 and fL0 .0/ D 1: Indeed,f 0 .0/ does not exist because these two limits are not equal. Again, see Fig. 4.2. ˘4.2 Differentiation Rules 77

If x0 is not an endpoint of I; then for jhj small enough, x0 C h 2 I . Therefore

we may write

f .x0 C h/ f .x0 / f 0 .x0 / D lim : h!0 h

When the derivative is to be thought of as a function, which is typically the case,

the x0 in f 0 .x0 / is usually replaced simply by x (or by t , or by s; etc.). It is customary to denote a small increment in x by x. Then for y D f .x/; theresulting increment in f .x/ is denoted by y D f .x C x/ f .x/: As such, wewrite

f .x C x/ f .x/ y dy f 0 .x/ D lim D lim D :

x!0 x x!0 x dx

ySince x is the average rate of change of f between x and x C x; this notationemphasizes the important fact that f 0 .x/ D dx dy is the instantaneous rate of changeof f with respect to x: Other notations are, depending on the context,

dy d d df y 0 D f 0 .x/ D D yD f .x/ D : dx dx dx dx

4.2 Differentiation Rules

In practice, appealing to the definition of the derivative is often unnecessary. Instead

Proof. This is Exercise 4.2. t

u For the case of a product of functions, the differentiation rule is not quite sostraightforward.Product Rule: If f and g are each differentiable for x 2 I then the product f gis differentiable for x 2 I; with 0 f .x/g.x/ D f .x/g 0 .x/ C g.x/f 0 .x/:78 4 Differentiable Functions

For x 4 we do the same sort of thing:

as desired. t u We have seen that the function f .x/
1 is differentiable on R (withf 0 .x/
0). Therefore, by the Power Rule for Positive Integer Powers, and theLinear Combination Rule, any polynomial is differentiable on R. Assume that f and g are differentiable, and consider their quotient h D f =g(wherever g ¤ 0). As in [28], we write hg D f and use the Product Rule to obtain

as desired. t u Thus far, we have seen how to obtain derivatives of linear combinations (includ-ing sums and differences), products, and quotients of functions. For compositionsof functions, we use the Chain Rule below. Proofs of the Chain Rule are somewhattricky so we supply one, which is motivated by [26]. (See also [5] and [23].) Anotherproof is outlined in Exercise 4.13.Chain Rule: Let g W J ! I and f W I ! R. If g is differentiable on J; and f isdifferentiable on I , then their composition f ı g is differentiable on J; and

and the theorem is proved. However, there are functions g for which there exists nosuch N as above. (See Exercise 4.11.) So if there is no such N , let a1 be the xj infx1 ; x2 ; x3 ; x4 ; : : :g which has the smallest subscript, and for which g.xj / D g.x0 /:Let a2 be the xj in fx2 ; x3 ; x4 ; : : :g which has the smallest subscript, and for whichg.xj / D g.x0 /: Let a3 be the xj in fx3 ; x4 ; : : :g which has the smallest subscript,and for which g.xj / D g.x0 / etc. Then fan g is a sequence in J , with an ! x0 andan ¤ x0 : Here we have g.an / g.x0 / D 0 for each n: an x0Therefore g 0 .x0 / D 0. But we also have f .g.an // f .g.x0 // D 0 for each n; an x0and therefore f .g.an // f .g.x0 // ! f 0 .g.x0 //g 0 .x0 /; an x0as desired. t u In terms of instantaneous rates of change, the Chain Rule can be stated as follows.If f is a function of g, and g is a function of x, then ultimately f is a function of x.And if f and g are also differentiable, then df df dg D : dx dg dxIndeed, if Fergus runs three times faster than Giuseppina and Giuseppina runs twotimes faster than Xavier, then Fergus runs six times faster than Xavier. For f .x/ D x p=q we write f .x/q D x p ; then the Chain Rule and the PowerRule for Integer Powers can be used to obtain a Power Rule for Rational Powers, asstated below.Power Rule for Rational Powers: Let p=q 2 Q be a rational number and for p px > 0; let f .x/ D x p=q . Then f is differentiable for x > 0, and f 0 .x/ D x q 1 . qProof. This is Exercise 4.14. (See also Exercise 4.12.) t u

4.3 Derivatives of Transcendental Functions

A transcendental function is a function that cannot be expressed as a finite

combination of the operations of addition, subtraction, multiplication, division,raising to powers, and taking roots. That is, it transcends the basic algebraicoperations.4.3 Derivatives of Transcendental Functions 81

The simplest examples of transcendental functions are the trigonometric

functions and their inverses, and the exponential function ex and its inverse thenatural logarithmic function ln.x/: It is beyond the scope of this book to showthat these functions are indeed transcendental, so we shall have to be content withtranscendental being simply a name. We leave it for Exercise 4.15 to show that 0 sin.x/ D cos.x/:

arctan.tan. // D for 2 . =2; =2/; and

Differentiating the latter expression, the Chain Rule gives:

However, these manipulations only suggest the answer because they assumethat the derivative of arctan.x/ exists. One can show that it indeed exists usingExercise 4.16. But here, following [13], we do so more directly. For 0 x < y; set

Applying the trigonometric identities

sin.A B/ D sin.A/ cos.B/ cos.A/ sin.B/ and

with y D tan.A/ and x D tan.B/ on the left-hand and right-hand sides respectively,we get

1 arctan.y/ arctan.x/ 1 p p < : 1 C y2 1 C x2 yx 1 C xy

Therefore, letting y ! x (or x ! y), we get

0 0 1 arctan.x/ D tan1 .x/ D ; for x 0: 1 C x2

Now since arctan.x/ is an odd function: arctan.x/ D arctan.x/ for all x, this issufficient to show that the formula holds for all real x: The exponential function ex and its inverse function ln.x/ are related by

ln.ex / D x for x 2 R; and

eln .x/ D x for x > 0:

For the moment we assume that the reader is familiar with the basic properties of exand ln.x/. One such property is

.ex /0 D ex for x 2 R:

Then differentiating eln.x/ D x using the Chain Rule gives

eln.x/ .ln.x//0 D 1:

Therefore 0 1 ln.x/ D for x > 0: xBut again, these manipulations are only suggestive because in them, we haveassumed that the derivative of ln.x/ exists. One can show that it indeed exists usingExercise 4.16 but we shall do so more directly in Sect. 6.3—in a similar spirit tohow we obtained the derivative of arctan.x/ above.4.4 Fermat’s Theorem and Applications 83

4.4 Fermat’s Theorem and Applications

Surely the most practical application of the derivative is to find maximum andminimum values (these are called extrema) for various functions. To this end, thefollowing result is very useful.Theorem 4.13. (Fermat’s Theorem) Let f be defined on .a; b/ and let c 2 .a; b/be such that f .c/ f .x/ for all x 2 .a; b/: Then either f 0 .c/ D 0 or f 0 .c/ doesnot exist.84 4 Differentiable Functions

Proof. Let fxn g be any sequence in .a; b/ with xn ! c (and xn ¤ c) so by

hypothesis, f .c/ f .xn /: Now whenever c > xn , we have

f .xn / f .c/ 0: xn c

But whenever c < xn , we have

f .xn / f .c/ 0: xn c

So if f 0 .c/ exists then we must have f 0 .c/ D 0: (Otherwise, of course, f 0 .c/ doesnot exist.) t u This theorem is named for French mathematician Pierre de Fermat (1601–1665). Although his life predates the discovery of calculus proper, Fermat computedtangent lines and extrema for many families of curves. The number f .c/ in this context is called a local maximum for f . It islocal because f may attain larger values outside .a; b/. Fermat’s Theorem(Theorem 4.13) holds also for c yielding a local minimum: f .c/ f .x/ forall x 2 .a; b/: We leave the verification of this claim for Exercise 4.24. The numberf .c/ is called a local extremum if it is either a local maximum or a local minimum. The number f .c/ is called an absolute maximum for f if f .c/ f .x/ forall x in the domain of f; and the number f .c/ is called an absolute minimum iff .c/ f .x/ for all x in the domain of f: The number f .c/ is called an absoluteextremum if it is either an absolute maximum or an absolute minimum. Any point c for which either f 0 .c/ D 0 or f 0 .c/ does not exist is called acritical point for f: So Fermat’s Theorem (Theorem 4.13) says, in short, that alocal extremum for f must occur at a critical point for f: These are places at whichf has either a horizontal tangent line or a cusp/corner. In either case, f .c/ is calleda critical value for f: The converse of Fermat’s Theorem (Theorem 4.13) does not hold—that is, acritical value need not be a local extremum: Consider f .x/ D x 3 at x D 0. (Thepaper [7] amusingly calls such points “duds.”) So Fermat’s Theorem only tells uswhere to look for local extrema; it does not guarantee success. (If someone hascaught a fish, then they must have been at a body of water. So suggesting that yourfriend goes fishing at a body of water is good advice, but this of course does notguarantee success.)Example 4.14. Let us seek the point(s) on the curve y D x 2 closest to the point.0; 1/. See see Fig. 4.3.4.4 Fermat’s Theorem and Applications 85

Fig. 4.3 For Example 4.14. y

Distance d from .0; 1/ toy D x2

y = x2

x −1 1

The distance from any point .x; x 2 / on the curve to the point .0; 1/ is given by p p d D d.x/ D .x 0/2 C .x 2 1/2 D x 2 C .x 2 1/2 :

Now it is clear from Fig. 4.3, or observe that d.x/ ! C1 as x ! ˙1, thatd indeed has an absolute minimum (and no absolute maximum). An absoluteminimum is also a local minimum and by Fermat’s Theorem (Theorem 4.13), itmust occur at a critical point. By the Chain Rule, and after some simplifying,

x.2x 2 1/ d 0 .x/ D p ; x 2 C .x 2 1/2 p points at x D 0pand x Dp˙1= 2; at which d 0 D 0: Nowand so d has critical pd.0/ D 1 and d.1= 2/ D d.1= p 2/ D 3=2 <p 1: Therefore the pointson y D x 2 closest p to .0; 1/ are .1= 2; 1=2/ and .1= 2; 1=2/; each attains theminimum distance 3=2: (The value d.0/ D 1 is a local maximum.) ˘ In Example 4.14 we were able to justify, within the context, that Fermat’sTheorem (Theorem 4.13) indeed led us to the absolute minimum that we sought.Generally however, deciding which critical points yield absolute extrema can be adelicate matter. We pursue this further in Sect. 5.2. But if the interval under consideration is Œa; b; that is, if it is closed, thenFermat’s Theorem (Theorem 4.13) and the Extreme Value Theorem (Theorem 3.23)together give a recipe by which the absolute extrema of a continuous function canbe found quite easily: The absolute maximum value of a continuous function f on Œa; b is the largest of

fthe critical values of f; f .a/; and f .b/g:

86 4 Differentiable Functions

The absolute minimum value of a continuous function f on Œa; b is the smallest of

fthe critical values of f; f .a/; and f .b/g:

(Here, if your friend goes to the right body of water then there are definitely fish tobe caught. And if your friend employs impeccable fishing techniques, then successis guaranteed!)Example 4.15. Consider the function p f .x/ D ex 5 4x ;

which is continuous on Œ1; 1: Here, by the Product and Chain Rules and aftersome simplifying, ex .3 4x/ f 0 .x/ D p : 5 4xThe only critical point that f has in Œ1; 1 is x D 3=4; at which f 0 D 0: Nowf .3=4/ Š 2:99; f .1/ Š 1:1; and f .1/ D e Š 2:718. Therefore, on Œ1; 1, fhas a maximum value of about 2:99 and a minimum value of about 1:1 : ˘Example 4.16. John is on one side of a river 1/2 a mile wide, say at point A. Henotices his house burning 2 miles downstream, but it is on the opposite side of theriver; naturally, he wants to get to his house as quickly as possible. John can run at5 miles per hour and he can swim downstream at 3 miles per hour. How should heproceed? For a solution, consider Fig. 4.4, which helps us to obtain an expression for thetime T that it takes John to get to his house (point B). We denote by P any point onthe opposite bank to which he may swim. The distance from P to the point directlyacross the river from A (let’s call it A0 ) is x:

Fig. 4.4 For Example 4.16.

B 2−x P x AJohn swims from A to P ,then runs from P to B

1 x2+1/4 2

As time is distance divided by speed,

p x 2 C 1=4 2x T .x/ D C : 3 5Exercises 87

Obviously, we need only consider T .x/ for 0 x 2. So we want to find a value

and so T is differentiable for all x; and T 0 .3=8/ D 0: That is, T has a critical x D 3=8: Now T .3=8/ D 8=15 D 0:53 ; T .0/ D 17=30 D 0:56 ; andpoint at pT .2/ D 17=6 Š 0:687 : Therefore, John should swim to the point exactly 3=8 ofa mile downstream, and run the rest of the way. In doing so, it would take him 0:53of an hour to get to his house. (If John wanted to allow his house to burn in orderto collect insurance money, then try to convince investigators that he did his best inattempting to save his house, he might swim the entire way.) ˘Remark 4.17. Example 4.16 is a classic [20]. A version of it, in which a man canwalk on smooth ground at a certain speed, and walk on plowed ground at a certain(slower) speed, appears in a 1691–1692 manuscript by the Swiss mathematicianJohann Bernoulli (1667–1748). The manuscript was published in 1742, just 6 yearsbefore Maria Agnesi’s book. ı

Exercises

4.1. Prove Lemma 4.6: If f is differentiable at x 2 I; then f is continuous at x:

f .g.x// f .g.x0 // D .g.x/ g.x0 //h.g.x//:

(b) Now, for xn ¤ x0 , consider the quotient

f .g.xn // f .g.x0 // .g.xn / g.x0 //h.g.xn //

Observe that h ı g is continuous on J; and let xn ! x0 :

4.14. Use the Chain Rule and the Power Rule for Integer Powers to prove the PowerRule for Rational Powers.4.15. (a) Show that

sin.h/ lim D 1: h!0 h

(b) Use this to show that

1 cos.h/ lim D 0: h!0 h

(c) Now write

sin.x C h/ D sin.x C h/ D sin.x/ cos.h/ C cos.x/ sin.h/

to show that 0 sin.x/ D cos.x/:

(d) What if x is in degrees, rather than radians?

(e) In a similar way, show that .cos.x//0 D sin.x/: (f) Find the derivatives (where they exist) of the other four trigonometric functions.(See [11,19,25] for neat ways of showing that .sin.x//0 D cos.x/ using some simplegeometry then taking a limit.)4.16. A function f is strictly increasing on I if

f .x1 / < f .x2 / whenever x1 ; x2 2 I with x1 < x2 :

Exercises 91

And f is strictly decreasing on I if f is strictly increasing there. A function

which is either strictly increasing or strictly decreasing is called strictly monotonic.A function f W I ! J is onto if for each y0 2 J; there is x0 2 I such thatf .x0 / D y0 : If f W .a; b/ ! .p; q/ is strictly monotonic and onto, then the inversef 1 W J ! I exists. It is defined by f 1 .y/ D x , y D f .x/: Suppose thatf W .a; b/ ! .p; q/ is strictly monotonic and onto.(a) Prove that f is continuous on .a; b/.(b) Prove that f 1 is strictly monotonic and onto, and therefore continuous.(c) Show that if f is also differentiable and f 0 .x/ ¤ 0 then f 1 is differentiable, with 1 .f 1 /0 .y/ D 0 .f 1 .y// : f

(d) [24] Explain what the formula in (c) has to do with Fig. 4.5.

Fig. 4.5 For Exercise 4.16. y

Observe thattan.“/ D 1= tan.’/, since y = f(x)’ C “ D =2

a x

4.17. [16] Show that for any n distinct real numbers (n 4) there are at least twowhich satisfy xy

0< < tan :

1 C xy n1

Hint: Each number can be written as tan.u/; where =2 < u < =2: The tan.A/tan.B/trigonometric identity tan.A B/ D 1Ctan.A/ tan.B/ will be useful.92 4 Differentiable Functions

4.30. [22] Let f be differentiable on an open interval which contains Œ1; 1; withjf 0 .x/j 1 for x 2 Œ1; 1: Show that there is x0 2 .1; 1/ for which jf 0 .x0 /j < 4:Hint: Consider g.x/ D f .x/ C 2x 2 :4.31. (a) Let f .x/ D 2x 2 x: Consider a rectangle in the first quadrant with one side on the positive x-axis and inscribed under the graph of y D f .x/: Find the rectangle so described which has maximal area.(b) [15] Let g.x/ D x 2xC1 : Consider a rectangle in the first quadrant with one side on the positive x-axis and inscribed under the graph of y D g.x/: Show that there is no such rectangle which has maximal area.

4.32. A piece of wire L inches long is cut into two pieces—one the shape of asquare, and one the shape of a circle. How should the wire be cut so that the totalarea of the two shapes is as small as possible? How should the wire be cut so thatthe total area of the two shapes is as large as possible?94 4 Differentiable Functions

4.33. Show, using calculus, that the rectangle with given perimeter which has thegreatest area is a square. Show that the rectangle with given area which has the leastperimeter is a square. (We’ve already shown these in Example 2.9, using the AGMInequality. Using calculus here is rather like killing a mite with a sledgehammer.)4.34. (a) A rectangular plot of ground is to be enclosed by fencing on three sides, with a long existing wall serving as boundary for the fourth side. Find the dimensions of the plot of greatest area which can be enclosed with 1,000 ft of fencing. Can you do this without using calculus?(b) Suppose now that we have the same situation as in (a), but that the existing wall is 400 ft long. Find the dimensions of the plot of greatest area which can be enclosed with 1,000 ft of fencing. Can you do this without using calculus? For a thorough study of problems such as these, see [21].4.35. A box with open top is made from a rectangular piece of cardboard 12 in. by18 in.; congruent squares are cut from each corner and the edges are folded up. Findthe dimensions of such a box which has largest volume.4.36. (a) Find the dimensions of the rectangle of maximum area that can be inscribed in a circle with radius R.(b) Find the dimensions of the right circular cylinder of maximum volume that can be inscribed in a sphere with radius R.4.37. A wooden beam is to be carried horizontally around a corner, from a hallwayof width 12 ft into a hallway of width 8 ft Find the length of the longest beam thatcan be so carried. Can you do this without using calculus?4.38. A rectangular piece of paper is 6 in. wide and 25 in. long. The paper is folded,creating a crease, so that the lower right corner just touches the left side. Describethe fold which minimizes the length of the crease.4.39. (e.g., [6]) Let A > 0; a > 0 and B > 0; so that P D .0; A/ and Q D .a; B/are points on the positive y-axis and in the first quadrant respectively. Find the pointC on the x-axis so that the sum of distances P C C CQ is minimized. The answeris known as Fermat’s Law of Reflection.4.40. (e.g., [6]) Let A > 0; a > 0 and B < 0; so that P D .0; A/ and Q D .a; B/are points on the positive y-axis and in the fourth quadrant respectively. Supposethat light travels with velocity p above the x-axis and velocity q below the x-axis.Find the path from P to Q which takes the least time. The answer is known asSnell’s Law of Refraction.

Up the airy mountain, Down the rushy glen . . .

– The Fairies, by William Allingham

The main focus of this chapter is the Mean Value Theorem and some of itsapplications. This is the big theorem in the world of differentiable functions. Manyimportant results in calculus (and well beyond!) follow from the Mean ValueTheorem. We also look at an interesting and useful generalization, due to Cauchy.

5.1 The Mean Value Theorem

The following result is named for French mathematician Michel Rolle (1652–1719).Theorem 5.1. (Rolle’s Theorem) Let f be continuous on Œa; b and differentiableon .a; b/; with f .a/ D f .b/: Then there exists c 2 .a; b/ such that f 0 .c/ D 0:Proof. If f is constant on Œa; b, then its derivative is zero and so any c 2 .a; b/satisfies the conclusion of the theorem. So we assume that f is not constant on Œa; b.By the Extreme Value Theorem (Theorem 3.23), f attains an absolute maximumand an absolute minimum on Œa; b: Since f is not constant, at least one of theseabsolute extrema must occur at c 2 .a; b/. Then since f is differentiable on .a; b/,an application of Fermat’s Theorem (Theorem 4.13) gives f 0 .c/ D 0; as desired. t u Rolle’s Theorem (Theorem 5.1) is fairly obvious, upon drawing a picture: If adifferentiable function starts at f .a/ then returns to f .b/ D f .a/; there must be atleast one place on its graph at which the tangent line is horizontal. See Fig. 5.1. Here is a another neat proof [2,7,41] of Rolle’s Theorem (Theorem 5.1); we leavethe details for Exercise 5.3. Since f is continuous and f .a/ D f .b/; by the Half-Chord Lemma there are a1 ; b1 2 Œa; b with f .a1 / D f .b1 / and b1 a1 D .ba/=2.Again by the Half-Chord Lemma, there are a2 ; b2 2 Œa1 ; b1 with f .a2 / D f .b2 /and b2 a2 D .b a/=22 : Continuing in this way, we obtain a sequence of nested

Fig. 5.1 Rolle’s Theorem y

The Mean Value Theorem, which is commonly attributed to French mathe-

matician Joseph-Louis Lagrange (1736–1813), extends Rolle’s Theorem to allowf .a/ ¤ f .b/: It is equivalent to Rolle’s Theorem but since f .a/ D f .b/ isgenerally not the case, it gets used most often in this form. Our proof follows [43].This proof is a little different from the one found in most textbooks, which we leavefor Exercise 5.9.Theorem 5.2. (Mean Value Theorem) Let f be continuous on Œa; b and differen-tiable on .a; b/: Then there exists c 2 .a; b/ such that

f .b/ f .a/ f 0 .c/ D : ba

Proof. The equation of the line L through the origin .0; 0/ which is parallel to theline through .a; f .a// and .b; f .b// is given by

f .b/ f .a/ yD x: ba

See Fig. 5.2. Therefore the vertical displacement between f .x/ and L is given bythe function

f .b/ f .a/ h.x/ D f .x/ x: ba

It is clear from Fig. 5.2 that h.a/ D h.b/; or the reader may verify directly that

bf .a/ af .b/ h.a/ D h.b/ D : ba5.1 The Mean Value Theorem 99

Now h is continuous on Œa; b and differentiable on .a; b/; so by Rolle’s Theorem

there is c 2 .a; b/ for which h0 .c/ D 0: Finally,

f .b/ f .a/ h0 .x/ D f 0 .x/ ; ba

and so h0 .c/ D 0 gives f 0 .c/ D f .b/f .a/

ba , as desired. t u

Fig. 5.2 The proof of the y

Remark 5.3. The quotient bf .a/afba

.b/ is the y-intercept of the line through thepoints .a; f .a// and .b; f .b//: We shall meet this quotient again in Sect. 7.3. ıRemark 5.4. We outlined just above a proof of Rolle’s Theorem (Theorem 5.1)which uses the Half-Chord Lemma (Lemma 3.12) and the Nested Interval Property(Theorem 1.41). In [16], this idea is eloquently modified to prove the Mean ValueTheorem (Theorem 5.2). ı The Mean Value Theorem (Theorem 5.2) says that between points .a; f .a//and .b; f .b//; the graph of a differentiable function must have at least one place.c; f .c// at which the tangent line is parallel to the line through the points .a; f .a//and .b; f .b//: See Fig. 5.3. Suppose that over some journey, a car has some particular average speed. Thenby the Mean Value Theorem (Theorem 5.2) there must have been an instant duringthe journey at which the car was travelling at precisely that average speed. (There isa rumor that if someone arrives in their car at a toll booth too soon after leaving aprevious toll booth, then they could get a ticket for speeding.) But it is not reallythe full Mean Value Theorem that is required here because a car travels with acontinuous position function certainly, but it travels with continuous speed as well.The Mean Value Theorem only requires that the speed function exists. It is aninteresting fact that there seems to be no simpler proof of the Mean Value Theoremassuming also that f 0 is continuous—even though this is the context in which it isusually applied [24].100 5 The Mean Value Theorem

Fig. 5.3 The Mean Value y

Theorem (Theorem 5.2):There is least one place y = f(x).c; f .c// at which the tangent f(c)line is parallel to the line f(b)through .a; f .a// and.b; f .b// (In Fig. 5.2 thereare three such places)

f(a)

a c b x

Example 5.5. Recall the Fixed Point Lemma (Lemma 3.10): If f W Œa; b ! Œa; bis continuous, then f has a fixed point in Œa; b: If we know also that f 0 .x/ < 1 foreach x 2 Œa; b; then the fixed point is unique. (This is reasonable upon drawing apicture.) Here’s why: If there are two fixed points, say f .x0 / D x0 and f .y0 / D y0 ;then by the Mean Value Theorem (Theorem 5.2) there is c between x0 and y0 suchthat

f .x0 / f .y0 / f 0 .c/ D : x0 y0

But this reads

x0 y0 f 0 .c/ D D 1; x0 y0

which contradicts f 0 < 1: So we must have x0 D y0 : ˘

It may seem that we could prove the Mean Value Theorem (Theorem 5.2) bysuitably rotating the x- and y-axes, to get f .a/ D f .b/, and then applying Rolle’sTheorem (Theorem 5.1). But the function f .x/ D x 3 x, for example, shows thatthis idea does not work. See Fig. 5.4. Here, f .2/ D 6 and f .2/ D 6: If werotate the axes so that the line through .2; 6/ and .2; 6/ is the new x-axis, thenthe image of the graph of f under this rotation is not a function. Indeed, the newy-axis (which is the old y D x=3 line) intersects the graph of f three times. See[17, 51].

on .a; b/:) The function f being strictly increasing means that the above canbe replaced with <; and the function f being strictly decreasing means that the above can be replaced with > :Lemma 5.6. Suppose that f is differentiable on .a; b/: (i) If f 0 0 on .a; b/ then f is increasing on .a; b/. (ii) If f 0 0 on .a; b/ then f is decreasing on .a; b/.(iii) If f 0 .x/ D 0 for every x 2 .a; b/; then f is constant on .a; b/:Proof. We let a < x1 < x2 < b and apply the Mean Value Theorem (Theorem 5.2)to f on Œx1 ; x2 : Then there is c 2 .x1 ; x2 / such that

f .x2 / f .x1 / f 0 .c/ D : x2 x1

That is,

f .x2 / f .x1 / D f 0 .c/.x2 x1 /:

Now for (i), if f 0 0 then the right-hand side is 0; and so the left-hand side is 0. That is, f .x2 / f .x1 /:For (ii), if f 0 0 then the right-hand side is 0; and so the left-hand side is 0.That is, f .x2 / f .x1 /:For (iii), we must have f 0 .c/ D 0 and so the right-hand side is D 0: Thereforef .x1 / D f .x2 /: This is true for any choice of x1 < x2 in .a; b/ and so f must beconstant. t u102 5 The Mean Value Theorem

The following consequence of part (iii) of Lemma 5.6 is particularly important.

can be verified by expanding the left-hand side then tidying up. But here is an easierway: The derivative of the left-hand side with respect to x is

X n X n X n .aj A/ D aj C A D 0: j D1 j D1 j D1

The derivative of the right-hand side with respect to x is zero, since x does notappear there. By Corollary 5.7 then, the left-hand and right-hand sides differ by aconstant. Setting x D 0 reveals that the constant is zero, as we wanted to show.To illustrate one instance in which the identity can be used, we take bj D aj foreach j . Then 0 12 X n X n 1 Xn 0 .aj A/2 D aj2 @ aj A : j D1 j D1 n j D1

That is, 0 12 1 X n 1X 2 n @ aj A a : n j D1 n j D1 j

This can also be obtained from the Cauchy-Schwarz Inequality (Theorem 2.18). Weleave this for the reader to verify; see also Exercise 2.40. ˘ With parts (i) and (ii) of Lemma 5.6, we are better equipped to handle manyextrema problems. But we continue to rely on Fermat’s Theorem (Theorem 4.13)5.2 Applications 103

which says that we should look for extrema of a function f defined on .a; b/ at thecritical points for f: That is, at c 2 .a; b/ for which either f 0 .c/ D 0 or f 0 .c/ doesnot exist.Example 5.9. Consider f .x/ D x 7=3 C x 4=3 x 1=3 ; for x 2 R: Then

5.3 Cauchy’s Mean Value Theorem

If we apply the Mean Value Theorem (Theorem 5.2) to each of two functions f andg continuous on Œa; b and differentiable on .a; b/, we can conclude that there arec1 ; c2 2 .a; b/ such that

f .b/ f .a/ g.b/ g.a/

f 0 .c1 / D and g 0 .c2 / D : ba baThis gives

f 0 .c1 / g.b/ g.a/ D g 0 .c2 / f .b/ f .a/ :

But it happens that there is in fact one c 2 .a; b/ which works for both functions, asfollows.Theorem 5.11. (Cauchy’s Mean Value Theorem) Let f and g be continuous onŒa; b and differentiable on .a; b/: Then there exists c 2 .a; b/ such that

f 0 .c/ f .b/ f .a/

D : g 0 .c/ g.b/ g.a/

Remark 5.12. We saw the geometrical interpretation of the Mean Value Theorem(Theorem 5.2) in Fig. 5.3. Cauchy’s Mean Value Theorem also has a geometricalinterpretation, though perhaps not so obvious (e.g., [17,35]). If P .t / D .g.t /; f .t //is a point in the xy-plane which depends on t; then f 0 .t /=g 0 .t /, when it exists,is the slope of the tangent line to the curve that P .t / traces as t varies in Œa; b.What Cauchy’s Mean Value Theorem says is that as long as g 0 ¤ 0; there is atleast one place on the curve at which the tangent line is parallel to the line through.g.a/; f .a// and .g.b/; f .b//: See Fig. 5.6. If g.t / D t then we recover the MeanValue Theorem (Theorem 5.2) and its geometric interpretation. ı Probably the best known application of Cauchy’s Mean Value Theorem(Theorem 5.11) is L’Hospital’s Rule, as follows. (But we shall see others.)Theorem 5.13. (L’Hospital’s Rule) Let f and g be continuous on Œa; b anddifferentiable on .a; b/ and let x0 2 .a; b/: Let f .x0 / D g.x0 / D 0; but supposethat g ¤ 0 at all other points of .a; b/. Suppose also that g 0 ¤ 0 on .a; b/: Then f 0 .x/ f .x/ lim DL ) lim D L: x!x0 g 0 .x/ x!x0 g.x/

Proof. Since f .x0 / D g.x0 / D 0; for x ¤ x0 we may write

f .x/ f .x/ f .x0 /

D : g.x/ g.x/ g.x0 /106 5 The Mean Value Theorem

f(b) P(g(t), f(t))

f(a)

x g(a) g(b)

Fig. 5.6 Cauchy’s Mean Value Theorem (Theorem 5.11): There is at least one place on the curveat which the tangent line is parallel to the line through .g.a/; f .a// and .g.b/; f .b//. In thispicture, there are two such places

In Exercises 5.43 and 5.44 we consider the roles that some of the varioushypotheses in L’Hospital’s Rule play. There are also many variants of L’Hospital’sRule, a few of which we explore in Exercises 5.45 and 5.46.Remark 5.15. L’Hospital’s Rule is in fact due to Swiss mathematician JohannBernoulli (1667–1748). Guillaume de L’Hospital (1661–1704) was a French studentof Bernoulli’s who, with permission, published notes from his teacher’s lectures in1696. This was the first-ever calculus textbook. ıRemark 5.16. [9] There is some disagreement among historians of mathematics onthe spelling of L’Hospital’s name. He himself spelled it, at times, Lhospital. That is,without the apostrophe and with a lower case h. R.P. Boas Jr. used this spelling onoccasion (e.g., [5]). On the cover of the 1696 calculus book, it is spelled l’Hospital.The official French national bibliographic entry is L’Hospital, which is what mosthistorians choose. ı

Exercises

5.1. (a) Show that a polynomial of degree n cannot have more than n real zeros.(b) Show that if a polynomial p has n distinct real zeros then p 0 has n 1 distinct real zeros.5.2. [30] Let p be a cubic polynomial with real zeros a1 < a2 < a3 :(a) Show that p has a critical point c, with a1 < c < a2 .(b) Show that c is closer to a1 than to a2 :5.3. (a) Fill in the details of the proof of Rolle’s Theorem (Theorem 5.1) outlined in Sect. 5.1, which uses the Half-Chord Lemma (Lemma 3.12) and the Nested Interval Property of R (Theorem 1.41).(b) Is the “c 2 .a; b/” from our proof of Rolle’s Theorem (Theorem 5.1) necessarily the same as the “c 2 .a; b/” from the proof in (a) ? Explain.5.4. [54](a) Let f be continuous on Œa; b and differentiable on .a; b/ such that f .a/ D f .b/ D 0. Show that there is c 2 .a; b/ such that f 0 .c/ D f .c/: Hint: Consider g.x/ D ex f .x/:(b) Interpret the result in (a) geometrically.

5.5. [25](a) Let f be continuous on Œa; b and differentiable on .a; b/ such that f .a/ D f .b/ D 0; but f is not the zero function. Show that for any real number r ¤ 0 there is c 2 .a; b/ such that rf 0 .c/ C f .c/ D 0: Hint: Consider g.x/ D ex=r f .x/:(b) Interpret the result in (a) geometrically.108 5 The Mean Value Theorem

5.6. [50](a) Let f be continuous on Œa; b and differentiable on .a; b/: Set

5.9. The proof of the Mean Value Theorem (Theorem 5.2) given in the text isslightly different from the one given in most textbooks. Typically, one uses theauxiliary function

f .b/ f .a/ h.x/ D f .x/ f .a/ .x a/: ba

(a) Prove the Mean Value Theorem using this h.

(b) What is the significance of this h, geometrically?(c) Is the “c 2 .a; b/” from the proof supplied in the text necessarily the same as the “c 2 .a; b/” from the proof in (a) ?5.10. [53] Apply Rolle’s Theorem (Theorem 5.1) to

g.x/ D .f .x/ f .a//.x b/ .f .x/ f .b//.x a/

to obtain another proof of the Mean Value Theorem (Theorem 5.2). The functiong.x/ is ˙ twice the area of the triangle determined by the points .a; f .a//;.x; f .x//; and .b; f .b//: See Exercise 1.12.5.11. Show that the conclusion of the Mean Value Theorem (Theorem 5.2) can bewritten f .b/ f .a/ f 0 .a C t .b a// D ; bafor some t 2 .0; 1/:Exercises 109

5.14. [27] Prove the following converse to the Mean Value Theorem (Theorem 5.2).Let F and f be defined on .a; b/ and let f be continuous there. Suppose that forevery x; y 2 .a; b/ there is c between x and y such that

F .x/ F .y/ D .x y/f .c/:

Show that F is differentiable on .a; b/; and that f is its derivative.

5.17. Prove part (iii) of Lemma 5.6 by going at it in the contrapositive direction.That is, show that if f is not constant then there is a c 2 .a; b/ for which f 0 .c/ ¤ 0:(For another interesting proof, which uses a bisection algorithm and not the MeanValue Theorem, see [39].)5.18. Suppose that f is differentiable on .a; b/ with f 0 .x/ ¤ 0 for everyx 2 .a; b/. Prove that f is one-to-one on .a; b/:5.19. [36] Have a look again at Example 5.8. Under what conditions do the realnumbers a1 ; a2 ; : : : ; an satisfy 0 12 X n 1 X n aj2 D @ aj A ‹ j D1 n j D1

5.20. [1] Here’s a slick way of verifying the trigonometric identities

(a) Show that if it is the case that the sequence fxn g converges to some number p; then p is necessarily a fixed point for f:(b) Now suppose that f W Œa; b ! Œa; b has a continuous derivative. Let 0 < k < 1 and suppose that jf 0 .x/j k for all x 2 .a; b/: Show that for any x0 2 Œa; b; the iteration scheme defined in (a) converges.(c) Verify the hypotheses in (b), for f .x/ D 1Cx 1 on Œ0; 1. To what number does the fixed point iteration scheme converge in this case? (Take x0 D 0, say.)5.37. [22] Let x; y; z 0. Schur’s Inequality is: For any 2 R;

.x y/.x z/x C .y x/.y z/y C .z x/.z y/z 0;

with equality if and only if x D y D z D 0: Prove Schur’s Inequality, as follows.

The greatest shortcoming of the human race is our inability to

understand the exponential function. —Albert A. Bartlett

By now we know Euler’s number e D e1 quite well. In this chapter we define

the exponential function ex for any x 2 R, and its inverse the natural logarithmicfunction ln.x/; for x > 0. (In the first section of the chapter we take a conciseapproach to the exponential function; in the second section we do things carefully.)These functions enable us to extend many of our previous results to allow forreal exponents. For example, we obtain the Power Rule for real exponents, weextend Bernoulli’s Inequality, and we obtain a more strapping version of the AGMInequality. We also meet the Logarithmic Mean, the Harmonic series and its closerelatives the Alternating Harmonic series and p-series, and Euler’s constant ”.

6.1 The Exponential Function, Quickly

In this section we take a concise approach (e.g., [52]) to the exponential function,while omitting some details of rigor. In the next section we offer an entirely rigorousand self-contained approach. The reader may choose to concentrate on this sectionor on the next before proceeding to Sect. 6.3, but understanding both would be best. We begin with the basic assumption that there exists a function
.x/ defined forall x 2 R such that

By the definition of the derivative we have (for n large):

Then since
.0/ D 1 and
.x/ D
0 .x/,

And continuing in this way, we get

n n
n D
1 Š 1 C n1 :

Taking n as large as we please we can obtain e Š 2:71828 , say.

The symbol e is used in honor of the Swiss mathematician Leonhard Euler(1701–1783). It is often called Euler’s number.Remark 6.1. The scheme used above for approximating e is a special case ofEuler’s method of tangent lines. This is a method for obtaining approximatesolutions to differential equations, like
0 .x/ D
.x/: See also Exercise 6.2. ı Finally, because
.1/ D e, and because of item (iii) which is evocative of the“same base add the exponents” rule, it is customary to write

.x/ D ex :

The most important properties of this, the exponential function ex ; are (arguably):

ex ey D exCy ; .ex /0 D ex ;

and

1 C x ex for x 2 R (with strict inequality for x ¤ 0) :

122 6 The Exponential Function

This last inequality is tremendously useful, as we shall see many times. A graphof 1 C x and ex over Œ1; 2 is shown in Fig. 6.1.

we might just as well have considered

1 n 1 n 1C <e< 1 for n D 2; 3; 4; : : : : n n

Indeed, this latter form will be more suitable for the present purposes. Among otherthings, its obvious symmetry will be useful. For a given x 2 R, we consider now the sequences n x n o n x n o 1C and 1 ; for natural numbers n > jxj : n n n n

And for jhj < 1 the estimates (6.3) give

The result now follows upon letting h ! 0. t

.x/
.x/ D 1 for all x 2 R:

Therefore
is never zero. Also, since
is differentiable (Lemma 6.6), it is

continuous (Lemma 4.6). So by the Intermediate Value Theorem (Theorem 3.17)
is either positive or negative. Since
.0/ D 1;
must be positive.126 6 The Exponential Function

We have already seen that 1 C x
.x/ for x 1: Since
is positive, we

must therefore have

1 C x
.x/ for all x 2 R:

Now because
satisfies the functional equation
.x/
.y/ D
.x C y/ which

is evocative of the “same base add the exponents” rule, and because
.1/ D e, it iscustomary to write

.x/ D ex :

The most important properties of this, the exponential function ex ; are (arguably)the contents of Lemmas 6.5 and 6.6:

ex ey D exCy ; .ex /0 D ex ;

and

1 C x ex for x 2 R (with strict inequality for x ¤ 0) : (6.5)

The inequality (6.5) is tremendously useful, as we shall see many times. A graphof 1 C x and ex over Œ1; 2 is shown in Fig. 6.1 in Sect. 6.1.

6.3 The Natural Logarithmic Function

In this section we show that the exponential function has an inverse. To do so, weestablish a few more of its properties.Lemma 6.7. The exponential function ex has the following properties: (i) ex is strictly increasing on .1; C1/, (ii) ex ! C1 as x ! C1;(iii) ex ! 0 as x ! 1:Proof. For (i), we have seen that ex > 0. Then since .ex /0 D ex , we must have.ex /0 > 0: Therefore ex is strictly increasing, by Lemma 5.6.For (ii), we saw in (6.5) that 1 C x ex : As such, ex ! C1 as x ! C1:For (iii), by (ii) we have ex ! C1 as x ! C1: Then since ex D 1=ex bythe functional equation (6.4), we have ex ! 0 as x ! C1: That is, ex ! 0 asx ! 1: t uExample 6.8. As regards item (ii) of Lemma 6.7, much more can be said:

This follows from applying L’Hospital’s Rule (Theorem 5.13) n times. It says thatas x ! C1; ex ! C1 faster than any polynomial. ˘ Lemma 6.7 shows that
.x/ D ex has an inverse, defined on .0; 1/: This inverseis denoted by
1 .x/ D ln.x/; and its range is .1; C1/: This is the naturallogarithmic function. Being the inverse of ex ; ln.x/ satisfies:

eln.x/ D x for x > 0 and ln.ex / D x for x 2 R:

Graphs of ex and ln.x/ are shown in Fig. 6.2.

y = ex

y=x

1 y = ln(x)

x 1

Fig. 6.2 The graphs of y D ex and y D ln.x/: Each is the graph of the other, reflected the lineyDx

Any property of the exponential function gives rise to a property of the naturallogarithmic function, since the latter is the inverse of the former. We list some ofthese properties below and leave their proofs as an exercise. We shall use themfreely without explicit mention.Lemma 6.9. The natural logarithmic function ln.x/ has the following properties: (i) ln.ab/ D ln.a/ C ln.b/ for a; b > 0; (ii) ln.a=b/ D ln.a/ ln.b/ for a; b > 0;(iii) ln.ar / D r ln.a/ for a > 0 and r 2 R;(iv) ln.x/ is a strictly increasing function, (v) ln.x/ ! C1 as x ! C1;(vi) ln.x/ ! 1 as x ! 0C :128 6 The Exponential Function

Proof. This is Exercise 6.15. t

u x 0 We saw in Lemma 6.6 that .e / D e for all x and so by the Chain Rule, x

’.x/ 0 e D ’0 .x/e’.x/ ; for differentiable functions ’.x/: 0 Then the relationship eln.x/ D x appears to imply that eln.x/ D eln.x/ .ln.x//0 D1,and so .ln.x//0 D 1=x for x > 0: But this only shows that if .ln.x//0 exists, then itequals 1=x (for x > 0/. One really must show that .ln.x//0 indeed exists. This canbe done using Exercise 4.16, but here we do so more directly—similarly in spirit tohow we obtained the derivative of arctan.x/ in Sect. 4.3. In inequality (6.5), we replace x with u v and then with v u to get

We close this section with two more examples (the first is very simple), in whicha property of the exponential function gives rise to a corresponding property of thenatural logarithmic function.Example 6.10. The reader may verify by taking logarithms then dividing by n and n nC1nC1 in turn, that the estimates (6.1), i.e., 1 C n1 < e < 1 C n1 are equivalentto the equally useful estimates

In this context, the wj 0 s are called weights. For the special case in which p1 D Pnp2 D D pn D 1; we get wj D n1 for each j and so indeed wj D 1; and j D1Pn a1 Ca2 CCan wj aj is the ordinary Arithmetic Mean A D n :j D1

Example 6.14. In the weighted Arithmetic Mean

2a1 C 7a2 C a3 C 5a4

; 15

we set w1 D 2=15; w2 D 7=15; w3 D 1=15; and w4 D 5=15. Then

2a1 C 7a2 C a3 C 5a4 X

4 D 2 a 15 1 C 7 a 15 2 C 1 a 15 3 C 13 a4 D wj aj : ˘ 15 j D1

The AGM Inequality (Theorem 2.10) can be extended without too muchdifficulty to allow for positive rational weights. This is Exercise 6.25. But there is an even more general version of the AGM Inequality, as follows,which allows all positive real numbers as weights, not only positive rationals. Weprovide the beautiful proof [30, 81] by American (Hungarian born) mathematicianGeorge Polya (1887–1985) which uses (6.5) (or see (6.7)) in the form:

6.6 The Logarithmic Mean

The Logarithmic Mean of the positive numbers a and b is given by

8 ˆ ba ˆ < if a ¤ b ln.b/ ln.a/ L D L.a; b/ D ˆ :̂ a if a D b:

As well as having intrinsic interest, the Logarithmic Mean arises in problems dealingwith heat transfer and fluid mechanics. Since L.a; b/ D L.b; a/, we may supposethat a b. Applying Cauchy’s Mean Value Theorem (Theorem 5.11) to

f .x/ D x a and g.x/ D ln.x/ ln.a/

on Œa; b, there is c 2 .a; b/ such that

minfa; bg L maxfa; bg;

and so L is indeed a mean. And this justifies the choice L.a; a/ D a; making Lcontinuous. (The reader might also verify that L.t a; t b/ D tL.a; b/; for t > 0:) Of course we didn’t really use Cauchy’s Mean Value Theorem here, just the MeanValue Theorem (Theorem 5.2) upside down. But the idea of using Cauchy’s MeanValue Theorem can give us more, as follows. First, recall that for positive numbersa and b; their Arithmetic Mean is

After a little manipulation, this yields

px.xC1/ 1 1 xC1=2 1C e 1C : (6.10) x x

These estimates p improve (6.9) considerably. See also Exercises 6.9 and 6.45. Since x.x C 1/ is the Geometric Mean of x C 1 and x, and x C 1=2 is theirArithmetic Mean, we point out the rather satisfying fact that (6.10) reads: G.xC1; x/ A.xC1; x/ xC1 xC1 e : x x

6.7 The Harmonic Series and Some Relatives

(i) The Harmonic series is the infinite series

X1 1 1 1 1 1 D 1 C C C C C : nD1 n 2 3 4 5

Each term (after the first) is the Harmonic Mean of the term just before it and the term just after it. As with any infinite series, this expression denotes the Nlimit (if P 1 it exists) of the sequence of partial sums fSN g : In this case, fSN g D n : nD1138 6 The Exponential Function

We saw in the course of obtaining (6.7) that ln.1 C x/ x for x > 1:Therefore, as in [7, 48] for example,

X N

ln.N C 1/ D ln.n C 1/ ln.n/ nD1

X N XN 1 1 D ln 1 C : nD1 n nD1 n

Now since ln.N C 1/ ! C1 as N ! C1; we must also have SN ! C1. So

we write

X1 1 D C1: nD1 n

We might say that the Harmonic series diverges to C1: Exercises 6.47–6.50 containseveral other proofs of this important fact.Remark 6.22. The partial sums of the Harmonic series grow without bound, butthey do so very slowly. For example, S10;000 Š 9:8; the smallest N for which SN >20 is 272400600; and the smallest N for which SN > 1;000 is greater than 10434 . ı(ii) Euler’s constant. In (6.6) we saw that

and this is > n1 ; by the right-hand side of (6.6). Therefore f”n g is bounded below. So by the Increasing Bounded Sequence Property (Theorem 1.34), f”n g converges to some real number ” 0. The number ” is called Euler’s constant. Since ”n is decreasing, ” < ”1 D 1; ” < ”2 Š 0:807; ” < ”3 Š 0:735 etc. In fact,

contexts. Mathematicians typically rate ” just below and e in its overall impor-tance in mathematical analysis. Still, ” remains elusive. It is not even known whether” is irrational. Another approach to ” can be found in [12]. Also, the book [31] ishighly recommended. ı(iii) The Alternating Harmonic series is the infinite series

X1 1 1 1 1 .1/nC1 D 1 C C : nD1 n 2 3 4

P N 1 Here fSN g D .1/nC1 ; and so nD1 n

1 1 1 1 1 1 1 S2N D 1 C C C C 2 3 4 5 6 2N 1 2N

1 1 1 1 1 1 1 D 1 C C C C : 2 3 4 5 6 2N 1 2N

Therefore fS2N g is increasing. But also,

1 1 1 1 1 1 1 S2N D 1 ; 2 3 4 5 2N 2 2N 1 2N

and therefore fS2N g is bounded above, by 1: So by the Increasing Bounded

Using Euler’s constant ”, we can find S as follows. Observe that

Notice that ln.2n/ ln.n/ D ln.2/ and ”n ! Euler’s constant ”: So we have

at hand the sum of the Alternating Harmonic series:

X1 1 .1/nC1 D ln.2/ Š 0:693147 : nD1 n

We shall show in Corollary 12.6 that ln.2/ is irrational.

Remark 6.24. For the Alternating Harmonic series we have (for example)

1 1 1 1 1 S D 1 C C C 2 3 4 5 6

1 1 1 1 1 1 1 1 D 1 C C C 2 4 3 6 8 5 10 12

1 1 1 1 1 1 D C C C 2 4 6 8 10 12

1 1 1 1 1 1 D 1 C C C 2 2 3 4 5 6

1 D S: 2This means that the sum S is dependent on how the terms are arranged! There isa theorem (see [42] or [61]) due to the German mathematician Bernhard Riemann(1826–1866) which implies that for any real number S; there is a rearrangement6.7 The Harmonic Series and Some Relatives 141

of the Alternating Harmonic series which sums to S: And there are rearrangementswhich diverge to each of ˙1 as well. We address this phenomenon a little morein Exercise 6.57 and in Sect. 10.1. But for more about rearrangements of theAlternating Harmonic series, see for example, [5, 8, 20, 46]. ı(iv) For any real number p, the associated p-series is the infinite series X1 1 1 1 1 1 p D 1 C p C p C p C p C : nD1 n 2 3 4 5

For p D 1 this is simply the Harmonic series and so it diverges. It is easy to

see, and we leave this for Exercise 6.58, that a p-series diverges also for p < 1: Here we show, as in [19], that a p-series converges for p > 1: For the N th partial sum

and so it converges, by the Increasing Bounded Sequence Property (Theo-

rem 1.34). We have then:

X1 1 the p-series converges , p > 1: nD1 np

Taking p D 2 in the analysis above, we get

X1 1 < 2: nD1 n2142 6 The Exponential Function

We shall see in Theorem 12.7 that in fact,

X1 1 2 D Š 1:645 : nD1 n2 6

Being the first to find the sum of this series was one of Euler’s many great triumphs.Remark 6.25. Taking p D 3, it is the case that

X1 1 1 1 4 3 < 1 D ; nD1 n 4 3

but the precise value of this sum is not known. It was proved in only 1979, by the P 1 1French mathematician Roger Apéry (1916–1994), that 3 Š 1:202 is irrational. nD1 n P1 1The values of the sums p are known if p is a positive even integer. ı nD1 n

(b) Conclude, in particular, that

.nŠ/1=n 1 lim D : n!1 n e k kC1 Hint: In 1 C k1 < e < 1 C k1 ; take the product for k D 1; 2; : : : n:(b) Denote by An the Arithmetic Mean and by Gn the Geometric Mean, of the first n natural numbers. Show that the result in (a) is the same as

Gn 2 lim D : n!1 An e

Other approaches to this problem can be found in [13, 44, 82]. It is generalized in various directions in [43, 68, 71, 83].6.13. Suppose that a certain population at year t 0 is given (approximately)by P .t / D C ekt , and that the population’s growth is r % per year. Show that thepopulation doubles in size every ln.2/= ln.1 C r=100/ years. xn6.14. Let x1 > 1 and for n D 1; 2; : : : ; let xnC1 D : Show that fxn g ln.xn /converges and find the limit.6.15. Prove Lemma 6.9.Exercises 145

6.16. [41] Here’s a way to show that .ln.x//0 D x1 ; assuming we already know that u 1 C 1u ! e as u ! 1: Verify that

ln.x C h/ ln.x/ 1 h x= h D ln 1 C ; then let h ! 0: h x x

6.17. Show that for a > 0,

.ax /0 D ax ln.a/

and for differentiable functions ’.x/,

’.x/ 0 a D a’.x/ ln.a/’0 .x/:

6.18. (a) Use the Chain Rule to find .ln.ax//0 .

(b) Conclude that ln.ab/ D ln.a/ C ln.b/ for a; b > 0:6.19. The logarithmic function with base a > 0 but a ¤ 1 is defined by

(c) Show that for x > 1 and L > 0,

d logx L logx .L/ D : dx x ln.x/

6.20. [28] Fill in the details of the following proof that ln.x/ is not a rationalfunction. If it were, we could write ln.x/ D p.x/ q.x/ ; where p and q are polynomialswith no common factors. Now differentiate both sides of this expression to obtain acontradiction.146 6 The Exponential Function

apply the AGM Inequality to a suitable collection of M numbers, which contains(perhaps lots of) repetition.6.26. [30] Fill in the details of another proof of the weighted AGM Inequality P n(Theorem 6.15), as follows. Set A D wj aj and x D aj =A in (6.7), i.e., in j D1

ln.x/ x 1 for x > 0:

Now multiply by wj ; and sum. This is a 1930 proof by Hungarian mathematician

Frigyes Riesz (1880–1956). It is the logarithmic companion of the proof we gave(i.e., G. Polya’s) of Theorem 6.15.6.27. [50,64,66] Fill in the details of another proof of the weighted AGM Inequality Q n w(Theorem 6.15), as follows. Set G D aj j and x D aj =G in (6.7): ln.x/ x 1 j D1for x > 0. Now multiply by wj ; and sum. This is the Geometric Mean companionto Riesz’s proof from Exercise 6.26.6.28. [69] Fill in the details of another proof of the weighted AGM Inequality(Theorem 6.15), as follows.(a) Verify that X n

on Œa; b to show that L 23 G C 13 A A: This refines the inequality L A

from Lemma 6.20. (This was first obtained by other methods in [60].)6.44. Let a; b > 0 and denote by G and L the Geometric and Logarithmic Meansof a and b respectively. Apply Cauchy’s Mean Value Theorem (Theorem 5.11) to

6.47. [18, 19, 25, 26]

(a) Fill in the details of the following proof, due to American mathematician Leonard Gillman (1917–2009), that the Harmonic series diverges:

1 1 1 1 1 S D 1C C C C C C 2 3 4 5 6

1 1 1 1 1 1 > C C C C C C 2 2 4 4 6 6

D S: P 1(b) And here’s a similar argument, though slightly more complicated. If S D 1 n ; nD1 P 1 P 1 P 1 then 12 S D 1 2 1 n D 1 2n and so we must have 1 2n1 D 12 S also. Show nD1 nD1 nD1 that this leads to a contradiction.6.48. [79] Fill in the details of another proof that the Harmonic series diverges: If P1 PNSD 1 n exists then S2N SN ! 0; where SN D 1 n . Show that S2N SN > 12 nD1 nD1to obtain a contradiction.6.49. [21] Fill in the details of another proof that the Harmonic series diverges:Obtain a contradiction by observing that

X1 X1 1 1 1 D C nD1 n nD0 2n C 1 2n C 2

1 X 1 1 D C : nD0 nC1 .2n C 1/.2n C 2/

6.50. [17] Fill in the details of another proof that the Harmonic series diverges:(a) Prove (or at least recall) the well-known fact (e.g., Exercise 5.35) that for the Fibonacci sequence f1 D 1; f2 D 1; fnC2 D fn C fnC1 , it is the case that fnC1 =fn ! '; where ' is the golden mean, i.e., the positive root of x 2 x 1 D 0.(b) By collecting successive blocks of the Harmonic series whose lengths are the Fibonacci numbers, show that

(b) Can you sum other rearrangements in a similar way? P 16.58. (a) Show that 1 np diverges for p 1: nD1(b) Use

XN 1 XN 1 XN 1 SN D D1C <1C nD1 n 2 nD2 n 2 nD2 .n 1/nExercises 155

P 1 to show that 1 n2 2. (The series on the right-hand side is a telescoping nD1 series). P 1(c) Use (b) to show that 1 np converges for p 2: (We showed in Sect. 6.7 that nD1 it also converges for 1 < p < 2:) P 1 16.59. [57] Here is an extension of Example 1.32, which shows that np nD1converges for p > 1:(a) Verify that for natural numbers r > 1,

Suppose that we stack cubes with side lengths 1; 1=2; 1=3; 1=4; 1=5 : : : together, asshown in Fig. 6.4.(a) Looking at the side view, the height of each vertical stack is > 1=2 and so we P 1 have 1 n D C1: nD1(b) Looking again at the side view, the total area obtained by taking one face of P 1 1 P 1 1 each cube gives n2 < 1; and so n2 < 2: nD2 nD1(c) Looking at the full view, all of the cubes are inside a 1 1 3 2 box. Therefore P 1 1 their total volume gives n3 < 32 : nD1

One cannot fix one’s eyes on the commonest natural production

without finding food for a rambling fancy. —Mansfield Park, by Jane Austen

In this chapter, which is independent of all subsequent chapters, we allow our-

selves a brief diversion. We have met and used Rolle’s Theorem (Theorem 5.1),its extension the Mean Value Theorem (Theorem 5.2), and its extension Cauchy’sMean Value Theorem (Theorem 5.11). Here we consider other Mean Value –type theorems. Each of these, as with their namesake, has an appealing geometricinterpretation. For convenience we recall below the Mean Value Theorem.Theorem 5.2. (Mean Value Theorem): Let f be continuous on Œa; b anddifferentiable on .a; b/: Then there exists c 2 .a; b/ such that

f .b/ f .a/ f 0 .c/ D : ba

7.1 Darboux’s Theorem

We begin with a preliminary result, which extends Rolle’s Theorem (Theorem 5.1),to cases in which f 0 .b/ exists. It was obtained by D.H. Trahan [18] in 1966.Lemma 7.1. Let f be continuous on Œa; b and differentiable on .a; b; with

f .b/ f .a/ f 0 .b/ 0:

Then there exists c 2 .a; b such that f 0 .c/ D 0:

Proof. If f .b/f .a/ D 0 then the result holds, by Rolle’s Theorem (Theorem 5.1).If f 0 .b/ D 0 then the result holds, with c D b: Otherwise, we consider the functiong defined on Œa; b via

The Intermediate Value Theorem (Theorem 3.17) says, in short, that a continuousfunction satisfies the Intermediate Value Property. It is a rather surprising factthat derivatives (which need not be continuous) also satisfy the Intermediate ValueProperty. This discovery was made in 1875 by French mathematician J.G. Darboux(1842–1917). We provide a 2004 proof due to L. Olsen [10], which is a natural extension ofthe proof of Lemma 7.1: the g in the proof of Lemma 7.1 is the g1 in the proofbelow. This is different from the proof found in most textbooks, which we leavefor Exercise 7.3. (See [4, 8] for two other clever proofs.)Theorem 7.2. (Darboux’s Theorem) Let f be differentiable on Œa; b: Let y bebetween f 0 .a/ and f 0 .b/: Then there is c 2 .a; b/ such that f 0 .c/ D y:7.1 Darboux’s Theorem 161

lim f 0 .x/ ¤ f 0 .0/:

Therefore f 0 is not continuous at x D 0. Even so, by Darboux’s Theorem

7.2 Flett’s Mean Value Theorem

The following Mean Value – type theorem was discovered by T.M. Flett in 1958 [3].Our proof follows Trahan’s paper [18], which uses Lemma 7.1. (And the g2 in theproof of Darboux’s Theorem (Theorem 7.2) is the g in the proof below.) Anotherproof can be found in [12]; see also [1].Theorem 7.4. (Flett’s Mean Value Theorem) Let f be differentiable on Œa; b, withf 0 .a/ D f 0 .b/: Then there exists c 2 .a; b/ such that

Finally, setting c D 1 ; this reads

7.4 A Related Result

We close this chapter with another Mean Value – type theorem [6], some variantsof which we explore in the exercises. Let f be defined on Œa; b: For c 2 Œa; b wedenote by C D .c; f .c// any point on the graph of f and by

f .a/Cf .b/ M D aCb 2 ; 2

the midpoint of the chord from .a; f .a// to .b; f .b//:

Recall that two lines are perpendicular if their slopes are negative reciprocals ofeach other. The following result says that for a function f differentiable on .a; b/,either M is on the graph of f or there is a C such that the line through M and C isperpendicular to the tangent line at C . See Fig. 7.4.Theorem 7.6. Let f be continuous on Œa; b and differentiable on .a; b/: Thenthere exists c 2 Œa; b such that h i f 0 .c/ f .c/ f .a/Cf 2 .b/ D c aCb 2 :

Proof. Define h on Œa; b via

2 2 2 2 h.x/ D x a C f .x/ f .a/ C x b C f .x/ f .b/ :

Then h is continuous on Œa; b and differentiable on .a; b/ and, as the reader caneasily verify, h.a/ D h.b/: So we apply Rolle’s Theorem (Theorem 5.1) to h toconclude that there is c 2 .a; b/ such that

7.7. Suppose that you take a ride on a roller coaster. Show that there is a momentduring the ride at which your instantaneous speed is equal to your average speed upto that moment.7.8. Consider Flett’s Mean Value Theorem (Theorem 7.4), but from the other side.That is: Let f be differentiable on Œa; b with f 0 .a/ D f 0 .b/: Then there existsc 2 .a; b/ such that

to obtain a version which does not require f 0 .a/ D f 0 .b/:

h.x/ D .b x/Œf .x/ f .a/

168 7 Other Mean Value Theorems

to prove the following: Let f be continuous on Œa; b and differentiable on

.a; b/: Then there exists c 2 .a; b/ such that

f .c/ f .a/ f 0 .c/ D : bc

(b) Show that the triangle formed by the x-axis, the tangent line at .c; f .c//; and the line through .c; f .c// and .b; f .a//; is isosceles.7.11. In Pompeiu’s Mean Value Theorem (Theorem 7.5), is it necessary that theinterval Œa; b does not contain 0? Explain.7.12. (a) What does the function h in the proof of Theorem 7.6 represent geometrically?(b) Prove Theorem 7.6 by instead applying Rolle’s Theorem (Theorem 5.1) to

aCb 2 f .a/ C f .b/ 2 g.x/ D x C f .x/ : 2 2

(c) What does the function g in (b) represent geometrically?

7.13. [6](a) Apply Rolle’s Theorem (Theorem 5.1) to

h.x/ D Œx a2 Œf .x/ f .a/2 C Œx b2 Œf .x/ f .b/2

to prove the following: Let f be continuous on Œa; b and differentiable on

A smile is a curve that sets everything straight.

– Phyllis Diller

In this chapter we consider the higher derivatives of a function f: These are f 00 D

.f 0 /0 ; f .3/ D .f 00 /0 ; etc. We extend the Mean Value Theorem to an analogousstatement about the second derivative, and this takes us naturally to the notion ofconvexity. Once there, we meet the very important Jensen’s Inequality. Then weextend the Mean Value Theorem to the (n+1)st derivative—this is Taylor’s Theorem.We prove that e is irrational and we take a brief look at Taylor series.

8.1 Higher Derivatives

We saw in Sect. 4.1 that

f .x0 C h/ f .x0 / f 0 .x0 / D lim ; h!0 h

whenever this limit exists. It is often useful to consider higher derivatives of f ,

whenever possible:

f 0 .x0 C h/ f 0 .x0 / f 00 .x0 C h/ f 00 .x0 /

f 00 .x0 / D lim ; f .3/ .x0 / D lim ; etc. h!0 h h!0 h

Generally, we write f .0/ D f and f .1/ D f 0 , then

f .n1/ .x0 C h/ f .n1/ .x0 /

f .n/ .x0 / D lim for n D 2; 3; 4; : : : h!0 h

whenever these limits exist. f .n/ .x0 / is called the nth derivative of f at x0 :As we have already seen with f 0 , it is common to replace x0 with simply x; whenf 00 ; f .3/ ; f .4/ etc. are to be thought of as functions.

f .2n1/ .x/ D .1/n1 cos.x/; f .2n/ .x/ D .1/n sin.x/;

f .n/ .x/ D sin x C : ˘

Then we find ak0 s as follows. Taking derivatives up to order j .0 j n/ of each

side we get

nj X n

n n 1 n .j 1/ 1 C x D k k 1 k .j 1/ ak x kj : kDj

If we set x D 0 here, the only nonzero term on the right-hand side is that for whichk D j . Then solving for aj we get

n.n 1/ .n .j 1// nŠ aj D D : j.j 1/ .j .j 1// .n j /Šj Š

And so we have obtained the Binomial formula

! X n n k .1 C x/ D n x ; k kD0

nwhere the k ’s are the binomial coefficients ! n nŠ D : k .n k/ŠkŠ

We leave it for Exercise 8.3 to find an expression for .a C b/n : See also [19]. ˘8.1 Higher Derivatives 173

Remark 8.3. Let k; n 2 N, with k n. The number of ways of selecting

a k-element set from a set having n elements is the binomial coefficient kn :The number of ways of arranging k distinct elements is kŠ : The number ofarrangements of k elements taken from a set having n elements is kŠ kn D .nk/Š nŠ :ıExample 8.4. Here is a neat fact about derivatives which we will use in the nextsection. It provides a way of computing f 00 using f , but not f 0 . We show that if fis defined on an open interval containing x; and if f 00 exists, then

f .x C h/ 2f .x/ C f .x h/ f 00 .x/ D lim : h!0 h2

With h as the independent variable, we apply L’Hospital’s Rule (Theorem 5.13)

is the average of the right-hand and left-hand derivatives of f at x. (See

Exercise 4.10.) This average is called the Schwarz derivative, or the symmetricderivative. It may exist even when f 0 .x/ does not: Consider f .x/ D jxj at x D 0:For Rolle – type theorems and Mean Value – type theorems for the symmetricderivative, see [2]. For Flett’s Mean Value Theorem (Theorem 7.4) as regards thesymmetric derivative, see [39]. See also Exercise 8.17. ı Suppose now that f is differentiable on an open interval I and that x0 2 I: TheMean Value Theorem (Theorem 5.2) says that for each x 2 I there is c between xand x0 such that

ˇ 00So the MeanˇValue Theorem for the Second Derivative (Theorem 8.6) says thatˇ f .c/ ˇˇ 2 .x x0 /2 ˇ is the error which comes about, in approximating f .x/ with thelinear function L.x/ D f .x0 / C f 0 .x0 /.x x0 /. If f 00 .x/ D 0 for all x 2 .a; b/ then by the Mean Value Theorem for the SecondDerivative (Theorem 8.6), f must be a linear function. (Compare with Lemma 5.6.) If f 00 .x/ 0 for all x 2 .a; b/ then applying Lemma 5.6 to f 00 D .f 0 /0 ; we seethat f 0 is increasing on .a; b/: In freshman calculus, the graph of such a function isusually called concave upward. (And f is called concave downward.) The following is immediate from the Mean Value Theorem for the SecondDerivative (Theorem 8.6). It says that the graph of a function which is concaveupward lies on or above all of its tangent lines.Lemma 8.7. Let f be such that f 00 0 on .a; b/; and let x0 2 .a; b/: Then foreach x 2 .a; b/;

f .x/ f .x0 / C f 0 .x0 /.x x0 /:

Proof. Let x0 2 .a; b/: By the Mean Value Theorem for the Second Derivative(Theorem 8.6), there is c between x and x0 such that

Exercise 8.19 contains a neat proof that x e < ex implies the AGM Inequality(Theorem 2.10). ˘ Also Immediate from the Mean Value Theorem for the Second Derivative(Theorem 8.6) is the Second Derivative Test, which we leave for Exercise 8.18.It says that if f 00 exists on .a; b/ and if for some c 2 .a; b/ we have f 0 .c/ D 0;then f 00 .c/ > 0 implies that f has a local minimum at c; and f 00 .c/ < 0 impliesthat f has a local maximum at c: See also Exercise 8.54.

Fig. 8.2 .1 t /x C ty for a few values of t 2 Œ0; 1

Now let f be a function defined on some interval I . Then f is convex on I

means that for any x; y 2 I with x < y,

f .1 t /x C ty .1 t /f .x/ C tf .y/ for every t 2 .0; 1/:

And f being strictly convex means that the above can be replaced with < . Sogeometrically, a strictly convex function is one whose graph lies below all of itschords. See Fig. 8.3.8.2 Convex Functions 177

Applying the exponential function to both sides we obtain

x .1t/ y t < .1 t /x C ty :

This is the weighted AGM Inequality with n=2 (Corollary 6.16). We have seen thatthis is equivalent to Young’s Inequality (Corollary 6.19):

ap bq 1 1 ab C where C D1 : p q p q

These results can also be obtained by using the fact that ex is convex on .1; C1/.We leave this for Exercise 8.34. ˘ We close this section by showing that the converse of Lemma 8.12 is also true,as long as f 00 exists.Lemma 8.16. If f is convex on .a; b/ and f 00 exists, then f 00 0: That is, if f isconvex on .a; b/ and f 00 exists there, then f is concave upward on .a; b/.Proof. Let x 2 .a; b/ and choose h > 0 small enough that .x h; x C h/ .a; b/:We write x D .1 12 /.x h/ C 12 .x C h/: Then since f is convex,

and so must have f 00 .x/ 0; as desired. t

u The paper [7] contains a thorough treatment of many of the various geometriccharacterizations of a convex function.

8.3 Jensen’s Inequality

The big theorem in the world of convex functions is due to Danish mathematicianJ.W. Jensen (1859–1925). Many of the most important results related to convexityfollow from Jensen’s Inequality. In the definition of convexity, we have

f .1 t /x C ty .1 t /f .x/ C tf .y/:

The idea in Jensen’s Inequality is that the x and y can be replaced by any number ofpoints in I; and the .1 t /x C ty can be replaced by any weighted Arithmetic Mean180 8 Convex Functions and Taylor’s Theorem

For an interesting geometric explanation of Jensen’s Inequality (Theorem 8.17),

Example 8.18. We showed in Example 8.15 that the concavity of ln.x/ can be usedto obtain the weighted AGM Inequality with n D 2 (Corollary 6.16). More generally,the concavity of ln.x/ and Jensen’s Inequality (Theorem 8.17) can be used to obtain8.3 Jensen’s Inequality 181

After some tidying, this reads

0 12 X n X n X n @ aj bj A aj2 bj2 ; j D1 j D1 j D1

which is the Cauchy–Schwarz Inequality. ˘

We close this section by showing that Jensen’s Inequality (Theorem 8.17)actually holds assuming only that f is convex—that is, without assuming even thatf is continuous, much less f 00 0: We prove it for the equal weights case (8.1) andleave the more general version for Exercise 8.39. The proof is exactly analogous toCauchy’s proof of the AGM Inequality (Theorem 2.10) which we provided at the182 8 Convex Functions and Taylor’s Theorem

end of Sect. 2.2. In fact, it was a careful analysis of Cauchy’s proof of the AGMinequality which led Jensen to discover his inequality and thus initiate the study ofconvex functions [40].Proof. If n D 2 then Jensen’s Inequality is simply the convexity condition (witht D 1=2). If n D 4 we use the condition twice:

1 1 1 f .x1 C x2 C x3 C x4 / D f Œx1 C x2 C 2 Œx3 C x4 1 4 2 2

1 1 f 2 Œx1 C x2 C 12 Œx3 C x4 2

1 1 D f 2 Œx1 C x2 C Œx3 C x4 2

11 f Œx1 C x2 C Œx3 C x4 22

1 D f .x1 C x2 C x3 C x4 /: 4And for n D 8; we would use the n D 4 case twice. Etcetera: We could continuethis procedure indefinitely, and so we may assume that Jensen’s Inequality holds forany n of the form 2m .m 0/: For any (other) n; we choose m so large that 2m > n:Now writing

1X n AD xj ; n j D1

we observe that

x1 C x2 C C xn C .2m n/A D A: 2mThe numerator of the left-hand side here has 2m members in the sum and so we canapply what we have proved so far to see that

8.4 Taylor’s Theorem: e Is Irrational

Looking at the Mean Value Theorem (Theorem 5.2) and then the Mean ValueTheorem for the Second Derivative (Theorem 8.6) one might ask, “why stop at twoderivatives?” Indeed, continuing on to the (n+1)st derivative yields the importanttheorem below, named for English mathematician Brook Taylor (1685–1731). Theproof we provide is just an extension of the proof of the Mean Value Theorem forthe Second Derivative. (We shall prove it an entirely different way in Sect. 11.4.)Theorem 8.20. (Taylor’s Theorem) Let f be such that f .nC1/ exists on some openinterval I and let x0 2 I: Then for each x 2 I there is c between x and x0 such that

So we might expect that the error should be small if n is large and/or if x is closeto x0 : And this expectation seems to be supported by the form of the remainder term.(See also Exercise 8.59.) Of course, having n D 0 and n D 1 gives the Mean ValueTheorem (Theorem 5.2) and the Mean Value Theorem for the Second Derivative(Theorem 8.6) respectively.Example 8.21. For f .x/ D ex and for k D 0; 1; 2; : : : ;

f .k/ .x/ D ex and so f .k/ .0/ D 1:

8.4 Taylor’s Theorem: e Is Irrational 185

So by Taylor’s Theorem (Theorem 8.20), with x0 D 0; there is c between 0 and x

The first term on the right-hand side is an integer for any n. If n b; then theleft-hand side is an integer. And if n > ec then the second term on the right-hand side is between 0 and 1: So choosing n > maxfec ; bg yields a contradiction.Therefore, e must be irrational. ˘ We generalize Example 8.22 considerably in Theorem 12.3 and thenCorollary 12.4, showing that er is irrational for any nonzero rational number r:Remark 8.23. Euler was the first to prove that e is irrational, in 1737. Saying that eis irrational is the same as saying that e is not the solution to any linear equation axCb D 0 with integer coefficients. The French mathematician J. Liouville (1809–1882)proved around 1844 that e is not a solution to any quadratic equation ax 2 C bx Cc D 0 with integer coefficients. The French mathematician C. Hermite (1822–1901)proved in 1873 that e is not a solution to any polynomial equation of any degree withinteger coefficients. That is, e is not an algebraic number; it is a transcendentalnumber. ıExample 8.24. For x > 0 and f .x/ D ln.x/; and for k D 1; 2; 3; : : : ;

8.5 Taylor Series

Now if f has derivatives of all orders and it so happens that for a given x; it is thecase that the remainder term

f .nC1/ .c/ .x x0 /nC1 ! 0 as n ! 1; .n C 1/Š

then we may reasonably write (for such x):

1 X f .n/ .x0 / f .x/ D .x x0 /n : nD0 nŠ

This is called the Taylor series for f about the point x D x0 : If x0 D 0 it is oftencalled the Maclaurin series for f , for Scottish mathematician Colin Maclaurin(1698–1746).Example 8.25. Again, for f .x/ D ex and x0 D 0; the remainder term is

f .nC1/ .c/ ec x nC1

.x x0 /nC1 D : .n C 1/Š .n C 1/Š

We claim that for any given x 2 R;

xN ! 0 as N ! 1: NŠThen since c is between 0 and x; the remainder term

(b) Apply Rolle’s Theorem (Theorem 5.1) to F on Œx; x0 to show there is c 2

for the Schwarz derivative, or the symmetric derivative. The second Schwarzderivative, or the second symmetric derivative is:

f .x C h/ 2f .x/ C f .x h/ f Œ2 .x/ D lim : h!0 h2

It is easy to see (e.g., Exercise 4.10) that if f 0 .x/ exists, then f 0 .x/ D f Œ1 .x/ andwe know that if f 0
0 then f is constant.(a) If f Œ1 D 0 then is f necessarily constant?(b) In Example 8.4 we showed that if f 00 .x/ exists, then f 00 .x/ D f Œ2 .x/ and we know that if f 00
0 then f is a linear function. Show that if f Œ2 .x/ D 0 for194 8 Convex Functions and Taylor’s Theorem

all x 2 .a; b/ then f is a linear function on Œa; b, as follows. If f is linear, it

must look like f .b/f ba .a/ .x a/ C f .a/: So consider

f .b/ f .a/ .x a/.x b/ F .x/ D Fn .x/ D f .x/ .x a/ C f .a/ C : ba n

Now use F Œ2 .x/ D 2=n to show that F .x/ 0:

(c) Consider

f .b/ f .a/ .x a/.x b/

G.x/ D Gn .x/ D .x a/ C f .a/ f .x/ C ; ba n

and show that G.x/ 0:

(d) Combine (b) and (c) and let n ! 1:8.18. This is the Second Derivative Test. Suppose that f 00 exists on .a; b/ and thatfor some c 2 .a; b/; we have f 0 .c/ D 0: Show that if f 00 .c/ > 0 then f has a localminimum at c: Show that if f 00 .c/ < 0 then f has a local maximum at c: What iff 00 .c/ D 0‹ What if f 00 .c/ does not exist? Draw generic pictures which illustratethese cases.8.19. [44] We saw in Example 8.10 that x e < ex for x ¤ e: Use this to prove theAGM Inequality (Theorem 2.10) as follows. Set x D eaj =G for j D 1; 2; : : : n;then multiply.8.20. [14] Suppose that f > 0 and has two derivatives on R. Show that there isx0 2 R such that f 00 .x0 / 0:8.21. [13] Let f be a function with continuous second derivative on R. Show thatif lim f .x/ D 0 and f 00 is bounded, then lim f 0 .x/ D 0. n!1 n!1

8.22. [12] Extend Lemma 8.7 as follows. Show that if f is convex and differen-tiable on .a; b/ with x0 2 .a; b/ then for x 2 .a; b/,

f .x/ f .x0 / C f 0 .x0 /.x x0 /:

Hint: Show that the convexity condition can be manipulated to obtain (for t ¤ 0 andx ¤ x0 )

f .t .x x0 / C x0 / f .x0 / .x x0 / f .x/ f .x0 /; t .x x0 /

then let t ! 0; and hence t .x x0 / ! 0:

8.23. [12] Show that the converse of the Exercise 8.22 holds: Suppose that f isdifferentiable on .a; b/ and for each x; x0 2 .a; b/,

f .x/ f .x0 / C f 0 .x0 /.x x0 /:

Then f is convex on .a; b/:

Exercises 195

8.24. [10](a) Suppose that f is convex on Œa; b, f 00 exists, and that f < 1. Show that 1=.1 f / is convex.(b) Let f be such that f 00 is continuous on Œa; b: Show that

x2 x2 f .x/ m and M f .x/ 2 2

are each convex, where m D min ff 00 .x/g and M D max ff 00 .x/g

x2Œa;b x2Œa;b (m and M exist by the Extreme Value Theorem (Theorem 3.23).)8.25. Here is another proof of Lemma 8.12, that if f 00 0 on .a; b/ then f isconvex on .a; b/.(a) Since f 00 0; f 0 is increasing. Conclude that for x < c < y,

(b) Verify that g 0 .0/ D 0 and use this to conclude that g.0/ g.1/:(c) Can you extend this to prove the weighted AGM Inequality (Theorem 6.15)?8.37. [30, 46] Fill in the details of another proof of Jensen’s Inequality (Theo- Pnrem 8.17), as follows. Set A D wj xj ; and for 0 t 1, let j D1

(a) Verify, for example, that M1 is the Arithmetic Mean, M1 is the Harmonic Mean, and M2 is the Root Mean Square.(b) Use Jensen’s Inequality (Theorem 8.17) to show that if s < r, then Ms < Mr :(c) Show that it is reasonable, for the sake of continuity, to define M0 D the Geometric Mean G D .x1 x2 xn /1=n :(d) What are reasonable definitions of M1 and M1 ?(e) How would you define the weighted Power Means?Exercises 201

is called the Lagrangian Mean of a and b: So that the Lagrangian Mean is

continuous, what should we define as the Lagrangian Mean of a and b; if a D b?(c) Compute the Lagrangian Mean for f .x/ D x 2 ; for f .x/ D 1=x; and for a few other functions of your choice. Try f .x/ D x r and let r ! 0:8.52. Extend Exercise 8.16 above to prove Taylor’s Theorem (Theorem 8.20). Thisis essentially the proof to be found in most textbooks. Yet another proof can befound in [5].8.53. Let n be a positive integer. Prove the Binomial formula ! X n n k .1 C x/ D n x ; k kD0

where the coefficient of x k is the binomial coefficient

! n nŠ n.n 1/.n 2/ .n k C 1/ D D : k .n k/ŠkŠ k.k 1/.k 2/ 2 1

by using Taylor’s Theorem (Theorem 8.20), as follows.

202 8 Convex Functions and Taylor’s Theorem

(a) Let f .x/ D .1 C x/n and verify that for k n,

f .k/ .x/ D n.n 1/ .n k C 1/.1 C x/nk :

(b) Conclude that

8 < n.n 1/ .n k C 1/ if k n f .k/ .0/ D : 0 if k > n:

(c) Now apply Taylor’s Theorem with x0 D 0:

8.54. Here we extend the Second Derivative Test from Exercise 8.18. Supposethat each of f; f 0 ; f 00 ; : : : ; f .nC1/ is continuous on an open interval I containingc, that

0 D f 0 .c/ D f 00 .c/ D D f .n/ .c/; but that f .nC1/ .c/ ¤ 0:

Show that if f .nC1/ .c/ > 0 and n is even, then c yields a local minimum for f . Canyou summarize the other possibilities—for example f .nC1/ .c/ < 0 and n odd? x8.55. Use the Taylor polynomial n degree n and corresponding remainder for e nofwith x0 D 0; to show that nŠ > e : We did this another way in Exercise 2.27. In nExercise 2.17 we saw that nŠ < nC1 2 :8.56. [11] Suppose that f 00 .x/ exists for all x and that p; q > 1 satisfy1=p C 1=q D 1. Show that if

remainder, for f .x/ D 1=x:(b) Find the Taylor series for f .x/ D 1=x at x0 D 1 and show that it converges for 1=2 x 2:8.65. (a) Compute the Taylor polynomial of degree n at x0 D 0 and correspondingremainder, for f .x/ D cos.x/:(b) Show that for x 2 R,

(b) How does this compare with Jordan’s Inequality sin.x/ 2 x from Example 8.14 ? 3(c) How does this compare with the sin.x/ x x6 from Exercise 8.66 ?8.68. (a) Compute the Taylor polynomial of degree n at x0 D 0 and correspondingremainder, for

ex C ex cosh.x/ D : 2

(b) Find (with justification) the Maclaurin series for cosh.x/:

(c) Compute the Taylor polynomial of degree n at x0 D 0 and corresponding remainder, for

(d) Find (with justification) the Maclaurin series for sinh.x/:

Show that f has derivatives of all orders at x0 D 0, but that

1 X f .n/ .0/ f .x/ ¤ xn; nD0 nŠ

unless x D 0: (This function is not analytic except at zero. A function is analytic

wherever it equals its Taylor series.)8.71. Newton’s method (Sir Isaac Newton, English (1642–1727); no introductionnecessary) is a method for approximating a root c of a function f; i.e., a number cfor which f .c/ D 0: It begins with an initial guess x0 ; then the iteration scheme

f .xn / xnC1 D xn for n D 0; 1; 2; 3; : : : : f 0 .xn /

As long as f 0 .c/ ¤ 0 and x0 is close to c; the scheme converges to c. (See, for

example, [5] or [12] for details).(a) Show that xnC1 is the x intercept of the tangent line to y D f .x/ at x D xn : That is, xnC1 is where the Taylor polynomial p1 of degree 1 at x D xn has a zero.(b) Show that if we take instead xnC1 as a zero the Taylor polynomial p2 of degree 2 then we get the expression

f .xn / xnC1 D xn f 00 .xn / : f 0 .xn / C 2 .xnC1 xn /

(c) [36] Solve this for xnC1 to get another iteration scheme.(d) [9] A different approach from (c) is to use Newton’s method to approximate the xnC1 on the right-hand side. Show that this leads to the iteration scheme

2f .xn /f 0 .xn / xnC1 D xn : 2f 0 .xn /f 0 .xn / f .xn /f 00 .xn /

This scheme is known as Halley’s method, named for English mathematician

and astronomer Edmond Halley (1656–1742). (Yes, this is the same Halley as the comet: Halley’s comet can be seen from Earth with the naked eye every 75 years or so. It is due to next come around in 2061.)(e) [6] Show that Newton’s method applied to

f .x/ g.x/ D p f 0 .x/

yields Halley’s method.

References 207

(f) Show that Newton’s method is fixed point iteration (see Exercise 5.36) applied to

f .x/ g.x/ D x : f 0 .x/

(g) Show that Halley’s method is fixed point iteration (see Exercise 5.36) applied to

It has long been an axiom of mine that the little things are infinitely more important.

– Sherlock Holmes, in A Case of Identity,

by Sir Arthur Conan Doyle

A function’s range is a collection of values and so we might expect that it should

have an average value, as long as the function is reasonably well behaved. A fenceor a wall for example, no matter how long or how irregular in height, should havean average height. In Sect. 3.4 we considered the average value

1 X N f .xj / N j D1

of a continuous function f W Œa; b ! R evaluated at N sample points

x1 ; x2 ; : : : ; xN from Œa; b: By choosing these sample points in a systematic wayand then letting N ! 1; we define the average value of f over the interval Œa; b:This naturally gives rise to the notion of area under a curve and the definite integral.Then, since the definite integral is defined in terms of sums, we see that manyproperties of sums give rise to properties of definite integrals—and vice-versa. Forexample, we obtain integral analogues for many of the inequalities from Chaps. 2and 6.

Fig. 9.1 Each partition Pn of n = 0, N = 20 = 1

Finally, we denote by xj any particular point of each subinterval Œxj 1 ; xj W

xj 2 Œxj 1 ; xj for j D 1; 2; : : : ; N:

That is,

x1 2 Œx0 ; x1 ; x2 2 Œx1 ; x2 ; : : : ; xN 2 ŒxN 1 ; xN :

With all of this notation

P in place, it is a very important fact that if f is continuous

on Œa; b; then lim N1 N j D1 f .xj / exists and is independent of the choices N !1for xj : This limit is the average value of f over Œa; b and we denote it by

Af Œa; b :9.1 The Average Value of a Continuous Function 211

We merely state this fact as a theorem below. Its proof requires some rather deepideas that would take us somewhat off course, so we leave it for Appendix A.Theorem 9.1. Let f be continuous on Œa; b: With the notation as above (in partic-ular N D 2n ), 0 1 1 XN Af Œa; b D lim @ f .xj /A exists, N !1 N j D1

and is independent of the choices xj 2 Œxj 1 ; xj :

Proof. See the Appendix, Sect. A.3. t u

a constant function f .x/
C; we should expect that the average

In practice, since xj can be any particular point of each interval Œxj 1 ; xj ; aconvenient choice is usually made—like for example xj D xj 1 , or xj D xj , x Cxor xj D the midpoint: xj D j 12 j . We shall make such choices in next fewexamples. We shall also make use of the formulas (for N being a natural number)

X N N.N C 1/ X N N.N C 1/.2N C 1/ j D and j2D : j D1 2 j D1 6

These were verified in Exercises 2.17 and 5.39 (and again in Exercise 9.1).

min ff .x/g f .x/ max ff .x/g;

and therefore, essentially by Example 9.2,

P nNow recall from Sect. 2.2 that the average, or Arithmetic Mean, A D 1 n aj of j D1the n numbers a1 ; a2 ; : : : ; an is called a mean simply because it satisfies

min faj g A max faj g:

1j n 1j n9.2 The Definite Integral 215

So in view of (9.1), calling Af Œa; b an average is natural. This is the analogue,for functions, of the Arithmetic Mean. But somewhat more than (9.1) is true, asfollows.Lemma 9.9. Let f and g be continuous on Œa; b; with f g there. Then

Af Œa; b Ag .Œa; b/:

Proof. This is Exercise 9.5. t

u The following important result is the analogue, for functions, of the AverageValue Theorem for Sums (Theorem 3.19). It says that the average value of acontinuous function on a closed interval is actually attained by the function.Theorem 9.10. (Average Value Theorem) Let f be continuous on Œa; b: Thenthere is c 2 Œa; b such that

9.2 The Definite Integral

So it is customary to denote the average value of the continuous function f on

Œa; b by

Zb 1 Af Œa; b D f .x/ dx: ba a

This notation serves vaguely as a reminder of where it comes from: As N ! 1,

the idea is that

X N Zb ! ; f .xj / ! f .x/ ; and xN ! dx : j D1 a

Here, Zb f .x/ dx a

is called the definite integral (or simply the integral) of f from a to b.

In case we need to interchange the roles of a and b, we define

Za Zb f .x/ dx D f .x/ dx; (9.2) b a

which is consistent with the set-up: for b < a; we have x0 D bZ and xN D a;

aand xN < 0: And notice that taking a D b in (9.2), we get f .x/ dx D Z a a

f .x/ dx; so that (as we should expect):

a

Za f .x/ dx D 0: a

With this notation in place, Lemma 9.9 reads, for f and g continuous on Œa; b:

Zb Zb f g ) f .x/ dx g.x/ dx: (9.3) a a

This says that the definite integral is a positive operator. This simple but veryimportant property of the definite integral is sometimes taken for granted. Thisproperty is not shared, for example, by the derivative: The reader should agree thatit is not the case that f .x/ g.x/ ) f 0 .x/ g 0 .x/:9.2 The Definite Integral 217

Remark 9.11. The expression

X N f .xj /xN j D1

is called a Riemann sum, after the great German mathematician Bernhard Riemann(1826–1866). For N large,

X N Zb f .xj /xN Š f .x/ dx: j D1 a

This is made more precise in the Appendix (Theorem A.9 of Sect. A.3). ıRemark 9.12. In any sum, the index of summation plays no essential role. Forexample,

X N X N X N f .xj /xN D f .tj /tN D f .uj /uN etc. j D1 j D1 j D1

In the same way, the variable of integration in a definite integral plays no essentialrole. It might be x, or just as well be t; or u; or virtually anything else:

Zb Zb Zb f .x/ dx D f .t / dt D f .u/ d u etc. ı a a a

Since integrals are defined in terms of sums, we can often use a property of sumsto deduce a property of integrals. For example, the property

easily gives rise to the following.

Proof. This is Exercise 9.6. t

uLemma 9.13 says that the definite integral is a linear operator. The derivative is alsoa linear operator—we saw in Sect. 4.2 that the derivative obeys what we called theLinear Combination Rule. Observe now that the conclusion of the Average Value Theorem (Theorem 9.10)reads

Rb Zb f .x/1 dx 1 a f .c/ D f .x/ dx D : ba Rb a 1 dx a

If we replace the 1’s in the numerator and the denominator of the right-hand sideabove with a suitable continuous function g then we get the more general MeanValue Theorem for Integrals below. It is the analogue, for functions, of the MeanValue Theorem for Sums (Theorem 3.22).Theorem 9.14. (Mean Value Theorem for Integrals) Let f and g be continuous onŒa; b and suppose that g does not change signs on Œa; b, and that g.x/ 6
0. Thenthere is c 2 Œa; b such that

Rb f .x/g.x/ dx a f .c/ D : Rb g.x/ dx a

Proof. We may assume that g.x/ 0 on Œa; b for otherwise, we would considerg.x/: By the Extreme Value Theorem (Theorem 3.23), there are xm ; xM 2 Œa; bsuch that

f .xm / f .x/ f .xM / for every x 2 Œa; b:

Multiplying through by g.x/, we get

f .xm /g.x/ f .x/g.x/ f .xM /g.x/ for every x 2 Œa; b:

Then integrating and using (9.3) and Lemma 9.13 we obtain

Now since g.x/ 6
0, there exists x0 2 Œa; b such that g.x0 / > 0. Then byLemma 3.4 there is a closed interval J containing x0 such that g.x/ > 0 for x 2 J . Z bTherefore g.x/ dx > 0, and we may divide through to obtain a

Rb f .x/g.x/ dx a f .xm / f .xM /: Rb g.x/ dx a

Then by the Intermediate Value Theorem (Theorem 3.17) there exists c between xmand xM (and so c 2 Œa; b) such that

Rb f .x/g.x/ dx a f .c/ D ; Rb g.x/ dx a

as desired. t u In the context of the Mean Value Theorem for Integrals (Theorem 9.14), forg.x/ > 0 on Œa; b one often sets

is the analogue, for functions, of the weighted Arithmetic Mean. Then the conclu-sion of the Mean Value Theorem for Integrals (Theorem 9.14) reads

Zb f .c/ D w.x/f .x/ dx: a220 9 Integration of Continuous Functions

9.3 The Definite Integral as Area

For f continuous on Œa; b, the conclusion of the Average Value Theorem(Theorem 9.10) reads:

Zb 1 f .c/ D f .x/ dx : ba a

Therefore f .c/.b a/ is the average value of f over Œa; b; multiplied by the lengthof Œa; b: So if f is also nonnegative we adopt this (very naturally) as the definitionof the area between the graph of f and the x-axis, from x D a to x D b: That is, Zb f .x/ dx D the area between the graph of f and the x-axis, a

from x D a to x D b: (See Fig. 9.2.)

Fig. 9.2 The area of the y

Z bshaded region is f .x/ dx a y = f(x)

b f(x)dx a

x a b

So for example, if f defines the varying height of a wall running straight alongthe ground from a to b, then the area of the wall’s face is

Zb f .x/ dx: a

The average height of the wall is

Zb 1 f .x/ dx; ba a9.3 The Definite Integral as Area 221

and the Average Value Theorem (Theorem 9.10) says that there is at least one placealong the ground over which the wall is exactly its average height. Z b For any continuous function f; not just nonnegative ones, f .x/ dx is the asigned area between the graph of f and the x-axis, from x D a to x D b: Thearea is signed because function values below the x-axis give negative contributionsin the sums which ultimately define the integral. The total area between the graphof f and the x-axis, from x D a to x D b is then

Zb ˇ ˇ ˇf .x/ˇ dx: a

The following is the analogue, for functions, of the triangle inequality

Proof. This is Exercise 9.10. t

u Z bExample 9.16. We saw in Example 9.2 that C dx D C.b a/. For a < b and aC > 0, this is the area of the rectangle with base Œa; b and height C: ˘ pExample 9.17. For r > 0 the graph of f .x/ D r 2 x 2 on Œr; r is the top halfof the circle with radius r; centered at the origin. Therefore

Zr p r2 r 2 x 2 dx D : ˘ 2 r

Example 9.18. By interpreting the definite integral as a signed area (and knowingthe formula for the area of a trapezoid), one can verify that

So, for example,

sin.x/ dx D cos.0/ cos.3 =2/ D 1:

And using the symmetry of the sine function,

9.4 Some Applications

The following lemma seems perfectly reasonable but its proof is surprisingly tricky,so we leave it for Appendix A. It enables us to consider the definite integral ofcertain functions which are not continuous. See Fig. 9.3.Lemma 9.22. Let f be continuous on Œa; b and let c 2 .a; b/: Then

Example 9.25. We showed in Sect. 6.7 that a p-series

converges for p > 1 and diverges to C1 for p 1. In Example 10.5 we shall showthat for p ¤ 1:

ZN ZN 1 1 N 1 dx D 1 ; and for p D 1 W dx D ln.N /: x p 1p Np x 1 1

Therefore (as we shall conclude in Example 10.6) the convergence/divergence of a

p-series also follows from the Integral Test (Theorem 9.24). ˘ For another application of the definite integral, we suppose that f and f 0 arecontinuous on Œa; b: Then we define the length of the curve described by y D f .x/from x D a to x D b, as follows. (But see also [9, 21].) Again, let

with N line segments yields a polygonal approximation to the graph of y D f .x/

So (see Fig. 9.6) the total length of these segments is

N q X 2 .xj xj 1 /2 C f .xj / f .xj 1 / : j D1

And taking very N large it seems that the total length of these segments wouldprovide a pretty good approximation to what we think would be the length of thecurve y D f .x/ from x D a to x D b. Now applying the Mean Value Theorem (Theorem 5.2) on each intervalŒxj 1 ; xj , there is xj 2 .xj 1 ; xj / such that9.4 Some Applications 227

This is the length of the curve y D f .x/ over Œa; b. It is typically denoted bythe letter s. For example, if f defines the varying height of a fence that runs straightalong the ground from a to b, then

Then using Remark 9.6 and Lemma 9.13, we get

1 1 sD e : ˘ 2 e

9.5 Famous Inequalities for the Definite Integral

Continuing the theme that properties of sums can give rise to correspondingproperties of integrals, we extend some of the famous inequalities for sums to obtainanalogous inequalities for integrals. And quite often, their proofs are really no moredifficult. In Sect. 6.4 we obtained Hölder’s Inequality (Lemma 6.18) using the weightedAGM Inequality with n = 2 (Corollary 6.16). In an entirely similar way we obtainthe following.Theorem 9.27. (Hölder’s Integral Inequality) Let f and g be continuous andnonnegative on Œa; b and let p; q > 1 satisfy p1 C q1 D 1. Then

Jensen’s Integral Inequality is

which is the Cauchy–Schwarz Integral Inequality (Corollary 9.28). ˘

We saw at the end of Sect. 8.3 with Cauchy’s proof of Jensen’s Inequality(Theorem 8.17) and again in Exercise 8.39, that Jensen’s Inequality continues tohold even if f is only assumed to be convex and continuous, i.e., not requiringthat f 00 0, or even that f 0 exists. Likewise, Jensen’s Integral Inequality(Theorem 9.29) holds under less restrictive conditions than the ones we haveimposed on the function '. But ' must still be convex; see Exercise 9.35. For the remainder of this section we focus on the important special case ofJensen’s Integral Inequality (Theorem 9.29) in which w.x/
ba 1 : That is, for fcontinuous and
convex on the range of f : 0 1 Zb Zb 1 1 '@ f .x/ dx A '.f .x// dx. ba ba a a

This is the AGM Inequality for Functions—it is the analogue, for continuousfunctions, of the weighted AGM Inequality (Theorem 6.15). The left-hand sideabove is the Geometric Mean of f , and the right-hand side is of course the AverageValue, or the the Arithmetic Mean of f . Here is a very simple example which shows how the AGM Inequality forFunctions can reduce to the weighted AGM Inequality (Theorem 6.15). Let

The right-hand side of (9.4), i.e., the Average Value of f , is

The reader should agree that this procedure could be carried out for any number of P nweights. That is, for any w1 ; w2 ; : : : ; wn > 0 with wj D 1: ˘ j D1

9.6 Epilogue

The (definite) integral that we have considered is called the Riemann integral. Wedefined it only for continuous functions (and we were to able extend it somewhat,using Lemma 9.22). This is usually quite adequate for calculus. But there arefunctions which are not continuous, and not covered by Lemma 9.22—in fact, verynasty ones—which nevertheless possess a Riemann integral. A nice approach to theRiemann integral, to which ours can be viewed as a precursor, can be found in [29]. Recall from Chap. 1 that the set Q was not (for us) a large enough place in whichto work: the Increasing Bounded Sequence Property does not hold within Q. In thesame way, the collection of Riemann integrable functions is not large enough, in the234 9 Integration of Continuous Functions

sense that there lacks an analogue of the Increasing Bounded Sequence Property.(That is, one can construct a sequence of Riemann integrable functions which isincreasing and bounded above, whose limit is not a Riemann integrable function.) So in Chap. 1 we extended Q to get R, a set in which the Increasing BoundedSequence Property does hold. The analogue here, for integrals, is the Lebesgue inte-gral, developed by French mathematician Henri Lebesgue (1875–1941). Enoughfunctions are Lebesgue integrable that the analogue of the Increasing BoundedProperty does hold, and the Lebesgue integral reduces to the Riemann integral whenapplied to functions which are Riemann integrable. It is the Lebesgue integral which makes many areas of modern mathematicalanalysis possible. It is typically first encountered in a graduate level real analysiscourse. An excellent account of the historical development of the Riemann andLebesgue integrals, and many other topics from calculus, can be found in [8].

Exercises

9.1. [6] Show, as follows, that

X N N.N C 1/.2N C 1/ SD j2D : j D1 6

P N N.N C1/(a) Verify that T D j D 2 : j D1(b) In the identity

.k C 1/3 k 3 D 3k 2 C 3k C 1;

set k D 0; 1; 2; : : : ; N and add each of these together to get

.N C 1/3 D 3S C 3T C N C 1:

(c) Now solve for S and use (a).

9.2. (a) Show that 0 12 X N N .N C 1/ 2 2 X N j3D D@ jA : j D1 4 j D1

(b) Find the average value of f .x/ D x 3 over Œ0; 1:

9.3. Find the average value of f .x/ D ex over Œa; b:Exercises 235

9.4. [18] Find the average value of f .x/ D cos.x/ over Œa; b:9.5. Prove Lemma 9.9: Let f and g be continuous on Œa; b; with f g. Then

Z !1=2 b 1 29.8. Let f be continuous on Œa; b. (a) Show that ba f .x/ dx is a a mean. Which mean of n numbers, that we have met before, does this generalize? (b) Show that there is c 2 Œa; b such that 0 11=2 Zb 1 f .c/ D @ f .x/2 dx A : ba a

(b) Prove this another way, by writing

Z Z b b M Cm f .x/g.x/ dx D f .x/ g.x/ dx: a a 2

Z b9.18. Let f be continuous on Œa; b and suppose that f .x/g.x/ dx D 0 for aevery continuous function g on Œa; b: Show that f .x/
0 on Œa; b:9.19. [31](a) Suppose that f and g are continuous on Œa; b: Show that there is c 2 Œa; b such that238 9 Integration of Continuous Functions

Zc Zb f .x/ dx C g.x/ dx D f .c/.b c/ C g.c/.c a/: a c

(b) Draw a picture to show what this says.

(c) Show that this generalizes the Average Value Theorem (Theorem 9.10).9.20. [2] Suppose that f is continuous on Œa; b, and that there are m; M 2 Œa; bsuch that

9.22. Let f be continuous on Œa; b and let c 2 .a; b/. Show that the average valueof f on Œa; b is a weighted average of the average value of f on Œa; c and theaverage value of f on Œc; b:9.23. Let f be increasing and continuous on Œa; b, with f 0 is continuous.Show that

9.26. [13](a) Prove the following Mean Value – type theorem for the length of a curve. Let f and f 0 be continuous on Œa; b. Denote by t .x/ the length of the tangent line to y D f .x/ at x, between the vertical lines x D a and x D b: Then there is c 2 Œa; b such that

Zb p t .c/ D 1 C f 0 .x/2 dx: a

(b) Draw a picture which shows what this is saying geometrically.

240 9 Integration of Continuous Functions

9.27. [14] Let f and g be continuous on Œa; b with f 0; and g 0 and

decreasing. Show that

Rb Rb xf .x/g.x/ dx xf .x/ dx a a : Rb Rb f .x/g.x/ dx f .x/ dx a a

9.28. [23] Let f be differentiable on Œa; b with f .a/ D f .b/ D 0: Show that aslong as f .x/ 6
0, there is c 2 .a; b/ such that

is c 2 .r; r/ such that f 00 .c/ D r33 f .x/ f .0/ dx: r Hint: Start with the Mean Value Theorem for the Second Derivative(Theorem 8.6), then integrate, then use the Mean Value Theorem for Integrals(Theorem 9.14).9.31. [12] Let f be continuous and increasing on Œ0; 1: (a) Show that for anypositive integer n,

Z1 1X n k f f .x/ dx: n n kD1 0

Z 1(b) If f is also convex, then these sums decrease to f .x/ dx: Verify and fill in 0the details of the following proof. Since f is convex, for x; y 2 Œ0; 1,Exercises 241

Now sum from k D 1 to n:

(c) Show that the assumption that f is increasing is necessary. Z 1(d) Show that if f is concave the sums also decrease to f .x/ dx. 09.32. Prove the Cauchy–Schwarz Integral Inequality (Corollary 9.28) by suitablymodifying H. Schwarz’s proof, from Sect. 2.3, of the Cauchy–Schwarz Inequality(Theorem 2.18) for sums. (It was in fact in the context of integrals that Schwarz’sproof first appeared [30].) Z b 29.33. [4] Apply Schwarz’s idea as in Exercise 9.32 to f .x/ C t dx: a What do you get?9.34. [25, 26] p(a) Take f .x/ D 1=x and g
1, then f .x/ D 1= x and g
1, in the Cauchy–Schwarz Integral Inequality (Corollary 9.28) to give another proof of Lemma 6.20:

G < L < A;

where G; L; and A are respectively, the Geometric, Logarithmic, and Arith-

metic Means of a and b:(b) Manipulate the latter case carefully, to show that in fact L < ACG 2 < A:9.35. (a) Show that Jensen’s Integral Inequality (Theorem 9.29) still holds for ' convex, but ' is only assumed to be differentiable. Hint: Look at Exercise 8.22.(b) Can you show that Jensen’s Integral Inequality (Theorem 9.29) still holds for ' convex, but ' is only assumed to be continuous?

9.36. Fill in the details of another proof of the Cauchy–Schwarz Integral Inequality(Corollary 9.28) which is similar to Schwarz’s. (The sum version of this is thecontent of Exercise 2.38). First dispense with the case in which g.x/
0. Observethat for any real number t ,242 9 Integration of Continuous Functions

(b) How would this read on Œa; b instead of Œ0; 1?

(The sum version of this exercise is the content of Exercise 2.51.)9.40. [11] We used the weighted AGM Inequality with n = 2 (Corollary 6.16) toobtain Hölder’s Integral Inequality (Theorem 9.27). Use the full weighted AGMExercises 243

Inequality (Theorem 6.15) to obtain the following extension of Hölder’s Integral

Let f and g be continuous on Œa; b with either both increasing or both decreas-ing. Then

Zb Zb Zb 1 1 1 f .x/ dx g.x/ dx f .x/g.x/ dx: ba ba ba a a a

And the inequality is reversed if f and g have opposite monotonicity. (The sumversion of this is the content of Exercise 2.54.) Fill in the details of the followingproof of Chebyshev’s Integral Inequality. If f and g have the same monotonicity,then

f .x/ f .c/ g.x/ g.c/ 0 for any c 2 Œa; b:

So by Lemma 9.9,

Zb

f .x/ f .c/ g.x/ g.c/ dx 0: a

Now take c 2 Œa; b as given for f by the Average Value Theorem (Theorem 9.10): Z bf .c/ D ba 1 f .x/ dx and expand the left-hand side. a

9.45. [28] Suppose that f is positive and has two continuous derivatives onŒa; b; with each of f and 1=f convex. Use Chebyshev’s Integral Inequality(Exercise 9.44) to show that

Zb 2 f 0 .x/ 1 .f .b/ f .a//2 dx : f .x/ .b a/ f .a/f .b/ a

9.46. (a) In Exercise 9.44 is Chebyshev’s Integral Inequality. Prove the weighted version of Chebyshev’s Integral Inequality: Let F and G be continuous on Œa; b with either both increasing or both decreasing. Let w > 0 be continuous on Œa; b: ThenExercises 245

And the inequality is reversed if f and g have opposite monotonicity. (A sum

version of this is the content of Exercise 6.40.)(b) Use this to prove the Cauchy–Schwarz Integral Inequality (Corollary 9.28). Hint: Set w D G 2 ; and F D G D f =g: (A sum version of this is also contained in Exercise 6.40.)9.47. Let f be continuous on Œa; b with m f M; and set A D Z b 1ba f .x/ dx: a(a) Verify that

Z b(b) Conclude that 1 ba .f .x/ A/2 dx .M A/.A m/: a(c) Show that the inequality in (b) is better than—that is, is a refinement of—the integral version of Popoviciu’s Inequality:

Zb 1 1 .f .x/ A/2 dx .M m/2 : ba 4 a

Hint: Show that for q < Q; the quadratic .Qx/.x q/ is maximized precisely when x D 12 .Q C q/: (A sum version of this is the content of Exercise 2.57.)9.48. [20](a) Look carefully at Exercise 9.47. Extend the ideas there to prove Grüss’s Integral Inequality: Let f; g be continuous on Œa; b; with m f M and g : Then

For example, M1 is the average value of f and M2 is called the Root MeanSquare.(a) Apply Jensen’s Integral Inequality (Theorem 9.29) with '.x/ D x r=s to show that Ms < Mr if s < r:(b) What is a reasonable way to define M0 ?(c) What is a reasonable definition of M1 ?(d) How might you define weighted Power Means?(A sum version of this exercise is the content of Exercise 8.48 and another approachis the content of Exercise 10.18.)9.52. [20] Let f and g be continuous on Œ0; 1, with f decreasing and 0 g 1: Z 1Let D g.x/ dx: Steffensen’s Inequalities are: 0

Z1 Z1 Z f .x/ dx f .x/g.x/ dx f .x/ dx: 1 0 0

(a) By considering 1 g.x/; show that the left-hand inequality follows from the right-hand inequality.References 247

(b) Prove the right-hand inequality. Hint: Verify that

Z Z1 Z Z1 f .x/ dx f .x/g.x/ dx D .1 g.x//f .x/ dx f .x/g.x/ dx 0 0 0

Z Z1 f ./ .1 g.x// dx f .x/g.x/ dx; 0

then show that this is 0:

(c) Prove the left-hand inequality directly—that is, without using (a).(d) How would Steffensen’s Inequalities read on Œa; b?

Z bGenerally however, computing integrals f .x/ dx which routinely appear in aproblems with no apparent notion of area or of average value in sight, can bevery difficult. Coming to the rescue in many cases is the Fundamental Theoremof Calculus. With it, many more definite integrals can be computed relatively easily.But this—the most important theorem in all of calculus—gives us a great deal more.

f 0 .t / dt . t u a The Fundamental Theorem (Theorem 10.1) contains the astonishing fact that iff 0 is continuous, then the operations of differentiation and (definite) integration areinverses of one another—that is, except for the “f .a/” which appears in Part (ii). So virtually any useful or interesting fact about derivatives corresponds to auseful or interesting fact about integrals, and vice-versa. Here is a good example.10.1 The Fundamental Theorem 251

Example 10.2. We show how the Mean Value Theorem (Theorem 5.2) yields theAverage Value Theorem (Theorem 9.10), and vice versa, by way of the FundamentalTheorem (Theorem 10.1). For f continuous Z on Œa; b, the Fundamental Theorem xPart (i) says in particular that F .x/ D f .t / dt is differentiable. Then the Mean aValue Theorem applied to F says that there is c 2 .a; b/ such that

which is the Mean Value Theorem. In this latter analysis, we require that f 0 is Z bcontinuous because if f is only differentiable, then f 0 .x/ dx may not be adefined. And even if it were defined, the Average Value Theorem might not beapplicable. A function f for which f 0 is continuous is often called continuouslydifferentiable. ˘Example 10.3. Here is the integral analogue of Cauchy’s Mean Value Theorem(Theorem 5.11). For f and g continuous on Œa; b, Cauchy’s Mean Value Theoremapplied to

says that there is c 2 .a; b/ such that

This is Cauchy’s Mean Value Theorem for Integrals. ˘

ˇb ˇ For f .b/ f .a/ the notation f .t /ˇ is commonly used. This way, Part (ii) of the aFundamental Theorem with x D b is written

Zb ˇb ˇ f 0 .t / dt D f .t /ˇ : a a

As we saw in the proof of Part (ii), an antiderivative f of a given f 0 is not unique:

f C C is also an antiderivative of f 0 for any C 2 R: But we also saw in the proofthat whichever constant C is chosen does not matter—the C ’s cancel out. So inpractice one usually chooses C D 0: The string of symbols Z f .t / dt

is used to denote an antiderivative of f , but when written this way it is referred to as

an indefinite integral. It is the Fundamental Theorem which makes this terminologyand notation reasonable; indeed, Part (i) tells how to produce an antiderivative of acontinuous function. And again, since any two indefinite integrals of f differ by aconstant, the expression Z f .t / dt C C

is used to denote all indefinite integrals of f; that is, all antiderivatives of f .

The Fundamental Theorem Part (ii) (Theorem 10.1) contains another astonishing Z bfact—that for a continuous function f; evaluating a definite integral f .x/ dx a(which remember, is defined in terms of averages or area) comes down to theapparently very different problem of finding an antiderivative for f:Example 10.4. In Example 9.7 we showed that

Zb sin.x/ dx D cos.a/ cos.b/; a10.1 The Fundamental Theorem 253

by considering an appropriate sequence of Riemann sums. But .cos.x//0 D sin.x/

and so by the Fundamental Theorem (Theorem 10.1),

Zb ˇb ˇ sin.x/ dx D cos.x/ˇ D .cos.b/ cos.a// D cos.a/ cos.b/: a a

Likewise, since .sin.x//0 D cos.x/;

Zb ˇb ˇ cos.x/ dx D sin.x/ˇ D sin.b/ sin.a/: ˘ a a

0 x rC1Example 10.5. For r 2 R with r ¤ 1 we have D x r , by the Power r C1Rule. So by the Fundamental Theorem,

Example 10.6. Taking r D p ¤ 1 in Example 10.5,

This limit exists if p > 1, and it diverges to C1 if p < 1: Also from Example 10.5,

ZN 1 lim dx D lim .ln.N // ; N !1 x N !1 1

which diverges to C1. Therefore, by the Integral Test (Theorem 9.24), the p-series

X1 1 converges if and only if p > 1: nD1 np

(We obtained this result differently in Sect. 6.7.) For example, the Harmonic seriesP1 P 1 1 n diverges to C1 and 1 n2 converges. We will see in Theorem 12.7 thatnD1 nD1in fact,

X1 1 2 D Š 1:645 : nD1 n2 6

Being the first to find the sum of this series was one of Euler’s many triumphs. ˘Example 10.7. We have met several times, beginning with Example 6.11(line (6.7)), the useful inequality:

ln.x/ x 1 for x > 0 : (10.1)

Here is a way to obtain it using integrals. Again by Example 10.5,

Zx 1 dt D ln.x/ ln.1/ D ln.x/: t 1

Now if x > 1; then 1

t 1 on Œ1; x and so

Zx Zx 1 ln.x/ D dt 1 dt D x 1: t 1 110.1 The Fundamental Theorem 255

If 0 < x < 1; then 1

t 1 on Œx; 1 and so

Z1 Z1 1 dt 1 dt D 1 x: t x x

Therefore

Zx Z1 Z1 1 1 ln.x/ D dt D dt 1 dt D x 1: t t 1 x x

Either way, ln.x/ x 1 for x > 0: ˘

Example 10.8. Sometimes a complicated limit can be viewed as the limit of aRiemann sum, so that evaluating the limit comes down to evaluating a definiteintegral. For example, let us consider the limit (for k ¤ 1)

1 1k C 2k C C nk lim : n!1 n nk

We write k k k ! 1 1k C 2k C C nk 1 1 2 k D C C ; n nk n n n n

Z 1and notice that this is a Riemann sum for x k dx: Therefore 0

Z1 1 1k C 2k C C nk 1 lim D x k dx D : ˘ n!1 n nk kC1 0

Example 10.9. We have seen that for a function u which is differentiable and neverzero,

Integrating from 0 to u > 1 and using Example 10.5, we get

Now for u 0 and x 2 Œ0; u; we have 1 1 C x 1 C u, so that

x nC1 x nC1 x nC1 : 1Cu 1Cx

Integrating with respect to x from 0 to u; we get

Zu 1 unC2 x nC1 unC2 dx : 1 C u .n C 2/ 1Cx .n C 2/ 0

For 1 < u < 0 and t 2 Œu; 0; a similar analysis leads to the same inequalities, butreversed. So either way, if 1 < u 1; each of the right-hand and left-hand sides! 0 as n ! 1: Therefore we are justified in writing (for 1 < u 1):

Now for u > 0 and t 2 Œ0; u; we have 1 1 C t 2 1 C u2 , so that

t 2nC2 t 2nC2 t 2nC2 : 1 C u2 1 C t2

Integrating with respect to t from 0 to u; we get

Zu 1 u2nC3 t 2nC2 u2nC3 dt : 1 C u .2n C 3/ 2 1Ct 2 .2n C 3/ 0

For u < 0 and t 2 Œu; 0; a similar analysis leads to the same inequalities, butreversed. So either way, if 1 u 1; each of the right-hand and left-hand sides! 0 as n ! 1: Therefore we are indeed justified in writing (for 1 u 1):

X1 1 1 1 .1/n 2nC1 arctan.u/ D u u3 C u5 u7 C D u : 3 5 7 nD0 2n C 1

Taking u D 1 in this, the Leibniz series, we get the Gregory-Leibniz series namedalso for Scottish mathematician James Gregory (1638–1675):

X1 1 1 1 1 1 .1/n D1 C C C D : ˘ 4 3 5 7 9 11 nD0 2n C 1

Remark 10.14. A result similar to the Integral Test (Theorem 9.24) is used in [2]to obtain sums of rearrangements of various alternating series. For example, takingthree positive terms then two negative terms and so on, in the Gregory-Leibnizseries, one has:260 10 The Fundamental Theorem of Calculus

1 1 1 1 1 1 1 1 1 3 1C C C C C D C ln : 5 9 3 7 13 17 21 11 4 4 2

See also [7]. And taking three positive terms then two negative terms and so on, inthe Alternating Harmonic series, one has:

1 1 1 1 1 1 1 1 1 3 1C C C C C D ln.2/ C ln : 3 5 2 4 7 9 11 6 2 2

In these formulas, replacing the 3 positive terms and 2 negative

terms with mpositive terms and n negative terms respectively, we get ln mn instead of the

ln 32 . ı

10.2 The Natural Logarithmic and Exponential

Functions Again

In Chap. 6 we defined the exponential function then showed that its inverse exists—this is natural logarithmic function. An alternative way is to define the naturallogarithmic function then show that its inverse exists—this is the exponentialfunction. Here we outline the latter approach, made possible by the Fundamen-tal Theorem of Calculus (Theorem 10.1). Z b Let a; b > 0: The Power Rule for Rational Powers does not apply to x n dx awhen n D 1; but the integral still makes sense for n D 1. Therefore we candefine the natural logarithmic function ln.x/ by

Zx 1 ln.x/ D dt; for x > 0: t 1

The integrand is positive and so ln.x/ is an increasing function, with ln.x/ < 0 forx 2 .0; 1/; ln.1/ D 0; and ln.x/ > 0 for x 2 .1; 1/: By its very definition, ln.x/ isdifferentiable by the Fundamental Theorem (Theorem 10.1), and that theorem gives

set x D 1: Now set x D a:For (iv) we have observed already that x1 > 0 for x > 0 and so ln.x/ is in factstrictly increasing.For (v), notice that ln.2n / D n ln.2/: Now ln.x/ is increasing, so ln.x/ > n ln.2/for x > 2n . Therefore, since ln.2/ > 0; we can make ln.x/ as large as we please, bytaking x large.For (vi), we simply write ln.x/ D ln.1=x/ and appeal to (v). t u Since ln.x/ is continuous (it is differentiable) and ln.1/ D 0, by Lemma 10.15part (v) and the Intermediate Value Theorem (Theorem 3.17) we may define thenumber e > 1 as that number for which ln.e/ D 1: That is,

Ze 1 ln.e/ D dt D 1: t 1

This number is unique, by Rolle’s Theorem (Theorem 5.1) and Lemma 10.15part (iv). Then by Lemma 10.15 parts (iv)–(vi), we see that f .x/ D ln.x/ has an262 10 The Fundamental Theorem of Calculus

inverse function, defined on .1; 1/; with range .0; 1/: It is denoted by exp.x/.Since ln.e/ D 1; we have exp.1/ D e: And since ln.1/ D 0; we have exp.0/ D 1: Now any property of ln.x/ gives rise to a property of exp.x/, since the latter isthe inverse of the former. We list those, in order, which correspond to the propertieslisted in Lemma 10.15. We leave their proofs as an exercise.Lemma 10.16. The function exp.x/ has the following properties: (i) exp.a C b/ D exp.a/ exp.b/ for a; b 2 R; (ii) exp.a b/ D exp.a/= exp.b/ for a; b 2 R;(iii) .exp.a//r D exp.ar/ for a 2 R and r 2 Q;(iv) exp.x/ is a strictly increasing function, (v) exp.x/ ! 1 as x ! 1;(vi) exp.x/ ! 0 as x ! 1:Proof. This is Exercise 10.43. t u We now show that exp.t / is continuous at every t 2 R. (Or we could applyExercise 4.16.) First, using Lemma 10.16 (i) or (ii), observe that

exp.t / exp.s/ D exp.t / Œ1 exp.s t / :

Therefore, since exp.0/ D 1, it suffices to show that exp.t / is continuous at t D 0.

We use the inequality (10.1), which we obtained using integrals in Example 10.7:

ln.t / t 1 for t > 0:

Replacing t with exp.t / in this inequality, we get

1 C t exp.t / for t 2 R:

Notice that exp.t / is increasing, so

1 C t exp.t / 1 for t 0:

This shows that exp.t / is continuous from the left at t D 0.

Now replacing t with 1=t in (10.1) we get

1 1 < ln.t / for t > 0 : t

And replacing t with exp.t / here, we get

exp.t / < 1 C t exp.t / for t > 0:

Again 1 C t exp.t / and exp.t / is increasing, so

1 C t exp.t / < 1 C t exp.t / 1 C t e for 0 < t 1 :

10.2 The Natural Logarithmic and Exponential Functions Again 263

This shows that exp.t / is continuous from the right at t D 0. So exp.t / is indeedcontinuous at t D 0. In view of Lemma 10.15 (iii), ar D exp.ln.ar // D exp.r ln.a// for any a > 0and any rational number r: But since exp.x/ is continuous, for any a > 0 and anyreal number x it is reasonable to define

ex D exp.x ln.e// D exp.x/:

This being the case, it is most customary to denote the inverse of ln.x/ by ex insteadof exp.x/. This function is called the exponential function. It satisfies

eln.x/ D x for x > 0 and ln.ex / D x for x 2 R:

The graphs of ln.x/ and ex are shown in Fig. 10.1.

y = ex

y=x

1 y = ln(x)

x 1

Fig. 10.1 The graphs of y D ln.x/ and its inverse, y D ex : Each is the graph of the other,reflected in the line y D x

The relationship ln.ex / D x seems to imply, by the Chain Rule, that e1x .ex /0 D 1and so .ex /0 D ex : But this only shows that if .ex /0 exists, then it equals ex . One264 10 The Fundamental Theorem of Calculus

really must show first that .ex /0 exists. This can be done using Exercise 4.16, buthere we do so more directly. We show that ex is differentiable at every x 2 R; andwe find its derivative while doing so. In (10.1), we replace x with each of exy and eyx , to get

10.4. Assume the hypotheses which yield Cauchy’s Mean Value Theorem forIntegrals (Example 10.3). Show how, if we assume further that one of the twofunctions is never zero, we can obtain the Mean Value Theorem for Integrals(Theorem 9.14).10.5. [41] Let f and g be continuous on Œa; b; with g nonnegative and notidentically zero. Set Zx Zx F .x/ D f .t /g.t / dt and G.x/ D g.t / dt: a a

H.x/ D F .x/G.b/ F .b/G.x/

to obtain the Mean Value Theorem for Integrals (Theorem 9.14). (The func-tion H.x/ is ˙ twice the area of the triangle determined by the points .0; 0/;.F .x/; F .b//; and .G.x/; G.b//: See Exercise 1.12.) Z 210.6. [51] Let f be continuous on Œ1; 2; with f .x/ dx D 0: Show that there is 1c 2 .1; 2/ such that

(b) Now use the fact that f is continuous.

10.8. Here is another proof of Part (i) of Fundamental Theorem (Theorem 10.1).266 10 The Fundamental Theorem of Calculus

Z x(a) Again set F .x/ D f .t / dt: Use the Extreme Value Theorem (Theo- a rem 3.23) to show that there are tm and tM (depending on x and h) such that f .tm / f .t / f .tM / for all x 2 Œx; x C h:(b) Show that hf .tm / F .x C h/ F .x/ hf .tM /; then use the continuity of f:10.9. [6, 9] Fill in the details of the following proof of Part (ii) of the FundamentalTheorem (Theorem 10.1), which does not rely on Part (i). With f 0 continuous onŒa; b; let a D x0 < x1 < < xn1 < xN D b; with xj C1 xj D .ba/ N foreach j:(a) Verify that X N

f .b/ f .a/ D f .xj / f .xj 1 / : j D1

(b) Apply the Mean Value Theorem (Theorem 5.2) to each term in the sum.(c) Let N ! 1 in a suitable way.10.10. Let f .t / D 0 for t < 0; and Z f .t / D 1 for t 0: x(a) For a < 0; compute F .x/ D f .t / dt: a(b) Is F differentiable?(c) How, if at all, does this fit into the Fundamental Theorem (Theorem 10.1)?10.11. [16] Use the Fundamental Theorem (Theorem 10.1) to show that if f is Z aCTcontinuous on R and periodic with period T , then f .x/ dx is independent Z aCT Z T a

of a: Conclude that f .x/ dx D f .x/ dx:

a 0

10.12. Let f be a function defined (for simplicity) on all of R. Then f is an odd

function if f .x/ D f .x/ for all x and f is an even function if f .x/ D f .x/for all x:(a) Show that x; x 3 and sin.x/ are odd, while 1; x 2 , and cos.x/ are even.(b) Use the Fundamental Z Theorem (Theorem 10.1) to show that if f is odd and a continuous then f .x/ dx D 0 for all a: a(c) Use the Fundamental Z a Theorem (Theorem Z a 10.1) to show that if f is even and continuous then f .x/ dx D 2 f .x/ dx for all a: a 0(d) Show that any function defined on R can be written as the sum of an odd function and an even function. (For example, ex D sinh.x/ C cosh.x/.)

10.13. [18] Show that

0 1 11=n Z

lim @ 1 C x n dx A D 2: n n!1 0Exercises 267

Z n 1Hint: First show that 1 C x n dx 2n : Then use the AGM Inequality 0 Z 1 Z 1 n 2 2(Theorem 2.10) to show that 1 C x n dx 2n x n =2 dx 2n 2 : 0 0 n C210.14. [5] We have seen that the Harmonic series diverges, so for each n D1; 2; 3; : : : there is a least positive integer an such that

(b) Use (a) to find the length of the curve y D 1

2 ln.sin.2x// from x D =6 to x D =3:10.22. [13] Let f be defined on Œa; b with f 0 continuous there, and let P and Q bepoints on the graph of f: Denote by L.P; Q/ the length of the curve y D f .x/ fromP to Q and denote by D.P; Q/ the length of the chord from P to Q: Show that

p10.42. (a) Set x D 1= 3 in the Leibniz series (Example 10.13) to obtain a series for =6.(b) Set ’ D 1=2 and “ D 1=3 in the trigonometric identity

tan ˛ C tan ˇ tan.’ C ˇ/ D 1 tan ˛ tan ˇ

to obtain Euler’s formula

1 1 D arctan C arctan : 4 2 3

(c) Use Euler’s formula to obtain another series for =4; which converges much more quickly than does the Gregory-Leibniz series (Example 10.13).Note: Machin’s formula, obtained in 1706 by John Machin,

1 1 D 4 arctan arctan 4 5 239

yields a series which converges even more quickly. William Shanks, around 1873,used Machin’s formula to compute to 707 decimal places. It took him 15 years.It was discovered in the 1950s with the aid of contemporary computers that hiscomputation was incorrect at the 528th decimal place.10.43. Prove Lemma 10.16.10.44. [12, 33](a) Show that f .x/ D c ln.x/; for x > 0 and arbitrary c 2 R; is the only continuous function which satisfies

n Pn k Hint: Show that ln nnŠ D 1 n ln , and recognize this as a Riemann kD1 n sum related to (a).(c) Denote by An the Arithmetic Mean and by Gn the Geometric Mean, of the first n natural numbers. Show that the result in (b) is the same as

Gn 2 lim D : n!1 An e

Other methods can be found in [29,47,54], and are generalized in [26] and [48]. (Also, cf. Exercise 6.12.)10.48. Fill in the details of the following argument, which shows how to obtainJensen’s Inequality (Theorem 8.17) from Steffensen’s Inequality (Exercise 9.52):Let f and g be continuous on Œa; b, with f increasing, 0 g 1; and DZ b g.x/ dx: Then a278 10 The Fundamental Theorem of Calculus

Z aC Zb f .x/ dx f .x/g.x/ dx: a a

(a) Let a D x0 x1 x2 xn and let w1 ; : : : ; wn be positive, with

Pn wj D 1: Define g on Œa; xn via j D1

X k gD wj on Œxk1 ; xk ; for k D 1; 2; ; n: j D1

Verify that 0 g 1 and that

Zxn X n D g.t / dt D xj wj : 0 j D1

(b) If f is convex then f 00 0; so that f 0 is increasing. Apply Steffensen’s

Inequality to f 0 and g as above, and use the Fundamental Theorem (Theorem 10.1).10.49. [43, 44, 52](a) Prove the following integral analogue of Flett’s Mean Value Theorem (Theorem 7.4). Let f be continuous on Œa; b with f .a/ D f .b/: Show that there is c 2 .a; b/ such that

(a) Show that if b is near enough to a; then c is unique.

(b) Evaluate

Rb f .x/ dx .b a/f .a/ a lim b!a .b a/2

two different ways, to show that

ca 1 lim D : b!a ba 2

(This result is generalized to the case where f 0 .a/ D 0 in [1, 19, 40], and in other ways in [23, 36].)(c) Draw a picture which shows that the result in (b) is really not very surprising— remember that f 0 .a/ ¤ 0.10.51. [44] Let f be continuously differentiable on Œa; b with f 00 .a/ ¤ 0: Letc 2 Œa; b be as given by the Mean Value Theorem (Theorem 5.2):

f .b/ f .a/ D f 0 .c/: ba

(a) Show that if b is near enough to a; then c is unique.

(b) Use the Fundamental Theorem (Theorem 10.1) along with the result of Exercise 10.50 to show that if f 00 is continuous and f 00 .a/ ¤ 0 then

ca 1 lim D : c!a ba 2(c) Prove the result in (b) above directly—that is, without the Fundamental Theorem. (All of this is generalized considerably, in [35].)