Diary for Math 311, spring 2003

Let's analyze and prove one version of the Fundamental Theorem of
Calculus (FTC). This is discussed in section 7.3 of the text.

This version of FTC has to do with how the integral behaves as a
function of its upper parameter. Perhaps an example will make the
difficulties clearer. Let's look at a function defined in [0,3] by the
"piecewise" formula f(0)=2, f(x)=1 for 0<x<=1, f(x)=0 for
1<x<=3. We first observed that f was indeed Riemann integrable
on [0,3]. We could use partitions such as {0,B,H,C,3} where B is
slightly bigger than 0, H is slightly less than 1, and C is slightly
larger than 1. The difference between the resulting upper and lower
sums is "very small". And consideration of such sums allows one to
actually compute the Riemann integral. Of course f is also Riemann
integrable on subintervals of [0,3], so we could define F by
F(x)=int0xf for x in [0,3]. We can actually
"compute" F. First, F(0)=int00f must be 0,
because the width of the "subintervals" in any Riemann sum must be
0. Now if x is between 0 and 1 (or actually equal to 1) we can use the
partition {0,B,1} with B close to 0 to see that
int0xf is x (really, the width of 1-B is close
to x and the height, sup of f on [B,1], is 1). Now if we have x>1
and x<=3, we can use the partition {0,B,H,C,3} as before to see
that for such x's, F(x) is 1. What about the derivative of F? Everyone
who has gone through a calc 1 class can tell that F is differentiable
for x in [0,1) and its derivative is 1, and it is also differentiable
in (1,3] with derivative 0. To the left below is a graph of F, and to the
right below is a graph of F'. I would like you to compare F' and
f. The Riemann integral, by the way, doesn't even "notice" the
discontinuity of f at 0. The Riemann integral locally averages the
behavior of f and reports that local average to F.

graph of the derivative of F-->

In what follows, we will need to "recall" some facts about the Riemann
integral.

If f is Riemann integrable in an interval, then it is Riemann
integrable in any subinterval.
Proof: Discussed in the writeup for the lecture of 4/28/2003.

If f is Riemann integrable in an interval, then it is bounded, say
by M (so |f(x)|<=M for all x in the interval). Also the absolute
value of the integral of f on the interval is less than or equal to M
multiplied by the length of the interval.
Proof: Discussed in the writeup for the lecture of 4/24/2003.

Theorem (continuity of the integrated function) Suppose f is
Riemann integrable on [a,b]. Then f is Riemann integrable on [a,c] for
all c in [a,b], and if F(x)=intaxf for x in
[a,b], F is continuous on [a,b]. Indeed, if f is bounded by M, F
satisfies a Lipschitz condition with Lipschitz constant M: for all x
and y in [a,b], |F(x)-f(y)|<=M|x-y|.
Proof: If x<y, then F(y)=intayf=
intaxf+intxyf by
additivity on intervals (4/28 lecture). Since
F(x)=intaxf we see that
F(y)-F(x)=intxyf. If we know that
-M<=f(x)<=M, then -M(y-x)<=F(y)-F(x)<=M(y-x) so that
|F(x)-f(y)|<=M|x-y|.

Comment: the function g(x) which is 0 if x<0 and is sqrt(x) if
x>=0 does not satisfy a Lipschitz condition in any interval which
includes the "right side" of 0 because sqrt(x) doesn't satisfy the
Lipschitz property in such an interval (see the lecture on 4/14,
please). So this g can't be a F corresponding to any Riemann
integrable f. The same is true for other functions which don't
satisfy Lipschitz conditions. Generally, people expect that
"integrating" makes functions "smoother" and better behaved. Thus, we
go from Riemann integrable to Lipschitz. Along this line is the next
result, which says we go from continuity to differentiability.

Theorem (a version of FTC) Suppose f is Riemann integrable on
[a,b]. Then f is Riemann integrable on [a,c] for all c in [a,b], and
if F(x)=intaxf for x in [a,b], and is f is
continuous at c, F is differentiable at c and F'(c)=f(c).
Proof: Recall that F'(c) is the limit as h-->0 of (F(c+h)-F(c))/h.
This proof naturally separates into two cases, h>0 and
h<0. We'll do just the case h>0 (as in class).
Now
F(c+h)=intac+hf=intacf+intcc+hf=F(c)+intcc+hf
so that F(c+h)-F(c)=intcc+hf. Since f is
continuous at c, given epsilon>0 we can find delta>0 so that if
|h|<delta, then when |x-c|<delta, |f(x)-f(c)| must be less than
epsilon. So if |f(x)-f(c)|<epsilon we can "unroll" the inequality
to get -epsilon<f(x)-f(c)<epsilon. Therefore
f(c)-epsilon<f(x)<f(c)+epsilon. Both "ends" of this inequality
are constants, so if we integrate on the interval [c,c+h] the result
for the ends is just multiplication by the length of the interval,
which is h. The result of integrating the central term is
F(c+h)-F(c). The result is
h(f(c)-epsilon)<F(c+h)-f(c)<h(f(c)+epsilon). Since h is
positive here, we know that
f(c)-epsilon<(F(c+h)-F(c))/h<f(c)+epsilon (division by a
positive number does not change the direction of the
inequalities). Now subtract f(c) to get
-epsilon<[(F(c+h)-F(c))/h]-f(c)<epsilon, which means
|[(F(c+h)-F(c))/h]-f(c)|<epsilon. This will hold certainly if
0<h<delta.
A similar inequality can be proved for h<0. But this is precisely
what is meant by declaring that the limit as h-->0 of (F(c+h)-F(c))/h
exists and equals f(c). So we are done.

Although this result is almost always used where f is continuous in
the whole interval, so F is differentiable in the whole interval with
F'=f, we actually don't "need" continuity of f. Here's a fairly simple
example of discontinuities in f not "noticed" by F. Suppose f(x) is 1
if x=1/n for n in N and f(x)=0 otherwise. Then f is Riemann
integrable in any interval (!) and intabf
is 0 for any a and b. Therefore a candidate F will always be 0 and
will be differentiable everywhere, and F'=f for x not equal to 1/n. F
doesn't notice F's values on a thin set like {1/n}.

Of course if we make f non-zero on a "thicker" set then we may run
into trouble. We have already seen examples where f is not Riemann
integrable as a result (f(x)=1 if x is rational, or even f(x)=x if x
is rational).

If we had also verified the Mean Value Theorem, then we would know
that two differentiable functions defined on the same interval with
the same derivative would differ by a constant. That would be enough
when combined with our version of FTC to prove that if G'=f on [a,b]
and f is Riemann integrable, then
intabf=G(b)-G(a). This is the version of FTC
which is used everywhere in calculus and associated subjects.

Tuesday, May 13, 12:00-3:00 PM in SEC 205

and by popular demand (!?) I will have a

Review session on Saturday,
May 10, at 1 PM in Hill 525

I will also have office hours in Hill 542 on Monday, May 12, from 1
PM to 5 PM. Almost surely I'll be in my office most days next
week, and I will also respond to e-mail. I will try to produce some
review material to hand out on Monday. We will cover a version of the
Fundamental Theorem of Calculus in the last class.

Material related to what I discuss today is covered in the textbook in
section 5.6 (see 5.6.1 through 5.6.4) and in section 7.2 (see 7.2.7).

In probability one builds models of "chance". The cdf (cumulative
distribution function) of a random variable X, which is defined by
f(x)=probability{X<=x} contains most of the useful probability
information. Quantities such as the mean (expectation) and the
variance can be computed from it (usually involving various
integrals). The function I'd like to study today has the essential
properties of a cdf which are listed below. I won't discuss a
probability "model" that this function might come from.

f's values are in [0,1].

The limit of f(x) as x-->infinity is 1.

The limit of f(x) as x-->-infinity is 0.

If x<y, then f(x)<=f(y).

If a is in R, then the limit of f(x) as x-->a-
exists. In the language of 311, this is the limit of f(x) as
x-->a when the domain of f is (-infinity,a).

If a is in R, then the limit of f(x) as x-->a+
exists and equals f(a). In the language of 311, this is the limit of
f(x) as x-->a when the domain of f is (a,infinity).

Since this is Math 311, I should prove something. So I
will show that if property 3 holds (so if x<y, then f(x)<=f(y)),
then property 5 is true (If a is in R, then the limit of f(x)
as x-->a- exists.).Theorem Suppose for all x, y in R, if x<y, then
f(x)<=f(y). Then the limit of f(x) as x-->a- exists and
is equal to sup{f(x):x<a}.
Proof: Let's call the set {f(x) : x< a}, W, and call its sup,
S. Why should S exist? We will use the completeness axiom. W is not
empty (since, for example, f(a-33) is in W). W is bounded above, and
one upper bound is f(a) (this uses the increasing assumption, of
course). Therefore S exists using the completeness axiom on the set W,
which is non-empty and bounded above.

I claim that the stated limit exists. This is a one-sided limit, so
the following implication must be verified: if epsilon>0 is given,
then there is a delta>0 so that if a-delta<x<a, then
S-epsilon<f(x)<=S. Since S is sup{f(x) : x< a}, given
epsilon>0, there will be w in W so that S-epsilon<w<=S. But w
is a value of f, so that there is v<a with f(v)=w. Take delta to be
a-v. Then if a-delta<x<a, we know that a-(a-v)<x<a, so
v<x<a. Since f is increasing, we may "apply" f to this
inequality and get f(v)<=f(x)<=f(a). This means that
w<=f(x)<=f(a). But w>S-epsilon, and since x<a, f(x)<=S
because S is the sup of W. Now we have S-epsilon<f(x)<=S, which
is what we wanted and the proof is done.

Of course a similar statement is true about limits "on the right" with
sup replaced by inf.

Now I'll begin creating a weird example. We know that Q is
a countably infinite set (go back and look at the first day or two in this
course). In fact, Q intersect any interval of positive length
is countably infinite. Countably infinite means that there is a
bijection (a pairing, a function which is 1-to-1 and onto) between
N, the positive integers, and the set. So there is a bijection
B:N-->{elements of Q, the rational numbers, in the open
interval (0,1)}. Now remember that the
sumn=1infinity1/2n is 1. So what's
f(x), finally?

f(x)=the sum of 1/2n for those n's which have B(n)<=x.

This is a weird definition. Since the range of B, the bijection, is
only the rationals in (0,1), if x is less than or equal to 0, there
are no B(n)'s less than or equal to x. Therefore the sum is
"empty", and the legal (?!) interpretation of an empty sum is 0. Thus
f(x)=0 for x<=0. Now if x>=1, all of the rationals in the
open interval (0,1) are below x, so f(x) must be 1. Notice also that
f's values are some sort of "subsum" of the complete sum of
1/2n for n in N, and therefore f(x) must be in [0,1]
for all x. So we have verified requirements 1 and 2 and 3 for cdf's
above.

Things will get even more interesting when we look at #4. If x<y
and x and y are in the unit interval,
then the interval (x,y) has infinitely many rational numbers between 0
and 1 in it. Therefore f(x) must be strictly less than f(y). Thus, on
[0,1], f is a strictly increasing function: f(x)<f(y) if
x<y. This is more than #4 requires.

We saw that #4 implies that the left and right hand limits exist. So
all we need to do is investigate where f is continuous. Well, since
limx-->a-f=sup{f(x):x<a}=LEFT and
limx-->a+f=inf{f(x):x>a}=RIGHT we just need to
think about where f(a) fits. Certainly since f is increasing, we know
that LEFT<=RIGHT. We'll say that f has a jump at a if
LEFT<RIGHT, and the amount of the jump is RIGHT-LEFT. Can f have a
jump of, say, 33? That is, can RIGHT-LEFT be 33? Since f's values are
in [0,1], I don't think this is possible. Can f have a jump of 1/33?
It could have such a jump, but it actually couldn't have too
many of such jumps: it certainly couldn't have more than 33 of those,
because notice that f can't jump down, only up, since f is
increasing. Actually the total length of the jumps of f should be 1,
since f(large negative) is 0 and f(large positive) is 1. Now where do
the jumps take place?

Let's imagine an example, where, say B(17)=3/7. If x<3/7, the sum
for f(x) would not have the 1/217 term in it. As x
increases "towards" 3/7, f(x) would increase towards
sumB(n)<3/71/2n, which would be
sup{f(x):x<3/7}.
If x>=3/7,
the sum would have the term 1/217. And inf{f(x):x>3/7}
would be exactly 1/217 larger than sup{f(x):x<3/7}, and
would be equal to f(3/7).

In fact, f has a jump of 1/2n at B(n): f has jumps at every
rational number. The total sum of the jumps at the rationals is
1/2+1/4+1/8+...=1. There are no other possible jumps, since any
additional jump would mean that the function increases more than 1,
and we already know this is not possible.

In [0,1], this f is continuous at every irrational number and is
continuous at 0 and at 1, and is not continuous at every
rational number between 0 and 1. Is f a cdf of a continuous
distribution? Is it a cdf of a discrete distribution? Well, maybe f
shares aspects of both kinds of distributions.

Is f Riemann integrable on [0,1]? Of course this is the same as asking
if, given epsilon>0, we can find a partition P of [0,1] so
that US(f,P)-LS(f,P)<epsilon. For example, suppose we
take epsilon=1/10. Is there some partition which clearly satisfies
the requirement? Remember that f is increasing on [0,1], and f(0)=0
and f(1)=1. If we partition [0,1] into n equal subintervals of width
1/n, then the difference US(f,P)-LS(f,P) must actually
equal the total increase (1) multiplied by the width of the
subintervals (1/n), so the difference is 1/n. And if n>10, we have
a satisfactory partition. And by the Archimedean property, we can
always find n so 1/n<epsilon. In fact, any increasing function must
be Riemann integrable. Any decreasing function must be Riemann
integrable. There are problems with integrability when there is much
combined "wiggling" up and down.

What is the Riemann integral of f on [0,1]? Some thought should
convince you that we know approximately what the integral is: it must
be between 0 and 1. But I don't know more than that. In fact, until
after this lecture I really didn't think much about what f looks
like. So here is what I did: I asked Maple to "draw" an approximation
of the graph of an f. I listed 30 rational numbers between 0 and 1,
and had Maple draw the approximation to the graph of f by just using
these thirty rational numbers, in order, as B(1), B(2), ..., and
B(30). The total "weight" that's left over sums up to at most
1/230 which is less than 10-8, a very small
number. So the graph drawn is quite close (probably beyond screen
resolution) to a "true" graph of f.
Here are the Maple procedures I used.

gen:=rand(1..99999);

This asks Maple to create a "random" integer between 0 and 99,999.

A:=[seq((gen()/100000),j=1..30)];n:=30;

This asks Maple to create a sequence of "random" rationals (the
integers divided by 100,000) in the open unit interval. It assigns
this sequence to the name A. The next statement creates the variable n
with value 30.

The associated graph is shown and it has approximate area
.57585733. One surprising aspect of the graph to me was the enormous
flatness of most of it: but of course the graph is not "flat" anywhere
(any interval has infinitely many rationals, so the graph must increase).
The amount of increase is mostly very very very small. Maple also
displays the vertical jumps with vertical line segments.

I don't know or understand very much about the possible values of
int01f. It is between 0 and 1, but it can be
very very large: if I select the first 30 "random" rationals close to
0, then I get the first graph shown below and the area is .97827779,
and if I select the first 30 "random" rationals close to 1, then I get
the second graph shown below and the area is .05210699.

Problem Is there a bijection B which has
int01f=1/2? I don't know. In fact I don't know
any specific value (or non-value!) of
int01f. Certainly fairly easy reasoning (moving
around the big blocks) shows that the values of
int01f are dense in (0,1), but I really
don't know the answer to the question just asked. I suspect it is "yes".

4/30/2003

Tomorrow I will give out student evaluation forms. Also tomorrow I
will request information on when I can usefully be available before
the final exam. The final is scheduled for

Tuesday, May 13, 12:00-3:00 PM in SEC 205

I began by discussing the material written in this color in
yesterday's diary entry. We went through the proof in detail, and I
think at least some students understood and verified it.

Students did not want to discuss the material presented in this
color in yesterday's diary entry (Additivity of the integral over
intervals). This material will be used in the proof of a version of
the Fundamental Theorem of Calculus, which I hope to give at the last
meeting of the class Monday.

Historically probability has been the inspiration of many of the more
intricate results about integration. I now take a small detour to
present a complicated example of a function. First I tried to give
some background on probability.

Probability originated
in the 1600's in an effort to predict gambling odds. Here's the
basic idea, as it is now understood. One plays a game "many" times and
observes the outcomes. A quotient called the "relative frequency" is
computed: this is the (number of outcomes of a desired type) divided
by (total number of times the game has been played). Of course
relative frequency is a number between 0 and 1. Now the idea or hope
or desire is that as the(total number of times the game has been
played) gets large (approaches infinity?) this relative frequency
should somehow "stabilize" or approach a limit. This limit is called
the probability of the outcome of the desired type. Since the
limit of a sequence of numbers in [0,1] is also a number in [0,1], the
probability of a collection of outcomes is always in [0,1]. Of course
this is all a model of reality, and building these models can be
difficult. And certainly the relevance to "reality" of what's deducted
using these models can also be debated. But that's the basic idea. Now
I'll introduce some vocabulary and illustrate the vocabulary with a
few examples.

Vocabulary

Outcome

Sample space

Event

Probability

Random variable

(approximate)Definition

One of a list of possible results of "the game"

The collection of all possible outcomes

A collection of certain specified outcomes: an event is a subset
of the sample space

An assignment of a number in [0,1] to an event: a measurement of how
"likely" the event is

This is a real-valued function defined on the sample
space. Maybe
think of such a function as the amount of "winnings" (in $?)
that each outcome generates.

A fair die

The outcomes are classified by the number of dots on the face
showing.

The sample space could be labeled {1,2,3,4,5,6}

One event could be the collection of "odd" outcomes:
{1,3,5}. There are 26=64 different events.

The probability of an event is measured by the number
of distinct outcomes in it divided by 6. So pr({1,3,5})=3/6.

One random variable could be the number of dots showing: an
integer from 1 to 6.

A fair coin, flipped until heads shows

The game ends when the first head shows. One
outcome could be H. Another could be TTTH which can be
abbreviated as T3H. Another could be a sequence of
all T's, abbreviated as Tinfinity.

For each n in N, we have
Tn-1H. We also have Tinfinity.

One event could be all outcomes with at most 50
flips: {Tn-1H with 1<=n&lt=49}.

A fair coin flipped independently would likely get
the following: pr(Tn-1H)=1/2n, and
pr(Tinfinity)=0. See Note 1 below.

A random variable could be the number of flips
needed. This random variable is defined and is a real number on all of
the outcomes Tn-1H. It isn't defined on
Tinfinity but that doesn't matter since the
probability of that happening is 0.

Pick a number at random from [0,1], all numbers being
equally likely.

1/3 is an outcome, and so is Pi/7 and sqrt(2)-1.

The sample space is [0,1].

Well, one event could be A={x with
0=<x<1/2}. Another could be B={x with 3/7<x<5/7}. Another
could be C={1/3} (one number!).

The probability of A is 1/2. The probability of B
is 2/7. The probability of C is 0. Please see Note 2
below.

A random variable could be the following: if the
number x is picked, then the value is x2.

Note 1 Here is one unpleasant consequence of this model. Since
the probabilities of all the outcomes in the sample space should add
up to 1, there is no positive number "left over" to assign as
the probability of Tinfinity (since the sum of
1/2n as n goes from 1 to infinity is 1), so its probability
must be 0! So here is a conceivable event which happens hardly ever,
according to this model.

Note 2 This "game" is called "choosing a number from the unit
interval uniformly at random". Clearly (?) the correct model
would assume that the probability of an "interval event" is the length
of the interval. Being subintervals of [0,1] means that the
lengths are correctly weighted so the probability of the whole sample
space is 1, as it should be. But then the probability of, say, the
event which is the
interval (1/3-1/n,1/3+1/n) is 2/n for all positive integers n. Then
since pr({1/3})<=pr((1/3-1/n,1/3+1/n)) should be true (smaller
events should have smaller probabilities!) we see that
pr({1/3})<=2/n for all n in N. Thus (Archimedean property!)
pr({1/3})=0 in this model. What is more unsettling to realize is that
the probability of any one number event is 0! So the chance of
picking any one number, according to this model, is 0, but we've got
to pick some number! Both note 1 and note 2 deal with the paradoxes of
trying to model infinite "games" with a series of rules that lead to
weirdness. These weirdnesses seem to be necessary.

The most famous results of probability deal with repeated experiements
and the tendency of random variables to have nice "asymptotic"
properties. One such result is the Central Limit Theorem, which
essentially states that the normal curve rules every repeated
experiment. Here are two applets simulating the CLT, one with
dice and one with
a sort of pachinko-like "game". Such results are usually
understood and investigated using the cumulative distribution
function, cdf, of a random variable X. So cdf's are extremely
important in probability.

If X is a random variable, the cumulative distribution
function, f of X is defined by this:
f(x)=the probability that X is less
than or equal to x. That is, f(x)=pr(X<=x).

Some effort is needed to be acquainted with this definition. Let's
look at our three random variable examples, and graph their cdf's.

Tossing a fair die Here there are jumps of 1/6 at 1 and 2 and 3
and 4 and 5 and 6. A graph of the cdf follows. There is a solid dot
where the value of the function is, and an empty circle where it "isn't".
The values of the cdf are always in [0,1], since it is a
probability. The cdf is always increasing but may not not strictly
increasing. That is, if x<y, then f(x)<=f(y).

Flipping until a head occurs Here there are jumps of 1/2 and 1/4
and 1/8 and ... at 1 and 2 and 3 and ... This graph takes some
thinking about.
Notice that this cdf never "reaches" 1, but its limit as x-->infinity
is 1. We could easily get cdf's which are never 0, but whose limit
as x-->-infinity is 0.

Squaring a uniformly distributed number from [0,1] What is the
probability that such a number is less than 1/2? This is the same as
asking for the length of the interval of x's in [0,1] for which
x2<=1/2. That interval is [0,1/sqrt(2)], so its length
is 1/sqrt(2). Therefore the graph is sqrt(x) for x in [0,1], and 0 for
x<0 and 1 for x>1.
The first two random variables are examples of discrete random
variables, and the third is a continuous random variables. In
many manipulations concerning random variables, we build the model (as
described above) but then once the cdf of a random variable is known,
almost all other information is discarded, and work is done with the
cdf alone. We can generalize some facts about cdf's from these
examples:Properties of cdf's
Suppose f is a cdf. Then:

f's values are in [0,1].

The limit of f(x) as x-->infinity is 1.

The limit of f(x) as x-->-infinity is 0.

If x<y, then f(x)<=f(y).

If a is in R, then the limit of f(x) as x-->a-
exists. In the language of 311, this is the limit of f(x) as
x-->a when the domain of f is (-infinity,a).

If a is in R, then the limit of f(x) as x-->a+
exists and equals f(a). In the language of 311, this is the limit of
f(x) as x-->a when the domain of f is (a,infinity).

Sometimes people say that properties 5 and 6 mean that the cdf is a
cadlag function (!). This is an acronym for the French phrase
"Continue a droite, limite a gauche": the function is continuous from
the right, and has left-hand limits. We will discuss and further
verify these properties tomorrow, and also try to count the number of
jumps that any cdf can have. And we will look at a remarkable cdf.

In the case of continuous random variables, another function is
sometimes studied, the density function. This turns out to be the
derivative of the cdf, and its utility for discrete random variables
is not immediately clear. (What should the derivative of a mostly
horizontal function be?) So I will just look at cdf's here, today and
tomorrow.

4/28/2003

Again we are going through the technicalities on integral and order,
integral and linearity, and additivity of the integral over intervals.
This takes effort and discipline, but Math 311 is the course
whose total object is constucting calculus (also called "analysis") with all
the interconnections showing. So let's move on and finish up these
technicalities.

Proposition (negating integrands) Suppose f is Riemann
integrable on [a,b]. Then the function h defined by h(x)=-f(x)
is also Riemann integrable on [a,b], and
intabh=-intabf.
Proof: If A is a bounded subset of the reals, define B to be the
subset which consists of -a where a is in A. Then the -sup{x in
A}=inf{x in B}. This is an exercise in sup's and inf's, suitable for
earlier in the course. If A=(-13,5], then B would be [-5,13), and sup
A would be 5, and inf B would be -5. We will apply this repeatedly.

Initially I want to show that h is Riemann integrable. I will use the
necessary and sufficient condition with epsilon. What do I mean? I
must show that given epsilon>0, there is a partition P of
[a,b] so that US(f,P)-LS(f,P)<epsilon. Since the
upper sums and lower sums of f involve infs and sups, we can apply the
previous remark. We get US(h,P)=-LS(f,P) and
LS(h,P)=-US(f,P). Therefore
US(h,P)-LS(h,P)=-LS(f,P)-(-US(f,P))<epsilon.
We now know that h is Riemann integrable.

We need to show that
intabh=-intabf. Let's look
at UI(h,[a,b]). This is inf of the upper sums of h. But each upper sum
of h is minus a lower sum of f. Therefore the inf of h's upper
sums is (again by the remark above!) equal to minus the sup of the
lower sums of f. But this is -LI(f,[a,b]). Therefore we have shown
that UI(h,[a,b])=-LI(f,[a,b]). Since f and h are Riemann integrable,
the upper and lower integrals of each are equal to the "integral" of
each. That is, we have verified
intabh=-intabf as
desired.

Note This differs slightly from the presentation made in
class. I believe it is more systematic, and perhaps better. I am not
sure, though.

Proposition (addition of functions) Suppose f and g are
Riemann integrable functions on [a,b]. If the function h is
defined by h(x)=f(x)+g(x), then h is Riemann integrable on
[a,b], and
intabh=intabf+intabg.
Proof: Again we must show that h is Riemann integrable first. So given
epsilon>0 we must find a partition P of [a,b] so that
US(h,P)-LS(h,P)<epsilon. We can begin with
considering f and g separately first, and split up epsilon
suitably.

So there must be a partition A of [a,b] so that
US(f,A)-LS(f,A)<epsilon/2
and there must be a partition B of [a,b] so that
US(g,B)-LS(g,B)<epsilon/2. We will let P be
the union of A and B. Note that P is a partition:
it is a finite subset of [a,b] (since both A and B are)
and it contains both a and b, since both A and B
do. P is sometimes called the "smallest common refinement" of
A and B. As a refinement, we know that the approximation
gets "better": that
US(f,P)-LS(f,P)<epsilon/2
and US(g,P)-LS(g,P)<epsilon/2.

Now we must compare Riemann sums of f+g and those for f and for g. We
know that if W is a subset of [a,b], then the set f(W)={f(x) : x in W}
and g(W)={g(x) : x in W} are both bounded in R since we are
dealing with Riemann integrable functions which are bounded both above
and below. If Sf=sup{f(W)} and Sg=sup{g(W)},
then Sf is an upper bound of f(x) when x is in W and
Sg is an upper bound of g(x) when x is in W. Therefore
Sf+Sg is an upper bound of f(x)+g(x) for x in W,
and since Sf+g=sup{f(x)+g(x) : x in W} is a least
upper bound of those numbers, we know
Sf+g<=Sf+Sg. Therefore we know
that US(f+g,P)<=US(f,P)+US(g,P) and similarly
LS(f+g,P)>=LS(f,P)+LS(g,P). Now we subtract,
reversing the inequalities in the proper fashion:
US(f+g,P)-LS(f+g,P)<US(f,P)+US(g,P)-(
LS(f,P)+LS(g,P)).
Of course, if we "distribute" the -'s, we see that this is less than
epsilon/2+epsilon/2=epsilon. So now we know that f+g is Riemann
integrable on [a,b]. We've got to investigate the relationship between
the actual integrals of all of these functions.

UI(f+g,[a,b]) is the inf of US(f+g,P), while UI(f,[a,b]) is the
inf of US(f,P) and UI(g,[a,b]) is the inf of
US(g,P). Here is where I got tired and confused in class. Let's
see if I can work my way through this with less hesitation now! Let's
call the set of US(f+g,P), M, and the set of US(f,P), N,
and the set of US(g,P), P. What do we know? M and N and P are
all bounded sets of real numbers. If n is in N and p is in P, then
there must be m in M with m<=n+p. (We can take n to be an upper sum
for f+g on a partition with is the common refinement for the
partitions of n and p.)
What about inf M? If inf M > inf N + inf P, then we can find n in N
and p in P with inf M>n+p>=inf N + inf P (using our old
characterization of inf's). But there is m in M with m<=n+p, which
implies m<inf M, a contradiction. Therefore inf M <=inf N + inf
P. This means UI(f+g,[a,b])<=UI(f,[a,b])+UI(g,[a,b]). Similarly we
can prove that LI(f+g,[a,b])>=LI(f,[a,b])+LI(g,[a,b]). But f and g
and f+g are all Riemann integrable, so
UI=LI=intab for f and g and f+g. Thus the
inequalities stated above prove that
intabh=intabf+intabg.

Note I think that I made a mistake in this proof in class. What
I've written above is, I think, correct. But it is different from what
I stated in class. Please look at your notes and decide. Let me know,
please.

The last three propositions can be abbreviated by writing that:
The collection of Riemann integrable functions on [a,b] is a real
vector space, and the mapping from these functions to the real
numbers defined by integration is a linear transformation.

Finally here's the last technical result we need. Again, all of these
should be familiar from calculus days!Theorem (Additivity over intervals) Suppose
a<b<c. Then:
I) If f is defined and Riemann integrable on the interval [a,c],
then f is Riemann integrable on [a,b] and [b,c], and further
intacf=intabf+intbcf.
II) Suppose that f is defined on the interval [a,c], and that f
is Riemann integrable on [a,b] and that f is Riemann integrable
on [b,c]. Then f is Riemann integrable on [a,c] and
intacf=intabf+intbcf.
Proof of I): I will show that f is Riemann integrable on [a,b] in the
"standard" way: given epsilon>0, I will try to produce a partition
P of [a,b] for which
US(f,P)-LS(f,P)<epsilon. How can we create such a
P? Since f is Riemann integrable on [a,c], there is a partition
Q of [a,c] so that
US(f,Q)-LS(f,Q)<epsilon. Now Q might not
contain b, since b is a "random" interior point of [a,c]. However, we
can "throw in" b and only increase the lower sum and decrease the
upper sum. So we have:
US(f,Q union {b})-LS(f,Q union {b})<epsilon. But Q
union {b} is now a partition of [a,c] which contains b, so if we write
P=(Q union {b})intersection [a,b], we will have a finite set of
points in [a,b], containing a and b: this is a partition of [a,b]. But
also the upper and lower sums of P will be part of the upper
and lower sums of Q union {b}, and therefore
US(f,P)-LS(f,P)<=US(f,Q union {b})-LS(f,Q
union {b})<epsilon. So we proved that f is Riemann integrable on
[a,b]. Similar reasoning proves that f is Riemann integrable on
[b,c]. Now we need to analyze the integrals.

DA CAPO [begin]
Again we know if P1 is a partition of [a,b] and
P2 is a partition of [b,c], then P which is
the union of P1 and P2 is a
partition of [a,c]. We can also see that
US(f,P)=US(f,P1)+US(f,P2).
Also, using the refinement logic we wrote above, if Q is any
partition of [a,c], then there is a refinement P of Q
which contains b. Thus if Q is any partition of [a,c], there
are partitions P1 of [a,b] and P2
of [b,c] so that
US(f,Q)>=US(f,P1)+US(f,P2).
I claim that this then implies
UI(f,[a,c])>=UI(f,[a,b])+UI(f,[b,c}). If
UI(f,[a,c])<UI(f,[a,b])+UI(f,[b,c]), there is a partition Q
of [a,c] so that
UI(f,[a,c])<=US(f,Q)<UI(f,[a,b])+UI(f,[b,c]). But then
US(f,P1)+US(f,P2)<=US(f,Q)<UI(f,[a,b])+UI(f,[b,c])
and that's impossible since the UI's are the inf's of the respective
upper sums.DA CAPO [end]

So now we know that UI(f,[a,c])>=UI(f,[a,b])+UI(f,[b,c}). We can
similarly prove LI(f,[a,c])<=LI(f,[a,b])+LI(f,[b,c}). Since again
we already know these functions are Riemann integrable,
UI=LI=inta or bb or c, we have verified that
intacf=intabf +
intbcf.

Note There is much repetition of ideas in all of these proofs!

Proof of II): Since f is Riemann integrable on [a,b] and f is Riemann
integrable on [b,c], we know, given epsilon>0, there is a partition
Q1 of [a,b] and a partition Q2 of
[b,c] so that
US(f,Q1)-LS(f,Q1)<epsilon/2 and
US(f,Q2)-LS(f,Q2)<epsilon/2.
Now take P to be the union of Q1 and
Q2. This is a partition of [a,c], and if we add the
two inequalities just above, we have verified that
US(f,P)-LS(f,P)<epsilon so that f is indeed Riemann
integrable on [a,c].

We finally must verify that the Riemann integrals "behave" correctly.
Read through the section labeled DA CAPO
from [begin] to [end]. I think that finishes the proof.

Possible question of the day Suppose g is Riemann
integrable on [0,5], and you are told that
int05g=13, and that |g(x)|<=3 for all x in
[0,2]. What over and underestimates can you make about
int25g and why?

Several (dubiously) interesting (?) linguistic comments were made
during today's class.

First was MEGO. The web site
www.acronymfinder.com reports that this means "My Eyes Glaze Over
(during a boring speech or briefing)".

I asked for the source of the quotation "sup of the evening, beautiful
sup" -- this was a misspelling of the word "soup" and the phrase comes
from chapter 10 of Lewis Carroll's "Alice in
Wonderland" where it is the first line of a song that the Mock
Turtle sings. John Tenniel's historic illustration is shown.

The Mock Turtle also discusses its education, and remarks that it
studied `the different branches of Arithmetic-- Ambition, Distraction,
Uglification, and Derision.'

Lewis Carroll was actually an academic mathematician at Oxford
University named Charles Lutwidge Dodgson. Biographical information is
abundant.

4/24/2003

The Question of the Day
Suppose f:R-->R is defined by f(x)=5 when x=3 and
f(x)=-9 when x=6, while f(x)=0 for all other x's. Is f Riemann
integrable on [2,7], and, if it is, what is the Riemann integral of f
on that interval?
Answers: Yes, and 0.

I began by observing that the special arguments last time actually
proved more than I stated. Theorem (Integrability of Lipschitz functions) Suppose f
satisfies a Lipschitz condition on [a,b]: that is, there is a constant
K>0 so that for all x,y in [a,b], |f(x)-f(y)<=K|x-y|. Then f is
Riemann integrable on [a,b].

Comment The method of proof actually also provides the
beginning of an algorithm to approximate definite integrals, so the
work is not totally wasted, even though the conclusions of the theorem
to be stated below apply to many more functions than this one.

Corollary Suppose f is differentiable on [a,b], and there is
K>0 so that |f'(x)|<=K for all x in [a,b]. Then f is Lipschitz
and therefore Riemann integrable.

We reconsidered an example discussed on 4/14/2003 (see the material on boxes
and butterflies): the function sqrt(x) on the interval [0,1]. Since
f'(x)=x-1/2 for x>0, the derivative is not
bounded on (0,1]. And we actually saw that this f does not
satisfy a Lipschitz condition on [0,1]. But everyone who has been
through a calculus course knows that sqrt(x) for x between 0 and 1
does have an area, and this area is even easily computable with
the Fundamental Theorem of Calculus. So how can we verify this
function is Riemann integrable on [0,1]? The following result is a
major success of the course.

Theorem (continuous functions are Riemann integrable) Suppose f
is continuous on [a,b]. Then f is Riemann integrable on [a,b].
Proof: The key to this proof is to use uniform continuity to "control"
the amount of variation of the function on an interval. So we know:
given eta>0, there is alpha>0 so that if x and y are in [a,b]
and if |x-y|<alpha, then |f(x)-f(y)|<eta. I'll try to relate
alpha and eta to what we need to verify Riemann integrability.

We will use a simple partition again. So P will break up [a,b]
into n equal intervals, each of length (b-a)/n. The number of boxes is
n. In each subinterval, the inf and sup of f on
[xj-1,xj] is actually assumed: there are numbers
mj and Mj so that sup of f on
[xj-1,xj] is Mj=f(pj) and
inf of f on [xj-1,xj] is
mj=f(qj). This is a consequence of continuity.
The difference between the upper and lower sums will be bounded
by:
(the number of boxes)·(the width of the boxes)·(the max
height of the boxes). If we can show that this will be less than some
given epsilon>0, we will be done. But the number of boxes is n, and
the width is always (b-a)/n. Therefore if the height of the boxes will
always be less than epsilon/(b-a), we will be done. But the height is
f(pj)-f(qj) with
|pj-qj|<(b-a)/n. So now we use uniform
continuity. We want eta to be epsilon/(b-a). So uniform continuity
tells us there is some alpha>0 which guarantees that eta. Now take
n large enough so that (b-a)/n<alpha, always possible by the
Archimedean property. And we are done.

An example or two
I will assume the standard properties of sine for these examples (we
could instead get functions whose graphs would be polygons with
similar properties, but would it be worth the trouble to define them?).

Here is a picture of the function f(x)=sin(1/x) for x not 0, and
f(0)=0. This function is continuous in [0,1]. If (xj) is
any sequence in (0,1] with 0 as a limit, then the sequence
(f(xj) is squeezed by the "x" and the limit will be 0. I
had Maple draw in the "squeezing lines", +/-x, as well as sin(1/x) in
this picture. This
function has an infinite number of wiggles up and down in [0,1], but
it is still Riemann integrable.

Now consider the function sin(1/x) on (0,1]. A first observation is
that there's no way to define this function at 0 so it will be
continuous there. That's because it is possible to find sequences
(xj) in (0,1] whose limits are 0 for which
(f(xj) could be either a sequence without a limit, or
sequences with different limits (it isn't hard to get explicit
sequences converging to 0 or 1 or -1, for example). So there's no
"natural" f(0). For simplicity, let's define f(0)=0. I claim that this
f is indeed Riemann integrable on [0,1]. Here's is a verification of
this claim.

Given epsilon>0, we need a partition P of [0,1] so that
US(f,P)-LS(f,P) is less than epsilon. The maximum height
of any box in that difference is 2 since the range of sine is [-1,1].
So let's "waste" epsilon/2 on a first box: take 0 and x1 so
that x1<epsilon/4. Then the upper-lower on that
subinterval must be less than (epsilon/4)·(maximum
height)=epsilon/2. Now consider sin(1/x) on the interval
[x1,1]: there the function is continuous, hence Riemann
integrable, and there is a partition Q of [x1,1] so
that US(f,Q)-LS(f,Q)<epsilon/2. Now take P to
be the points in Q together with 0, and the discrepancy will be
at most the sum of the two, so therefore we have the desired partition
whose difference between upper and lower sums is less than epsilon. A
picture may help understanding. The large vertical box all the
way on the left of the picture contains infinitely much
wiggling of sin(1/x). The finite amount of wiggling not contained in
that box is "captured" inside a finite sequence of other boxes.

This web
page has links to notes of Professor Cohen which we are (approximately)
following. Please look down the page and find "Riemann Integral, Section 1".

Not mentioned in class
I really should have mentioned that
the combination of the Lipschitz idea and the analysis above allows
the
creation of an algorithm to approximate the definite integral of a
continuously differentiable function as accurately as desired. We know
that given epsilon>0, we can take delta=K/(b-a). If we want to
approximate within 1/n we could take any Riemann sum with more
than K/[(b-a)n] subintervals. Any Riemann sum will be between the
upper and lower sums, and those will both be within 1/n of the true
value of the integral. Naturally, "real" numerical analysis considers
algorithms which converge more rapidly, but what is described here is the
beginning.

Plans ...
We need to prove "technical" results about order, linearity of the
integral, and additivity over intervals to help present a version of
the Fundamental Theorem of Calculus. I'll also go through a short
excursion on probability in order to show the class some interesting
and almost unbelievable examples.

Theorem (order and integral) Suppose f and g are Riemann
integrable on [a,b], and that for all x in [a,b], f(x)<=g(x). Then
intabf<=intabg(x).
Proof: This result should be relatively easy to verify. Let's
see: since f(x)<=g(x), the sup of g on an interval will be at least
as large as the sup of f on the same interval. Therefore
US(f,P)<=US(g,P) for all partitions, P: each
of f's upper sums are less than or equal to the corresponding upper
sums of g. Now what can we say about the inf of f's upper sums (what
we called the upper integral of f, UI(f,[a,b])? Since UI(f,[a,b]) is a
lower bound for all of f's upper sums, it is also a lower bound for
all of g's upper sums. Therefore UI(f,[a,b])<=the greatest lower
bound for all of g's upper sums, so UI(f,[a,b])<=
UI(g,[a,b]). Since both f and g are Riemann integrable, the UI's are
equal to the Riemann integrals, so that
intabf<=intabg.

Corollary Suppose that there are real numbers m and M so that
for all x in [a,b], m<=f(x)<=M and f is Riemann integrable in
[a,b]. Then m(b-a)<=intabf<=M(b-a).
Of course this can also be proved if we just compute US(f,P)
and LS(f,P) when P is the "trivial" partition,
p={a,b}.

Linearity of the integral here will mean that
intabqf+g=q intabf+intabg
when f and g are Riemann integrable on [a,b] and q is a constant. I
will divide this into three parts, allowing me to concentrate on
smaller steps:
i) (Positive homogeneity)
intabq·f=q·intabf
ii) (-1) intab-f=-intabf
iii) (Addition of functions) intabf+g=intabf+intabg.

Proposition (i) Positive homogeneity)
Suppose q is a positive constant and f is
Riemann integrable on [a,b]. If g is a function defined by
g(x)=q·f(x), then g is Riemann integrable on [a,b] and
intabg=q·intabf.
Proof: If S is any subset of R and we define qS to be the set
of numbers y=qx where x is in S, then we know the following results
from long ago:
If S is bounded above (or below), then qS is bounded above (or
below). The converse is also true.
Why? If t is an upper bound of S, then t>=x for all x in S, so that
qt>qx always. Lower bounds work the same way. The converse is
verified by "multiplying" qS by the positive number 1/q.
If S is bounded above, then sup(qS)=q sup(S).
Why? If the sup of S is v, then qv is an upper bound of qS. Also, if w
is any upper bound of qS, (1/q)w is an upper bound of S, so that
v<=(1/q)w, so that qv<=w, and we've verified that qv=sup(qS) as
desired.

Now apply these observations to the collection of upper sums of g and
the upper sums of f. Note that US(g,P)=qUS(f,P) because
of the ideas above. So we have verified that
UI(g,[a,b])=qUI(f,[a,b]). We can similarly verify that
LI(g,[a,b])=qLI(f,[a,b]). Since f is Riemann integrable,
UI(f,[a,b])=LI(f,[a,b]), implying that UI(g,[a,b])=LI(g,[a,b]), and
therefore, g is also Riemann integrable. Whew! And we also know that
the upper and lower integrals both multiply by q, so that
intabg=q·intabg.

4/23/2003

We used the lemma proved last time to verify the following Theorem (upper sums dominate lower sums) If S and
T are any partitions of [a,b], then
US(f,S)>=LS(f,T).
Proof: Here's the proof, which is very witty. Last time we proved that
more points in the partition may make the upper sum decrease, but
can't make it increase. A similar result (reversing directions,
though!) is true for lower sums. Therefore, if P is the union
of the partitions S and T, we have the following
sequence of inequalities:
LS(f,T)<=LS(f,P)<=US(f,P)<=US(f,S).
The central inequality (between LS and US for P) is true
because sups are bigger than infs, always.

Comment I can't imagine a totally convincing picture of the
situation addressed in this result -- it seems really complicated.
Temporarily, I defined:
A=the set of all lower sums of f. That is, x is in A if
x=LS(F,P) for some partition P of [a,b].
B=the set of all upper sums of f. That is, x is in B if
x=US(F,P) for some partition P of [a,b].

The theorem just stated presents us with a situation which should be
familiar from earlier work in the course (a month or more ago). The
sets A and B have the following properties: if a is in A, then a is a
lower bound of B, and if b is in B, then b is an upper bound of A. It
is natural to look at the sup of A and the inf of B. Here we will use
special phrases:
The sup of A is called the lower Riemann integral of f on
[a,b], and is denoted LI(f,[a,b]).
The inf of B is called the upper Riemann integral of f on
[a,b], and is denoted UI(f,[a,b]).
It is always true that LI(f,[a,b])<=UI(f,[a,b]). If these numbers
are equal, then we say that f is Riemann integrable on [a,b],
and the common value is called the Riemann integral of f on [a,b],
intab f, or, more commonly,
intab f(x) dx. Note that the "x" in
the integration is a "local" variable, and therefore the value of
intab f(x) dx is the same as
intab f(w) dw which is the same as
intab f(t) dt, etc.

If f is example 3 of the last lecture (0 on the irrationals, and 1 on
the rationals, and the interval is [0,1]) then A={0} and B={1}, not
very big sets, and not very complicated!

How can we tell if the Riemann integral exists, and how can we get
interesting examples? We will begin with this theorem:Theorem f on [a,b] is Riemann integrable if and only if for
every epsilon>0, there is a partition P of [a,b] so that
US(f,P)-LS(f,P)<epsilon.
Proof: Let's first assume that f is Riemann integrable. Then
LI(f,[a,b])=UI(f,[a,b]). Now since LI(f,[a,b]) is a sup (the sup of
the set A), given epsilon/2, we can find a partition S so that
LI(f,[a,b])-epsilon/2<LS(f,S)<=LI(f,[a,b]). Since UI(f,[a,b])
is an inf (the inf of the set B), given epsilon/2, we can find a partition
T so that
UI(f,[a,b])<=US(f,T)<UI(f,[a,b])+epsilon/2. Now let P
be the partition which is the union of S and T (what is
called classically the common refinement of the two
partitions). We also know that
LI(f,[a,b])=UI(f,[a,b])=intab f. Now we can
package all this into a wonderful chain of inequalities:
intab f-epsilon/2<LS(f,S)<=LS(f,P)<=US(f,P)<=US(f,T)<intab+epsilon/2
So the numbers =LS(f,P) and US(f,P) are both
"sandwiched" into an interval centered around
intab, and this interval has length at most
epsilon. So we are done with this part of the proof.

Now we need to verify the "epsilon condition" implies Riemann
integrability. Remember what the sets A and B are. Since every element
of B is an upper bound of A, and every element of B is a lower bound
for A, we already know that sup A<=inf B. Why is this
true? Well, if sup A>inf B, then take epsilon=
sup A-inf B>0. We can find (sup chracterization) a in A so
that sup A>=a>sup A-epsilon. a is then greater than
inf B=sup A-epsilon. But we could then (inf characterization)
find b in B with
a>b>=inf B, which contradicts the known fact that a<=b
for all choices of a in A and b in B. How can we prove Riemann
integrability? This condition is exactly the same as proving
sup A=inf B. Since we already know
sup A<=inf B, let us see what happens when
sup A<inf B. Then take
epsilon=inf B-sup A>0. The assumption in the statement of the
theorem says we can find a in A and b in B with b-a<"this" epsilon. So
b-a<inf B-sup A.
But certainly a<=sup A and inf B<=b, yielding (since
-sup A<=-a) inf B-sup A<=b-a. This is a
contradiction! Whew. (The logic in all this is a bit intricate, but is
very similar to lots of proofs we did a month or two ago.)

Even better is the following result:Theorem f is Riemann integrable on [a,b] if and only if there
is a sequence of partitions (Pn) so that
US(f,Pn)-LS(f,Pn)<(1/n). If
such a sequence of partitions exists, then the two sequences of
numbers, (US(f,Pn)) and
(LS(f,Pn)), both converge to a common limit, and
that limit is intab f.

With this result (whose proof I postponed until next time) we will
actually be able to effectively recognize some Riemann integrable
functions and maybe compute some integrals.

Example 1 A step function: Select numbers
a<c<d<b. Define the function f(x) by f(x)=1 if c<x<d,
and f(x)=0 otherwise. This is the simplest example of what the text
(and other sources) call a step function. With some effort, we
decided to look at the following partition:
P={a,c,c+(1/3n),d-(1/3n),d,b}.
The lower sum is exactly
1·((d-(1/3n))-(c+(1/3n)))=d-c-(2/3n). Be careful with
endpoints -- only one rectangle is non-zero! The upper sum is
1·((d-(1/3n))-(c+(1/3n)))+1·(d-(d-(1/3n)))+1·((c+(1/3n))-c)=d-c
(no n's intervence at all!). The difference is (2/3n) which is
certainly less than 1/n, so the hypotheses of the preceding theorem
are satisfied. The limits of the sequences (d-c-(2/3n)) and (d-c) are
both d-c, which therefore must be the integral of f over [a,b].

Example 2
f(x)=(1/5)x2+arctan(cos(x2). This is
revision of history. In class I actually analyzed
f(x)=5x7+arctan(cos(x17)) on the interval
[3,11]. This was, of course, partly a joke, but also partly a serious
effort to show that we could "handle" something this complicated. I am
switching to the function written above because I had Maple graph it,
and this one wiggled a lot, whereas the function I did in class has
very small scale wiggles (the seventh power really dominates the
arctan term!) so it doesn't look as weird. So here
f(x)=(1/5)x2+arctan(cos(x2). We will analyze
this function with the help of some tools from calc 1. Therefore
f'(x)=(2/5)x+(1/(1+(cos(x2))2(-sin(x2))(2x).
And even more is true if we try to get a very rough upper bound on
|f'(x)|. The triangle inequality tells us that this is
<=|(2/5)x|+|(1/(1+(cos(x2))2(-sin(x2))(2x)|.
The first term is (2/5)x which on [3,11] is less than 22/5<5. Much
of the second term can be overestimated by 1 (the fraction, the sine)
so we just have left |2x| which is less than 22, so that always
|f'(x)|<27.
The Mean Value Theorem of calculus says that for x and y in [3,11],
|f(x)-f(y)|<=27|x-y| (because the quotient (f(x)-f(y))/(x-y) will
always be f'(c) and this in absolute value is less than 27). So this f
is a Lipschitz function with Lipschitz constant less than 27. (We
encountered such functions before [butterfly functions!] in the
lecture of 4/14). Now let us think about a partition of [3,11] into
J (an integer) equal pieces. The mesh of the partition will be
(11-3)/J=8/J, and that will be the length of each subinterval. If x
and y are in the same subinterval, therefore,
|x-y|<8/J. And from the Lipschitz inequality,
|f(x)-f(y)|<(27)(8/J). Notice that f is continuous, so that the sup
and the inf over each subinterval are values of f on the subinterval
(f(x) and f(y) in the above). Thus the difference between the upper
sum and the lower sum will be at most: (# of
subintervals)·(maximum variation in the
subinterval)·(length of the subinterval). This is
J·(27)(8/J)·(8/J). A simplification gives
(27)(8)(8)/J. And when J gets large, this gets small, so we have
indeed verified the criterion for this function to be Riemann
integrable on [3,11]. Wow! More to come next time.

4/21/2003

The problem of area
What is area?

This is a serious geometric question and difficult to answer. Usually
we would like "area" to be something satisfying the following rules:

The area of a piece of R2 should be a
non-negative number.

The area of bigger pieces should be a bigger number.

The area of congruent pieces should be the same.

If two pieces don't overlap or overlap only at the boundary, then
the area of the union of the two pieces should be the sum of the areas
of the pieces. More generally, if you split up pieces of the plane
into subpieces, the areas of the subpieces should add up to the area
of the original piece.

(Normalization) The area of the unit square should be 1.

These are the rules that area should seem to follow. Of course, some
of the words need explaining, and some of them need lots of
explanation. One book that I've read recently which discusses the
classical Euclidean approach to this problem and others is
Hartshorne's Geometry: Euclid and Beyond which I would
recommend for those who want to study the axiomatics of classical
geometry, now thousands of years old. The book isn't easy, but it has
a great deal of content. The idea of a "piece" of the plane is
certainly imprecise. "Bigger" in the case of numbers is <=. In the
case of pieces of the plane, it probably should mean "is a subset
of", so larger sets have larger areas. The word "congruent" probably
means, as was said in class, the same shape and size: this means that
if we translate or rotate or flip sets, the results should have areas
equal to the sets we started with. In the case of sets being "split
up" they should not have overlapping interiors: only the boundaries
are allowed to overlap. Whew!

Just an initial statement of these properties is awesome. What is more
distressing is the following statement, whose verification needs more
time than this course has to run: there is no way to assign
area to every subset of the plane in a way that obeys all of the
rules above. This is irritating. In R3 the situation
is even worse. It turns out that some "obvious" facts about
decomposition of polyhedra into pieces with equal area are also not
true! There is more information about this in Hartshorne's book. Here
we do something much more pedestrian. We will try to assign "area"
(actually, the definite integral) to regions in the plane bounded by
the x-axis, x=a, x=b (here a<b) and y=f(x). Even for this seemingly
more modest goal will turn out to be more difficult than one thinks,
and the examples we will consider will be intricate and
irritating.

We will follow the lead of Cauchy and Riemann in this. Bressoud's book
(referenced in the general background to the course) discusses some of
Cauchy's ideas and shows that some of what Cauchy wrote was just
wrong! This stuff can be difficult. The basic idea is exactly
described by the picture to the right, which is almost surely rather
familiar to every student who has been through a basic calculus
course. We need to label and define and investigate every aspect of
this picture. I note that we are investigating what is called the
Riemann integral (Google has over 72,000 responses to "Riemann
integral"). Another candidate for integration is called the "Lebesgue
integral" (Google has over 29,700 responses to "Lebesgue
integral").

We will start with a function f defined on [a,b]. We need to split up
the interval. The word "partition" is used both as a verb and as a
noun in this subject. As a verb, partition means to break up the
interval into subintervals. As a noun, currently more important,
"partition" P will mean a finite subset of [a,b] which contains
at least both a and b. So a partition could be as small as just {a,b}
(I'll assume here that a<b, so a and b are distinct). Or a
partition could have 10100 points. The points in P
will usually be written here as
{a=x0<x1<x2<...<xn-1<xn=b}.
In this case the partition has n+1 elements, and it has divided the
interval into n subintervals (although frequently the subintervals
have equal length, this is not required). The mesh of the
partition P will be written ||P|| and it means the
maximum of xj-xj-1 for j running from 1 to
n. Since P is a finite set, the word "maximum" can be used, and
will equal one of the elements of the set. Additionally, we will need
to specify "tags" (word used in the text) or "sample points" (phrase I
am more used to). So this is a selection of tj in each
subinterval [xj-1,xj] as j runs from 1 to n. The
tj will be used to create the height of each
subrectangle. Then the Riemann sum of f on [a,b] with partition
P and sample points T is
sumj=1nf(tj)(xj-xj-1).
The idea is that as ||P||-->0, this sum should tend to some
sort of limit, and this will be the area or the definite
integral. We'll call this RS(f,P,T): it is a complicated
creature. I will return to this general sum later, but right now I
will try something which may be a bit easier to handle: upper and
lower sums.

The Upper Sum of f on [a,b] with partition P, US(f,P), is
sumj=1n(sup of f on [xj-1,xj])(xj-xj-1).
Several observations should be made about this, and about the
corresponding definition, to be made below, of "lower sum". First, sup
and inf need to be used, and these sums don't need to be Riemann
sums. The reason for this is that functions don't need to attain their
sups and infs (example:f(x)=x, x in [0,1) and f(1)=0). If, however,
the function f is continuous then it will attain its sup and inf in
closed bounded intervals, so that the upper and lower sums will be
Riemann sums. Second, for us to know that sup and inf exist and are
real numbers, the function f should be bounded in [a,b]: there is a
positive real number M so that |f(x)|<=M for all x in [a,b]. The
function f(x)=1/x for x>0 and f(0)=0 is not bounded in [0,1], and
therefore our theory will not apply to it (such phenomena need to be
analyzed as improper integrals -- we are considering only proper
definite integrals here. Whew! Now for the lower sum:
The Lower Sum of f on [a,b] with partition P,
LS(f,P),
is
sumj=1n(inf of f on [xj-1,xj])(xj-xj-1).
Notice that Riemann sums (with tags or sample points) will always be
caught between the upper and lower sums associated with their
partitions.

Example 1 f(x)=x, [0,1]. Here I looked at the partition
P which was {0,1/n,2/n,3/n,...,(n-1)/n,1}, an evenly spaced
partition dividing [0,1] into n equal subintervals. The difference
between the upper and lower sums can be exactly computed in
this case (most unusual, because almost always we will need to
estimate such things). Just "shove over" the boxes so they line up and
have height 1 and width 1/n, so that the difference between the upper
and lower sums is 1/n. As n gets large, this discrepancy -->0.

Example 2 f(x)=1 if x=1/2 and f(x)=0 otherwise. On the interval
[0,1] use the partition {0,0,1/2-1/n,1/2+1/n,1}. This partition has four
points and three subintervals. The lower sum is 0 because the infs on
any of the three subintervals is 0. The upper sum has three parts. The
left and right parts are 0 because the sup is 0 there, but the inside
part, with width 2/n, has sup=1. Hence the upper sum is 2/n. The
difference between the upper and lower sums here also-->0 as n gets
large.

Example 3 Consider the function f which is 0 on all of the
irrationals and 1 on all the rationals. Then since there are rationals
and irrationals in all intervals of positive length (the "density" of
the rationals and irrationals) all of the upper sums on the interval
[0,1] are 1 and all of the lower sums on the interval [0,1] are
0. There is always a discrepancy of 1 between the upper and lower
sums.

Definition The Upper Riemann integral is the inf of all
of the upper sums of f. The Lower Riemann integral is the sup
of all of the lower sums of f. We will say that f is Riemann
integrable if the upper and lower Riemann integrals are equal. The
common value will be called the Riemann integral of f.

So the functions of examples 1 and 2 are Riemann integrable, with
integrals of value 1/2 and 0 respectively. The function in example 3
is not Riemann integrable.

The following result plays an important part in any development of the
Riemann integral.Lemma Suppose P is a partition of [a,b] and q is a point
which is not in P. Let Q be the partition
obtained by taking P union {q}. Then:
US(f,P)>=US(f,Q) and
LS(f,P)<LS(f,Q).
The idea is that the approximations should get closer when we throw in
more points. We will use this lemma many times.
Proof: I will only look at the upper sums. Let me suppose that q is
between xj-1 and xj. Then all of the terms in
the upper sums for P and Q are exactly the same except
for one term in the P sum:
(sup of f on [xj-1,xj])(xj-xj-1), which
is replaced by these terms in the Q sum:
(sup of f on [xj-1,q])(q-xj-1)+(sup of f on [q,xj])(xj-q).
There are now several observations:
1. (xj-xj-1)=(xj-q)+(q-xj-1).
2. (sup of f on [xj-1,xj])>=(sup of f on [xj,q])
3. (sup of f on [xj-1,xj])>=(sup of f on [q,xj-1])
2 and 3 are true because sup's on bigger sets may be larger but cannot
be smaller.
These observations combine to prove the result stated: multiply
inequality 2 by xj-q; multiply inequality 3 by
q-xj-1; add the results and use the equation in 1. This
shows that the Q sum is overestimated by the P sum. The result for
lower sums is similar.

On Wednesday I hope to give out some notes written by Professor Cohen
which will outline this material.

4/16/2003

The instructor addressed the question, "Where do we go from
here?" and then answered even more inquiries in preparation for the
exam.

I brought in the text (a standard calculus book) used for Math 151 and
compared what's in Chapter 6 of the Math 311 textbook with that
text. The approach and even many of the pictures, are the same. What
happens?

The definition of derivative; a differentiable function is
continuous.

If a function has a local extremum and is differentiable there,
then the derivative is 0 at that point.

Rolle's Theorem (a special case of the MVT): a function which is
differentiable in [a,b] with f(a)=f(b)=0 must have at least one c in
(a,b) with f'(c)=0.

Mean Value Theorem (tilted form of Rolle's Theorem): if f is
differentiable in [a,b], there is at least one c in (a,b) with
f'(c)=(f(b)-f(a))/(b-a).

f' positive in an interval implies f is increasing there; f'
negative in an interval implies f is decreasing there.

More topics: l'Hopital's rule; Taylor's Theorem.

Of course there are examples in the 311 text which have more interest
due to our developing knowledge than what's in the calculus text. But
I would like to temporarily skip this material and discuss the
integral, which is almost always given less attention than the
derivative in calculus courses. The examples illustrating features of
the integral are to me more interesting than corresponding examples
for the derivative. The examples tie into a number of applications in
other areas, such as probability. Therefore on Monday, April 21, we
will begin material in chapter 7. The treatment in class will not be
the same as in the textbook.

In the balance of time remaining for the class I tried to answer more
questions in preparation for the exam. Maybe the only interesting
comment was my remembering a "workshop" problem from calculus. The
problem was something like this:

Aliens change and permute 10,000 values of the function
F(x)=x2. That is, they might change (2,4) and (5,25) to
(2,25) and (5,4), except that this is done to 10,000 points. What is
the value of limx-->aF(x) and why?

The answer is that limx-->aF(x) exists for all a's, and its
value is alway a2. This is rather surprising, but once one
is "close enough" to a (and not equal to a)there is no difference
between the original function and the altered function.

I worked on some other textbook and workshop problems, just as I had
in the review session the night before.