Background: When I first took measure theory/integration, I was bothered by the idea that the integral of a real-valued function w.r.t. a measure was defined first for nonnegative functions and only then for real-valued functions using the crutch of positive and negative parts (and only then for complex-valued functions using their real and imaginary parts). It seemed like a strange starting point to make the theory dependent on knowledge of the nonnegative function case when this certainly isn't necessary for Riemann integrals or infinite series: in those cases you just take the functions or sequences as they come to you and put no bias on positive or negative parts in making the definitions of integrating or summing.

Later on I learned about integration w.r.t a measure of Banach-space valued functions in Lang's Real and Functional Analysis. You can't break up a Banach-space valued function into positive and negative parts, so the whole positive/negative part business has to be tossed aside as a foundational concept. At the end of this development in the book Lang isolates the special aspects of integration for nonnegative real-valued functions (which potentially could take the value $\infty$). Overall it seemed like a more natural method.

Now I don't think a first course in integration theory has to start off with Banach-space valued functions, but there's no reason you couldn't take a cue from that future generalization by developing the real-valued case in the same way Banach-space valued functions are handled, thereby avoiding the positive/negative part business as part of the initial steps.

Finally my question: Why do analysts prefer the positive/negative part foundations for integration when there is a viable alternative that doesn't put any bias on which function values are above 0 or below 0 (which seems to me like an artificial distinction to make)?

Note: I know that the Lebesgue integral is an "absolute" integral, but I don't see that as a justification for making the very definition of the integral require treatment of nonnegative functions first. (Lang's book shows it is not necessary. I know analysts are not fond of his books, but I don't see a reason that the method he uses, which is just copying Bochner's development of the integral, should be so wildly unpopular.)

Isn't it true that Lang was just following the Bourbaki approach?
–
Victor ProtsakMay 19 '10 at 4:51

In Lang's treatment, the monotone convergence theorem and Fatou's lemma play a comparatively minor role in the foundational development, which might seem very strange to the analysts. But at the end of the day his book produces all the same basic theorems, and he has an exercise on non-negative functions to show the traditional approach via suprema of simple non-negative functions produces the same completion as the approach used in his book.
–
KConradMay 19 '10 at 23:00

8 Answers
8

An order-theoretic completion. For this, it's easiest to start with non-negative functions, and have infinite values dealt with pretty naturally.

A metric completion. For this, it's more natural to start with finite-valued signed simple functions.

It's not exactly that simple -- historically, signed simple functions (well, actually, I think they used step functions) were used in an order-theoretic treatment by Riesz and Nagy. But I think this is a good way to look at the two ways of approaching this integral.

And needless to say, these two approaches generalize in two different contexts. They are both interesting and illuminate somewhat different aspects of the Lebesgue integral, even on the real line. For instance, the order-theoretic approach leads quickly to results such as the monotone convergence and bounded convergence theorems, while the metric approach leads naturally to the topology of convergence in measure and completeness of the $L_p$ spaces.

Ah, this is a nice point of view that had never explicitly occurred to me before. It's just like Dedekind cuts vs. Cauchy sequences to construct the real numbers from the rationals. Each of those has generalizations in directions not covered by the other case (e.g., ordered groups being completed using Dedekind cuts). To a newcomer Dedekind cuts require less baggage to describe, even though it might have far less potential down the road. In the case of integration, does the order-theoretic way of approaching L^1 have any application beyond the case of real-valued functions?
–
KConradMay 19 '10 at 22:51

1

Completion of what? I understand how the extended reals can be seen as an order completion of ${\mathbb R}.$ But what does metric completion have to do with vector valued integrals? What are we metrically completing? ${\mathbb R}$ is already complete. The connection between order and monotone convergence makes sense. But the examples of the "metric" approach seem off topic. Are you saying convergence in measure and completeness of $L_p$ are more natural if you avoid the extended reals? I don't get it.
–
Fabrizio PoloMay 20 '10 at 7:28

2

Fabrizio, Carl is referring, I believe, to the L^1-functions as a metric completion in the L^1-seminorm of the space of step maps (simple maps) with real values. You could also look at the L^1-seminorm on step maps with values in a Banach space (as Lang does, with absolutely no difference in treatment, for him, from the case of real-valued functions as he makes no systematic use of monotone convergence or Fatou's lemma from the start).
–
KConradMay 20 '10 at 9:34

Yes, that's exactly right. If you're dealing with a metric space, everything has to be finite, so you don't want to start out dealing with simple functions that can assume infinite values. Another book that develops the integral metrically is Halmos. These two ways of looking at the integral are both important, and most treatments actually slip back and forth from one to the other. And what makes the integral such a versatile and remarkable object is its relation to different structures -- order, metric, and certainly linear as well. The fact that it is a linear operator is quite central.
–
Carl OffnerMay 20 '10 at 22:18

1

There is an extensive theory of ordered vector spaces. Somewhat analogous to the theory of topological vector spaces. (Considerable overlap, but neither includes the other.) Nowadays not so widely known as it once was. Its heyday was 1930 to 1960, perhaps.
–
Gerald EdgarJun 10 '11 at 15:55

The integration of nonnegative functions deserves its own chapter, just like nonnegative measures. It has more features than the general case and there are cases when you need exactly these features and do not need negative numbers.

And it is so elegant: every measurable function has an integral, and the integration is uniquely characterized by 3 properties: the integral of $1_A$ is the measure of $A$; the integration is additive and satisfies the monotone convergence theorem.

For example, let $f:X\to Y$ be a map and $\mu$ a measure on $X$. Then one has a push-forward measure $\mu'$ on $Y$ defined by $\mu'(A)=\mu(f^{-1}(A))$. The integration against $\mu'$ is given by the formula $\int_Y hd\mu' =\int_X (h\circ f) d\mu$. Why? Because the r.h.s. satisfies the above 3 axioms. This is trivial to check, and going through any explicit definition of the integral would be painful.

So essentially one proves the formula first for step functions, then for nonnegative functions via monotone limits, and then the general case follows. It is a standard type of an argument. And it looks natural and obvious to a student who learned the integration the traditional way.

I agree the non-negative case deserves a separate treatment because of its features which don't work in the general case (function values not being in the real numbers), just like positive measures have their own features which don't occur in other settings like complex measures. But usually the vector-valued treatment builds on this non-negative case rather than developing it from scratch to see it runs on its own internal logic.
–
KConradMay 19 '10 at 22:57

When I said in my previous comment that the vector-valued treatment builds on the non-negative case, I meant that in a negative sense. It seems weird to create L^1 for real-valued functions in a fundamentally different way than you do for vector-valued (i.e., Banach space valued) functions.
–
KConradMay 19 '10 at 23:05

My point is that this approach has pedagogical advantages: a student gets familiar with a useful technique while working on this definition. Another plus is that this definition is short.
–
Sergei IvanovMay 20 '10 at 8:49

2

When I was a student, the apparent ugliness of this approach striked me too. I tried to find a better way but failed: everything had serious disadvantages from undergraduate teaching perspective.
–
Sergei IvanovMay 20 '10 at 9:20

Trying to get somewhere quickly is not a point I had thought about. I assumed that students who study Lebesgue integration will already know about completions of metric spaces, so the vector-valued approach (even if used exclusively for real-valued functions in the course) should be accessible, although admittedly it may take more time than the traditional way to reach key theorems. Trying to teach with the vector-valued approach while also doing things specific to the real-valued case (ex: monotone conv. thm) which require infty as a function value could create confusion. Спасибо за ответ.
–
KConradMay 20 '10 at 9:48

In standard textbooks, the Lebesgue integral is first defined for f with values in $[0,\infty]$. Note that the value $\infty$ is allowed, and it is important that it can be taken even on a set of positive measure.

You can see why for example in the one-line proof of the Borel-Cantelli lemma:

let $\mu$ be finite, and $A_i$ measurable sets so that $\Sigma \mu(A_i)< \infty$. Then a.e. x belongs to only a finite number of $A_i$.

If you start with a set of functions $f$ with values in $[-\infty,\infty]$, you will have to assume that either $-\infty$ or $\infty$ is taken only on a set of zero measure (so as to prevent the $\infty -\infty$ problem), and if you want it to work both for $f$ and $-f$, you will have to assume that none of $-\infty$, $\infty$ is taken on a set of positive measure,
thus ruling out the kind of argument as in the Borel-Cantelli lemma. Which is of course unbearable for an analyst, and even more for a probabilist. But perfectly sound from a functional analysis perspective.

You want $\int f d\mu$ to make sense for every non negative measurable function and employ the monotone convergence theorem when the limit is not integrable.

But sometimes I have developed the Lebesgue integral for bounded measurable functions supported on sets of finite measure and introduced $L_1$ as the completion of this. Some books do it similarly--define the integral for bounded measurable functions supported on sets of finite measure--but then define $\int f d\mu$ for non negative measurable functions and take differences when it makes sense. That mixed approach seems a bit weird to me.

Traditionally, infinite integrals are seen as a more immediate obstacle (${\mathbb R}$ is an infinite measure space so this difficulty shows up quite quickly.) Such integrals probably seem more obviously relevant to students at first. So texts usually develop integration with (1) in mind from the beginning.

To me, integration of vector valued functions is a lot more natural, and playing with the extended reals seems like little more than convenient notational trickery. But this is hindsight. I'm quite sure that when I first learned integration I would have been much more concerned with problems caused by infinity.

Well, it is not true that Riemann integral and series avoid the distinction altogether.

If you want to define improper Riemann integrals, you can follow two ways. Say you want to define $\int_a^b f(x) dx$, where $a, b$ are not necessarily finite and $f$ need not be bounded.

Either you split $f$ into the positive and negative part, or you define it as a limit of the truncated functions on a truncated domain, something like

$$\lim_{t \to +\infty} \lim_{s \to a^+} \lim_{r \to b^-} \int_s^r \max \{ \min \{ f(x), t \}, -t \} dx $$
But then the result, for functions not in $L^1$, depends on the way you choose to go to the limit.

Exactly the same happens for series: for those which are not absolutely convergent, the result depends on the order of summation.

The trick to consider $f_{+}$ and $f_{-}$ allows one to consider improper integrals, which may as well be infinite, and to declare that the integral of $f$ does not make sense in those cases where the order of the limits is relevant.

Of course you know all this, but I post it as an answer since it would be too long for a comment, so that you can comment to explain why this reason is not enough for you.

I consider the improper/principal value integrals not to be part of the basic development but something introduced only later.
–
KConradMay 18 '10 at 20:57

6

Uhm... but this way you cannot even integrate a function on $\mathbb{R}$!
–
Andrea FerrettiMay 18 '10 at 23:11

2

If I may intercede: sure you can, the function just needs to be compactly supported! This illustrates a functional (unintended pun) distinction between the Riemann integral and the Lebesgue integral: the former is a differential geometric tool and the latter is an analytic tool.
–
Victor ProtsakMay 19 '10 at 4:48

1

@Victor Protsak: for a first course, if you want to integrate some smooth examples on $\mathbb{R}$, it is much easier not to ask they are compactly supported.
–
Benoît KloecknerMay 19 '10 at 8:30

1

Andrea, when first developing the Riemann integral you don't do them on the whole real line. That only comes later, with the definition being in terms of the integration on bounded intervals (or, say, continuous functions) which have already been defined by more basic methods. This is what I had in mind when I said improper integrals are not part of the basic development. By "basic" I meant "initial". It comes later.
–
KConradMay 19 '10 at 22:46

I haven't looked at Lang's book before, but after a quick skim on Google Books, I see that his approach is to define the integral of simple functions, then use a completion with respect to the $L^1$ seminorm. One reason to avoid this approach is that it requires more functional analytic sophistication than we usually want to assume when first developing Lebesgue integration. This is true even for real-valued functions. Reducing first to the case of nonnegative functions allows the reduction to simple functions to be done by more elementary means.

Another point is that a rigorous treatment of improper Riemann integration usually does involve splitting into positive and negative parts.

"Another point is that a rigorous treatment of improper Riemann integration usually does involve splitting into positive and negative parts." Really? This seems to only give the notion of an absolutely convergent improper integral, which is too weak for applications. E.g., we want $\int_1^{\infty} \frac{\sin x}{x}$ to be convergent, right?
–
Pete L. ClarkMay 19 '10 at 0:21

I just noticed that my point is addressed in Andrea Ferretti's answer.
–
Pete L. ClarkMay 19 '10 at 0:23

The question has been exhaustively answered, I'd like to add a remark. The way I view the Lebesgue integral is: to every positive measurable function you can associate a meaningful integral (i.e. stable by all natural operations and limit procedures), which might be infinite. Now if you have a sign-changing measurable function, you can assign an integral to its positive and its negative part. If one of the two is finite, then you can associate a meaningful integral to that function too. If both are infinite, there are ways to define an integral in some cases, but much less meaningful and stable; some natural operations become impossible to define or unstable in general. I find this a quite natural approach: in most theorems you can replace the assumption 'integrable' with 'whose integral is defined'. Also, it is pretty intuitive that if both areas above and below the $x$-axis are infinite, there is little point in defining the signed area of the whole region.

Consider for instance the Fubini theorem (for $L^1$ functions) and its counterpart for positive functions, which we call the Tonelli theorem here in Italy. You can actually merge the two results and just say: if the integral of a measurable $f(x,y)$ is defined, then both iterated integrals are defined and give the same result as the double integral.