## Some thoughts on Lebesgue integration

§1 Riemann’s idea. Being the “inverse operation” to differentiation, integration holds a fundamental place in calculus. It is taught, not only to the mathematics major, but to students of disciplines ranging all the way from business to engineering, and usually rears its head in the first or second course of the traditional calculus sequence.

The definition of the integral employed by almost all non-mathematicians, whether they are conscious of it or not (or even conscious of there being a “definition” of the integral), is of course the intuitive Riemann integral: a limiting process in which we mercilessly butcher the function’s domain (which in this case one might note is already required to be an interval — a very special kind of set!) into subintervals, draw rectangles “approximating” the area under the graph, and consider their total area as the number (and hence density) of the rectangles increases wildly. This method to “finding” (or at least approximating) the area under a curve is quite likely the first method that a student would come up with. It is naïve, but ironically turns out to be more than sufficient for almost anyone except mathematical analysts. “After all, most functions are continuous anyway — usually even $\mathcal{C}^\infty$!” … right?

§2 Shortcomings; Lebesgue’s idea. This misconception may turn out true for many areas where integration finds itself in use. However, for more advanced applications, and especially for us theoretical mathematicians, the Riemann integral kind of just sucks in some fundamental ways: certain Riemann integrals unnaturally fail to exist, despite the fact that upon examining the integrals of a sequence of convergents, we get a strong impression that such integrals should exist, and also have a pretty good idea of what exactly the value of the integral should be.

Enter Henri Lebesgue. At the turn of the 20th century, Lebesgue brought to the table a more subtle and refined notion of integration, which, being based on measure theory, not only exhibited dramatically more desirable limiting behaviour, but also later found itself vastly generalized, thereby enabling us to think about integration in very abstract spaces, far beyond the horizons of $\mathbb{R}$ or even $\mathbb{R}^n$. Of course, wherever the Riemann integral “works”, so too does the more robust Lebesgue integral — so we say that the latter extends the former. After all, an integral would probably not be well-received if it clashed too forcefully with our early Riemannian indoctrination.

§3 Outline of Lebesgue integration. The following is an outline of the general strategy used in defining the Lebesgue integral. I will not, however, go into detail about Lebesgue measure on the real line — a topic which in itself is worthy of detailed study. The set of (Lebesgue-)measurable subsets of $\mathbb{R}$ will be denoted $\mathcal{L}(\mathbb{R})$. The Lebesgue measure is denoted $\lambda$. Throughout, $A$ is assumed to be some measurable set. Be warned that the notion of measurability defined for functions below is different, however, it does have a nice, natural connection through indicator functions $\chi_S$ (also called characteristic functions).

We say $f : \mathbb{R} \to \mathbb{R}$ is measurable if $f^{-1}((\alpha, \infty)) \in \mathcal{L}(\mathbb{R})$ for all $\alpha \in \mathbb{R}$. Here $f^{-1}(S)$ refers to the pullback of $S$ under $f$.

It turns out that equivalently, we could require that $f$ pull back Borel sets to measurable sets, that is, $f^{-1}(B) \in \mathcal{L}(\mathbb{R})$ for all $B \in \mathcal{B}(\mathbb{R})$. If the domain of $f$ is some $A \in \mathcal{L}(\mathbb{R})$ rather than all of $\mathbb{R}$ itself, we say $f$ is measurable if the function $\tilde{f} : \mathbb{R} \to \mathbb{R}$ obtained by “zeroing $f$ outside its domain” is measurable by the original definition.

Finally, for functions $f$ which take values in the extended reals $\overline{\mathbb{R}} = \mathbb{R} \cup \{ +\infty, -\infty \}$, we call $f$ measurable if $f$ pulls Borel sets back to measurable sets, and if $f^{-1}(\{ \pm \infty \}) \in \mathcal{L}(\mathbb{R})$.

We say $f : A \to \mathbb{R}$ is simple if $f$ has finite image, that is, $f(A) = \{ a_1, \ldots, a_n \}$. It is convenient, for such functions, to define $E_i = f^{-1}(\{ a_i \})$, and then write

$f = \displaystyle\sum\limits_{i=1}^n a_i \chi_{E_i}$

where $\chi_S$ is the indicator function of the set $S$. So every simple function is just a linear combination of indicator functions.

Since $\chi_S$ is a measurable function if and only if $S$ is a measurable set, we observe that a simple function is measurable if and only if all its corresponding $E_i$ are measurable sets.

Next we define $\mathcal{S}(A)$ to be the set of all measurable, simple $\varphi : A \to \mathbb{R}$. We let $\mathcal{S}^+(A)$ be the subset of such functions which are also non-negative.

Now with these simple functions in hand, we can define the very first hint of an integral: their proto-integral. The proto-integral of $\varphi \in \mathcal{S}^+(A)$ over $A$, is defined by

$I_A(\varphi) := \displaystyle\sum\limits_{i=1}^n a_i \lambda(E_i)$.

In other words, we define $I_A(\chi_S) = \lambda(S)$ for all measurable $S \subseteq A$, and extend the definition of $I_A$ to support any $\varphi \in \mathcal{S}^+(A)$ by linearity.

OK, we’re making progress; we now have a proto-integral for simple measurable non-negative $\varphi : A \to \mathbb{R}$. The idea now is that we want to extend this to an integral for any non-negative measurable $\overline{\mathbb{R}}$-valued function. How do we do this? Well, we just approximate from below:

For a measurable function $f : A \to \overline{\mathbb{R}}$ satisfying $f \geq 0$, we define the integral of $f$ by

$\displaystyle\int\limits_A f := \sup \{ I_A(\varphi) : \varphi \in \mathcal{S}^+(A), \varphi \leq f \}$.

At this point we are able to integrate a fairly large class of functions, and prove many of the desirable properties of the integral. However, we are not done, since we are still only dealing with non-negative creatures!

We first define $(f^+)(x) = \max\{ 0, f(x) \}$ and $(f^-)(x) = \max\{ 0, -f(x) \}$. Observe $f = f^+ - f^-$ and $|f| = f^+ + f^-$.

For a measurable function $f : A \to \overline{\mathbb{R}}$ we call it Lebesgue integrable on $A$ if

$\displaystyle\int_A f^+ - \displaystyle\int_A f^- < \infty$

and in this case, we call this quantity the Lebesgue integral of $f$ on $A$.

§4 Some questions. Why did we define the non-negative Lebesgue integral as a limit “from below” of proto-integrals? Why not a limit from above? Also, note that the Lebesgue integral does not depend at all on the orientation of an interval (in the special case where $A$ is chosen to be an interval, that is), whereas the Riemann integral does. Further, does Lebesgue’s monotone convergence theorem still hold for a monotonically decreasing sequence of functions, rather than an increasing one? Why is it that the functions considered in Lebesgue integration theory are measurable functions $f : (\mathbb{R}, \mathcal{L}(\mathbb{R})) \to (\mathbb{R}, \mathcal{B}(\mathbb{R}))$? What’s an example of a measurable function which pulls back some non-Borel measurable set to a non-measurable set (recall every Borel set is measurable)?