# 2.8: Existence and Uniqueness

- Page ID
- 10136

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Suppose that \( S \) is a set and \( \mathscr{S} \) a \( \sigma \)-algebra of subsets of \( S \), so that \( (S, \mathscr{S}) \) is a measurable space. In many cases, it is impossible to define a positive measure \(\mu\) on \(\mathscr{S}\) explicitly, by giving a formula

for computing \(\mu(A)\) for each \(A \in \mathscr{S}\). Rather, we often know how the measure \(\mu\) should work on some class of sets \(\mathscr{B}\) that generates \( \mathscr{S} \). We would then like to know that \(\mu\) can be extended to a positive measure on \(\mathscr{S}\), and that this extension is unique. The purpose of this section is to discuss the basic results on this topic. To understand this section you will need to review the sections on Measure Theory and Special Set Structures in the chapter on Foundations, and the section on Measure Spaces in this chapter. If you are not interested in questions of existence and uniqueness of positive measures, you can safely skip this section.

## Basic Theory

### Positive Measures on Algebras

Suppose first that \( \mathscr A \) is an algebra of subsets of \(S\). Recall that this means that \( \mathscr A \) is a collection of subsets that contains \(S\) and is closed under complements and finite unions (and hence also finite intersections). Here is our first definition:

A positive measure on \(\mathscr A\) is a function \( \mu: \mathscr A \to [0, \infty] \) that satisfies the following properties:

- \( \mu(\emptyset) = 0 \)
- If \( \{A_i: i \in I\} \) is a countable, disjoint collection of sets in \( \mathscr A \) and if \( \bigcup_{i \in I} A_i \in \mathscr A \) then \[ \mu\left(\bigcup_{i \in I} A_i\right) = \sum_{i \in I} \mu(A_i) \]

Clearly the definition of a positive measure on an algebra is very similar to the definition for a \( \sigma \)-algebra. If the collection of sets in (b) is finite, then \( \bigcup_{i \in I} A_i \) must be in the algebra \( \mathscr A \). Thus, \( \mu \) is finitely additive. If the collection is countably infinite, then there is no guarantee that the union is in \( \mathscr A \). If it is however, then \( \mu \) must be additive over this collection. Given the similarity, it is not surprising that \( \mu \) shares many of the basic properties of a positive measure on a \( \sigma \)-algebra, with proofs that are almost identical.

If \( A, \, B \in \mathscr A \), then \( \mu(B) = \mu(A \cap B) + \mu(B \setminus A) \).

## Proof

Note that \( B = (A \cap B) \cup (B \setminus A) \), and the sets in the union are in the algebra \( \mathscr A \) and are disjoint.

If \( A, \, B \in \mathscr A \) and \( A \subseteq B \) then

- \( \mu(B) = \mu(A) + \mu(B \setminus A) \)
- \( \mu(A) \le \mu(B) \)

## Proof

Part (a) follows from the previous theorem, since \( A \cap B = A \). Part (b) follows from part (a).

Thus \( \mu \) is increasing, relative to the subset partial order \( \subseteq \) on \( \mathscr A \) and the ordinary order \( \le \) on \( [0, \infty] \). Note also that if \( A, \, B \in \mathscr A \) and \( \mu(B) \lt \infty \) then \( \mu(B \setminus A) = \mu(B) - \mu(A \cap B) \). In the special case that \( A \subseteq B \), this becomes \( \mu(B \setminus A) = \mu(B) - \mu(A) \). If \( \mu(S) \lt \infty \) then \( \mu(A^c) = \mu(S) - \mu(A) \). These are the familiar difference and complement rules.

The following result is the subadditive property for a positive measure \( \mu \) on an algebra \( \mathscr A \).

Suppose that \( \{A_i: i \in I \}\) is a countable collection of sets in \( \mathscr A \) and that \( \bigcup_{i \in I} A_i \in \mathscr A \). Then \[ \mu\left(\bigcup_{i \in I} A_i \right) \le \sum_{i \in I} \mu(A_i) \]

## Proof

The proof is just like before. Assume that \( I = \N_+ \). Let \( B_1 = A_1 \) and \( B_i = A_i \setminus (A_1 \cup \ldots \cup A_{i-1}) \) for \( i \in \{2, 3, \ldots\} \). Then \( \{B_i: i \in I\} \) is a disjoint collection of sets in \( \mathscr A \) with the same union as \( \{A_i: i \in I\} \). Also \( B_i \subseteq A_i \) for each \( i \) so \( \mu(B_i) \le \mu(A_i) \). Hence if the union is in \( \mathscr A \) then \[ \mu\left(\bigcup_{i \in I} A_i \right) = \mu\left(\bigcup_{i \in I} B_i \right) = \sum_{i \in I} \mu(B_i) \le \sum_{i \in I} \mu(A_i) \]

For a finite union of sets with finite measure, the inclusion-exclusion formula holds, and the proof is just like the one for a probability measure.

Suppose that \(\{A_i: i \in I\}\) is a finite collection of sets in \( \mathscr A \) where \(\#(I) = n \in \N_+\), and that \( \mu(A_i) \lt \infty \) for \( i \in I \). Then \[\mu \left( \bigcup_{i \in I} A_i \right) = \sum_{k = 1}^n (-1)^{k - 1} \sum_{J \subseteq I, \; \#(J) = k} \mu \left( \bigcap_{j \in J} A_j \right)\]

The continuity theorems hold for a positive measure \( \mu \) on an algebra \( \mathscr A \), just as for a positive measure on a \( \sigma \)-algebra, assuming that the appropriate union and intersection are in the algebra. The proofs are just as before.

Suppose that \( (A_1, A_2, \ldots) \) is a sequence of sets in \( \mathscr A \).

- If the sequence is increasing, so that \( A_n \subseteq A_{n+1} \) for each \( n \in \N_+ \), and \( \bigcup_{i = 1}^\infty A_i \in \mathscr A \), then \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \).
- If the sequence is decreasing, so that \( A_{n+1} \subseteq A_n \) for each \( n \in \N_+ \), and \( \mu(A_1) \lt \infty \) and \( \bigcap_{i=1}^\infty A_i \in \mathscr A \), then \( \mu\left(\bigcap_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \).

## Proof

- Note that if \( \mu(A_k) = \infty \) for some \( k \) then \( \mu(A_n) = \infty \) for \( n \ge k \) and \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \infty \) if this union is in \( \mathscr A \). Thus, suppose that \( \mu(A_i) \lt \infty \) for each \( i \). Let \( B_1 = A_1 \) and \( B_i = A_i \setminus A_{i-1} \) for \( i \in \{2, 3, \ldots\} \). Then \( (B_1, B_2, \ldots) \) is a disjoint sequence in \( \mathscr A \) with the same union as \( (A_1, A_2, \ldots) \). Also, \( \mu(B_1) = \mu(A_1) \) and \( \mu(B_i) = \mu(A_i) - \mu(A_{i-1}) \) for \( i \in \{2, 3, \ldots\} \). Hence if the union is in \( \mathscr A \), \[ \mu\left(\bigcup_{i=1}^\infty A_i \right) = \mu \left(\bigcup_{i=1}^\infty B_i \right) = \sum_{i=1}^\infty \mu(B_i) = \lim_{n \to \infty} \sum_{i=1}^n \mu(B_i) \] But \( \sum_{i=1}^n \mu(B_i) = \mu(A_1) + \sum_{i=2}^n [\mu(A_i) - \mu(A_{i-1})] = \mu(A_n) \).
- Note that \( A_1 \setminus A_n \in \mathscr A \) and this sequence is increasing. Moreover, \( \bigcup_{n=1}^\infty (A_1 \setminus A_n) = \left(\bigcap_{n=1}^\infty A_n \right)^c \cap A_1 \). Hence if \( \bigcap_{n=1}^\infty A_n \in \mathscr A \) then \( \bigcup_{n=1}^\infty (A_1 \setminus A_n) \in \mathscr A \). Thus using the continuity result for increasing sets, \begin{align} \mu \left(\bigcap_{i=1}^\infty A_i \right) & = \mu\left[A_1 \setminus \bigcup_{i=1}^\infty (A_1 \setminus A_i) \right] = \mu(A_1) - \mu\left[\bigcup_{i=1}^\infty (A_1 \setminus A_n)\right]\\ & = \mu(A_1) - \lim_{n \to \infty} \mu(A_1 \setminus A_n) = \mu(A_1) - \lim_{n \to \infty} [\mu(A_1) - \mu(A_n)] = \lim_{n \to \infty} \mu(A_n) \end{align}

Recall that if the sequence \( (A_1, A_2, \ldots) \) is increasing, then we define \( \lim_{n \to \infty} A_n = \bigcup_{n=1}^\infty A_n \), and if the sequence is decreasing then we define \( \lim_{n \to \infty} A_n = \bigcap_{n=1}^\infty A_n \). Thus the conclusion of both parts of the continuity theorem is \[ \P\left(\lim_{n \to \infty} A_n\right) = \lim_{n \to \infty} \P(A_n) \] Finite additivity and continuity for increasing events imply countable additivity:

If \( \mu: \mathscr A \to [0, \infty] \) satisfies the properties below then \( \mu \) is a positive measure on \( \mathscr A \).

- \( \mu(\emptyset) = 0 \)
- \( \mu\left(\bigcup_{i \in I} A_i\right) = \sum_{i \in I} \mu(A_i) \) if \( \{A_i: i \in I\} \) is a finite disjoint collection of sets in \( \mathscr A \)
- \( \mu\left(\bigcup_{i=1}^\infty A_i \right) = \lim_{n \to \infty} \mu(A_n) \) if \( (A_1, A_2, \ldots) \) is an increasing sequence of events in \( \mathscr A \) and \( \bigcup_{i=1}^\infty A_i \in \mathscr A \).

## Proof

All that is left to prove is additivitiy over a countably infinite collection of sets in \( \mathscr A \) when the union is also in \( \mathscr A \). Thus suppose that \(\{A_n: n \in \N\} \) is a disjoint collection of sets in \( \mathscr A \) with \( \bigcup_{n=1}^\infty A_n \in \mathscr A \). Let \( B_n = \bigcup_{i=1}^n A_i \) for \( n \in \N_+ \). Then \( B_n \in \mathscr A \) and \( \bigcup_{n=1}^\infty B_n = \bigcup_{n=1}^\infty A_n \). Hence using the finite additivity and the continuity property we have \[ \P\left(\bigcup_{n = 1}^\infty A_n\right) = \P\left(\bigcup_{n=1}^\infty B_n\right) = \lim_{n \to \infty} \P(B_n) = \lim_{n \to \infty} \sum_{i=1}^n \P(A_i) = \sum_{i=1}^\infty \P(A_i) \]

Many of the basic theorems in measure theory require that the measure not be too far removed from being finite. This leads to the following definition, which is just like the one for a positive measure on a \( \sigma \)-algebra.

A measure \( \mu \) on an algebra \( \mathscr A \) of subsets of \( S \) is \( \sigma \)-finite if there exists a sequence of sets \( (A_1, A_2, \ldots) \) in \( \mathscr A \) such that \( \bigcup_{n=1}^\infty A_n = S \) and \( \mu(A_n) \lt \infty \) for each \( n \in \N_+ \). The sequence is called a \( \sigma \)-finite sequence for \( \mu \).

Suppose that \( \mu \) is a \( \sigma \)-finite measure on an algebra \( \mathscr A \) of subsets of \( S \).

- There exists an increasing \( \sigma \)-finite sequence.
- There exists a disjoint \( \sigma \)-finite sequence.

## Proof

We use the same tricks that we have used before. Suppose that \( (A_1, A_2, \ldots) \) is a \( \sigma \)-finite sequence for \( \mu \).

- Let \( B_n = \bigcup_{i = 1}^n A_i \). Then \( B_n \in \mathscr A \) for \( n \in \N_+ \) and this sequence is increasing. Moreover, \( \mu(B_n) \le \sum_{i=1}^n \mu(A_i) \lt \infty \) for \( n \in \N_+ \) and \( \bigcup_{n=1}^\infty B_n = \bigcup_{n=1}^\infty A_n = S \).
- Let \( C_1 = A_1 \) and let \( C_n = A_n \setminus \bigcup_{i=1}^{n-1} A_i \) for \( n \in \{2, 3, \ldots\} \). Then \( C_n \in \mathscr A \) for each \( n \in \N_+ \) and this sequence is disjoint. Moreover, \( C_n \subseteq A_n \) so \( \mu(C_n) \le \mu(A_n) \lt \infty \) and \( \bigcup_{n=1}^\infty C_n = \bigcup_{n=1}^\infty A_n = S \).

### Extension and Uniqueness Theorems

The fundamental theorem on measures states that a positive, \( \sigma \)-finite measure \( \mu \) on an algebra \( \mathscr A \) can be uniquely extended to \( \sigma(\mathscr A) \). The extension part is sometimes referred to as the Carathéodory extension theorem, and is named for the Greek mathematician Constantin Carathéodory.

If \( \mu \) is a positive, \( \sigma \)-finte measure on an algebra \(\mathscr A\), then \( \mu \) can be extended to a positive measure on \( \mathscr{S} = \sigma(\mathscr A) \).

## Proof

The proof is complicated, but here is a broad outline. First, for \( A \subseteq S \), we define a cover of \( A \) to be a countable collection \( \{A_i: i \in I\} \) of sets in \( \mathscr A \) such that \( A \subseteq \bigcup_{i \in I} A_i \). Next, we define a new set function \( \mu^* \), the outer measure, on all subsets of \( S \): \[ \mu^*(A) = \inf \left\{ \sum_{i \in I} \mu(A_i): \{A_i: i \in I\} \text{ is a cover of } A \right\}, \quad A \subseteq S \] Outer measure satifies the following properties.

- \( \mu^*(A) \ge 0 \) for \( A \subseteq S \), so \( \mu^* \) is nonnegative.
- \( \mu^*(A) = \mu(A) \) for \( A \in \mathscr A \), so \( \mu^* \) extends \( \mu \).
- If \( A \subseteq B \) then \( \mu^*(A) \le \mu^*(B) \), so \( \mu^* \) is increasing
- If \( A_i \subseteq S \) for each \( i \) in a countable index set \( I \) then \( \mu^*\left(\bigcup_{i \in I} A_i\right) \le \sum_{i \in I} \mu^*(A_i) \), so \( \mu^* \) is countably subadditive.

Next, \( A \subseteq S \) is said to be measurable if \[ \mu^*(B) = \mu^*(B \cap A) + \mu^*(B \setminus A), \quad B \subseteq S \] Thus, \( A \) is measurable if \( \mu^* \) is additive with respect to the partition of \( B \) induced by \( \{A, A^c\} \), for every \( B \subseteq S \). We let \( \mathscr{M} \) denote the collection of measurable subsets of \( S \). The proof is finished by showing that \( \mathscr A \subseteq \mathscr{M} \), \( \mathscr{M} \) is a \( \sigma \)-algebra of subsets of \( S \), and \( \mu^* \) is a positive measure on \( \mathscr{M} \). It follows that \( \sigma(\mathscr A) = \mathscr{S} \subseteq \mathscr{M} \) and hence \( \mu^* \) is a measure on \( \mathscr{S} \) that extends \( \mu \)

Our next goal is the basic uniqueness result, which serves as the complement to the basic extension result. But first we need another variation of the term *\( \sigma \)-finite*.

Suppose that \( \mu \) is a measure on a \( \sigma \)-algebra \( \mathscr{S} \) of subsets of \( S \) and \( \mathscr{B} \subseteq \mathscr{S} \). Then \( \mu \) is \( \sigma \)-finite on \( \mathscr{B} \) if there exists a countable collection \( \{B_i: i \in I\} \subseteq \mathscr{B} \) such that \( \mu(B_i) \lt \infty \) for \( i \in I \) and \( \bigcup_{i \in I} B_i = S \).

The next result is the uniqueness theorem. The proof, like others that we have seen, uses Dynkin's \( \pi \)-\( \lambda \) theorem, named for Eugene Dynkin.

Suppose that \( \mathscr{B} \) is a \( \pi \)-system and that \( \mathscr{S} = \sigma(\mathscr{B}) \). If \( \mu_1 \) and \( \mu_2 \) are positive measures on \( \mathscr{S} \) and are \( \sigma \)-finite on \( \mathscr{B} \), and if \( \mu_1(A) = \mu_2(A) \) for all \( A \in \mathscr{B} \), then \( \mu_1(A) = \mu_2(A) \) for all \( A \in \mathscr{S} \).

## Proof

Suppose that \( B \in \mathscr{B} \) and that \( \mu_1(B) = \mu_2(B) \lt \infty \). Let \( \mathscr{L}_B = \{A \in \mathscr{S}: \mu_1(A \cap B) = \mu_2(A \cap B) \} \). Then \( S \in \mathscr{L}_B \) since \( \mu_1(B) = \mu_2(B) \). If \( A \in \mathscr{L}_B \) then \( \mu_1(A \cap B) = \mu_2(A \cap B) \) so \( \mu_1(A^c \cap B) = \mu_1(B) - \mu_1(A \cap B) = \mu_2(B) - \mu_2(A \cap B) = \mu_2(A^c \cap B) \) and hence \( A^c \in \mathscr{L}_B \). Finally, suppose that \( \{A_j: j \in J\} \) is a countable, disjoint collection of events in \( \mathscr{L}_B \). Then \( \mu_1(A_j \cap B) = \mu_2(A_j \cap B) \) for each \( j \in J \) and hence \begin{align} \mu_1\left[ \left(\bigcup_{j \in J} A_j \right) \cap B \right] & = \mu_1 \left(\bigcup_{j \in J} (A_j \cap B) \right) = \sum_{j \in J} \mu_1(A_j \cap B) \\ & = \sum_{j \in J} \mu_2(A_j \cap B) = \mu_2\left(\bigcup_{j \in J} (A_j \cap B) \right) = \mu_2 \left[ \left(\bigcup_{j \in J} A_j \right) \cap B \right] \end{align} Therefore \( \bigcup_{j \in J} A_j \in \mathscr{L}_B \), and so \( \mathscr{L}_B \) is a \( \lambda \)-system. By assumption, \( \mathscr{B} \subseteq \mathscr{L}_B \) and therefore by the \( \pi \)-\( \lambda \) theorem, \( \mathscr{S} = \sigma(\mathscr{B}) \subseteq \mathscr{L}_B \).

Next, by assumption there exists \( B_i \in \mathscr{B} \) with \( \mu_1(B_i) = \mu_2(B_i) \lt \infty \) for each \( i \in \N_+ \) and \( S = \bigcup_{i=1}^\infty B_i \). If \( A \in \mathscr{S} \) then the inclusion-exclusion rule can be applied to \[ \mu_k\left[\left(\bigcup_{i=1}^n B_i\right) \cap A \right] = \mu_k\left[\bigcup_{i=1}^n (A \cap B_i) \right] \] where \( k \in \{1, 2\} \) and \( n \in \N_+ \). But the inclusion-exclusion formula only has terms of the form \( \mu_k \left[ \bigcap_{j \in J} (A \cap B_j) \right] = \mu_k \left[ A \cap \left(\bigcap_{j \in J} B_j\right) \right] \) where \( J \subseteq \{1, 2, \ldots, n\} \). But \( \bigcap_{j \in J} B_j \in \mathscr{B} \) since \( \mathscr{B} \) is a \( \pi \)-system, so by the previous paragraph, \( \mu_1 \left[ \bigcap_{j \in J} (A \cap B_j) \right] = \mu_2 \left[ \bigcap_{j \in J} (A \cap B_j) \right] \). It then follows that for each \( n \in \N_+ \) \[ \mu_1\left[\left(\bigcup_{i=1}^n B_i\right) \cap A \right] = \mu_2\left[\left(\bigcup_{i=1}^n B_i\right) \cap A \right] \] Finally, letting \( n \to \infty \) and using the continuity theorem for increasing sets gives \( \mu_1(A) = \mu_2(A) \).

An algebra \( \mathscr A \) of subsets of \( S \) is trivially a \( \pi \)-system. Hence, if \( \mu_1 \) and \( \mu_2 \) are positive measures on \( \mathscr{S} = \sigma(\mathscr A) \) and are \( \sigma \)-finite on \( \mathscr A \), and if \( \mu_1(A) = \mu_2(A) \) for \( A \in \mathscr A \), then \( \mu_1(A) = \mu_2(A) \) for \( A \in \mathscr{S} \). This completes the second part of the fundamental theorem.

Of course, the results of this subsection hold for probability measures. Formally, a probability measure \( \P \) on an algebra \( \mathscr A \) of subsets of \( S \) is a positive measure on \( \mathscr A \) with the additional requirement that \( \P(S) = 1 \). Probability measures are trivially \( \sigma \)-finite, so a probability measure \( \P \) on an algebra \( \mathscr A \) can be uniquely extended to \( \mathscr{S} = \sigma(\mathscr A) \).

However, usually we start with a collection that is more primitive than an algebra. The next result combines the definition with the main theorem associated with the definition. For a proof see the section on Special Set Structures in the chapter on Foundations.

Suppose that \( \mathscr{B} \) is a nonempty collection of subsets of \( S \) and let \[ \mathscr A = \left\{\bigcup_{i \in I} B_i: \{B_i: i \in I\} \text{ is a finite, disjoint collection of sets in } \mathscr{B}\right\} \] If the following conditions are satisfied, then \( \mathscr{B} \) is a semi-algebra of subsets of \( S \), and then \( \mathscr A \) is the algebra generated by \(\mathscr{B}\).

- If \( B_1, \, B_2 \in \mathscr{B} \) then \( B_1 \cap B_2 \in \mathscr{B} \).
- If \( B \in \mathscr{B} \) then \( B^c \in \mathscr A \).

Suppose now that we know how a measure \( \mu \) should work on a semi-algebra \( \mathscr{B} \) that generates an algebra \( \mathscr A \) and then a \( \sigma \)-algebra \( \mathscr{S} = \sigma(\mathscr A) = \sigma(\mathscr{B}) \). That is, we know \( \mu(B) \in [0, \infty] \) for each \( B \in \mathscr{B} \). Because of the additivity property, there is no question as to how we should extend \( \mu \) to \(\mathscr A\). We must have \[ \mu(A) = \sum_{i \in I} \mu(B_i)\] if \(A = \bigcup_{i \in I} B_i\) for some finite, disjoint collection \( \{B_i: i \in I\} \) of sets in \( \mathscr{B} \) (so that \( A \in \mathscr A \)). However, we cannot assign the values \( \mu(B) \) for \( B \in \mathscr{B} \) arbitrarily. The following extension theorem states that, subject just to some essential consistency conditions, the extension of \( \mu \) from the semi-algebra \( \mathscr{B} \) to the algebra \( \mathscr A \) does in fact produce a measure on \( \mathscr A \). The consistency conditions are that \( \mu \) be *finitely additive* and *countably subadditive* on \( \mathscr{B} \).

Suppose that \( \mathscr{B} \) is a semi-algebra of subsets of \( S \) and that \( \mathscr A \) is the algebra of subsets of \( S \) generated by \(\mathscr{B}\). A function \( \mu: \mathscr{B} \to [0, \infty] \) can be uniquely extended to a measure on \( \mathscr A \) if and only if \( \mu \) satisfies the following properties:

- If \( \emptyset \in \mathscr{B} \) then \( \mu(\emptyset) = 0 \).
- If \( \{B_i: i \in I\} \) is a finite, disjoint collection of sets in \( \mathscr{B} \) and \( B = \bigcup_{i \in I} B_i \in \mathscr{B} \) then \( \mu(B) = \sum_{i \in I} \mu(B_i) \).
- If \( B \in \mathscr{B} \) and \( B \subseteq \bigcup_{i \in I} B_i \) where \( \{B_i: i \in I\} \) is a countable collection of sets in \( \mathscr{B} \) then \( \mu(B) \le \sum_{i \in I} \mu(B_i) \)

If the measure \( \mu \) on the algebra \( \mathscr A \) is \( \sigma \)-finite, then the extension theorem and the uniqueness theorem apply, so \( \mu \) can be extended uniquely to a measure on the \( \sigma \)-algebra \( \mathscr{S} = \sigma(\mathscr A) = \sigma(\mathscr{B}) \). This chain of extensions, starting with a semi-algebra \( \mathscr{B} \), is often how measures are constructed.

## Examples and Applications

### Product Spaces

Suppose that \( (S, \mathscr{S}) \) and \( (T, \mathscr{T}) \) are measurable spaces. For the Cartesian product set \( S \times T \), recall that the product \( \sigma \)-algebra is \[ \mathscr{S} \otimes \mathscr{T} = \sigma\{A \times B: A \in \mathscr{S}, B \in \mathscr{T}\} \] the \( \sigma \)-algebra generated by the Cartesian products of measurable sets, sometimes referred to as measurable rectangles.

Suppose that \( (S, \mathscr S, \mu) \) and \( (T, \mathscr T, \nu) \) are \( \sigma \)-finite measure spaces. Then there exists a unique \( \sigma \)-finite measure \( \mu \otimes \nu \) on \((S \times T, \mathscr{S} \otimes \mathscr{T}) \) such that \[ (\mu \otimes \nu)(A \times B) = \mu(A) \nu(B); \quad A \in \mathscr{S}, \; B \in \mathscr{T} \] The measure space \( (S \times T, \mathscr{S} \otimes \mathscr{T}, \mu \otimes \nu) \) is the product measure space associated with \( (S, \mathscr{S}, \mu) \) and \( (T, \mathscr{T}, \nu) \).

## Proof

Recall that the collection \( \mathscr{B} = \{A \times B: A \in \mathscr{S}, B \in \mathscr{T}\} \) is a semi-algebra: the intersection of two product sets is another product set, and the complement of a product set is the union of two disjoint product sets. We define \( \rho: \mathscr{B} \to [0, \infty] \) by \( \rho(A \times B) = \mu(A) \nu(B) \). The consistency conditions hold, so \( \rho \) can be extended to a measure on the algebra \( \mathscr A \) generated by \( \mathscr{B} \). The algebra \( \mathscr A \) is the collection of all finite, disjoint unions of products of measurable sets. We will now show that the extended measure \( \rho \) is \( \sigma \)-finite on \( \mathscr A \). Since \( \mu \) is \( \sigma \)-finite, there exists, an increasing sequence \( (A_1, A_2, \ldots) \) of sets in \( \mathscr{S} \) with \( \mu(A_i) \lt \infty \) and \( \bigcup_{i = 1}^\infty A_i = S \). Similarly, there exists an increasing sequence \( (B_1, B_2, \ldots) \) of sets in \( \mathscr{T} \) with \( \nu(B_j) \lt \infty \) and \( \bigcup_{j = 1}^\infty B_j = T \). Then \( \rho(A_i \times B_j) = \mu(A_i) \nu(B_j) \lt \infty \), and since the sets are increasing, \( \bigcup_{(i, j) \in \N_+ \times \N_+} A_i \times B_j = S \times T \). The standard extension theorem and uniqueness theorem uniqueness theorem now apply, so \( \rho \) can be extended uniquely to a measure on \( \sigma(\mathscr A) = \mathscr{S} \otimes \mathscr{T} \).

Recall that for \( C \subseteq S \times T \), the cross section of \( C \) in the first coordinate at \( x \in S \) is \( C_x = \{ y \in T: (x, y) \in C\} \). Similarly, the cross section of \( C \) in the second coordinate at \( y \in T \) is \( C^y = \{ x \in S: (x, y) \in C\} \). We know that the cross sections of a measurable set are measurable. The following result shows that the measures of the cross sections of a measurable set form measurable functions.

Suppose again that \( (S, \mathscr S, \mu) \) and \( (T, \mathscr T, \nu) \) are \( \sigma \)-finite measure spaces. If \( C \in \mathscr{S} \otimes \mathscr{T} \) then

- \( x \mapsto \nu(C_x) \) is a measurable function from \( S \) to \( [0, \infty] \).
- \( y \mapsto \mu(C^y) \) is a measurable function from \( T \) to \( [0, \infty] \).

## Proof

We prove part (a), since of course the proof for part (b) is symmetric. Suppose first that the measure spaces are finite. Let \( \mathscr{R} = \{A \times B: A \in \mathscr{S}, B \in \mathscr{T}\} \) denote the set of measurable rectangles. Let \( \mathscr{C} = \{C \in \mathscr{S} \otimes \mathscr{T}: x \mapsto \nu(C_x) \text{ is measurable}\}\). If \( A \times B \in \mathscr{R} \), then \( A \times B \in \mathscr{C} \), since \( \nu[(A \times B)_x] = \nu(B) \bs{1}_A(x) \). Next, suppose \( C \in \mathscr{C} \). Then \( (C^c)_x = (C_x)^c \), so \( \nu[(C^c)_x] = \nu(T) - \nu(C_x) \) and this is a measurable function of \( x \in S \). Hence \( C^c \in \mathscr{C} \). Next, suppose that \( \{C_i: i \in I\} \) is a countable, disjoint collection of sets in \( \mathscr{C} \) and let \( C = \bigcup_{i \in I} C_i \). Then \( \{(C_i)_x: i \in I\} \) is a countable, disjoint collection of sets in \( \mathscr{T} \), and \( C_x = \bigcup_{i \in I} (C_i)_x \). Hence \( \nu(C_x) = \sum_{i \in I} \nu[(C_i)_x] \), and this is a measurable function of \( x \in S \). Hence \( C \in \mathscr{C} \). It follows that \( \mathscr{C} \) is a \( \lambda \)-system that contains \( \mathscr{R} \), which in turn is a \( \pi \)-system. It follows from Dynkins \(\pi\)-\(\lambda \) theorem, that \( \mathscr{S} \otimes \mathscr{T} = \sigma(\mathscr{R}) \subseteq \mathscr{C} \). Thus \( \mathscr{C} = \mathscr{S} \otimes \mathscr{T} \).

Consider now the general case where the measure spaces are \( \sigma \)-finite. There exists a countable, increasing sequence of sets \( C_n \in \mathscr{S} \otimes \mathscr{T} \) for \( n \in \N_+ \) with \( (\mu \otimes \nu)(C_n) \lt \infty \) for \( n \in \N_+ \). If \( C \in \mathscr{S} \otimes \mathscr{T} \), then \( C \cap C_n \) is increasing in \( n \in \N_+ \), and \( C = \bigcup_{n=1}^\infty (C \cap C_n) \). Hence, for \( x \in S \), \( (C \cap C_n)_x \) is increasing in \( n \in \N_+ \) and \( C_x = \bigcup_{n=1}^\infty (C \cap C_n)_x \). Therefore \( \nu(C_x) = \lim_{n \to \infty} \nu[(C \cap C_n)_x] \). But \( x \mapsto \nu[(C \cap C_n)_x] \) is a measurable function of \( x \in S \) for each \( n \in \N_+ \) by the previous argument, so \( x \mapsto \nu(C_x) \) is a measurable function of \( x \in S \).

In the next chapter, where we study integration with respect to a measure, we will see that for \( C \in \mathscr{S} \otimes \mathscr{T} \), the product measure \( (\mu \otimes \nu)(C) \) can be computed by integrating \( \nu(C_x) \) over \( x \in S \) with respect to \( \mu \) or by integrating \( \mu(C^y) \) over \( y \in T \) with respect to \( \nu \). These results, generalizing the definition of the product measure, are special cases of Fubini's theorem, named for the Italian mathematician Guido Fubini.

Except for more complicated notation, these results extend in a perfectly straightforward way to the product of a finite number of \( \sigma \)-finite measure spaces.

Suppose that \( n \in \N_+ \) and that \( (S_i, \mathscr S_i, \mu_i) \) is a \( \sigma \)-finite measure space for \( i \in \{1, 2, \ldots, n\} \). Let \( S = \prod_{i=1}^n S_i \) and let \( \mathscr S \) denote the corresponding product \( \sigma \)-algebra. There exists a unique \( \sigma \)-finite measure \( \mu \) on \( (S, \mathscr{S}) \) satisfying \[ \mu\left(\prod_{i=1}^n A_i\right) = \prod_{i=1}^n \mu_i(A_i), \quad A_i \in \mathscr{S}_i \text{ for } i \in \{1, 2, \ldots, n\} \] The measure space \( (S, \mathscr S, \mu) \) is the product measure space associated with the given measure spaces.

### Lebesgue Measure

The next discussion concerns our most important and essential application. Recall that the Borel \( \sigma \)-algebra on \( \R \), named for Émile Borel, is the \( \sigma \)-algebra \( \mathscr{R} \) generated by the standard Euclidean topology on \( \R \). Equivalently, \( \mathscr{R} = \sigma(\mathscr{I}) \) where \( \mathscr{I} \) is the collection of intervals of \( \R \) (of all types—bounded and unbounded, with any type of closure, and including single points and the empty set). Next recall how the length of an interval is defined. For \( a, \, b \in \R \) with \( a \le b \), each of the intervals \( (a, b) \), \( [a, b) \), \( (a, b] \), and \( [a, b] \) has length \( b - a \). For \( a \in \R \), each of the intervals \( (a, \infty) \), \( [a, \infty) \), \( (-\infty, a) \), \( (-\infty, a] \) has length \( \infty \), as does \( \R \) itself. The standard measure on \( \mathscr{R} \) generalizes the length measurement for intervals.

There exists a unique measure \( \lambda \) on \( \mathscr{R} \) such that \( \lambda(I) = \length(I) \) for \( I \in \mathscr{I} \). The measure \( \lambda \) is Lebesgue measure on \( (\R, \mathscr R) \).

## Proof

Recall that \( \mathscr{I} \) is a semi-algebra: The intersection of two intervals is another interval, and the complement of an interval is either another interval or the union of two disjoint intervals. Define \( \lambda \) on \( \mathscr{I} \) by \( \lambda(I) = \length(I) \) for \( I \in \mathscr{I} \). Then \( \lambda \) satisfies the consistency condition and hence \( \lambda \) can be extended to a measure on the algebra \( \mathscr{J} \) generated by \( \mathscr{I} \), namely the collection of finite, disjoint unions of intervals. The measure \( \lambda \) on \( \mathscr{J} \) is clearly \( \sigma \)-finite, since \( \R \) can be written as a countably infinite union of bounded intervals. Hence the standard extension theorem and uniqueness theorem apply, so \( \lambda \) can be extended to a measure on \( \mathscr{R} = \sigma(\mathscr{I}) \).

The is name in honor of Henri Lebesgue, of course. Since \( \lambda \) is \( \sigma \)-finite, the \( \sigma \)-algebra of Borel sets \( \mathscr{R} \) can be completed with respect to \( \lambda \).

The completion of the Borel \( \sigma \)-algebra \( \mathscr R \) with respect to \( \lambda \) is the Lebesgue \( \sigma \)-algebra \( \mathscr R^* \).

Recall that completed means that if \( A \in \mathscr{R}^* \), \( \lambda(A) = 0 \) and \( B \subseteq A \), then \( B \in \mathscr{R}^* \) (and then \( \lambda(B) = 0 \)). The Lebesgue measure \( \lambda \) on \( \R \), with either the Borel \( \sigma \)-algebra \( \mathscr{R} \), or its completion \( \mathscr{R}^* \) is the standard measure that is used for the real numbers. Other properties of the measure space \( (\R, \mathscr R, \lambda) \) are given below, in the discussion of Lebesgue measure on \( \R^n \).

For \( n \in \N_+ \), let \( \mathscr R_n \) denote the Borel \( \sigma \)-algebra corresponding to the the standard Euclidean topology on \( \R^n \), so that \( (\R^n, \mathscr R_n) \) is the \( n \)-dimensional Euclidean measurable space. The \( \sigma \)-algebra, \( \mathscr{R}_n \) is also the \( n \)-fold power of \( \mathscr{R} \), the Borel \( \sigma \)-algebra of \( \R \). That is, \( \mathscr{R}_n = \mathscr{R} \otimes \mathscr{R} \otimes \cdots \otimes \mathscr{R} \) (\( n \) times). It is also the \( \sigma \)-algebra generated by the products of intervals: \[ \mathscr{R}_n = \sigma\left\{I_1 \times I_2 \times \cdots I_n: I_j \in \mathscr{I} \text{ for } j \in \{1, 2, \ldots n\}\right\} \] As above, let \( \lambda \) denote Lebesgue measure on \( (\R, \mathscr R) \).

For \( n \in \N_+ \) the \( n \)-fold power of \( \lambda \), denoted \( \lambda_n \) is Lebesgue measure on \( (\R^n, \mathscr R_n) \). In particular, \[ \lambda_n(A_1 \times A_2 \times \cdots \times A_n) = \lambda(A_1) \lambda(A_2) \cdots \lambda(A_n); \quad A_1, \, \ldots, A_n \in \mathscr{R} \]

Specializing further, if \( I_j \in \mathscr{I} \) is an interval for \( j \in \{1, 2, \ldots, n\} \) then \[ \lambda_n\left(I_1 \times I_2 \times \cdots \times I_n\right) = \length(I_1) \length(I_2) \cdots \length(I_n) \] In particular, \( \lambda_2 \) extends the *area* measure on \( \mathscr{R}_2 \) and \( \lambda_3 \) extends the *volume* measure on \( \mathscr{R}_3 \). In general, \( \lambda_n(A) \) is sometimes referred to as \( n \)-dimensional volume of \( A \in \mathscr{R}_n \). As in the one-dimensional case, \( \mathscr{R}_n \) can be completed with respect to \( \lambda_n \), essentially adding all subsets of sets of measure 0 to \( \mathscr{R}_n \). The completed \( \sigma \)-algebra is the \( \sigma \)-algebra of Lebesgue measurable sets. Since \( \lambda_n(U) \gt 0 \) if \( U \subseteq \R^n \) is open, the support of \( \lambda_n \) is all of \( \R^n \). In addition, Lebesgue measure has the regularity properties that are concerned with approximating the measure of a set, from below with the measure of a compact set, and from above with the measure of an open set.

The measure space \( (\R^n, \mathscr R_n, \lambda_n) \) is regular. That is, for \( A \in \mathscr R_n \),

- \( \lambda_n(A) = \sup\{\lambda_n(C): C \text{ is compact and } C \subseteq A\} \), (inner regularity)
- \( \lambda_n(A) = \inf\{\lambda_n(U): U \text { is open and } A \subseteq U\} \) (outer regulairty).

The following theorem describes how the measure of a set is changed under certain basic transformations. These are essential properties of Lebesgue measure. To setup the notation, suppose that \( n \in \N_+ \), \( A \subseteq \R^n \), \( x \in \R^n \), \( c \in (0, \infty) \) and that \( T \) is an \( n \times n \) matrix. Define \[ A + x = \{a + x: a \in A\}, \quad c A = \{c a: a \in A\}, \quad TA = \{T a: a \in A\} \]

Suppose that \( A \in \mathscr R_n \).

- If \( x \in \R^n \) then \( \lambda_n(A + x) = \lambda_n(A) \) (translation invariance)
- If \( c \in (0, \infty) \) then \( \lambda_n(c A) = c^n \lambda_n(A) \) (dialation property)
- If \( T \) is an \( n \times n \) matrix then \( \lambda_n(T A) = |\det(T)| \lambda_n(A) \) (the scaling property)

### Lebesgue-Stieltjes Measures on \( \R \)

The construction of Lebesgue measure on \( \R \) can be generalized. Here is the definition that we will need.

A function \( F: \R \to \R \) that satisfis the following properties is a distribution function on \( \R \)

- \( F \) is increasing: if \( x \le y \) then \( F(x) \le F(y) \).
- \( F \) is continuous from the right: \( \lim_{t \downarrow x} F(t) = F(x) \) for all \( x \in \R \).

Since \( F \) is increasing, the limit from the left at \( x \in \R \) exists in \( \R \) and is denoted \( F(x^-) = \lim_{t \uparrow x} F(t) \). Similarly \(F(\infty) = \lim_{x \to \infty} F(x) \) exists, as a real number or \( \infty \), and \(F(-\infty) = \lim_{x \to -\infty} F(x) \) exists, as a real number or \( -\infty \).

If \( F \) is a distribution function on \( \R \), then there exists a unique measure \( \mu \) on \( \mathscr{R} \) that satisfies \[ \mu(a, b] = F(b) - F(a), \quad -\infty \le a \le b \le \infty \]

The measure \( \mu \) is called the Lebesgue-Stieltjes measure associated with \( F \), named for Henri Lebesgue and Thomas Joannes Stieltjes. Distribution functions and the measures associated with them are studied in more detail in the chapter on Distributions. When the function \( F \) takes values in \( [0, 1] \), the associated measure \( \P \) is a probability measure, and the function \( F \) is the probability distribution function of \( \P \). Probability distribution functions are also studied in much more detail (but with less technicality) in the chapter on Distributions.

Note that the identity function \( x \mapsto x \) for \( x \in \R \) is a distribution function, and the measure associated with this function is ordinary Lebesgue measure on \( \R \) constructed in(15).