1.11: Measurable Spaces

Last updated
Save as PDF

Page ID: 10126

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\(\newcommand{\R}{\mathbb{R}}\) \(\newcommand{\N}{\mathbb{N}}\) \(\newcommand{\Z}{\mathbb{Z}}\) \(\newcommand{\Q}{\mathbb{Q}}\) \(\newcommand{\D}{\mathbb{D}}\) \(\newcommand{\bs}{\boldsymbol}\)

In this section we discuss some topics from measure theory that are a bit more advanced than the topics in the early sections of this chapter. However, measure-theoretic ideas are essential for a deep understanding of probability, since probability is itself a measure. The most important of the definitions is the \(\sigma\)-algebra, a collection of subsets of a set with certain closure properties. Such collections play a fundamental role, even for applied probability, in encoding the state of information about a random experiment.

On the other hand, we won't be overly pedantic about measure-theoretic details in this text. Unless we say otherwise, we assume that all sets that appear are measurable (that is, members of the appropriate \(\sigma\)-algebras), and that all functions are measurable (relative to the appropriate \(\sigma\)-algebras).

Although this section is somewhat abstract, many of the proofs are straightforward. Be sure to try the proofs yourself before reading the ones in the text.

Algebras and \( \sigma \)-Algebras

Suppose that \(S\) is a set, playing the role of a universal set for a particular mathematical model. It is sometimes impossible to include all subsets of \(S\) in our model, particularly when \(S\) is uncountable. In a sense, the more sets that we include, the harder it is to have consistent theories. However, we almost always want the collection of admissible subsets to be closed under the basic set operations. This leads to some important definitions.

Algebras of Sets

Suppose that \(\mathscr S\) is a nonempty collection of subsets of \(S\). Then \(\mathscr S\) is an algebra (or field) if it is closed under complement and union:

If \(A \in \mathscr S\) then \(A^c \in \mathscr S\).
If \(A \in \mathscr S\) and \(B \in \mathscr S\) then \(A \cup B \in \mathscr S\).

If \(\mathscr S\) is an algebra of subsets of \(S\) then

\( S \in \mathscr S \)
\( \emptyset \in \mathscr S \)

Proof

Since \( \mathscr S \) is nonempty, there exists \( A \in \mathscr S \). Hence \( A^c \in \mathscr S \) so \( S = A \cup A^c \in \mathscr S \).
\( \emptyset = S^c \in \mathscr S \)

Suppose that \(\mathscr S\) is an algebra of subsets of \(S\) and that \(A_i \in \mathscr S\) for each \(i\) in a finite index set \(I\).

\(\bigcup_{i \in I} A_i \in \mathscr S\)
\(\bigcap_{i \in I} A_i \in \mathscr S\)

Proof

This follows by induction on the number of elements in \(I\).
Thie follows from (a) and DeMorgan's law. If \( A_i \in \mathscr S \) for \( i \in I \) then \( A_i^c \in \mathscr S \) for \( i \in I \). Therefore \( \bigcup_{i \in I} A_i^c \in \mathscr S \) and hence \( \bigcap_{i \in I} A_i = \left(\bigcup_{i \in I} A_i^c\right)^c \in \mathscr S \).

Thus it follows that an algebra of sets is closed under a finite number of set operations. That is, if we start with a finite number of sets in the algebra \( \mathscr S \), and build a new set with a finite number of set operations (union, intersection, complement), then the new set is also in \( \mathscr S \). However in many mathematical theories, probability in particular, this is not sufficient; we often need the collection of admissible subsets to be closed under a countable number of set operations.

\(\sigma\)-Algebras of Sets

Suppose that \(\mathscr S\) is a nonempty collection of subsets of \(S\). Then \(\mathscr S\) is a \(\sigma\)-algebra (or \(\sigma\)-field) if the following axioms are satisfied:

If \(A \in \mathscr S\) then \(A^c \in \mathscr S\).
If \(A_i \in \mathscr S\) for each \(i\) in a countable index set \(I\), then \(\bigcup_{i \in I} A_i \in \mathscr S\).

Clearly a \(\sigma\)-algebra of subsets is also an algebra of subsets, so the basic results for algebras above still hold. In particular, \( S \in \mathscr S \) and \( \emptyset \in \mathscr S \).

If \(A_i \in \mathscr S\) for each \(i\) in a countable index set \(I\), then \(\bigcap_{i \in I} A_i \in \mathscr S\).

Proof

The proof is just like the one above for algebras. If \( A_i \in \mathscr S \) for \( i \in I \) then \( A_i^c \in \mathscr S \) for \( i \in I \). Therefore \( \bigcup_{i \in I} A_i^c \in \mathscr S \) and hence \( \bigcap_{i \in I} A_i = \left(\bigcup_{i \in I} A_i^c\right)^c \in \mathscr S \).

Thus a \(\sigma\)-algebra of subsets of \(S\) is closed under countable unions and intersections. This is the reason for the symbol \(\sigma\) in the name. As mentioned in the introductory paragraph, \( \sigma \)-algebras are of fundamental importance in mathematics generally and probability theory specifically, and thus deserve a special definition:

If \( S \) is a set and \( \mathscr S \) a \( \sigma \)-algebra of subsets of \( S \), then the pair \( (S, \mathscr S) \) is called a measurable space.

The term measurable space will make more sense in the next chapter, when we discuss positive measures (and in particular, probability measures) on such spaces.

Suppose that \(S\) is a set and that \(\mathscr S\) is a finite algebra of subsets of \(S\). Then \(\mathscr S\) is also a \(\sigma\)-algebra.

Proof

Any countable union of sets in \(\mathscr S\) reduces to a finite union.

However, there are algebras that are not \(\sigma\)-algebras. Here is the classic example:

Suppose that \( S \) is an infinite set. The collection of finite and co-finite subsets of \( S \) defined below is an algebra of subsets of \( S \), but not a \(\sigma\)-algebra: \[ \mathscr{F} = \{A \subseteq S: A \text{ is finite or } A^c \text{ is finite}\} \]

Proof

\( S \in \mathscr{F} \) since \( S^c = \emptyset \) is finite. If \( A \in \mathscr{F} \) then \( A^c \in \mathscr{F} \) by the symmetry of the definition. Suppose that \( A, \, B \in \mathscr{F} \). If \( A \) and \( B \) are both finite then \( A \cup B \) is finite. If \( A^c \) or \( B^c \) is finite, then \( (A \cup B)^c = A^c \cap B^c \) is finite. In either case, \( A \cup B \in \mathscr{F} \). Thus \( \mathscr{F} \) is an algebra of subsets of \( S \).

Since \( S \) is infinite, it contains a countably infinite subset \( \{x_0, x_1, x_2, \ldots\} \). Let \( A_n = \{x_{2 n}\} \) for \( n \in \N \). Then \( A_n \) is finite, so \( A_n \in \mathscr{F} \) for each \( n \in \N \). Let \( E = \bigcup_{n=0}^\infty A_n = \{x_0, x_2, x_4, \ldots\} \). Then \( E \) is infinite by construction. Also \(\{x_1, x_3, x_5, \ldots\} \subseteq E^c \), so \( E^c \) is infinite as well. Hence \( E \notin \mathscr{F} \) and so \( \mathscr{F} \) is not a \( \sigma \)-algebra.

General Constructions

Recall that \(\mathscr{P}(S)\) denotes the collection of all subsets of \(S\), called the power set of \(S\). Trivially, \(\mathscr{P}(S)\) is the largest \(\sigma\)-algebra of \(S\). The power set is often the appropriate \( \sigma \)-algebra if \( S \) is countable, but as noted above, is sometimes too large to be useful if \( S \) is uncountable. At the other extreme, the smallest \(\sigma\)-algebra of \(S\) is given in the following result:

The collection \(\{\emptyset, S\}\) is a \(\sigma\)-algebra.

Proof

Clearly \( \{\emptyset, S\} \) is a finite algebra: \( S \) and \( \emptyset \) are complements of each other, and \( S \cup \emptyset = S \). Hence \( \{S, \emptyset\} \) is a \( \sigma \)-algebra by the result above.

In many cases, we want to construct a \(\sigma\)-algebra that contains certain basic sets. The next two results show how to do this.

Suppose that \(\mathscr S_i\) is a \(\sigma\)-algebra of subsets of \(S\) for each \(i\) in a nonempty index set \(I\). Then \( \mathscr S = \bigcap_{i \in I} \mathscr S_i\) is also a \(\sigma\)-algebra of subsets of \(S\).

Proof

The proof is completely straightforward. First, \( S \in \mathscr S_i \) for each \( i \in I \) so \( S \in \mathscr S \). If \( A \in \mathscr S \) then \( A \in \mathscr S_i \) for each \( i \in I \) and hence \( A^c \in \mathscr S_i \) for each \( i \in I \). Therefore \( A^c \in \mathscr S \). Finally suppose that \( A_j \in \mathscr S \) for each \( j \) in a countable index set \( J \). Then \( A_j \in \mathscr S_i \) for each \( i \in I \) and \( j \in J \) and therefore \( \bigcup_{j \in J} A_j \in \mathscr S_i \) for each \( i \in I \). It follows that \( \bigcup_{j \in J} A_j \in \mathscr S \).

Note that no restrictions are placed on the index set \( I \), other than it be nonempty, so in particular it may well be uncountable.

Suppose that \( S \) is a set and that \(\mathscr B\) is a collection of subsets of \(S\). The \(\sigma\)-algebra generated by \(\mathscr B\) is \[\sigma(\mathscr B) = \bigcap \{\mathscr S: \mathscr S \text{ is a } \sigma\text{-algebra of subsets of } S \text{ and } \mathscr B \subseteq \mathscr S\}\] If \( \mathscr B \) is countable then \( \sigma(\mathscr B) \) is said to be countably generated.

So the \(\sigma\)-algebra generated by \(\mathscr B\) is the intersection of all \(\sigma\)-algebras that contain \(\mathscr B\), which by the previous result really is a \(\sigma\)-algebra. Note that the collection of \( \sigma \)-algebras in the intersection is not empty, since \( \mathscr{P}(S) \) is in the collection. Think of the sets in \(\mathscr B\) as basic sets that we want to be measurable, but do not form a \(\sigma\)-algebra.

The \(\sigma\)-algebra \(\sigma(\mathscr B)\) is the smallest \(\sigma\) algebra containing \(\mathscr B\).

\(\mathscr B \subseteq \sigma(\mathscr B)\)
If \(\mathscr S\) is a \(\sigma\)-algebra of subsets of \(S\) and \(\mathscr B \subseteq \mathscr S\) then \(\sigma(\mathscr B) \subseteq \mathscr S\).

Proof

Both of these properties follows from the definition of \( \sigma(\mathscr B) \) as the intersection of all \( \sigma \)-algebras that contain \( \mathscr B \).

Note that the conditions in the last theorem completely characterize \( \sigma(\mathscr B) \). If \( \mathscr S_1 \) and \( \mathscr S_2 \) satisfy the conditions, then by (a), \( \mathscr B \subseteq \mathscr S_1 \) and \( \mathscr B \subseteq \mathscr S_2 \). But then by (b), \( \mathscr S_1 \subseteq \mathscr S_2 \) and \( \mathscr S_2 \subseteq \mathscr S_1\).

If \(A\) is a subset of \(S\) then \(\sigma\{A\} = \{\emptyset, A, A^c, S\}\)

Proof

Let \( \mathscr S = \{\emptyset, A, A^c, S\} \). Clearly \( \mathscr S \) is an algebra: \( A \) and \( A^c \) are complements of each other, as are \( \emptyset \) and \( S \). Also, \begin{align*} &A \cup A^c = A \cup S = A^c \cup S = S \cup S = \emptyset \cup S = S \\ &A \cup \emptyset = A \cup A = A \\ &A^c \cup \emptyset = A^c \cup A^c = A^c \\ &\emptyset \cup \emptyset = \emptyset \end{align*} Since \( \mathscr S \) is finite, it is a \( \sigma \)-algebra by (7). Next, \( A \in \mathscr S \). Conversely, if \( \mathscr T \) is a \( \sigma \)-algebra and \( A \in \mathscr T \) then of course \( \emptyset, S, A^c \in \mathscr T \) so \( \mathscr S \subseteq \mathscr T \). Hence \( \mathscr S = \sigma\{A\} \)

We can generalize the previous result. Recall that a collection of subsets \( \mathscr{A} = \{A_i: i \in I\} \) is a partition of \( S \) if \( A_i \cap A_j = \emptyset \) for \( i, \; j \in I \) with \( i \ne j \), and \( \bigcup_{i \in I} A_i = S \).

Suppose that \( \mathscr{A} = \{A_i: i \in I\} \) is a countable partition of \( S \) into nonempty subsets. Then \( \sigma(\mathscr{A}) \) is the collection of all unions of sets in \( \mathscr{A} \). That is, \[ \sigma(\mathscr{A}) = \left\{ \bigcup_{j \in J} A_j: J \subseteq I \right\} \]

Proof

Let \( \mathscr S = \left\{ \bigcup_{j \in J} A_j: J \subseteq I \right\} \). Note that \( S \in \mathscr S \) since \( S = \bigcup_{i \in I} A_i \). Next, suppose that \( B \in \mathscr S \). Then \( B = \bigcup_{j \in J} A_j \) for some \( J \subseteq I \). But then \( B^c = \bigcup_{j \in J^c} A_j \), so \( B^c \in \mathscr S \). Next, suppose that \( B_k \in \mathscr S \) for \( k \in K \) where \( K \) is a countable index set. Then for each \( k \in K \) there exists \( J_k \subseteq I \) such that \( B_k = \bigcup_{j \in J_k} A_j \). But then \( \bigcup_{k \in K} B_k = \bigcup_{k \in K} \bigcup_{j \in J_k} A_j = \bigcup_{j \in J} A_j \) where \( J = \bigcup_{k \in K} J_k \). Hcnce \( \bigcup_{k \in K} B_k \in \mathscr S \). Therefore \( \mathscr S \) is a \( \sigma \)-algebra of subsets of \( S \). Trivially, \( \mathscr{A} \subseteq \mathscr S \). If \( \mathscr T \) is a \( \sigma \)-algebra of subsets of \( S \) and \( \mathscr{A} \subseteq \mathscr T \), then clearly \( \bigcup_{j \in J} A_j \in \mathscr T \) for every \( J \subseteq I \). Hence \( \mathscr S \subseteq \mathscr T\).

A \( \sigma \)-algebra of this form is said to be generated by a countable partition. Note that since \( A_i \ne \emptyset \) for \( i \in I \), the representation of a set in \( \sigma(\mathscr{A}) \) as a union of sets in \( \mathscr{A} \) is unique. That is, if \( J, \, K \subseteq I \) and \( J \ne K \) then \( \bigcup_{j \in J} A_j \ne \bigcup_{k \in K} A_k \). In particular, if there are \( n \) nonempty sets in \( \mathscr{A} \), so that \( \#(I) = n \), then there are \( 2^n \) subsets of \( I \) and hence \( 2^n \) sets in \( \sigma(\mathscr{A}) \).

Suppose now that \( \mathscr{A} = \{A_1, A_2, \ldots, A_n\} \) is a collection of \(n\) subsets of \(S\) (not necessarily disjoint). To describe the \( \sigma \)-algebra generated by \( \mathscr{A} \) we need a bit more notation. For \( x = (x_1, x_2, \ldots, x_n) \in \{0, 1\}^n \) (a bit string of length \( n \)), let \( B_x = \bigcap_{i=1}^n A_i^{x_i} \) where \( A_i^1 = A_i \) and \( A_i^0 = A_i^c \).

In the setting above,

\( \mathscr B = \{B_x: x \in \{0, 1\}^n\} \) partitions \( S \).
\( A_i = \bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\}\) for \(i \in \{1, 2, \ldots, n\}\).
\(\sigma(\mathscr{A}) = \sigma(\mathscr B) = \left\{\bigcup_{x \in J} B_x: J \subseteq \{0, 1\}^n\right\}\).

Proof

Suppose that \( x, \; y \in \{0, 1\}^n \) and that \( x \ne y \). Without loss of generality we can suppose that for some \( j \in \{1, 2, \ldots, n\} \), \(x_j = 0 \) while \( y_j = 1 \). Then \( B_x \subseteq A_j^c \) and \( B_y \subseteq A_j \) so \( B_x \) and \( B_y \) are disjoint. Suppose that \( s \in S \). Construct \( x \in \{0, 1\}^n \) by \( x_i = 1 \) if \( s \in A_i \) and \( x_i = 0 \) if \( s \notin A_i \), for each \( i \in \{1, 2, \ldots, n\} \). Then by definition, \( s \in B_x \). Hence \( \mathscr B \) partitions \( S \).
Fix \( i \in \{1, 2, \ldots, n\}\). Again if \( x \in \{0, 1\}^n \) and \( x_i = 1 \) then \( B_x \subseteq A_i \). Hence \(\bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\} \subseteq A_i\). Conversely, suppose \( s \in A_i \). Define \( y \in \{0, 1\}^n \) by \( y_j = 1 \) if \( s \in A_j \) and \( y_j = 0 \) if \( s \notin A_j \) for each \( j \in \{1, 2, \ldots, n\} \). Then \( y_i = 1 \) and \( s \in B_y \). Hence \( s \in \bigcup\left\{B_x: x \in \{0, 1\}^n, \; x_i = 1\right\}\).
Clearly, every \( \sigma \)-algebra of subsets of \( S \) that contains \( \mathscr{A} \) must also contain \( \mathscr B \), and every \( \sigma \)-algebra of subsets of \( S \) that contains \( \mathscr B \) must also contain \( \mathscr{A} \). It follows that \( \sigma(\mathscr{A}) = \sigma(\mathscr B) \). The characterization in terms of unions now follows from the previous result.

Recall that there are \( 2^n \) bit strings of length \( n \). The sets in \( \mathscr{A} \) are said to be in general position if the sets in \( \mathscr B \) are distinct (and hence there are \( 2^n \) of them) and are nonempty. In this case, there are \( 2^{2^n} \) sets in \( \sigma(\mathscr{A}) \).

Open the Venn diagram app. This app shows two subsets \(A\) and \(B\) of \(S\) in general position, and lists the 16 sets in \( \sigma\{A, B\} \).

Select each of the 4 sets that partition \( S \): \( A \cap B \), \( A \cap B^c \), \( A^c \cap B \), \( A^c \cap B^c \).
Select each of the other 12 sets in \(\sigma\{A, B\}\) and note how each is a union of some of the sets in (a).

Sketch a Venn diagram with sets \( A_1, \, A_2, \, A_3 \) in general position. Identify the set \( B_x \) for each \( x \in \{0, 1\}^3 \).

If a \( \sigma \)-algebra is generated by a collection of basic sets, then each set in the \( \sigma \)-algebra is generated by a countable number of the basic sets.

Suppose that \( S \) is a set and \( \mathscr B \) a nonempty collection of subsets of \( S \). Then

\[ \sigma(\mathscr B) = \{A \subseteq S: A \in \sigma(\mathscr{C}) \text{ for some countable } \mathscr{C} \subseteq \mathscr B\} \]

Proof

Let \( \mathscr S \) denote the collection on the right. We first show that \( \mathscr S \) is a \( \sigma \)-algebra. First, pick \( B \in \mathscr B \), which we can do since \( \mathscr B \) is nonempty. Then \( S \in \sigma\{B\} \) so \( S \in \mathscr S \). Let \( A \in \mathscr S \) so that \( A \in \sigma(\mathscr{C}) \) for some countable \( \mathscr{C} \subseteq \mathscr B \). Then \( A^c \in \sigma(\mathscr{C}) \) so \( A^c \in \mathscr S \). Finally, suppose that \( A_i \in \mathscr S \) for \( i \) in a countable index set \( I \). Then for each \( i \in I \), there exists a countable \( \mathscr{C}_i \subseteq \mathscr B \) such that \( A_i \in \sigma(\mathscr{C}_i) \). But then \( \bigcup_{i \in I} \mathscr{C}_i \) is also countable and \( \bigcup_{i \in I} A_i \in \sigma\left(\bigcup_{i \in I} \mathscr{C}_i \right) \). Hence \( \bigcup_{i \in I} A_i \in \mathscr S \).

Next if \( B \in \mathscr B \) then \( B \in \sigma\{B\} \) so \( B \in \mathscr S \). Hence \( \sigma(\mathscr B) \subseteq \mathscr S \). Conversely, if \( A \in \sigma(\mathscr{C}) \) for some countable \( \mathscr{C} \subseteq \mathscr B \) then trivially \( A \in \sigma(\mathscr B) \).

A \( \sigma \)-algebra on a set naturally leads to a \( \sigma \)-algebra on a subset.

Suppose that \((S, \mathscr S)\) is a measurable space, and that \(R \subseteq S\). Let \(\mathscr{R} = \{A \cap R: A \in \mathscr S\}\). Then

\( \mathscr{R} \) is a \(\sigma\)-algebra of subsets of \(R\).
If \(R \in \mathscr S\) then \(\mathscr{R} = \{B \in \mathscr S: B \subseteq R\}\).

Proof

First, \( S \in \mathscr S \) and \( S \cap R = R \) so \( R \in \mathscr{R} \). Next suppose that \( B \in \mathscr{R} \). Then there exists \( A \in \mathscr S \) such that \( B = A \cap R \). But then \( A^c \in \mathscr S \) and \( R \setminus B = R \cap B^c = R \cap A^c \), so \( R \setminus B \in \mathscr{R} \). Finally, suppose that \( B_i \in \mathscr{R} \) for \( i \) in a countable index set \( I \). For each \( i \in I \) there exists \( A_i \in \mathscr S \) such that \( B_i = A_i \cap R \). But then \( \bigcup_{i \in I} A_i \in \mathscr S \) and \( \bigcup_{i \in I} B_i = \left(\bigcup_{i \in I} A_i \right) \cap R \), so \( \bigcup_{i \in I} B_i \in \mathscr{R} \).
Suppose that \( R \in \mathscr S \). Then \( A \cap R \in \mathscr S \) for every \( A \in \mathscr S \), and of course, \( A \cap R \subseteq R \). Conversely, if \( B \in \mathscr S \) and \( B \subseteq R \) then \( B = B \cap R \) so \( B \in \mathscr{R} \)

The \( \sigma \)-algebra \(\mathscr{R}\) is the \(\sigma\)-algebra on \(R\) induced by \(\mathscr S\). The following construction is useful for counterexamples. Compare this example with the one for finite and co-finite sets.

Let \( S \) be a nonempty set. The collection of countable and co-countable subsets of \( S \) is \[ \mathscr{C} = \{A \subseteq S: A \text{ is countable or } A^c \text{ is countable}\} \]

\( \mathscr{C} \) is a \( \sigma \)-algebra
\( \mathscr{C} = \sigma\{\{x\}: x \in S\} \), the \( \sigma \)-algebra generated by the singleton sets.

Proof

First, \( S \in \mathscr{C} \) since \( S^c = \emptyset \) is countable. If \( A \in \mathscr{C} \) then \( A^c \in \mathscr{C} \) by the symmetry of the definition. Suppose that \( A_i \in \mathscr{C} \) for each \( i \) in a countable index set \( I \). If \( A_i \) is countable for each \( i \in I \) then \( \bigcup_{i \in I} A_i \) is countable. If \( A_j^c \) is countable for some \( j \in I \) then \( \left(\bigcup_{i \in I} A_i \right)^c = \bigcap_{i \in I} A_i^c \subseteq A_j^c \) is countable. In either case, \( \bigcup_{i \in I} A_i \in \mathscr{C} \).
Let \( \mathscr{D} = \sigma\{\{x\}: x \in S\} \). Clearly \( \{x\} \in \mathscr{C} \) for \( x \in S \). Hence \( \mathscr{D} \subseteq \mathscr{C} \). Conversely, suppose that \( A \in \mathscr{C} \). If \( A \) is countable, then \( A = \bigcup_{x \in A} \{x\} \in \mathscr{D} \). If \( A^c \) is countable, then by an identical argument, \( A^c \in \mathscr{D} \) and hence \( A \in \mathscr{D} \).

Of course, if \( S \) is itself countable then \( \mathscr{C} = \mathscr{P}(S) \). On the other hand, if \( S \) is uncountable, then there exists \( A \subseteq S \) such that \( A \) and \( A^c \) are uncountable. Thus, \( A \notin \mathscr{C} \), but \( A = \bigcup_{x \in A} \{x\} \), and of course \( \{x\} \in \mathscr{C} \). Thus, we have an example of a \( \sigma \)-algebra that is not closed under general unions.

Topology and Measure

One of the most important ways to generate a \( \sigma \)-algebra is by means of topology. Recall that a topological space consists of a set \( S \) and a topology \(\mathscr S\), the collection of open subsets of \( S \). Most spaces that occur in probability and stochastic processes are topological spaces, so it's crucial that the topological and measure-theoretic structures are compatible.

Suppose that \( (S, \mathscr S) \) is a topological space. Then \( \sigma(\mathscr S) \) is the Borel \( \sigma \)-algebra on \(S\), and \((S, \sigma(\mathscr S))\) is a Borel measurable space.

So the Borel \( \sigma \)-algebra on \( S \), named for Émile Borel is generated by the open subsets of \( S \). Thus, a topological space \( (S, \mathscr S) \) naturally leads to a measurable space \( (S, \sigma(\mathscr S))\). Since a closed set is simply the complement of an open set, the Borel \( \sigma \)-algebra contains the closed sets as well (and in fact is generated by the closed sets). Here are some other sets that are in the Borel \(\sigma\)-algebra:

Suppose again that \((S, \mathscr S)\) is a topological space and that \(I\) is a countable index set.

If \(A_i\) is open for each \(i \in I\) then \(\bigcap_{i \in I} A_i \in \sigma(\mathscr S)\). Such sets are called \(G_\delta\) sets.
If \(A_i\) is closed for each \(i \in I\) then \(\bigcup_{i \in I} A_i \in \sigma(\mathscr S)\). Such sets are called \(F_\sigma\) sets.
If \((S, \mathscr S)\) is Hausdorff then \(\{x\} \in \mathscr S\) for every \(x \in S\).

Proof

This follows direction from the closure property for intersections.
This follows from the definition.
This follows since \(\{x\}\) is closed for each \(x \in S\) if the topology is Hausdorff.

In terms of part (c), recall that a topological space is Hausdorff, named for Felix Hausdorff, if the topology can distinguish individual points. Specifically, if \(x, \, y \in S\) are distinct then there exist disjoint open sets \(U, \, V\) with \(x \in U\) and \(y \in V\). This is a very basic property possessed by almost all topological spaces that occur in applications. A simple corollary of (c) is that if the topological space \((S, \mathscr S)\) is Hausdorff then \(A \in \sigma(\mathscr S)\) for every countable \(A \subseteq S\).

Let's note the extreme cases. If \( S \) has the discrete topology \( \mathscr{P}(S) \), so that every set is open (and closed), then of course the Borel \( \sigma \)-algebra is also \( \mathscr{P}(S) \). As noted above, this is often the appropriate \( \sigma \)-algebra if \( S \) is countable, but is often too large if \( S \) is uncountable. If \(S\) has the trivial topology \(\{S, \emptyset\}\), then the Borel \(\sigma\)-algebra is also \(\{S, \emptyset\}\), and so is also trivial.

Recall that a base for a topological space \( (S, \mathscr T) \) is a collection \( \mathscr B \subseteq \mathscr T \) with the property that every set in \(\mathscr T\) is a union of a collection of sets in \( \mathscr B \). In short, every open set is a union of some of the basic open sets.

Suppose that \( (S, \mathscr S) \) is a topological space with a countable base \( \mathscr B \). Then \( \sigma(\mathscr B) = \sigma(\mathscr S) \).

Proof

Since \( \mathscr B \subseteq \mathscr S \) it follows trivially that \( \sigma(\mathscr B) \subseteq \sigma(\mathscr S) \). Conversely, if \( U \in \mathscr S \), there exists a collection of sets in \( \mathscr B \) whose union is \( U \). Since \( \mathscr B \) is countable, \( U \in \sigma(\mathscr B) \).

The topological spaces that occur in probability and stochastic processes are usually assumed to have a countable base (along with other nice properties such as the Hausdorff property and locally compactness). The \( \sigma \)-algebra used for such a space is usually the Borel \( \sigma \)-algebra, which by the previous result, is countably generated.

Measurable Functions

Recall that a set usually comes with a \(\sigma\)-algebra of admissible subsets. A natural requirement on a function is that the inverse image of an admissible set in the range space be admissible in the domain space. Here is the formal definition.

Suppose that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces. A function \( f: S \to T \) is measurable if \( f^{-1}(A) \in \mathscr S \) for every \( A \in \mathscr T \).

If the \( \sigma \)-algebra in the range space is generated by a collection of basic sets, then to check the measurability of a function, we need only consider inverse images of basic sets:

Suppose again that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces, and that \( \mathscr T = \sigma(\mathscr B) \) for a collection of subsets \( \mathscr B \) of \( T \). Then \( f: S \to T \) is measurable if and only if \( f^{-1}(B) \in \mathscr S \) for every \( B \in \mathscr B \).

Proof

First \( \mathscr B \subseteq \mathscr T \), so if \( f: S \to T \) is measurable then the condition in the theorem trivially holds. Conversely, suppose that the condition in the theorem holds, and let \( \mathscr{U} = \{A \in \mathscr T: f^{-1}(A) \in \mathscr S\} \). Then \( T \in \mathscr{U} \) since \( f^{-1}(T) = S \in \mathscr S \). If \( A \in \mathscr{U} \) then \( f^{-1}(A^c) = \left[f^{-1}(A)\right]^c \in \mathscr S \), so \( A^c \in \mathscr{U} \). If \( A_i \in \mathscr{U} \) for \( i \) in a countable index set \( I \), then \( f^{-1}\left(\bigcup_{i \in I} A_i\right) = \bigcup_{i \in I} f^{-1}(A_i) \in \mathscr S \), and hence \( \bigcup_{i \in I} A_i \in \mathscr{U} \). Thus \( \mathscr{U} \) is a \( \sigma \)-algebra of subsets of \( T \). But \( \mathscr B \subseteq \mathscr{U} \) by assumption, so \( \mathscr T = \sigma(\mathscr B) \subseteq \mathscr{U} \). Of course \( \mathscr{U} \subseteq \mathscr T \) by definition, so \( \mathscr{U} = \mathscr T \) and hence \( f \) is measurable.

If you have reviewed the section on topology then you may have noticed a striking parallel between the definition of continuity for functions on topological spaces and the defintion of measurability for functions on measurable spaces: A function from one topological space to another is continuous if the inverse image of an open set in the range space is open in the domain space. A function from one measurable space to another is measurable if the inverse image of a measurable set in the range space is measurable in the domain space. If we start with topological spaces, which we often do, and use the Borel \( \sigma \)-algebras to get measurable spaces, then we get the following (hardly surprising) connection.

Suppose that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are topological spaces, and that we give \( S \) and \( T \) the Borel \( \sigma \)-algebras \( \sigma(\mathscr S) \) and \( \sigma(\mathscr T) \) respectively. If \( f: S \to T \) is continuous, then \( f \) is measurable.

Proof

If \( V \in \mathscr T \) then \( f^{-1}(V) \in \mathscr S \subseteq \sigma(\mathscr S) \). Hence \( f \) is measurable by the previous theorem.

Measurability is preserved under composition, the most important method for combining functions.

Suppose that \((R, \mathscr{R})\), \((S, \mathscr S)\), and \((T, \mathscr T)\) are measurable spaces. If \(f: R \to S\) is measurable and \(g: S \to T\) is measurable, then \(g \circ f: R \to T\) is measurable.

Proof

If \( A \in \mathscr T \) then \( g^{-1}(A) \in \mathscr S \) since \( g \) is measurable, and hence \( (g \circ f)^{-1}(A) = f^{-1}\left[g^{-1}(A)\right] \in \mathscr{R} \) since \( f \) is measurable.

If \( T \) is given the smallest possible \( \sigma \)-algebra or if \( S \) is given the largest one, then any function from \( S \) into \( T \) is measurable.

Every function \( f: S \to T \) is measurable in each of the following cases:

\( \mathscr T = \{\emptyset, T\} \) and \( \mathscr S \) is an arbitrary \( \sigma \)-algebra of subsets of \( S \)
\( \mathscr S = \mathscr{P}(S) \) and \( \mathscr T \) is an arbitrary \( \sigma \)-algebra of subsets of \( T \).

Proof

Suppose that \( \mathscr T = \{\emptyset, T\} \) and that \( \mathscr S \) is an arbitrary \( \sigma \)-algebra on \( S \). If \( f: S \to T \), then \( f^{-1}(T) = S \in \mathscr S \) and \( f^{-1}(\emptyset) = \emptyset \in \mathscr S \) so \( f \) is measurable.
Suppose that \( \mathscr S = \mathscr{P}(S) \) and that \( \mathscr T \) is an arbitrary \( \sigma \)-algebra on \( T \). If \( f: S \to T \), then trivially \( f^{-1}(A) \in \mathscr S \) for every \( A \in \mathscr T \) so \( f \) is measurable.

When there are several \( \sigma \)-algebras for the same set, then we use the phrase with respect to so that we can be precise. If a function is measurable with respect to a given \( \sigma \)-algebra on its domain, then it's measurable with respect to any larger \( \sigma \)-algebra on this space. If the function is measurable with respect to a \( \sigma \)-algebra on the range space then its measurable with respect to any smaller \( \sigma \)-algebra on this space.

Suppose that \( S \) has \( \sigma \)-algebras \( \mathscr{R} \) and \( \mathscr S \) with \( \mathscr{R} \subseteq \mathscr S \), and that \( T \) has \( \sigma \)-algebras \( \mathscr T \) and \( \mathscr{U} \) with \( \mathscr T \subseteq \mathscr{U} \). If \( f: S \to T \) is measurable with respect to \( \mathscr{R} \) and \( \mathscr{U} \), then \( f \) is measureable with respect to \( \mathscr S \) and \( \mathscr T \).

Proof

If \( A \in \mathscr T \) then \( A \in \mathscr{U} \). Hence \( f^{-1}(A) \in \mathscr{R} \) so \( f^{-1}(A) \in \mathscr S \).

The following construction is particularly important in probability theory:

Suppose that \( S \) is a set and \( (T, \mathscr T) \) is a measurable space. Suppose also that \(f: S \to T\) and define \(\sigma(f) = \left\{f^{-1}(A): A \in \mathscr T\right\}\). Then

\( \sigma(f) \) is a \(\sigma\)-algebra on \(S\).
\( \sigma(f) \) is the smallest \( \sigma \)-algebra on \( S \) that makes \( f \) measurable.

Proof

The key to the proof is that the inverse image preserves all set operations. First, \( S \in \sigma(f) \) since \( T \in \mathscr T \) and \( f^{-1}(T) = S \). If \( B \in \sigma(f) \) then \( B = f^{-1}(A) \) for some \( A \in \mathscr T \). But then \( A^c \in \mathscr T \) and hence \( B^c = f^{-1}(A^c) \in \sigma(f) \). Finally, suppose that \( B_i \in \sigma(f) \) for \( i \) in a countable index set \( I \). Then for each \( i \in I \) there exists \( A_i \in \mathscr T \) such that \( B_i = f^{-1}(A_i) \). But then \( \bigcup_{i \in I} A_i \in \mathscr T \) and \( \bigcup_{i \in I} B_i = f^{-1}\left(\bigcup_{i \in I} A_i \right) \). Hence \( \bigcup_{i \in I} B_i \in \sigma(f) \).
If \( \mathscr S \) is a \( \sigma \)-algebra on \( S \) and \( f \) is measurable with respect to \( \mathscr S \) and \( \mathscr T \), then by definition \( f^{-1}(A) \in \mathscr S \) for every \( A \in \mathscr T \), so \( \sigma(f) \subseteq \mathscr S \).

Appropriately enough, \( \sigma(f) \) is called the \(\sigma\)-algebra generated by \(f\). Often, \( S \) will have a given \( \sigma \)-algebra \( \mathscr S \) and \( f \) will be measurable with respect to \( \mathscr S \) and \( \mathscr T \). In this case, \( \sigma(f) \subseteq \mathscr S \). We can generalize to an arbitrary collection of functions on \( S \).

Suppose \( S \) is a set and that \((T_i, \mathscr T_i)\) is a measurable space for each \(i\) in a nonempty index set \(I\). Suppose also that \(f_i: S \to T_i\) for each \(i \in I\). The \(\sigma\)-algebra generated by this collection of functions is \[ \sigma\left\{f_i: i \in I\right\} = \sigma\left\{\sigma(f_i): i \in I\right\} = \sigma\left\{f_i^{-1}(A): i \in I, \, A \in \mathscr T_i\right\} \]

Again, this is the smallest \(\sigma\)-algebra on \(S\) that makes \(f_i\) measurable for each \(i \in I\).

Product Sets

Product sets arise naturally in the form of the higher-dimensional Euclidean spaces \( \R^n \) for \( n \in \{2, 3, \ldots\} \). In addition, product spaces are particularly important in probability, where they are used to describe the spaces associated with sequences of random variables. More general product spaces arise in the study of stochastic processes. We start with the product of two sets; the generalization to products of \( n \) sets and to general products is straightforward, although the notation gets more complicated.

Suppose that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces. The product \( \sigma \)-algebra on \( S \times T \) is \[\mathscr S \otimes \mathscr T = \sigma\{A \times B: A \in \mathscr S, \; B \in \mathscr T\} \]

So the definition is natural: the product \( \sigma \)-algebra is generated by products of measurable sets. Our next goal is to consider the measurability of functions defined on, or mapping into, product spaces. Of basic importance are the projection functions. If \( S \) and \( T \) are sets, let \( p_1: S \times T \to S \) and \( p_2: S \times T \to T \) be defined by \( p_1(x, y) = x \) and \( p_2(x, y) = y \) for \( (x, y) \in S \times T \). Recall that \( p_1 \) is the projection onto the first coordinate and \( p_2 \) is the projection onto the second coordinate. The product \( \sigma \) algebra is the smallest \( \sigma \)-algebra that makes the projections measurable:

Suppose again that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces. Then \( \mathscr S \otimes \mathscr T = \sigma\{p_1, p_2\} \).

Proof

If \( A \in \mathscr S \) then \( p_1^{-1}(A) = A \times T \in \mathscr S \otimes \mathscr T\). Similarly, if \( B \in \mathscr T \) then \( p_2^{-1}(B) = S \times B \in \mathscr S \otimes \mathscr T \). Hence \( p_1 \) and \( p_2 \) are measurable, so \( \sigma\{p_1, p_2\} \subseteq \mathscr S \otimes \mathscr T \). Conversely, if \( A \in \mathscr S \) and \( B \in \mathscr T \) then \( A \times B = p_1^{-1}(A) \cap p_2^{-1}(B) \in \sigma\{p_1, p_2\}\). Since sets of this form generate the product \( \sigma \)-algebra, we have \( \mathscr S \otimes \mathscr T \subseteq \sigma\{p_1, p_2\} \).

Projection functions make it easy to study functions mapping into a product space.

Suppose that \( (R, \mathscr{R}) \), \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \mathscr S \otimes \mathscr T \). Suppose also that \( f: R \to S \times T \), so that \( f(x) = \left(f_1(x), f_2(x)\right) \) for \( x \in \R \), where \( f_1: R \to S \) and \( f_2: R \to T \) are the coordinate functions. Then \( f \) is measurable if and only if \( f_1 \) and \( f_2 \) are measurable.

Proof

Note that \( f_1 = p_1 \circ f \) and \( f_2 = p_2 \circ f \). So if \( f \) is measurable then \( f_1 \) and \( f_2 \) are compositions of measurable functions, and hence are measurable. Conversely, suppose that \( f_1 \) and \( f_2 \) are measurable. If \( A \in \mathscr S \) and \( B \in \mathscr T \) then \( f^{-1}(A \times B) = f_1^{-1}(A) \cap f_2^{-1}(B) \in \mathscr{R} \). Since products of measurable sets generate \( \mathscr S \otimes \mathscr T \), it follows that \( f \) is measurable.

Our next goal is to consider cross sections of sets in a product space and cross sections of functions defined on a product space. It will help to introduce some new functions, which in a sense are complementary to the projection functions.

Suppose again that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \mathscr S \otimes \mathscr T \).

For \( x \in S \) the function \( 1_x : T \to S \times T \), defined by \( 1_x(y) = (x, y) \) for \( y \in T \), is measurable.
For \( y \in T \) the function \( 2_y: S \to S \times T \), defined by \( 2_y(x) = (x, y) \) for \( x \in S \), is measurable.

Proof

To show that the functions are measurable, if suffices to consider inverse images of products of measurable sets, since such sets generate \( \mathscr S \otimes \mathscr T \). Thus, let \( A \in \mathscr S \) and \( B \in \mathscr T \).

For \( x \in S \) note that \( 1_x^{-1}(A \times B) \) is \( B \) if \( x \in A \) and is \( \emptyset \) if \( x \notin A \). In either case, \( 1_x^{-1}(A \times B) \in \mathscr T \).
Similarly, for \( y \in T \) note that \( 2_y^{-1}(A \times B) \) is \( A \) if \( y \in B \) and is \( \emptyset \) if \( y \notin B \). In either case, \( 2_y^{-1}(A \times B) \in \mathscr S \).

Now our work is easy.

Suppose again that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces, and that \( C \in \mathscr S \otimes \mathscr T \). Then

For \( x \in S \), \( \{y \in T: (x, y) \in C\} \in \mathscr T \).
For \( y \in T \), \( \{x \in S: (x, y) \in C\} \in \mathscr S\).

Proof

These result follow immediately from the measurability of the functions \( 1_x \) and \( 2_y \):

For \( x \in S \), \( 1_x^{-1}(C) = \{y \in T: (x, y) \in C\} \).
For \( y \in T \), \( 2_y^{-1}(C) = \{x \in S: (x, y) \in C\} \).

The set in (a) is the cross section of \( C \) in the first coordinate at \( x \), and the set in (b) is the cross section of \( C \) in the second coordinate at \( y \). As a simple corollary to the theorem, note that if \( A \subseteq S \), \( B \subseteq T \) and \( A \times B \in \mathscr S \otimes \mathscr T \) then \( A \in \mathscr S \) and \( B \in \mathscr T \). That is, the only measurable product sets are products of measurable sets. Here is the measurability result for cross-sectional functions:

Suppose again that \( (S, \mathscr S) \) and \( (T, \mathscr T) \) are measurable spaces, and that \( S \times T \) is given the product \( \sigma \)-algebra \( \mathscr S \otimes \mathscr T \). Suppose also that \( (U, \mathscr{U}) \) is another measurable space, and that \( f: S \times T \to U \) is measurable. Then

The function \( y \mapsto f(x, y) \) from \( T \) to \( U \) is measurable for each \( x \in S \).
The function \( x \mapsto f(x, y) \) from \( S \) to \( U \) is measurable for each \( y \in T \).

Proof

Note that the function in (a) is just \( f \circ 1_x\), and the function in (b) is just \( f \circ 2_y \), both are compositions of measurable functions

The results for products of two spaces generalize in a completely straightforward way to a product of \( n \) spaces.

Suppose \( n \in \N_+ \) and that \( (S_i, \mathscr S_i) \) is a measurable space for each \( i \in \{1, 2, \ldots, n\} \). The product \( \sigma \)-algebra on the Cartesian product set \( S_1 \times S_2 \times \cdots \times S_n \) is \[ \mathscr S_1 \otimes \mathscr S_2 \otimes \cdots \otimes \mathscr S_n = \sigma\left\{ A_1 \times A_2 \times \cdots \times A_n: A_i \in \mathscr S_i \text{ for all } i \in \{1, 2, \ldots, n\}\right\} \]

So again, the product \( \sigma \)-algebra is generated by products of measurable sets. Results analogous to the theorems above hold. In the special case that \( (S_i, \mathscr S_i) = (S, \mathscr S) \) for \( i \in \{1, 2, \ldots, n\} \), the Cartesian product becomes \( S^n \) and the corresponding product \( \sigma \)-algebra is denoted \( \mathscr S^n \). The notation is natural, but potentially confusing. Note that \( \mathscr S^n \) is not the Cartesian product of \( \mathscr S \) \( n \) times, but rather the \( \sigma \)-algebra generated by sets of the form \( A_1 \times A_2 \times \cdots \times A_n \) where \( A_i \in \mathscr S \) for \( i \in \{1, 2, \ldots, n\} \).

We can also extend these ideas to a general product. To recall the definition, suppose that \( S_i \) is a set for each \( i \) in a nonempty index set \( I \). The product set \( \prod_{i \in I} S_i \) consists of all functions \( x: I \to \bigcup_{i \in I} S_i \) such that \( x(i) \in S_i \) for each \( i \in I \). To make the notation look more like a simple Cartesian product, we often write \( x_i \) instead of \( x(i) \) for the value of a function in the product set at \( i \in I \). The next definition gives the appropriate \( \sigma \)-algebra for the product set.

Suppose that \( (S_i, \mathscr S_i) \) is a measurable space for each \(i \) in a nonempty index set \( I \). The product \( \sigma \)-algebra on the product set \( \prod_{i \in I} S_i \) is \[ \sigma\left\{\prod_{i \in I} A_i: A_i \in \mathscr S_i \text{ for each } i \in I \text{ and } A_i = S_i \text{ for all but finitely many } i \in I \right\}\]

If you have reviewed the section on topology, the definition should look familiar. If the spaces were topological spaces instead of measurable spaces, with \( \mathscr S_i \) the topology of \( S_i \) for \( i \in I \), then the set of products in the displayed expression above is a base for the product topology on \( \prod_{i \in I} S_i \).

The definition can also be understood in terms of projections. Recall that the projection onto coordinate \( j \in I \) is the function \( p_j: \prod_{i \in I} S_i \to S_j \) given by \( p_j(x) = x_j \). The product \( \sigma \)-algebra is the smallest \( \sigma \)-algebra on the product set that makes all of the projections measurable.

Suppose again that \( (S_i, \mathscr S_i)\) is a measurable space for each \( i \) in a nonempty index set \( I \), and let \( \mathfrak{S} \) denote the product \( \sigma \)-algebra on the product set \( S_I = \prod_{i \in I} S_i \). Then \(\mathfrak{S} = \sigma\{p_i: i \in I\} \).

Proof

Let \( j \in I \) and \( A \in \mathscr S_j \). Then \( p_j^{-1}(A) = \prod_{i \in I} A_i \) where \( A_i = S_i \) for \( i \ne j \) and \( A_j = A \). This set is in \( \mathfrak{S} \) so \( p_j \) is measurable. Hence \( \sigma\{p_i: i \in I\} \subseteq \mathfrak{S} \). For the other direction, consider a product set \( \prod_{i \in I} A_i \) where \( A_i = S_i \) except for \( i \in J \), where \( J \subseteq I \) is finite. Then \( \prod_{i \in I} A_i = \bigcap_{j \in J} p_j^{-1}(A_j) \). This set is in \( \sigma\{p_i: i \in I\} \). Product sets of this form generate \( \mathfrak{S} \) so it follows that \( \mathfrak{S} \subseteq \sigma\{p_i: i \in I\} \).

In the special case that \( (S, \mathscr S) \) is a fixed measurable space and \( (S_i, \mathscr S_i) = (S, \mathscr S) \) for all \( i \in I \), the product set \( \prod_{i \in I} S \) is just the collection of functions from \( I \) into \( S \), often denoted \( S^I \). The product \( \sigma \)-algebra is then denoted \( \mathscr S^I \), a notation that is natural, but again potentially confusing. Here is the main measurability result for a function mapping into a product space.

Suppose that \( (R, \mathscr{R}) \) is a measurable space, and that \( (S_i, \mathscr S_i) \) is a measurable space for each \( i \) in a nonempty index set \( I \). As before, let \(\prod_{i \in I} S_i \) have the product \( \sigma \)-algebra. Suppose now that \( f: R \to \prod_{i \in I} S_i \). For \( i \in I \) let \( f_i: R \to S_i \) denote the \( i \)th coordinate function of \( f \), so that \( f_i(x) = [f(x)]_i \) for \( x \in R \). Then \( f \) is measurable if and only if \( f_i \) is measurable for each \( i \in I \).

Proof

Suppose that \( f \) is measurable. For \( i \in I \) note that \( f_i = p_i \circ f \) is a composition of measurable functions, and hence is measurable. Conversely, suppose that \( f_i \) is measurable for each \( i \in I \). To show that measurability of \( f \) we need only consider inverse images of sets that generate the product \( \sigma \)-algebra. Thus, suppose that \( A_j \in \mathscr S_j \) for \( j \) in a finite subset \( J \subseteq I \), and let \( A_i = S_i \) for \( i \in I - J \). Then \( f^{-1}\left(\prod_{i \in I} A_i\right) = \bigcap_{j \in J} f_j^{-1}(A_j) \). This set is in \( \mathscr{R} \) since the intersection is over a finite index set.

Just as with the product of two sets, cross-sectional sets and functions are measurable with respect to the product measure. Again, it's best to work with some special functions.

Suppose that \( (S_i, \mathscr S_i) \) is a measurable space for each \( i \) in an index set \( I \) with at least two elements. For \( j \in I \) and \( u \in S_j \), define the function \( j_u: \prod_{i \in I - \{j\}} \to \prod_{i \in I} S_i \) by \( j_u(x) = y \) where \( y_i = x_i \) for \( i \ne j \) and \( y_j = u \). Then \( j_u \) is measurable with respect to the product \( \sigma \)-algebras.

Proof

Once again, it suffices to consider the inverse image of the sets that generate the product \( \sigma \)-algebra. So suppose \( A_i \in \mathscr S_i \) for \( i \in I \) with \( A_i = S_i \) for all but finitely many \( i \in I \). Then \( j_u^{-1}\left(\prod_{i \in I} A_i\right) = \prod_{i \in I - \{j\}} A_i \) if \( u \in A_j \), and the inverse image is \( \emptyset \) otherwise. In either case, \( j_u^{-1}\left(\prod_{i \in I} A_i\right) \) is in the product \( \sigma \)-algebra on \( \prod_{i \in I - \{j\}} S_i \).

In words, for \( j \in I \) and \( u \in S_j \), the function \( j_u \) takes a point in the product set \( \prod_{ i \in I - \{j\}} S_i \) and assigns \( u \) to coordinate \( j \) to give a point in \( \prod_{i \in I} S_i \). If \( A \subseteq \prod_{i \in I} S_i \), then \( j_u^{-1}(A) \) is the cross section of \( A \) in coordinate \( j \) at \( u \). So it follows immediately from the previous result that the cross sections of a measurable set are measurable. Cross sections of measurable functions are also measurable. Suppose that \( (T, \mathscr T) \) is another measurable space, and that \( f: \prod_{i \in I} S_i \to T \) is measurable. The cross section of \( f \) in coordinate \( j \in I \) at \( u \in S_j \) is simply \( f \circ j_u: S_{I - \{j\}} \to T\), a composition of measurable functions.

However, a non-measurable set can have measurable cross sections, even in a product of two spaces.

Suppose that \( S \) is an uncountable set with the \( \sigma \)-algebra \( \mathscr{C} \) of countable and co-countable sets as in (21). Consider \( S \times S \) with the product \( \sigma \)-algebra \( \mathscr{C} \otimes \mathscr{C} \). Let \( D = \{(x, x): x \in S\}\), the diagonal of \( S \times S \). Then \( D \) has measurable cross sections, but \( D \) is not measurable.

Proof

For \( x \in S \), the cross section of \( D \) in the first coordinate at \( x \) is \( \{y \in S: (x, y) \in D\} = \{x\} \in \mathscr{C} \). Similarly, for \( y \in S \), the cross section of \( D \) in the second coordinate at \( y \) is \( \{x \in S: (x, y) \in D\} = \{ y\} \in \mathscr{C} \). But \( D \) cannot be generated by a countable collection of sets of the form \( A \times B \) with \( A, \, B \in \mathscr{C} \), so \( D \notin \mathscr{C} \otimes \mathscr{C} \), by the result above.

Special Cases

Most of the sets encountered in applied probability are either countable, or subsets of \(\R^n\) for some \(n\), or more generally, subsets of a product of a countable number of sets of these types. In the study of stochastic processes, various spaces of functions play an important role. In this subsection, we will explore the most important special cases.

Discrete Spaces

If \(S\) is countable and \(\mathscr S = \mathscr P(S)\) is the collection of all subsets of \(S\), then \((S, \mathscr S)\) is a discrete measurable space.

Thus if \((S, \mathscr S)\) is discrete, all subsets of \( S \) are measurable and every function from \( S \) to another measurable space is measurable. The power set is also the discrete topology on \( S \), so \( \mathscr S \) is a Borel \( \sigma \)-algebra as well. As a topological space, \( (S, \mathscr S) \) is complete, locally compact, Hausdorff, and since \( S \) is countable, separable. Moreover, the discrete topology corresponds to the discrete metric \( d \), defined by \( d(x, x) = 0 \) for \( x \in S \) and \( d(x, y) = 1 \) for \( x, \, y \in S \) with \( x \ne y \).

Euclidean Spaces

Recall that for \(n \in \N_+\), the Euclidean topology on \(\R^n\) is generated by the standard Euclidean metric \( d_n \) given by \[ d_n(\bs x, \bs y) = \sqrt{\sum_{i=1}^n (x_i - y_i)^2}, \quad \bs x = (x_1, x_2, \ldots, x_n), \, \bs y = (y_1, y_2, \ldots, y_n) \in \R^n \] With this topology, \( \R^n \) is complete, connected, locally compact, Hausdorff, and separable.

For \(n \in \N_+\), the \(n\)-dimensional Euclidean measurable space is \((\R^n, \mathscr R_n)\) where \(\mathscr R_n\) is the Borel \(\sigma\)-algebra corresponding to the standard Euclidean topology on \(\R^n\).

The one-dimensional case is particularly important. In this case, the standard Euclidean metric \( d \) is given by \( d(x, y) = \left|x - y\right| \) for \( x, \, y \in \R \). The Borel \(\sigma\)-algebra \(\mathscr R\) can be generated by various collections of intervals.

Each of the following collections generates \( \mathscr R \).

\( \mathscr B_1 = \{I \subseteq \R: I \text{ is an interval} \} \)
\( \mathscr B_2 = \{(a, b]: a, \, b \in \R, \; a \lt b \}\)
\( \mathscr B_3 = \{(-\infty, b]: b \in \R \} \)

Proof

The proof involves showing that each set in any one of the collections is in the \( \sigma \)-algebra of any other collection. Let \( \mathscr S_i = \sigma(\mathscr B_i) \) for \( i \in \{1, 2, 3\} \).

Clearly \( \mathscr B_2 \subseteq \mathscr B_1 \) and \( \mathscr B_3 \subseteq \mathscr B_1 \) so \( \mathscr S_2 \subseteq \mathscr S_1 \) and \( \mathscr S_3 \subseteq \mathscr S_1 \).
If \( a, \, b \in \R \) with \( a \le b \) then \( [a, b] = \bigcap_{n=1}^\infty \left(a - \frac{1}{n}, b\right] \) and \( (a, b) = \bigcup_{n=1}^\infty \left(a, b - \frac{1}{n}\right] \), so \( [a, b], \, (a, b) \in \mathscr S_2 \). Also \( [a, b) = \bigcup_{n=1}^\infty \left[a, b - \frac{1}{n}\right] \) so \( [a, b) \in \mathscr{R}_2 \). Thus all bounded intervals are in \( \mathscr S_2 \). Next, \( [a, \infty) = \bigcup_{n=1}^\infty [a, a + n) \), \( (a, \infty) = \bigcup_{n=1}^\infty (a, a + n) \), \( (-\infty, a] = \bigcup_{n=1}^\infty (a - n, a] \), and \( (-\infty, a) = \bigcup_{n=1}^\infty (a - n, a) \), so each of these intervals is in \( \mathscr S_2 \). Of course \( \R \in \mathscr S_2 \), so we now have that \( I \in \mathscr S_2 \) for every interval \( I \). Thus \( \mathscr S_1 \subseteq \mathscr S_2 \), and so from (a), \( \mathscr S_2 = \mathscr S_1\).
If \( a, \, b \in \R \) with \( a \lt b \) then \( (a, b] = (-\infty, b] - (-\infty, a] \) so \( (a, b] \in \mathscr S_3 \). Hence \( \mathscr S_2 \subseteq \mathscr S_3 \). But then from (a) and (b) it follows that \( \mathscr S_3 = \mathscr S_1 \).

Since the Euclidean topology has a countable base, \(\mathscr R\) is countably generated. In fact each collection of intervals above, but with endpoints restricted to \( \Q \), generates \(\mathscr R\). Moreover, \( \mathscr R \) can also be constructed from \( \sigma \)-algebras that are generated by countable partitions. First recall that for \( n \in \N \), the set of dyadic rationals (or binary rationals) of rank \( n \) or less is \( \D_n = \{j / 2^n: j \in \Z\} \). Note that \( \D_n \) is countable and \( \D_n \subseteq \D_{n+1} \) for \( n \in \N \). Moreover, the set \( \D = \bigcup_{n \in \N} \D_n \) of all dyadic rationals is dense in \( \R \). The dyadic rationals are often useful in various applications because \( \D_n \) has the natural ordered enumeration \( j \mapsto j / 2^n \) for each \( n \in \N \). Now let \[ \mathscr{D}_n = \left\{\left(j / 2^n, (j + 1) / 2^n\right]: j \in \Z\right\}, \quad n \in \N \] Then \( \mathscr{D}_n \) is a countable partition of \( \R \) into nonempty intervals of equal size \( 1 / 2^n \), so \( \mathscr{E}_n = \sigma(\mathscr{D}_n) \) consists of unions of sets in \( \mathscr{D}_n \) as described above. Every set \( \mathscr{D}_{n} \) is the union of two sets in \( \mathscr{D}_{n+1} \) so clearly \( \mathscr{E}_n \subseteq \mathscr{E}_{n+1} \) for \( n \in \N \). Finally, the Borel \( \sigma \)-algebra on \( \R \) is \( \mathscr{R} = \sigma\left(\bigcup_{n=0}^\infty \mathscr{E}_n\right) = \sigma\left(\bigcup_{n=0}^\infty \mathscr{D}_n\right) \). This construction turns out to be useful in a number of settings.

For \( n \in \{2, 3, \ldots\} \), the Euclidean topology on \(\R^n\) is the \( n \)-fold product topology formed from the Euclidean topology on \( \R \). So the Borel \( \sigma \)-algebra \( \mathscr R^n \) is also the \( n \)-fold power \( \sigma \)-algebra formed from \( \mathscr R \). Finally, \( \mathscr R^n \) can be generated by \( n \)-fold products of sets in any of the three collections in the previous theorem.

Space of Real Functions

Suppose that \( (S, \mathscr S) \) is a measurable space. From our general discussion of functions, recall that the usual arithmetic operations on functions from \( S \) into \( \R \) are defined pointwise.

If \( f: S \to \R \) and \( g: S \to \R \) are measurable and \( a \in \R \), then each of the following functions from \( S \) into \( \R \) is also measurable:

\( f + g \)
\( f - g \)
\( f g \)
\( a f \)

Proof

These results follow from the fact that the arithmetic operators are continuous, and hence measurable. That is, \( (x, y) \mapsto x + y \), \( (x, y) \mapsto x - y \), and \( (x, y) \mapsto x y \) are continuous as functions from \( \R^2 \) into \( \R \). Thus, if \( f, \, g: S \to \R \) are measurable, then \( (f, g): S \to \R^2 \) is measurable by the result above. Then, \( f + g \), \( f - g \), \( f g \) are the compositions, respectively, of \( + \), \( - \), \( \cdot \) with \( (f, g) \). Of course, (d) is a simple corollary of (c).

Similarly, if \( f: S \to \R \setminus \{0\} \) is measurable, then so is \( 1 / f \). Recall that the set of functions from \( S \) into \( \R \) is a vector space, under the pointwise definitions of addition and scalar multiplication. But once again, we usually want to restrict our attention to measurable functions. Thus, it's nice to know that the measurable functions from \( S \) into \( \R \) also form a vector space. This follows immediately from the closure properties (a) and (d) of the previous theorem. Of particular importance in probability and stochastic processes is the vector space of bounded, measurable functions \( f: S \to \R \), with the supremum norm \[ \|f\| = \sup\left\{\left|f(x)\right|: x \in S \right\} \]

The elementary functions that we encounter in calculus and other areas of applied mathematics are functions from subsets of \( \R \) into \( \R \). The elementary functions include algebraic functions (which in turn include the polynomial and rational functions), the usual transcendental functions (exponential, logarithm, trigonometric), and the usual functions constructed from these by composition, the arithmetic operations, and by piecing together. As we might hope, all of the elementary functions are measurable.