5.20: General Uniform Distributions
This section explores uniform distributions in an abstract setting. If you are a new student of probability, or are not familiar with measure theory, you may want to skip this section and read the sections on the uniform distribution on an interval and the discrete uniform distributions.
Basic Theory
Definition
Suppose that \( (S, \mathscr S, \lambda) \) is a measure space. That is, \( S \) is a set, \( \mathscr S \) a \( \sigma \)-algebra of subsets of \( S \), and \( \lambda \) a positive measure on \( \mathscr S \). Suppose also that \( 0 \lt \lambda(S) \lt \infty \), so that \( \lambda \) is a finite, positive measure.
Random variable \( X \) with values in \( S \) has the uniform distribution on \( S \) (with respect to \( \lambda \)) if \[ \P(X \in A) = \frac{\lambda(A)}{\lambda(S)}, \quad A \in \mathscr S \]
Thus, the probability assigned to a set \( A \in \mathscr S\) depends only on the size of \( A \) (as measured by \( \lambda \)).
The most common special cases are as follows:
- Discrete : The set \( S \) is finite and non-empty, \( \mathscr S \) is the \( \sigma \)-algebra of all subsets of \( S \), and \( \lambda = \# \) (counting measure).
- Euclidean : For \(n \in \N_+\), let \(\mathscr R_n\) denote the \(\sigma\)-algebra of Borel measureable subsets of \(\R^n\) and let \(\lambda_n\) denote Lebesgue measure on \((\R^n, \mathscr R_n)\). In this setting, \(S \in \mathscr R_n\) with \(0 \lt \lambda_n(S) \lt \infty\), \(\mathscr S = \{A \in \mathscr R_n: A \subseteq S\}\), and the measure is \(\lambda_n\) restricted to \((S, \mathscr S)\).
In the Euclidean case, recall that \( \lambda_1 \) is length measure on \( \R \), \( \lambda_2 \) is area measure on \( \R^2 \), \( \lambda_3 \) is volume measure on \( \R^3 \), and in general \( \lambda_n \) is sometimes referred to as \( n \)-dimensional volume. Thus, \( S \in \mathscr R_n \) is a set with positive, finite volume.
Properties
Suppose \((S, \mathscr S, \lambda)\) is a finite, positive measure space, as above, and that \( X \) is uniformly distributed on \( S \).
The probability density function \( f \) of \( X \) (with respect to \( \lambda \)) is \[ f(x) = \frac{1}{\lambda(S)}, \quad x \in S \]
Proof
This follows directly from the definition of probability density function: \[\int_A \frac 1 {\lambda(S)} \, d\lambda(x) = \frac{\lambda(A)}{\lambda(S)}, \quad A \in \mathscr S\]
Thus, the defining property of the uniform distribution on a set is constant density on that set. Another basic property is that uniform distributions are preserved under conditioning.
Suppose that \( R \in \mathscr S \) with \( \lambda(R) \gt 0 \). The conditional distribution of \( X \) given \( X \in R \) is uniform on \( R \).
Proof
For \(A \in \mathscr S\) with \( A \subseteq R \), \[ \P(X \in A \mid X \in R) = \frac{\P(X \in A)}{\P(X \in R)} = \frac{\lambda(A)/\lambda(S)}{\lambda(R)/\lambda(S)} = \frac{\lambda(A)}{\lambda(R)} \]
In the setting of previous result, suppose that \( \bs{X} = (X_1, X_2, \ldots) \) is a sequence of independent variables, each uniformly distributed on \( S \). Let \( N = \min\{n \in \N_+: X_n \in R\} \). Then \( N \) has the geometric distribution on \( \N_+ \) with success parameter \( p = \P(X \in R) \). More importantly, the distribution of \( X_N \) is the same as the conditional distribution of \( X \) given \( X \in R \), and hence is uniform on \( R \). This is the basis of the rejection method of simulation. If we can simulate a uniform distribution on \( S \), then we can simulate a uniform distribution on \( R \).
If \( h \) is a real-valued function on \( S \), then \( \E[h(X)] \) is the average value of \( h \) on \( S \), as measured by \( \lambda \):
If \( h: S \to \R \) is integrable with respect to \( \lambda \) Then \[ \E[h(X)] = \frac{1}{\lambda(S)} \int_S h(x) \, d\lambda(x) \]
Proof
This result follows from the change of variables theorem for expected value, since \[ \E[h(X)] = \int_S h(x) f(x) \, d\lambda(x) = \frac 1 {\lambda(S)} \int_S h(x) \, d\lambda(x)\]
The entropy of the uniform distribution on \( S \) depends only on the size of \( S \), as measured by \( \lambda \):
The entropy of \( X \) is \( H(X) = \ln[\lambda(S)] \).
Proof
\[ H(X) = \E\{-\ln[f(X)]\} = \int_S -\ln\left(\frac{1}{\lambda(S)}\right) \frac{1}{\lambda(S)} = -\ln\left(\frac{1}{\lambda(S)}\right) = \ln[\lambda(S)] \]Product Spaces
Suppose now that \( (S, \mathscr S, \lambda) \) and \( (T, \mathscr T, \mu) \) are finite, positive measure spaces, so that \( 0 \lt \lambda(S) \lt \infty \) and \( 0 \lt \mu(T) \lt \infty \). Recall the product space \( (S \times T, \mathscr S \otimes \mathscr T, \lambda \otimes \mu) \). The product \( \sigma \)-algebra \( \mathscr S \otimes \mathscr T \) is the \( \sigma \)-algebra of subsets of \( S \times T \) generated by product sets \( A \times B \) where \( A \in \mathscr S \) and \( B \in \mathscr T \). The product measure \( \lambda \otimes \mu \) is the unique positive measure on \( (S \times T, \mathscr S \otimes \mathscr T) \) that satisfies \( (\lambda \otimes \mu)(A \times B) = \lambda(A) \mu(B) \) for \( A \in \mathscr S \) and \( B \in \mathscr T \).
\( (X, Y) \) is uniformly distributed on \( S \times T \) if and only if \( X \) is uniformly distributed on \( S \), \( Y \) is uniformly distributed on \( T \), and \( X \) and \( Y \) are independent.
Proof
Suppose first that \( (X, Y) \) is uniformly distributed on \( S \times T\). If \( A \in \mathscr S \) and \( B \in \mathscr T \) then \[ \P(X \in A, Y \in B) = \P[(X, Y) \in A \times B] = \frac{(\lambda \otimes \mu)(A \times B)}{(\lambda \otimes \mu)(S \times T)} = \frac{\lambda(A) \mu(B)}{\lambda(S) \mu(T)} = \frac{\lambda(A)}{\lambda(S)} \frac{\mu(B)}{\mu(T)} \] Taking \( B = T \) in the displayed equation gives \( \P(X \in A) = \lambda(A) \big/ \lambda(S) \) for \( A \in \mathscr S \), so \( X \) is uniformly distributed on \( S \). Taking \( A = S \) in the displayed equation gives \( \P(Y \in B) = \mu(B) \big/ \mu(T) \) for \( B \in \mathscr T \), so \( Y \) is uniformly distributed on \( T \). Returning to the displayed equation generally gives \( \P(X \in A, Y \in B) = \P(X \in A) \P(Y \in B) \) for \( A \in \mathscr S \) and \( B \in \mathscr T \), so \( X \) and \( Y \) are independent.
Conversely, suppose that \( X \) is uniformly distributed on \( S \), \( Y \) is uniformly distributed on \( T \), and \( X \) and \( Y \) are independent. Then for \( A \in \mathscr S \) and \( B \in \mathscr T \), \[ \P[(X, Y) \in A \times B] = \P(X \in A, Y \in B) = \P(X \in A) \P(Y \in B) = \frac{\lambda(A)}{\lambda(S)} \frac{\mu(B)}{\mu(T)} = \frac{\lambda(A) \mu(B)}{\lambda(S) \mu(T)} = \frac{(\lambda \otimes \mu)(A \times B)}{(\lambda \otimes \mu)(S \times T)} \] It then follows (see the section on existence and uniqueness of measures) that \( \P[(X, Y) \in C] = (\lambda \otimes \mu)(C) / (\lambda \otimes \mu)(S \times T) \) for every \( C \in \mathscr S \otimes \mathscr T \), so \( (X, Y) \) is uniformly distributed on \( S \times T \).