5.41: The Logarithmic Series Distribution
The logarithmic series distribution , as the name suggests, is based on the standard power series expansion of the natural logarithm function. It is also sometimes known more simply as the logarithmic distribution .
Basic Theory
Distribution Functions
The logarithmic series distribution with shape parameter \( p \in (0, 1) \) is a discrete distribution on \( \N_+ \) with probability density function \( f \) given by \[ f(n) = \frac{1}{-\ln(1 - p)} \frac{p^n}{n}, \quad n \in \N_+ \]
- \( f \) is decreasing with mode \( n = 1 \).
- When smoothed, \( f \) is concave upward.
Proof
Recall that the standard power series for \( -\ln(1 - p) \), obtained by integrating the geometric series \( \sum_{n=0}^\infty p^n = 1 \big/ (1 - p) \), is \[ -\ln(1 - p) = \sum_{n=1}^\infty \frac{p^n}{n}, \quad p \in (0, 1) \] For the properties, consider the function \( x \mapsto p^x \big/ x \) on \( [1, \infty) \). The first derivative is \[ \frac{p^x [x \ln(p) - 1]}{x^2} \] which is negative, and the second derivative is \[ \frac{p^x \left[x^2 \ln^2(p) - 2 x \ln(p) + 2\right]}{x^3} \] which is positive
Open the Special Distribution Simulator and select the logarithmic series distribution. Vary the parameter and note the shape of the probability density function. For selected values of the parameter, run the simulation 1000 times and compare the empirical density function to the probability density function.
The distribution function and the quantile function do not have simple, closed forms in terms of the standard elementary functions.
Open the special distribution calculator and select the logarithmic series distribution. Vary the parameter and note the shape of the distribution and probability density functions. For selected values of the parameters, compute the median and the first and third quartiles.
Moments
Suppose again that random variable \( N \) has the logarithmic series distribution with shape parameter \( p \in (0, 1) \). Recall that the permutation formula is \( n^{(k)} = n (n - 1) \cdots (n - k + 1) \) for \( n \in \R \) and \( k \in \N \). The factorial moments of \( N \) are \( \E\left(N^{(k)}\right) \) for \( k \in \N \).
The factorial moments of \( N \) are given by \[ \E\left(N^{(k)}\right) = \frac{(k - 1)!}{-\ln(1 - p)} \left(\frac{p}{1 - p}\right)^k, \quad k \in \N_+\]
Proof
Recall that a power series can be differentialed term by term within the open interval of convergence. Hence \begin{align} \E\left(N^{(k)}\right) & = \sum_{n=1}^\infty n^{(k)} \frac{1}{-\ln(1 - p)} \frac{p^n}{n} = \frac{p^k}{-\ln(1 - p)} \sum_{n=k}^\infty n^{(k)} \frac{p^{n-k}}{n} \\ & = \frac{p^k}{-\ln(1 - p)} \sum_{n=k}^\infty \frac{d^k}{dp^k} \frac{p^n}{n} = \frac{p^k}{-\ln(1 - p)} \frac{d^k}{dp^k} \sum_{n=1}^\infty \frac{p^n}{n} \\ = & \frac{p^k}{-\ln(1 - p)} \frac{d^k}{dp^k} [-\ln(1 - p)] = \frac{p^k}{-\ln(1 - p)} (k - 1)! (1 - p)^{-k} \end{align}
The mean and variance of \( N \) are
- \[ \E(N) = \frac{1}{-\ln(1 - p)} \frac{p}{1 - p} \]
- \[ \var(N) = \frac{1}{-\ln(1 - p)} \frac{p}{(1 - p)^2} \left[1 - \frac{p}{-\ln(1 - p)} \right] \]
Proof
These results follow easily from the factorial moments . For part (b), note first that \[ \E\left(N^2\right) = \E[N(N - 1)] + \E(N) = \frac{1}{-\ln(1 - p)} \frac{p}{(1 - p)^2} \] The result then then follows from the usual computational formula \( \var(N) = \E\left(N^2\right) - [\E(N)]^2 \).
Open the special distribution simulator and select the logarithmic series distribution. Vary the parameter and note the shape of the mean \( \pm \) standard deviation bar. For selected values of the parameter, run the simulation 1000 times and compare the empirical mean and standard deviation to the distribution mean and standard deviation.
The probability generating function \( P \) of \( N \) is given by \[ P(t) = \E\left(t^N\right) = \frac{\ln(1 - p t)}{\ln(1 - p)}, \quad \left|t\right| \lt \frac{1}{p} \]
Proof
\[ P(t) = \sum_{n=1}^\infty t^n \frac{1}{-\ln(1 - p)} \frac{p^n}{n} = \frac{1}{-\ln(1 - p)} \sum_{n=1}^\infty \frac{(p t)^n}{n} = \frac{-\ln(1 - p t)}{-\ln(1 - p)} \]The factorial moments above can also be obtained from the probability generating function, since \( P^{(k)}(1) = \E\left(N^{(k)}\right) \) for \( k \in \N_+ \).
Related Distributions
Naturally, the limits of the logarithmic series distribution with respect to the parameter \( p \) are of interest.
The logarithmic series distribution with shape parameter \( p \in (0, 1) \) converges to point mass at 1 as \( p \downarrow 0 \).
Proof
An application of L'Hospitals rule to the PGF \( P \) above shows that \( \lim_{p \downarrow 0} P(t) = t \), which is the PGF of point mass at 1.
The logarithmic series distribution is a power series distribution associated with the function \( g(p) = -\ln(1 - p) \) for \( p \in [0, 1) \).
Proof
This follows from the definition of a power series distribution, since as noted in the PDF proof , \[ \sum_{n=1}^\infty \frac{p^n}{n} = - \ln(1 - p), \quad p \in [0, 1) \]
The moment results above actually follow from general results for power series distributions. The compound Poisson distribution based on the logarithmic series distribution gives a negative binomial distribution.
Suppose that \( \bs{X} = (X_1, X_2, \ldots) \) is a sequence of independent random variables each with the logarithmic series distribution with parameter \( p \in (0, 1) \). Suppose also that \( N \) is independent of \( \bs{X} \) and has the Poisson distribution with rate parameter \( r \in (0, \infty) \). Then \( Y = \sum_{i = 1}^N X_i \) has the negative binomial distribution on \( \N \) with parameters \( 1 - p \) and \( -r \big/\ln(1 - p) \)
Proof
The PGF of \( Y \) is \( Q \circ P \), where \( P \) is the PGF of the logarithmic series distribution, and where \( Q \) is the PGF of the Poisson distribution so that \( Q(s) = e^{r(s - 1)} \) for \( s \in \R \). Thus we have \[ (Q \circ P)(t) = \exp \left(r \left[\frac{\ln(1 - p t)}{\ln(1 - p)} - 1\right]\right), \quad \left|t\right| \lt \frac{1}{p} \] With a little algebra, this can be written in the form \[ (Q \circ P)(t) = \left(\frac{1 - p}{1 - p t}\right)^{-r / \ln(1 - p)}, \quad \left|t\right| \lt \frac{1}{p} \] which is the PGF of the negative binomial distribution with parameters \( 1 - p \) and \( -r \big/ \ln(1 - p) \).