# 5: Special Distributions

In this chapter, we study several general families of probability distributions and a number of special parametric families of distributions. Unlike the other expository chapters in this text, the sections are not linearly ordered and so this chapter serves primarily as a reference. You may want to study these topics as the need arises.

First, we need to discuss what makes a probability distribution special in the first place. In some cases, a distribution may be important because it is connected with other special distributions in interesting ways (via transformations, limits, conditioning, etc.). In some cases, a parametric family may be important because it can be used to model a wide variety of random phenomena. This may be the case because of fundamental underlying principles, or simply because the family has a rich collection of probability density functions with a small number of parameters (usually 3 or less). As a general philosophical principle, we try to model a random process with as few parameters as possible; this is sometimes referred to as the principle of parsimony of parameters. In turn, this is a special case of Ockham's razor, named in honor of William of Ockham, the principle that states that one should use the simplest model that adequately describes a given phenomenon. Parsimony is important because often the parameters are not known and must be estimated.

In many cases, a special parametric family of distributions will have one or more distinguished standard members, corresponding to specified values of some of the parameters. Usually the standard distributions will be mathematically simplest, and often other members of the family can be constructed from the standard distributions by simple transformations on the underlying standard random variable.

An incredible variety of special distributions have been studied over the years, and new ones are constantly being added to the literature. To truly deserve the adjective special, a distribution should have a certain level of mathematical elegance and economy, and should arise in interesting and diverse applications.

• 5.1: Location-Scale Families
As usual, our starting point is a random experiment modeled by a probability space (Ω,F,P), so that Ω is the set of outcomes, F the collection of events, and P the probability measure on the sample space (Ω,F). In this section, we assume that we fixed random variable Z defined on the probability space, taking values in R.
• 5.2: General Exponential Families
Many of the special distributions studied in this chapter are general exponential families, at least with respect to some of their parameters. On the other hand, most commonly, a parametric family fails to be a general exponential family because the support set depends on the parameter. The following theorems give a number of examples. Proofs will be provided in the individual sections.
• 5.3: Stable Distributions
Stable distributions are an important general class of probability distributions on R that are defined in terms of location-scale transformations. Stable distributions occur as limits (in distribution) of scaled and centered sums of independent, identically distributed variables. Such limits generalize the central limit theorem, and so stable distributions generalize the normal distribution in a sense. The pioneering work on stable distributions was done by Paul Lévy.
• 5.4: Infinitely Divisible Distributions
A number of special distributions are infinitely divisible. Proofs of the results stated below are given in the individual sections.
• 5.5: Power Series Distributions
Power Series Distributions are discrete distributions on (a subset of) N constructed from power series. This class of distributions is important because most of the special, discrete distributions are power series distributions.
• 5.6: The Normal Distribution
The normal distribution holds an honored role in probability and statistics, mostly because of the central limit theorem, one of the fundamental theorems that forms a bridge between the two subjects. In addition, as we will see, the normal distribution has many nice mathematical properties. The normal distribution is also called the Gaussian distribution, in honor of Carl Friedrich Gauss, who was among the first to use the distribution.
• 5.7: The Multivariate Normal Distribution
The multivariate normal distribution is among the most important of multivariate distributions, particularly in statistical inference and the study of Gaussian processes such as Brownian motion. The distribution arises naturally from linear transformations of independent normal variables. In this section, we consider the bivariate normal distribution first, because explicit results can be given and because graphical interpretations are possible.
• 5.8: The Gamma Distribution
In this section we will study a family of distributions that has special importance in probability and statistics. In particular, the arrival times in the Poisson process have gamma distributions, and the chi-square distribution in statistics is a special case of the gamma distribution. Also, the gamma distribution is widely used to model physical quantities that take positive values.
• 5.9: Chi-Square and Related Distribution
In this section we will study a distribution, and some relatives, that have special importance in statistics. In particular, the chi-square distribution will arise in the study of the sample variance when the underlying distribution is normal and in goodness of fit tests.
• 5.10: The Student t Distribution
In this section we will study a distribution that has special importance in statistics. In particular, this distribution will arise in the study of a standardized version of the sample mean when the underlying distribution is normal.
• 5.11: The F Distribution
In this section we will study a distribution that has special importance in statistics. In particular, this distribution arises from ratios of sums of squares when sampling from a normal distribution, and so is important in estimation and in the two-sample normal model and in hypothesis testing in the two-sample normal model.
• 5.12: The Lognormal Distribution
The lognormal distribution is a continuous distribution on (0,∞) and is used to model random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables.
• 5.13: The Folded Normal Distribution
The folded normal distribution is the distribution of the absolute value of a random variable with a normal distribution. As has been emphasized before, the normal distribution is perhaps the most important in probability and is used to model an incredible variety of random phenomena.
• 5.14: The Rayleigh Distribution
The Rayleigh distribution, named for William Strutt, Lord Rayleigh, is the distribution of the magnitude of a two-dimensional random vector whose coordinates are independent, identically distributed, mean 0 normal variables. The distribution has a number of applications in settings where magnitudes of normal variables are important.
• 5.15: The Maxwell Distribution
The Maxwell distribution, named for James Clerk Maxwell, is the distribution of the magnitude of a three-dimensional random vector whose coordinates are independent, identically distributed, mean 0 normal variables. The distribution has a number of applications in settings where magnitudes of normal variables are important, particularly in physics. The Maxwell distribution is closely related to the Rayleigh distribution.
• 5.16: The Lévy Distribution
The Lévy distribution, named for the French mathematician Paul Lévy, is important in the study of Brownian motion, and is one of only three stable distributions whose probability density function can be expressed in a simple, closed form.
• 5.17: The Beta Distribution
In this section, we will study the beta distribution, the most important distribution that has bounded support. But before we can study the beta distribution we must study the beta function.
• 5.18: The Beta Prime Distribution
The beta prime distribution is the distribution of the odds ratio associated with a random variable with the beta distribution. Since variables with beta distributions are often used to model random probabilities and proportions, the corresponding odds ratios occur naturally as well.
• 5.19: The Arcsine Distribution
The arcsine distribution is important in the study of Brownian motion and prime numbers, among other applications.
• 5.20: General Uniform Distributions
This section explores uniform distributions in an abstract setting. If you are a new student of probability, or are not familiar with measure theory, you may want to skip this section and read the sections on the uniform distribution on an interval and the discrete uniform distributions.
• 5.21: The Uniform Distribution on an Interval
The continuous uniform distribution on an interval of R is one of the simplest of all probability distributions, but nonetheless very important. In particular, continuous uniform distributions are the basic tools for simulating other probability distributions. The uniform distribution corresponds to picking a point at random from the interval.
• 5.22: Discrete Uniform Distributions
The discrete uniform distribution is a special case of the general uniform distribution with respect to a measure, in this case counting measure. The distribution corresponds to picking an element of S at random. Most classical, combinatorial probability models are based on underlying discrete uniform distributions. The chapter on Finite Sampling Models explores a number of such models.
• 5.23: The Semicircle Distribution
• 5.24: The Triangle Distribution
Like the semicircle distribution, the triangle distribution is based on a simple geometric shape. The distribution arises naturally when uniformly distributed random variables are transformed in various ways.
• 5.25: The Irwin-Hall Distribution
The Irwin-Hall distribution, named for Joseph Irwin and Phillip Hall, is the distribution that governs the sum of independent random variables, each with the standard uniform distribution. It is also known as the uniform sum distribution. Since the standard uniform is one of the simplest and most basic distributions (and corresponds in computer science to a random number), the Irwin-Hall is a natural family of distributions. It also serves as a nice example of the central limit theorem.
• 5.26: The U-Power Distribution
The U-power distribution is a U-shaped family of distributions based on a simple family of power functions.
• 5.27: The Sine Distribution
The sine distribution is a simple probability distribution based on a portion of the sine curve. It is also known as Gilbert's sine distribution, named for the American geologist Grove Karl (GK) Gilbert who used the distribution in 1892 to study craters on the moon.
• 5.28: The Laplace Distribution
The Laplace distribution, named for Pierre Simon Laplace arises naturally as the distribution of the difference of two independent, identically distributed exponential variables. For this reason, it is also called the double exponential distribution.
• 5.29: The Logistic Distribution
The logistic distribution is used for various growth models, and is used in a certain type of regression, known appropriately as logistic regression.
• 5.30: The Extreme Value Distribution
Extreme value distributions arise as limiting distributions for maximums or minimums (extreme values) of a sample of independent, identically distributed random variables, as the sample size increases. Thus, these distributions are important in probability and mathematical statistics.
• 5.31: The Hyperbolic Secant Distribution
The hyperbolic secant distribution is a location-scale familty with a number of interesting parallels to the normal distribution. As the name suggests, the hyperbolic secant function plays an important role in the distribution, so we should first review some definitions.
• 5.32: The Cauchy Distribution
The Cauchy distribution, named of course for the ubiquitous Augustin Cauchy, is interesting for a couple of reasons. First, it is a simple family of distributions for which the expected value (and other moments) do not exist. Second, the family is closed under the formation of sums of independent variables, and hence is an infinitely divisible family of distributions.
• 5.33: The Exponential-Logarithmic Distribution
The exponential-logarithmic distribution arises when the rate parameter of the exponential distribution is randomized by the logarithmic distribution. The exponential-logarithmic distribution has applications in reliability theory in the context of devices or organisms that improve with age, due to hardening or immunity.
• 5.34: The Gompertz Distribution
The Gompertz distributon, named for Benjamin Gompertz, is a continuous probability distribution on [0,∞) that has exponentially increasing failure rate. Unfortunately, the death rate of adult humans increases exponentially, so the Gompertz distribution is widely used in actuarial science.
• 5.35: The Log-Logistic Distribution
As the name suggests, the log-logistic distribution is the distribution of a variable whose logarithm has the logistic distribution. The log-logistic distribution is often used to model random lifetimes, and hence has applications in reliability.
• 5.36: The Pareto Distribution
The Pareto distribution is a skewed, heavy-tailed distribution that is sometimes used to model the distribution of incomes and other financial variables.
• 5.37: The Wald Distribution
The Wald distribution, named for Abraham Wald, is important in the study of Brownian motion. Specifically, the distribution governs the first time that a Brownian motion with positive drift hits a fixed, positive value. In Brownian motion, the distribution of the random position at a fixed time has a normal (Gaussian) distribution, and thus the Wald distribution, which governs the random time at a fixed position, is sometimes called the inverse Gaussian distribution.
• 5.38: The Weibull Distribution
The Weibull distribution is named for Waloddi Weibull. Weibull was not the first person to use the distribution, but was the first to study it extensively and recognize its wide use in applications. The standard Weibull distribution is the same as the standard exponential distribution. But as we will see, every Weibull random variable can be obtained from a standard Weibull variable by a simple deterministic transformation, so the terminology is justified.
• 5.39: Benford's Law
Benford's law refers to probability distributions that seem to govern the significant digits in real data sets. The law is named for the American physicist and engineer Frank Benford, although the law was actually discovered earlier by the astronomer and mathematician Simon Newcomb.
• 5.40: The Zeta Distribution
The zeta distribution is used to model the size or ranks of certain types of objects randomly chosen from certain types of populations. Typical examples include the frequency of occurrence of a word randomly chosen from a text, or the population rank of a city randomly chosen from a country. The zeta distribution is also known as the Zipf distribution, in honor of the American linguist George Zipf.
• 5.41: The Logarithmic Series Distribution
The logarithmic series distribution, as the name suggests, is based on the standard power series expansion of the natural logarithm function. It is also sometimes known more simply as the logarithmic distribution.