Skip to main content
Statistics LibreTexts

6.4: Estimating the Binomial with the Normal Distribution

  • Page ID
    45955
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    We found earlier that various probability density functions are the limiting distributions of others; thus, we can estimate one with another under certain circumstances. We will find here that the normal distribution can be used to estimate a binomial process. The Poisson was used to estimate the binomial previously, and the binomial was used to estimate the hypergeometric distribution.

    In the case of the relationship between the hypergeometric distribution and the binomial, we had to recognize that a binomial process assumes that the probability of a success remains constant from trial to trial: a head on the last flip cannot have an effect on the probability of a head on the next flip. In the hypergeometric distribution this is the essence of the question because the experiment assumes that any "draw" is without replacement. If one draws without replacement, then all subsequent "draws" are conditional probabilities. We found that if the hypergeometric experiment draws only a small percentage of the total objects, then we can ignore the impact on the probability from draw to draw.

    Imagine that there are 312 cards in a deck comprised of 6 normal decks. If the experiment called for drawing only 10 cards, less than 5% of the total, than we will accept the binomial estimate of the probability, even though this is actually a hypergeometric distribution because the cards are presumably drawn without replacement.

    The Poisson likewise was considered an appropriate estimate of the binomial under certain circumstances. In Chapter 4 we found that if the number of trials of interest is large and the probability of success is small, such that μ=np < 7, the Poisson can be used to estimate the binomial with good results. Again, these rules of thumb do not in any way claim that the actual probability is what the estimate determines, only that the difference is in the third or fourth decimal and is thus de minimus.

    Here, again, we find that the normal distribution makes particularly accurate estimates of a binomial process under certain circumstances. Figure \(\PageIndex{1}\) is a frequency distribution of a binomial process for the experiment of flipping three coins where the random variable is the number of heads. The sample space is listed below the distribution. The experiment assumed that the probability of a success is 0.5; the probability of a failure, a tail, is thus also 0.5. In observing Figure \(\PageIndex{1}\) we are struck by the fact that the distribution is symmetrical. The root of this result is that the probabilities of success and failure are the same, 0.5. If the probability of success were smaller than 0.5, the distribution becomes skewed right. Indeed, as the probability of success diminishes, the degree of skewness increases. If the probability of success increases from 0.5, then the skewness increases in the lower tail, resulting in a left-skewed distribution.

    A histogram showing the frequency distribution of flipping three coins where x represents the number of heads. The vertical y axis represents Probability. Each bar has a label on the horizontal axis in the center of the bar. The labels are 0, 1, 2, 3. The height of the bar representing 0 heads is 1/8. The height of the bar representing 1 head is 3/8. The height of the bar representing 2 heads is 3/8. The height of the bar representing 3 heads is 1/8. Below the histogram is the set, s, representing the sample space. The elements of the set are HHH, HHT, HTH, THH, TTT, TTH, THT, HTT.

    Figure \(\PageIndex{1}\)

    The reason the skewness of the binomial distribution is important is because if it is to be estimated with a normal distribution, then we need to recognize that the normal distribution is symmetrical. The closer the underlying binomial distribution is to being symmetrical, the better the estimate that is produced by the normal distribution. Figure \(\PageIndex{2}\) shows a symmetrical normal distribution transposed on a graph of a binomial distribution where p = 0.2 and n = 5. The discrepancy between the estimated probability using a normal distribution and the probability of the original binomial distribution is apparent. The criteria for using a normal distribution to estimate a binomial thus addresses this problem by requiring BOTH np AND n(1 − p) are greater than five. Again, this is a rule of thumb, but is effective and results in acceptable estimates of the binomial probability.

    A histogram showing the frequency distribution of a binomial distribution with p = 0.2 and n = 5. The random variable X represents number of heads. The vertical y axis represents Probability P(X). Each bar has a label on the horizontal axis in the center of the bar. The labels are 0, 1, 2, 3, 4, 5. The height of the bar at 0 is 0.3277. The height of the bar at 1 is 0.4096. The height of the bar at 2 is 0.2048. The height of the bar at 3 is 0.0512. The height of the bar at 4 is 0.0064. The height of the bar at 5 is 0.0003. Superimposed on the histogram is a normal distribution curve with mean mu = 1.

    Figure \(\PageIndex{2}\)

    Exercise \(\PageIndex{1}\)

    Imagine that it is known that only 10% of Australian Shepherd puppies are born with what is called "perfect symmetry" in their three colors, black, white, and copper. Perfect symmetry is defined as equal coverage on all parts of the dog when looked at in the face and measuring left and right down the centerline. A kennel would have a good reputation for breeding Australian Shepherds if they had a high percentage of dogs that met this criterion. During the past 5 years and out of the 100 dogs born to Dundee Kennels, 16 were born with this coloring characteristic.

    What is the probability that, in 100 births, more than 16 would have this characteristic?

    Answer

    If we assume that one dog's coloring is independent of other dogs' coloring, a bit of a brave assumption, this becomes a classic binomial probability problem.

    The statement of the probability requested is 1 − [p(X = 0) + p(X = 1) + p(X = 2)+ … + p(X = 16)]. This requires us to calculate 17 binomial formulas and add them together and then subtract from one to get the right hand part of the distribution. Alternatively, we can use the normal distribution to get an acceptable answer and in much less time.

    First, we need to check if the binomial distribution is symmetrical enough to use the normal distribution. We know that the binomial for this problem is skewed because the probability of success, 0.1, is not the same as the probability of failure, 0.9. Nevertheless, both np=10 and n(1−p)=90 are larger than 5, the cutoff for using the normal distribution to estimate the binomial.

    Figure \(\PageIndex{3}\) below shows the binomial distribution and marks the area we wish to know. The mean of the binomial, 10, is also marked, and the standard deviation is written on the side of the graph: σ = √𝑛𝑝𝑞 = 3. The area under the distribution from zero to 16 is the probability requested, and has been shaded in. Below the binomial distribution is a normal distribution to be used to estimate this probability. That probability has also been shaded.

    A histogram showing the frequency distribution of a binomial distribution with p = 0.1 and n = 100. The random variable X represents number of successes. The vertical y axis represents Probability P(X). The bars greater than 16 are shaded. Below the histogram is the graph of a normal distribution with mean m = 10. The area under the curve for x > 16 is shaded (corresponding to the shaded area on the histogram above). Below the graph of the normal curve is the z-score formula: z 1 = (x – mu)/sigma and the calculation: z 1 = (16 – 10)/3 = 2.

    Figure \(\PageIndex{3}\)

    Standardizing from the binomial to the normal distribution as done in the past shows where we are asking for the probability from 16 to positive infinity, or 100 in this case. We need to calculate the number of standard deviations 16 is away from the mean: 10.
    \[
    Z=\frac{x-\mu}{\sigma}=\frac{16-10}{3}=2
    \]

    We are asking for the probability beyond two standard deviations, a very unlikely event. We look up two standard deviations in the standard normal table and find the area from zero to two standard deviations is 0.4772 . We are interested in the tail, however, so we subtract 0.4772 from 0.5 and thus find the area in the tail. Our conclusion is the probability of a kennel having 16 dogs with "perfect symmetry" is 0.0228 . Dundee Kennels has an extraordinary record in this regard.

    Mathematically, we write this as:
    \[
    1-[p(X=0)+p(X=1)+p(X=2)+\ldots+p(X=16)]=p(X>16)=p(Z>2)=0.0228
    \]


    6.4: Estimating the Binomial with the Normal Distribution is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?