Skip to main content
Statistics LibreTexts

7: Estimation

  • Page ID
    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    If we wish to estimate the mean μ of a population for which a census is impractical, say the average height of all 18-year-old men in the country, a reasonable strategy is to take a sample, compute its mean x−, and estimate the unknown number μ by the known number x−. For example, if the average height of 100 randomly selected men aged 18 is 70.6 inches, then we would say that the average height of all 18-year-old men is (at least approximately) 70.6 inches.

    Estimating a population parameter by a single number like this is called point estimation; in the case at hand the statistic x− is a point estimate of the parameter μ. The terminology arises because a single number corresponds to a single point on the number line.

    A problem with a point estimate is that it gives no indication of how reliable the estimate is. In contrast, in this chapter we learn about interval estimation. In brief, in the case of estimating a population mean μ we use a formula to compute from the data a number E, called the margin of error of the estimate, and form the interval [x−−E,x−+E]. We do this in such a way that a certain proportion, say 95%, of all the intervals constructed from sample data by means of this formula contain the unknown parameter μ. Such an interval is called a 95% confidence interval for μ.

    Continuing with the example of the average height of 18-year-old men, suppose that the sample of 100 men mentioned above for which x−=70.6 inches also had sample standard deviation s = 1.7 inches. It then turns out that E = 0.33 and we would state that we are 95% confident that the average height of all 18-year-old men is in the interval formed by 70.6±0.33 inches, that is, the average is between 70.27 and 70.93 inches. If the sample statistics had come from a smaller sample, say a sample of 50 men, the lower reliability would show up in the 95% confidence interval being longer, hence less precise in its estimate. In this example the 95% confidence interval for the same sample statistics but with n = 50 is 70.6±0.47 inches, or from 70.13 to 71.07 inches.

    • 7.1: Large Sample Estimation of a Population Mean
      A confidence interval for a population mean is an estimate of the population mean together with an indication of reliability. There are different formulas for a confidence interval based on the sample size and whether or not the population standard deviation is known. The confidence intervals are constructed entirely from the sample data (or sample data and the population standard deviation, when it is known).
    • 7.2: Small Sample Estimation of a Population Mean
      In selecting the correct formula for construction of a confidence interval for a population mean ask two questions: is the population standard deviation σ known or unknown, and is the sample large or small? We can construct confidence intervals with small samples only if the population is normal.
    • 7.3: Large Sample Estimation of a Population Proportion
      We have a single formula for a confidence interval for a population proportion, which is valid when the sample is large. The condition that a sample be large is not that its size n be at least 30, but that the density function fit inside the interval [0,1].
    • 7.4: Sample Size Considerations
      Sampling is typically done with a set of clear objectives in mind. Since sampling costs time, effort, and money, it would be useful to be able to estimate the smallest size sample that is likely to meet these criteria.
    • 7.E: Estimation (Exercises)
      These are homework exercises to accompany the Textmap created for "Introductory Statistics" by Shafer and Zhang.

    7: Estimation is shared under a CC BY-NC-SA 3.0 license and was authored, remixed, and/or curated by via source content that was edited to conform to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.