Statistical inference is concerned primarily with understanding the quality of parameter estimates. For example, a classic inferential question is, "How sure are we that the estimated mean, ˉx, is near the true population mean, μ?" While the equations and details change depending on the setting, the foundations for inference are the same throughout all of statistics. We introduce these common themes in Sections 4.1-4.4 by discussing inference about the population mean, μ, and set the stage for other parameters and scenarios in Section 4.5. Some advanced considerations are discussed in Section 4.6. Understanding this chapter will make the rest of this book, and indeed the rest of statistics, seem much more familiar.
A plausible range of values for the population parameter is called a confidence interval. In this section, we will emphasize the special case where the point estimate is a sample mean and the parameter is the population mean.
Hypothesis testing involves the formulate two hypothesis to test against the measured data: (1) The null hypothesis often represents either a skeptical perspective or a claim to be tested and (2) The alternative hypothesis represents an alternative claim under consideration and is often represented by a range of possible parameter values. The skeptic will not reject the null hypothesis, unless the evidence in favor of the alternative hypothesis is so strong to rejects the null hypothesis.
The Central Limit Theorem states that when the sample size is small, the normal approximation may not be very good. However, as the sample size becomes large, the normal approximation improves. We will investigate three cases to see roughly when the approximation is reasonable.
An unbiased estimate does not naturally over or underestimate the parameter. Rather, it tends to provide a "good" estimate. The sample mean is an example of an unbiased point estimate, as are each of the examples we introduce in this section. Finally, we will discuss the general case where a point estimate may follow some distribution other than the normal distribution. We also provide guidance about how to handle scenarios where familiar statistical techniques are insufficient.
The Type 2 Error rate and the magnitude of the error for a point estimate are controlled by the sample size. Real differences from the null value, even large ones, may be difficult to detect with small samples. If we take a very large sample, we might nd a statistically significant difference but the magnitude might be so small that it is of no practical value. In this section we describe techniques for selecting an appropriate sample size based on these considerations.