2.3: Confidence Intervals
At the end of this section you should be able to answer the following questions:
- What is a Confidence Interval?
- How is standard error related to a Confidence Interval?
In many statistical papers, you will see Confidence Intervals (or CIs) reported, usually at the 95% threshold. But what does that actually mean?
The first step in understanding a CI is that there is a “point estimate” or statistic from a sample. The usual statistic first encountered in research is a sample mean, although many other statistics, such as effect size estimates, can be used for point estimates within CIs.
What is good about a CI is that the range of the CI – from the lowest value to the highest value – indicates how much error is used in the measurement of the “point estimate” or statistic from a sample.
Keep in mind that the standard error is a basic measure of variability that accounts for how much error exists when relating the sample results to the true values of a parameter in a population. In our example below, the standard error comes from the standard deviation divided by the square root of the sample size. Therefore, a larger sample of data will result in a smaller amount of standard error in estimating the population value of a parameter. This makes sense because if you have more cases in a sample from a population, you are more likely to estimate the true values of a parameter in a population.
For example, let’s say we have a sample of data from 42 fathers, where they provide ratings of their confidence in fathering as well as the time spent with their children. Here are the “point estimates” with standard errors and CIs.
Confidence in Fathering where the mean = 32.50, standard error = .70, CI = [31.20, 33.90]
Time Spent With Children where the mean = 8.00, standard error = .20, CI = [7.70, 8.30]
Notice that the range in the first CI is 2.7, which comes from the distance between the upper and lower limits of 31.20 and 33.90.
In comparison, the range in the second CI is 0.6, which comes from the distance between the upper and lower limits of 7.70 and 8.30.
The basic idea here is that when there is more standard error, then the range of the confidence intervals is bigger, which means that our point estimate is less accurate.
CIs are based on a proportion of confidence in the sample point estimate, that is, related to a set probability of the estimate occurring. This is because, from one sample alone, there is no way to know the value of the true population mean. Therefore, we need to estimate how likely it is that the true population value of the mean lies within the calculated CI.
As a general rule researchers like to state the confidence level as 95%. However, CIs can use other levels of significance such as .01 and .10 as a basis for constructing the interval. Accordingly, if 95% CIs were to be calculated from many repeated samples from the population, the population mean would fall within the limits of the CI in 95% of the samples. However, 5% of the CIs will not capture the population mean. Basically, if you were to sample the population 100 times, 95 times the sample mean would fall within the limits of the 95% CI.
CIs can be also calculated for effect size measures such as the T-test or correlation coefficient. An interesting fact about effect size point estimates based on the normal distribution is that in cases where the interval or the distance between the upper and lower limits of the CI includes 0, the effect measured by the point estimate is statistically insignificant.
To properly construct a CI, a sample should be randomly selected from the population, or the researchers must assume that a convenience sample adequately represents the population, with the results similar to what would have been observed had a true random sample be used. In addition, it must be assumed that the population has a normal distribution for the variable of interest or at least an approximately normal distribution.