7.2: The Student’s T-Distribution
- Page ID
- 48913
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
In a previous lesson, we saw that under certain conditions, the sampling distribution of sample means is normally distributed. We can use this information to standardize the sampling distribution of sample means and find probabilities and Z-scores from the standard normal distribution. We can do this if we know the population standard deviation. However, it is not common to know a population standard deviation. What happens when we don’t know the value of the population standard deviation?
When the Population Standard Deviation, σ, is Unknown
In most cases, we do not know the population standard deviation, \(\sigma\). The only option we have is to approximate with a known sample standard deviation, \(s\).
- Review: What conditions should be met to verify that the sampling distribution of sample means is approximately normal?
Recall, the mean of the sampling distribution of sample means is \(\mu_\bar{x}=\mu\). The standard error of the sampling distribution of sample means is \(\mu_\bar{x}=\mu\). We substitute s for in the standard error formula since is unknown. So, the standard error of the sampling distribution of sample means is estimated by \[\dfrac{s}{\sqrt{n}}\nonumber\]
The test statistic for a sample mean is \(\dfrac{\bar{x}-\mu}{\dfrac{s}{\sqrt{n}}}\). The sample standard deviation, \(s\), varies from sample to sample and it only approximates \(\sigma\). Therefore, this substitution introduces additional variability into the test statistic. This added variability means that the distribution of the test statistic is no longer normal. We instead use the Student’s T-distribution.
The T-Distribution
The T-distribution describes the variability of the test statistic, \(T=\dfrac{\bar{x}-\mu}{\dfrac{s}{\sqrt{n}}}\), when the sampling distribution of sample means is normal, and the sample standard deviation, \(s\), is used to estimate an unknown population standard deviation, \(\sigma\).
This test statistic is an estimate of how many standard errors the sample mean is from the hypothesized population mean.
A T-distribution is a member of a family of continuous probability distributions. The width of a T-distribution depends on how much a sample standard deviation can vary. The amount of variability in a sample standard deviation depends on how many deviations vary freely when it is computed. Recall, the sample standard deviation is calculated using the formula
\[s=\sqrt{\dfrac{\sum_i\left(x_i-\bar{x}\right)^2}{n-1}}\nonumber\]
The deviations from the mean are averaged in a sample standard deviation. When the deviations are added together, they always add to zero. Because of this, the last deviation summed in this average is not free – it is always the value that makes the resulting sum zero. There are \(n\) deviations from the mean, but only \(n-1\) of these are free deviations.
The variability of standard deviations depends on the number of free deviations in the sample standard deviations, \(n-1\). This quantity is known as the degrees of freedom (d.f.). Each degree of freedom defines a uniquely associated T-distribution.
T-distributions have the following characteristics:
- T-distributions are bell-shaped and symmetric with a mean of 0.
- Each T-distribution depends on the degrees of freedom, d.f. (\(d.f.=n-1\)).
- T-distributions have heavier tails and narrower peaks than the standard normal distribution.
- The total area under each T-distribution curve is 1.
- As the degrees of freedom increase, the tails become thinner and the curve approaches the standard normal distribution.
Images are created with the graphing calculator, used with permission from Desmos Studio PBC.
With fewer degrees of freedom, the more the sample standard deviation varies. In other words, a smaller sample size corresponds to greater variability in a T-distribution. In a small sample, it is more likely to observe extreme values. This is reflected in the shape of the T-distribution. As the sample size increases, the T-distribution trends toward the normal distribution.
Images are created with the graphing calculator, used with permission from Desmos Studio PBC.
Check for Understanding
- The heights of students at Claremont High School is approximately normal. If the average height of a student here is 166 centimeters.
- Is the sampling distribution of sample means better represented by the normal distribution or the T-distribution? Justify your answer.
- You find the average height and sample standard deviation for 10 randomly selected Claremont High School students. The sample standard deviation is 4 cm. Compute the test statistic for an average height of 168 cm. Round your answer to two decimal places.
- We will now use desmos to find the likelihood of seeing an average height or 168 cm or more.
- Go to https://www.desmos.com/calculator
- The degrees of freedom d.f. in this example is \(n-1=10-1=9\). Type tdist(9). In general, type tdist(degrees of freedom).
- Click the “Zoom Fit” button.
- Check the “Find Cumulative Probability (CDF)” box.
- The min and max will default to \(-\infty\) and \(\infty\) respectively. Enter the test statistics from b. into the min field. The probability \(P(\bar{x} \geq 168)=P(T \geq \underline{\quad \ \ }) \approx 0.0743\).
- You find the average height and sample standard deviation for 10 randomly selected Claremont High School students. The sample standard deviation is 4 cm. Use the process in c. to compute the probability of observing an average height of 170 cm or more.
- If the average height of a student at Claremont High School is 166 centimeters, which of the following is more likely:
- a random sample of 15 students, with the average height being greater than 170 centimeters? OR
- a random sample of 25 students, with the average height being greater than 170 centimeters?
- Is the sampling distribution of sample means better represented by the normal distribution or the T-distribution? Justify your answer.
Justify your answer using what you know about the T-distribution.
Finding a Critical Value from a T-Distribution
- We will use desmos to find the T critical value that separates the top 10% from the lower 90% from a random sample of size 8.
- Go to https://www.desmos.com/calculator
- The degrees of freedom \(d.f.=n-1=8-1=\)__________. The area to the left of the critical value is __________ (90%). Type tdist(7).inversecdf(0.9).
The T critical value here is around 1.415
Images are created with the graphing calculator, used with permission from Desmos Studio PBC.
- Use desmos to find the T critical value that separates the middle 95% from a random sample of size 8. Show your thinking by sketching a graph.