8.2: Hypothesis Testing Framework
- Page ID
- 58925
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Now that we've built a foundation for understanding what a hypothesis is — and what it means to be wrong — we’re ready to outline the hypothesis testing process in full.
This process combines the ideas of both Ronald Fisher (who emphasized using data to test a ruling-out hypothesis, \( H_0 \)) and Jerzy Neyman and Egon Pearson (who structured hypothesis testing as a formal decision-making process based on Type I and Type II errors).
Here’s a five-step plan we’ll use for every hypothesis test. Whether we’re studying proportions, means, or comparing two groups, this structure remains the same — what will change is the type of test statistic we use.
An Outline of the Five Steps of a Hypothesis Test
- State the hypotheses
Clearly define the null hypothesis (\( H_0 \)) and the alternative hypothesis (\( H_A \)).- \( H_0 \): often represents “no effect,” “no difference,” or “status quo.” This is always an "equals" statement.
- \( H_A \): what you are trying to find evidence for. This will be some kind of inequality (>,<, or \(\neq\)).
- Select the appropriate test statistic and distribution
Choose a statistical test that matches the structure of your question, and define the relevant test statistic \( T \), then calculate it using the sample data.- Is this a test about one mean? Use a z or t test for a single sample mean.
- Is this comparing two means? Use a two-sample t test.
- Is this a test about one proportion or rate? Use a z test for proportions.
- For other types of tests see - List of test statistics
- Your choice depends on: the type of variable, the number of groups, what statistics we can calculate and the sample size.
- Choose a significance level (α)
Set the allowable probability of making a Type I Error — in other words, the risk you're willing to take of rejecting the null when it's actually true.Common values: \( \alpha = 0.05, 0.01, 0.10 \)
- Determine where it falls in the sampling distribution (Calculate the p-value)
Then determine how extreme that value is under the assumption \( H_0 \) is true (either via a p-value or using a critical value from a distribution table). - Make a decision and state your conclusion
Use your comparison between \( t_{obs} \) and the distribution under \( H_0 \) to decide:- Reject \( H_0 \) (if the evidence against \(H_0\) is strong)
- Fail to reject \( H_0 \) (if the evidence against \(H_0\) isn't strong enough)
Always frame your decision in the context of the original research question.
Example: “Winters are hotter today than when I was a kid”
A friend says, “I swear winters are warmer now than when we were kids.” Can we test this statistically?
You decide to gather historical winter temperatures from 20 years ago and compare them to modern-day winters in your region. You calculate the average January temperature today and compare it to 32 degrees F, the known average from the past.
Let’s walk through the structure:
- Step 1: \[ H_0: \mu = 32^\circ\text{F} \quad \text{(no warming)} \\ H_A: \mu > 32^\circ\text{F} \quad \text{(today’s winters are warmer)} \]
- Step 2: Use a one-sample t-test for a population mean (since you're comparing sample data to a known mean).
- Step 3: Choose \( \alpha = 0.05 \). Since there is no excessive worry of the Type I error here (accidentally concluding that winters are warmer when they aren't) we will go with the default \( \alpha = 0.05 \).
- Step 4: Calculate \( t_{obs} \) from your sample data (mean, standard deviation, sample size). If we assume that the current mean temperature is still 32 degrees F, how likely or unlikely is our observed sample data?
- Step 5: Compare \( t_{obs} \) to t-distribution and make your conclusion. If the probability of our results is below \( \alpha = 0.05 \) we can conclude that winters are indeed warmer. If the probability of our results is higher than that, we don't have strong enough evidence to say winters are warmer.
This structure helps us test the belief (“it’s hotter!”) using real evidence from data.
What Comes Next
Now that we’ve outlined the general process, we’re going to build specific examples using data. We’ll start with hypothesis tests involving a single mean, and later move to tests comparing two means or working with proportions.
In each case, the framework we've outlined stays the same — we only change the specific test statistic and distribution used.


