12.2: Hypotheses
- Page ID
- 64765
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)In our earlier discussion of the scientific method, we presented the modern scientific method as an iterative five-step process based on observation, hypothesizing, testing, analysis, and drawing conclusions. As we shall see, statistical hypothesis testing encompasses the last four of these steps in a formal process. This section will focus on the second step of setting and identifying hypotheses.
A hypothesis in the abstract setting of the scientific method can be thought of as simply a statement about the universe that a researcher wishes to either prove or disprove. In the formal and practical setting of statistical hypothesis testing, such a hypothesis must be written in terms of a parameter of a population. Recall that a population is the collection of all the items or individuals that are of interest to the researcher. Each of the individuals has characteristics that can be measured, called variables, and a parameter was defined to be a summary of one or more of these characteristics computed on the measurements from the entire population. In statistics, a hypothesis is a specific statement about the value of an unknown parameter of a population.
A statistical hypothesis is a statement about the value of a parameter of a population.
As an example, let us once again consider the professor who has 250 exams. The population in this case corresponds to the grades of the 250 students on the exam. Recalling this example, the professor needs to report how well he thinks the class did on the exam to the chair of the department. This is most succinctly reported as the mean of the two hundred and fifty exam scores. The professor has not graded the exam and hence does not know what this mean value is. This unknown mean value is the parameter of the population that is of interest to the professor. Suppose the professor thinks that the students did well. For example, suppose they think that the mean of the grade on all 250 exams will be greater than 80 on a 100-point scale. This is an example of a statistical hypothesis.
Many researchers will use the standard statistical notion when specifying hypothesis, which consists of specifying it with the letter 𝐻, followed by a colon and then statement of the hypothesis. For the example given above, the hypothesis would be represented as:
𝐻: The mean grade is greater than 80.
Other researchers will use a more mathematical notation that is more common in statistics. In statistics the mean of a population is usually represented by the Greek letter mu, which looks like \(\mu\). For the current example, the researcher would state that \(\mu\) represents the mean test grade and the hypothesis would be represented as \(H:\mu>80\). The scientific method emphasizes the statement of a hypothesis which is to be tested through observation. In the formal mathematical framework of statistics, two hypothesis are considered. These hypotheses are called the null and alternative hypotheses. These two hypotheses are set up as competitors to one another. That is, when one of the hypotheses is true, the other must be false, and in this sense statistical hypothesis testing is seen as a battle between the two that is decided by what is observed. This is a bit of a simplification of the process, and more details will be given as we proceed, but at this point it may be helpful to take a first look at setting up a statistical hypothesis test for us to understand how the null and alternative hypotheses are structured.
At the beginning of a statistical hypothesis test, the researcher sets the null hypothesis and the alternative hypothesis before any data has been observed. In the structure of the test, the null hypothesis is assumed to be true, then the data is observed. The alternative hypothesis will only be accepted as truth if substantial evidence is found in the data to indicate that the null hypothesis must be false. With this structure the null hypothesis has the comparative advantage. The researcher will assume the null is the truth, and will only abandon the null hypothesis if so much evidence is found against it that they must conclude that the alternative hypothesis is true.
Hence, at the end of a statistical hypothesis test there are two possible conclusions, and one of these conclusions is stronger and has more impact than the other. If the data are observed and substantial evidence is found against the null hypothesis, then it is rejected and the alternative hypothesis is accepted. This is a strong conclusion because the process started by assuming the null hypothesis was true, and so much evidence was found contradicting it that the researcher changed their mind and sided with the alternative hypothesis. On the other hand, if insufficient evidence is found in the observed data, then the researcher will stay with the null hypothesis. This is the weak conclusion. Essentially, the researcher is concluding that not enough evidence was found against the null, so they will stay with the null hypothesis.
In statistical hypothesis testing, the null hypothesis is initially assumed to be true. The statistical testing procedure will then determine how much evidence exists in the data contradicting the null hypothesis. The null hypothesis is usually denoted mathematically as \(H_0\).
In statistical hypothesis testing the alternative hypothesis is accepted if the null hypothesis is rejected. The alternative hypothesis is usually denoted mathematically as \(H_1\).
The structure described above provides some motivation for how a researcher usually chooses the null and alternative hypothesis. If a researcher is attempting to prove that a new theory is true, they would want to put that theory in the alternative hypothesis. In that way, if the null is rejected, their theory in the alternative hypothesis would be accepted as a strong conclusion. Essentially, they would be arguing that they began the study by assuming that the current theory was true and did an experiment. In the experiment they found so much evidence contradicting the old theory that they now believe the new theory is true. If the new theory had been put into the null hypothesis, the best they could conclude is that they did not find enough evidence against the new theory to reject it, which is a much weaker conclusion.
Returning to the example of the professor sampling five exams from the class to grade, we stated earlier that the professor believes and would like to show that the mean grade on the exam for the entire class is greater than 80. The professor would want to set this conclusion to the alternative hypothesis, and hence the null hypothesis would be that the mean grade for the entire class is less than or equal to 80. That is,
\(H_0:\text{The mean grade is less than or equal to 80},\)
and
\(H_1:\text{The mean grade is greater than 80},\)
or in mathematical notation
\(H_0:\mu\leq 80,\)
and
\(H_1:\mu > 80.\)
The idea of comparing an old belief or old theory in the null hypothesis to a new theory in the alternative hypothesis is just one way that the null hypothesis can be considered in an intuitive way. One can also note in the null and alternative hypotheses specified for the case of the exams scores that the concept of equality is contained in the null hypothesis. That is, the null hypothesis specifies that the mean is less than or equal to 80 (\(\mu\leq 80\)) while the alternative hypothesis specifies that the mean is strictly greater that 80 (\(\mu>80\)). This is consistent with the general rule that equality is always specified in the null hypothesis and not the alternative hypothesis. The reason for this has to do with the way risk is assessed in a statistical hypothesis test but is not important for our development here. This also applies when two or more groups are being compared. If a researcher is testing whether two gender identities have the same median salary in a geographical region, the hypothesis that the salaries are the same must be contained in the null hypothesis.
Now let us consider an example from a real study. As we have observed in several studies so far, social justice researchers have an interest in racial and ethnic disparities in the sentencing of individuals convicted of a crime. Recent research has started to consider how these disparities are shaped by the larger community. The racial threat theory states that defendants of color receive more severe punishments in places where they represent a larger or growing share of the population. This is due to their potential threat to white positions of power (Blalock 1967). A recent study explored the effects of racial, ethnic, and immigrant threat on sentence disposition and sentence length (Feldmeyer et al. 2015).
The study is based on individual-level sentencing data, county level data, and sentencing data taken from the Florida Department of Corrections Sentencing Guidelines database and its offender-based information system. The final sample included 501,027 black, Latino, and white defendants convicted of felonies and sentenced in Florida from 2000 to 2006. The composition and socioeconomic characteristics of the 67 Florida counties are taken from the 1990 and 2000 U.S. Census and the Uniform Crime Report.
The study considers two hypotheses as state in the research article. The first hypothesis is stated as (Feldmeyer et al. 2015):
Hypothesis 1: Racial/ethnic threat effects—black–white and Latino–white disparities in sentencing are larger in places with growing black and Latino populations,
and the second hypothesis is stated as,
Hypothesis 2: Immigrant threat effects—Latino–white (and possibly black–white) disparities in sentencing are larger in places with growing immigrant populations.
These are the effects that the authors are interested in demonstrating are true, and therefore these are the alternative hypotheses in the statistical tests that they use in the research article. Note that both hypotheses also imply inequality, which is consistent with the structure that the null hypothesis always contains the concept of equality or no change.
It is worth noting that an analog version of testing the hypotheses operates in the United States justice system. When a defendant is brought to trial, the jury is told that they must assume that the defendant is innocent. As a result of the evidence revealed at the trial, the jurors must decide if there is enough evidence to conclude, beyond a reasonable doubt, that the defendant is guilty, or if there was not sufficient evidence. At the end of the trial the defendant can be found guilty, in which case the evidence was found, or not guilty, in which case not enough evidence was found. The null hypothesis can be considered to be that the defendant is innocent, and the alternative is that the defendant is guilty. The strong conclusion, which is that there is evidence beyond a reasonable doubt, corresponds to rejecting the null hypothesis and accepting the alternative hypothesis. If not enough evidence is found, this scenario is the analog of failing to reject the null hypothesis. Note that the strong conclusion is sided with the defendant being guilty. The idea is that someone will only be punished in cases where there is so much evidence that a reasonable person would believe the defendant committed the crime.

