Skip to main content
Statistics LibreTexts

1.6: Chi-Squared Test

  • Page ID
    51951
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Hypothesis Testing

    When evaluating social phenomena, especially regarding crime, people often make judgments. However, such judgments are not always correct. Researchers conducting empirical studies make tentative statements about phenomena and make decisions about the validity of those statements. These tentative statements are called hypotheses, and if one denies a hypothesis that happens to be true, they commit an error in judgment.

    When making decisions based on sample statistics about attributes of a population, i.e., unknown facts, one must judge whether something is like that or not. However, such judgments are not always accurate and can sometimes be wrong. In other words, researchers may make errors in judgment.

    Specifically, hypothesis testing entails comparing empirically observed sample findings with the theoretically expected outcomes if the null hypothesis were true. The null hypothesis represents the hypothesis that a researcher aims to reject, thereby supporting its alternative. This hypothesis often posits that two or more variables are not related. To compare the null and alternative hypotheses, the researcher calculates the probability of the observed outcome occurring solely due to chance or random error (Vogt & Johnson, 2011).

    NHST Steps

    This hypothesis testing is also known as null hypothesis significance testing, or NHST. In the context of NHST, it is recommended to follow these five steps:

    • Step 1: Formulate the null and alternative hypotheses.
    • Step 2: Calculate the test statistic.
    • Step 3: Determine the probability (p-value) of obtaining a test statistic at least as extreme as the observed value, assuming no relationship exists.
    • Step 4: If the p-value is very small, typically less than 5%, reject the null hypothesis.
    • Step 5: If the p-value is not small, typically 5% or greater, retain the null hypothesis.

    Chi-Squared Test

    The first hypothesis testing covered in this book is the chi-squared test, commonly called the one-sample chi-square. It is frequently employed to compare the proportion of cases from a sample with either hypothesized values or those previously obtained from a comparison population. In the data file, only one categorical variable and a designated proportion against which to evaluate the observed frequency are required. This test may assess whether no difference exists in the proportion within each category (e.g., 50%/50%) or against a specific proportion derived from a previous study.

    For instance, 44 male respondents used credit cards, while 62 female respondents used credit cards for online purchases in our example. From this tabulation, we may conclude that more female respondents use credit cards for online purchases than their male counterparts. There is a difference in the usage of credit cards for online purchases between male and female groups. However, we have no evidence to determine whether this difference is statistically significant or by accident. That's why we conduct a hypothesis test to confirm our findings with statistical evidence. Below, we will evaluate the usage of credit cards for online purchases in the past 12 months between men and women using the NCVS data we used in the last chapter.

    NHST Steps for Chi-Squared Test

    Step 1: Formulate the Null and Alternative Hypotheses.

    The first step in conducting the chi-squared test is to write the null and alternative hypotheses.

    • H0: The usage of credit cards for online purchases in the past 12 months is the same across men and women.
    • HA: The usage of credit cards for online purchases in the past 12 months is not the same across men and women.

    Step 2: Calculate the Test Statistic.

    The test statistic to use when examining a relationship between two categorical variables is the chi-squared statistics, χ². First, you will download the revised data from the 2016 NCVS from the shared Google Drive. Then, you will need to first load the data using the following syntax.

    library(haven)

    rNCVS2016 <- read_dta("C:/Users/75JCHOI/OneDrive - West Chester

    University of PA/WCU Research/R/data/rNCVS2016.dta")

    View(rNCVS2016)

    We will perform a chi-squared test on a contingency table using R. The variable male was coded as 0 for women and 1 for men. The PayCred variable represents whether the respondent used credit cards for online purchases in the past 12 months. Those who used it were coded as 1, and those who did not were coded as 0.

    chisq.test(x = rNCVS2016$Male,

    y = rNCVS2016$PayCred)

    The test statistic was χ² = 0.54045

    Step 3: Determine the Probability (P-Value) of Obtaining a Test Statistic at Least as Extreme as the Observed Value, Assuming no Relationship Exists.

    The probability of observing a chi-squared value of 0.54 in our sample—assuming no association between men and women in the population in using credit cards for online purchases in the past 12 months—is calculated to be 0.4622, indicating a p-value greater than 0.05.

    Step 4: If the P-Value Is very small, Typically Less Than 5%, Reject the Null Hypothesis.

    Step 4 is not relevant in this situation.

    Step 5: If the P-Value is not small, Typically 5% or Greater, Retain the Null Hypothesis.

    The probability that the null hypothesis, stating "The usage of credit cards for online purchases in the past 12 months is the same across men and women," holds true in the population, based on our sample data, is calculated to be 0.4622, indicating a p-value greater than 0.05. This relatively high probability suggests that the null hypothesis is likely true and should not be rejected.

    Reporting the Results

    So, how do we write up the results based on our test? We conducted a chi-squared test to examine the null hypothesis, which posited no association between using credit cards for online purchases in the past 12 months and gender. Our analysis failed to reject the null hypothesis, suggesting no statistically significant association between the two variables [χ² (1) = 0.54; p > .05]. For a demonstration of how a chi-square test is applied and reported, I recommend reviewing the work of Choi and Han (2022). Their study provides a clear example of applying and reporting a chi-square test using the data from NCVS.

    Conclusion

    In this chapter, we conducted first hypothesis testing using a chi-squared test. In the next chapter, we will use a different statistical test called t-test to compare different groups.

    References

    Choi, J., & Han, S. (2022). Exploring gender disparity in capable guardianship against identity theft: A focus on internet-based behavior. International Journal of Criminal Justice, 4(1), 25-48.

    Vogt, W. P., & Johnson, R. B. (2011). Dictionary of statistics & methodology: A nontechnical guide for the social sciences (4th ed.). Sage.


    This page titled 1.6: Chi-Squared Test is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Jaeyong Choi (The Pennsylvania Alliance for Design of Open Textbooks (PA-ADOPT)) via source content that was edited to the style and standards of the LibreTexts platform.