Skip to main content
Statistics LibreTexts

The Chi Squared Test

  • Page ID
    64239
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Chi Square and Two Way Tables


    Two Way Tables

    Is there any relationship between political affiliation and the type of car a person drives?

    A survey was done and the following data was collected:

     

      Democrat Republican Other Row marginal total
    American 32

    (35)

    21

    (24)

    15

    (9)

    68
    Foreign 55

    (53)

    40

    (37)

    8

    (13)

    103
    Column marginal total 87 61 23 171
    Column percent of total 51 36 13  

     

    We computed the numbers in parentheses by calculating the expected counts.  We multiplied the row marginal total by the column proportions.  We could also compute this number by calculating

         \( \text{Expected Count} = \frac{ \text{(Row Margin Total)(Column Marginal Total)}}{\text{Grand Total}} \)

     

     

    We have the hypothesis:

            H0: The true proportions are the same for all of the populations

            H1:  The true proportions are not the same for all of the populations.

    We compute:  

     

            (observed - expected)2 
                                                       
                     expected

         \( = \frac{(32 - 35)^{2}}{35} +\frac{(21 - 24)^{2}}{24} +\frac{(8 - 13)^{2}}{13} + ... = 6.87 \)               

    The degrees of freedom is 

            (num of rows - 1)(num of columns - 1) = (2 - 1)(3 - 1) = 2

    Now the \(\chi^{2}\) that corresponds to 2 degrees of freedom and \(\alpha\) = .05 is 5.99

    We can reject H0 and therefore accept H1 hence there is an association between political affiliation and the type of car a person drives.

    An applet that does the two way table computations can be found here

     


    Chi Square For Univariate Data

     Recall that we use a t-statistic for a difference between proportions.  If there are three or more Boolean variables, then we must use a different solution.

     

    Example  

    Suppose that we run a lunch special in our restaurant and want to determine if it makes a difference which day of the week to close.  In other words are all days equally frequented by customers?  We take a tally of the customers for each day and find:

     

      Mon Tue Wed Thur Fri Sat Sun
    Customers 30 33 20 22 35 40 30

    We have the following hypotheses:

            H0:  p =  1/7,  p=  1/7,  p3  =  1/7,  p4  =  1/7,  p5  =  1/7,  p6  =  1/7,  p7  =  1/7

            H1:  At least one of the p's is not 1/7

    Let \(\alpha\) = .05

    The test statistic that we will use is also called the chi square statistic and is also denoted by the Greek letter \(\chi^{2}\) .  It is computed as follows:

    Notice that the total sample size is 210, hence if H0 is true, then the expected count for each day is 

         \( \frac{210}{7} = 30 \)

    For each of the data, we compute:

    We compute:  

            S (observed - expected)2/expected

          \( \frac{(30 - 30)^2}{30} = 0  \hspace{1cm}   \frac{(33 - 30)^2}{30} = 0.3 \)  

          \( \frac{(20 - 30)^2}{30} = 3.3 \hspace{1cm}     \frac{(22 - 30)^2}{30} = 2.1 \hspace{1cm}  \frac{(35 - 30)^2}{30} = 0.83  \)  

          \( \frac{(40 - 30)^2}{30} = 3.3  \hspace{1cm}    \frac{(30 - 30)^2}{30} = 0.3 \)  

    Now we add these numbers to get:  

            0 + 0.3 + 3.3 + 2.1 + 0.83 + 3.3 + 0 = 9.8

    Hence we have

             \(\chi^{2}\) = 9.8

    The degrees of freedom is 

            k - 1 = 7 - 1      (k is the number of samples)

    Now go to the chi square table

    then the critical value for the \(\chi^{2}\) with 6 degrees of freedom is 12.50.  Since 

            9.8  <  12.59 

    we see that there is not enough evidence to conclude that the day of the week is a factor in lunch attendance.  

    An applet that does goodness of fit computations can be found here

     


    Back to the Regression and Nonparametric Home Page

     

     

    The Chi Squared Test is shared under a not declared license and was authored, remixed, and/or curated by LibreTexts.

    • Was this article helpful?