Skip to main content
Statistics LibreTexts

Common Formulas

  • Page ID
    22217
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)

    The following formulas are in the order in which you learn about them in this textook. Use the Table of Contents to look for a specific equation.

    Descriptive Statistics

    Mean

    \[ \displaystyle \bar{X} = \dfrac{\sum X}{N} \]

    Standard Deviation

    \[s=\sqrt{\dfrac{\sum(X-\overline {X})^{2}}{N-1}} \]

    Which is also: \(s=\sqrt{\dfrac{\sum(X-\overline {X})^{2}}{N-1}}=\sqrt{\dfrac{S S}{d f}} \)

    Some instructors prefer this formula because it is easier to calculate (but more difficult to see what's happening):

    \[\sqrt{ \dfrac{\left(\sum(X^2) - \dfrac{(\sum{X})^2}{N}\right)}{(N-1)}}\]

    z-score

    To find the z-score when you have a raw score:

    \[z=\frac{X-\bar{X}}{s}\]

    To find a raw score when you have a z-score:

    \[x=z s+\overline{X} \]

    t-tests

    One-Sample t-test

    These are the same formulas, but formatted slightly differently.

    \[t = \cfrac{(\bar{X}-\mu)}{\left(\cfrac{s} {\sqrt{n}}\right)} \]

    Confidence Interval

    \[\text {Margin of Error }=t \times \left(\dfrac{s}{\sqrt{N}}\right) \nonumber \]

    \[\text { Confidence Interval }=\overline{X} \pm (t \times \left(\dfrac{s}{\sqrt{N}}\right)) \]

    Independent Sample t-test

    Unequal N

    You can always use this formula:

    \[t=\dfrac{(\bar{X}_{1}-\bar{X}_{2})}{\sqrt{\left[\dfrac{\left(n_{1}-1\right) \times s_{1}^{2} + \left(n_{2}-1\right) \times s_{2}^{2}}{n_{1}+n_{2}-2}\right] \times \left(\dfrac{1}{n_{1}} + \dfrac{1}{n_{2}}\right)}} \]

    Equal N

    You should only use this formula when your two independent groups are the same size (N), meaning the same number of people in each group.

    \[\dfrac{(\bar{X_1} - \bar{X_2})}{\sqrt{\left(\frac{s_1^2}{N_1}\right)+\left(\frac{s_2^2}{N_2}\right)}}\]

    Dependent Sample t-test

    Conceptual Formula (symbols)

    \[ t = \cfrac{\overline{X}_{D}}{\left(\cfrac{s_{D}}{\sqrt{N}} \right)} \]

    Full Formula

    \[ t = \cfrac{ \left(\cfrac{\Sigma {D}}{N}\right)} { {\sqrt{\left(\cfrac{\sum\left((X_{D}-\overline{X}_{D})^{2}\right)}{(N-1)}\right)} } /\sqrt{N} } \]

    \[S S_{B}=\sum_{EachGroup} \left[ \left(\overline{X}_{group}-\overline{X}_{T}\right)^{2} \times (n_{group}) \right] \]

    \[S S_{W}=\sum_{EachGroup} \left[ \sum \left(\left(X-\overline{X}_{group}\right)^{2}\right) \right] \]

    \[S S_{T}=\sum \left[ \left(X - \overline{X}_{T}\right)^{2} \right] \]

    \[ HSD = q \times \sqrt{\dfrac{MSw}{n_{group}}} \]

    Same as above.

    \[SS_{Ps} = \left[\sum{\left(\dfrac{(\sum{X_{Ps}})^2}{k}\right)}\right] -\dfrac{\left((\sum{X})^2\right)}{N} \]

    \[SS_{WG} = SS_{T} - SS_{BG} - S_{P} \nonumber \]

    Same as above.

    Pearson's r (Correlation)

    The following formulas are the same. Use the first one when you already have the standard deviation calculated.

    These are paired data, so N is the number of pairs.

    SD Already Calculated:

    \[ r= \cfrac{ \left( \cfrac{\sum ((x_{Each} - \bar{X_x})\times(y_{Each} - \bar{X_y}) ) }{(N-1)}\right) } {(s_x \times s_y)} \]

    SD Not Calculated:

    \[ r = \cfrac{ \left( \cfrac{\sum ((x - \bar{X_x})\times(y - \bar{X_y}) ) }{(N-1)}\right) } {\left( \sqrt{\dfrac{\sum\left((x-\overline {X_x})^{2}\right)}{N-1}} \right) \times \left( \sqrt{\dfrac{\sum\left((y-\overline {X_y})^{2}\right)}{N-1}} \right)} \]

    Regression Line Equation

    \[\widehat{\mathrm{Y}}=\mathrm{a}+(\mathrm{b}\times{X}) \]

    a (intercept):

    \[\mathrm{a}=\overline{X_y}- (\mathrm{b} \times \overline{X_x}) \]

    b (slope):

    \[ \dfrac{\sum(Diff_{x} \times Diff_{y})}{\sum({Diff_{X}}^2)} \]

    In which "Diff" means the differences between each score and that variable's mean.

    Pearson's \(\chi^2\) (Chi-Square)

    \[\chi^{2}=\sum_{Each}\left(\dfrac{\left(E-O\right)^{2}}{E} \right)\]

    Expected Frequencies

    Goodness of Fit:

    \[\dfrac{N}{k}\]

    Test of Independence:

    \[E_{EachCell}=\dfrac{RT \times CT}{N} \]

    In which RT = Row Total and CT = Column Total

    • Was this article helpful?