Skip to main content
Statistics LibreTexts

2.2: Statistics is About Finding Patterns

  • Page ID
    48879
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Data can be presented as an array of numbers that looks like a mess. We organize it and describe it to find some order within those numbers. The next step is to find any patterns within those numbers. Patterns help us figure out how to make sense of the variations present in the data. With patterns, we can determine what is going on with the variations we see in the data. The opposite of patterns is random. Random means we cannot tell what is going on with the variations we see in the data.

    Statistics help us find patterns in the data. These patterns are between the variables. In statistics, you want your actual results to be a clear pattern, compared to random, error, chance, or no pattern.

    What is a "pattern?" Statistical tests produce an outcome, a value that tells you if something is going on. That's called a pattern. A pattern is something discernable and predictable that occurs between two variables. Something is going on systematically between the two variables. By systematic, we mean that there is a relationship between two variables that we can easily describe. Systemic variation is often referred to as a model or a prediction. When you hear the terms “the model fits” or “we want to model this,” what we are saying is that the variables we selected decide, or predict, a relationship, or a pattern, among the variables.

    We need patterns; we need predictions because they give us a framework. If we know that the more times a person attends therapy sessions, the more likely they will alleviate their mental health symptoms, then we can say something about the importance of attending therapy regularly and its impact on symptom reduction. Without a framework, we are useless and ineffective because we can't affect the outcome.

    What is the opposite of a pattern? The opposite pattern is random. Random means you can't tell what is going on; you can't predict anything. Whatever happened occurred by luck. Randomness stinks. If we study a lot and get a good grade, then study a lot, then not get a good grade, then not study a lot and get a good grade, and then not study a lot and not get a good grade, then there is randomness, and we can’t tell what is going on, at least between these two variables. The implication in this example is that the number of hours studying does not make a difference in the outcome. This result is instructive, though. It means that studying a lot is not the key to obtaining good grades. Something else is going on that leads to good grades.

    Another example is alcohol treatment. Pattern: "We know if you are motivated, plus have good friendship support, you will get sober faster, and stay sober longer, compared to someone who doesn't have those qualities. If you don't have those qualities, we need to alter our treatment approach." Random: "Good luck getting sober; we have no idea what will make you sober or not."

    When you have a pattern, you are looking for patterns between at least two variables. There are two types of patterns: relationships and group comparison. More on this later on in the chapters that discuss correlations, t-tests, and F-tests. But for now, the relationship pattern is this form: increases in Variable X are associated with increases or decreases in Variable Y. Group patterns take on this form: Group A is lower or higher than Group B.

    The statistical value is a value that indicates if a pattern is present or not present. All statistical tests generate a value and that value is a ratio. Ratios compare something to something else. In this case, the statistical value is a ratio that represents the pattern among the variables versus randomness among the variables. Ratios are numerators over denominators. The numerator is the actual difference we observe; the denominator is error variance. If the statistics value is 0, it means that there is no pattern because the numerator is much lower than the denominator, which means there are no actual differences and there is only error variance. If the statistics value is 1, that is still not great because it means that there is no pattern because the numerator is the same as the denominator, which means the actual differences are the same as the error variance. The likelihood of a pattern in the relationships among the variables is equal to the likelihood that there is randomness, or no patterns, among the variables. The higher the statistics value, the greater the numerator is than the denominator, which means that the likelihood of a pattern in the relationships among the variables is greater than the likelihood of randomness among the variables. Consequently, the lower the statistical value, the lower the numerator is than the denominator, and there is more randomness than pattern.

    There is only one decision with a statistic test – am I seeing a pattern vs. am I seeing randomness? More on why that is the only decision you make with a statistics test in the discussion of the concept of significance in Chapter Eight.


    This page titled 2.2: Statistics is About Finding Patterns is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Peter Ji.