Skip to main content
Statistics LibreTexts

4.1: Distributions – A Picture of Variation

  • Page ID
    49884
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\dsum}{\displaystyle\sum\limits} \)

    \( \newcommand{\dint}{\displaystyle\int\limits} \)

    \( \newcommand{\dlim}{\displaystyle\lim\limits} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \(\newcommand{\longvect}{\overrightarrow}\)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    A good place to start with descriptive statistics and to satisfy assumptions is to look at the variation of your variables. It helps to analyze variation by looking at it. And by looking at it, I do mean looking at a picture of it. In descriptive statistics, we need to look at a table of values to generate a picture of the distribution, otherwise known as a distribution plot.

    Unless someone is setting you up on a blind date, you want to see who you are dating before you even go on a date. If you want to see who you are dating, you need a picture. You will scan that picture and look for something that you want to see based on whatever outcome you want out of your first date. Wow, this paragraph seems so serious in tone. But the paragraph is true, right?

    This visual examination process is how we examine distributions. Distributions are pictures of the variations we see in a variable. What are we looking for when we look at variation? We are looking for quality distributions. We want to see how varied the variation is. Is there a wide spread in the variation, or is there no variation at all? Is the variation smooth, clumpy, or broken in spots? As always, our gauge for evaluating the quality of our variation depends on our research question.

    There are two types of distributions of variation.

    The first type, categorical distribution, describes the variation of a categorical or nominal variable.

    The second type, continuous distribution, describes the variation of a continuous variable, including the ordinal, interval, and ratio variables. There are different criteria for evaluating each distribution.

    To evaluate distributions, you need two things. The first is a table. The table has two columns. The first column contains the variable’s values. The second column is a frequency count of the number of times you observe each value. The times you observe each value consist of the number of participants who provided that value or the number of counts per value.

    Table One displays frequency distributions for a gender-categorical variable.

    Table One - Frequency Chart for Gender Categorical Variable

    Group

    Frequency

    Percent

    Valid Percent

    Cumulative Percent

    Male

    10

    13

    13

    13

    Female

    64

    83

    83

    96

    Other (please specify)

    3

    4

    4

    100

    Total

    77

    100

    100

    Table Two displays frequency distributions for a continuous variable.

    Table Two - Frequency Chart for Emotional Expression Continuous Variable

    Value

    Frequency

    Percent

    Valid Percent

    Cumulative Percent

    34

    1

    1.3

    1.3

    1.3

    43

    1

    1.3

    1.3

    2.6

    44

    2

    2.6

    2.6

    5.2

    45

    1

    1.3

    1.3

    6.5

    46

    4

    5.2

    5.2

    11.7

    47

    2

    2.6

    2.6

    14.3

    48

    5

    6.5

    6.5

    20.8

    49

    6

    7.8

    7.8

    28.6

    50

    4

    5.2

    5.2

    33.8

    51

    1

    1.3

    1.3

    35.1

    52

    7

    9.1

    9.1

    44.2

    53

    4

    5.2

    5.2

    49.4

    54

    6

    7.8

    7.8

    57.1

    55

    5

    6.5

    6.5

    63.6

    56

    3

    3.9

    3.9

    67.5

    57

    4

    5.2

    5.2

    72.7

    58

    4

    5.2

    5.2

    77.9

    59

    1

    1.3

    1.3

    79.2

    60

    2

    2.6

    2.6

    81.8

    61

    3

    3.9

    3.9

    85.7

    62

    2

    2.6

    2.6

    88.3

    63

    1

    1.3

    1.3

    89.6

    66

    1

    1.3

    1.3

    90.9

    68

    3

    3.9

    3.9

    94.8

    69

    1

    1.3

    1.3

    96.1

    71

    1

    1.3

    1.3

    97.4

    72

    1

    1.3

    1.3

    98.7

    73

    1

    1.3

    1.3

    100

    The second thing you need is a plot. The plot helps you observe the variation. The plot consists of an X-Y plot figure. The X-axis, or the horizontal line, is each of the variables’ values. The Y-axis, the vertical line, is the frequency count. The columns along the X-axis represent the number of times you observe that variable’s value. Figures One and Two present frequency plots for the gender and emotional expression variables.

    clipboard_e3ec6252daa1e5017856df1235ae322ce.png
    Figure One: Frequency Plot for Gender Categorical Variable
    clipboard_e7d0b72fea166607fadc98bb6a6d8885a.png
    Figure Two: Frequency Plot for Emotional Expression Continuous Variable

    This page titled 4.1: Distributions – A Picture of Variation is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Peter Ji.