4.1: Distributions – A Picture of Variation
- Page ID
- 49884
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\dsum}{\displaystyle\sum\limits} \)
\( \newcommand{\dint}{\displaystyle\int\limits} \)
\( \newcommand{\dlim}{\displaystyle\lim\limits} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\(\newcommand{\longvect}{\overrightarrow}\)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)A good place to start with descriptive statistics and to satisfy assumptions is to look at the variation of your variables. It helps to analyze variation by looking at it. And by looking at it, I do mean looking at a picture of it. In descriptive statistics, we need to look at a table of values to generate a picture of the distribution, otherwise known as a distribution plot.
Unless someone is setting you up on a blind date, you want to see who you are dating before you even go on a date. If you want to see who you are dating, you need a picture. You will scan that picture and look for something that you want to see based on whatever outcome you want out of your first date. Wow, this paragraph seems so serious in tone. But the paragraph is true, right?
This visual examination process is how we examine distributions. Distributions are pictures of the variations we see in a variable. What are we looking for when we look at variation? We are looking for quality distributions. We want to see how varied the variation is. Is there a wide spread in the variation, or is there no variation at all? Is the variation smooth, clumpy, or broken in spots? As always, our gauge for evaluating the quality of our variation depends on our research question.
There are two types of distributions of variation.
The first type, categorical distribution, describes the variation of a categorical or nominal variable.
The second type, continuous distribution, describes the variation of a continuous variable, including the ordinal, interval, and ratio variables. There are different criteria for evaluating each distribution.
To evaluate distributions, you need two things. The first is a table. The table has two columns. The first column contains the variable’s values. The second column is a frequency count of the number of times you observe each value. The times you observe each value consist of the number of participants who provided that value or the number of counts per value.
Table One displays frequency distributions for a gender-categorical variable.
|
Group |
Frequency |
Percent |
Valid Percent |
Cumulative Percent |
|---|---|---|---|---|
|
Male |
10 |
13 |
13 |
13 |
|
Female |
64 |
83 |
83 |
96 |
|
Other (please specify) |
3 |
4 |
4 |
100 |
|
Total |
77 |
100 |
100 |
Table Two displays frequency distributions for a continuous variable.
|
Value |
Frequency |
Percent |
Valid Percent |
Cumulative Percent |
|---|---|---|---|---|
|
34 |
1 |
1.3 |
1.3 |
1.3 |
|
43 |
1 |
1.3 |
1.3 |
2.6 |
|
44 |
2 |
2.6 |
2.6 |
5.2 |
|
45 |
1 |
1.3 |
1.3 |
6.5 |
|
46 |
4 |
5.2 |
5.2 |
11.7 |
|
47 |
2 |
2.6 |
2.6 |
14.3 |
|
48 |
5 |
6.5 |
6.5 |
20.8 |
|
49 |
6 |
7.8 |
7.8 |
28.6 |
|
50 |
4 |
5.2 |
5.2 |
33.8 |
|
51 |
1 |
1.3 |
1.3 |
35.1 |
|
52 |
7 |
9.1 |
9.1 |
44.2 |
|
53 |
4 |
5.2 |
5.2 |
49.4 |
|
54 |
6 |
7.8 |
7.8 |
57.1 |
|
55 |
5 |
6.5 |
6.5 |
63.6 |
|
56 |
3 |
3.9 |
3.9 |
67.5 |
|
57 |
4 |
5.2 |
5.2 |
72.7 |
|
58 |
4 |
5.2 |
5.2 |
77.9 |
|
59 |
1 |
1.3 |
1.3 |
79.2 |
|
60 |
2 |
2.6 |
2.6 |
81.8 |
|
61 |
3 |
3.9 |
3.9 |
85.7 |
|
62 |
2 |
2.6 |
2.6 |
88.3 |
|
63 |
1 |
1.3 |
1.3 |
89.6 |
|
66 |
1 |
1.3 |
1.3 |
90.9 |
|
68 |
3 |
3.9 |
3.9 |
94.8 |
|
69 |
1 |
1.3 |
1.3 |
96.1 |
|
71 |
1 |
1.3 |
1.3 |
97.4 |
|
72 |
1 |
1.3 |
1.3 |
98.7 |
|
73 |
1 |
1.3 |
1.3 |
100 |
The second thing you need is a plot. The plot helps you observe the variation. The plot consists of an X-Y plot figure. The X-axis, or the horizontal line, is each of the variables’ values. The Y-axis, the vertical line, is the frequency count. The columns along the X-axis represent the number of times you observe that variable’s value. Figures One and Two present frequency plots for the gender and emotional expression variables.



