4.4: Getting an Overall Summary of a Variable
- Page ID
- 29452
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Up to this point, I’ve explained several different summary statistics that are commonly used when analyzing data. I presented values for each one of the stats as if I'd used SPSS to calculate each one of those individually. As luck would have it, however, SPSS is capable of producing all that stuff at once! So, without further ado...
Using SPSS to Summarize a Variable
First, let's start with opening the MLBGL2021.sav
data file. Now that you've got that open, we can go to the Analyze menu, select Descriptive Statistics, and then Descriptives...
Once the Descriptives... dialog box is open, you should see something like this:
On the left side of the dialog is a list of all available variables in the data file. On the right is a list of variables to be included in the descriptive statistics. You can scroll down the left column and select Margin of Victory and then click the arrow between the columns to move that to the right column. That will tell SPSS you want to analyze the Margin of Victory variable. Should look like this:
Now we'll select the Options... button. You'll see a vaguely complicated window like this:
You can select all of the descriptive statistics you want or only a couple. Up to you. Have a party. If you select the ones shown, and those that roughly correspond to the previous discussion in this chapter, you then select Continue and that will return to the Descriptives... dialog. Once back there, click on the OK button and sit back and watch SPSS work its magic. Don't get too comfortable though, as SPSS will compute all of those things faster than you can even think about relaxing. SPSS will open the output window and you should see something like this:
So, with but a few mouse clicks, SPSS has computed most of the major descriptive stats. Oh, wait, it looks like there are a couple missing, such as percentiles. Don't fret. If you think the Descriptives... function is amazing, let's look at Frequencies....
This command is everything that Descriptives... is but better.
Once we select Frequencies... you'll choose your variable(s) of interest, in our case here Margin of Victory, move that to the right column in the dialog, and then select Statistics... You'll see a plethora of options to choose from:
Now we can select Quartiles, which will allow you to figure out the IQR, and Range, although Maximum - Minimum gets you the same thing. So, click away and select the descriptive statistics of interest. Here is my choices, but you may have your own desires:
Click Continue to go back to the Frequencies... dialog. Note that in the lower left of the dialog there is a box checked that says "Display frequency tables." As implied, if you leave that checked, you'll get a frequency table along with all the other statistics you've requested.
One last item in the Frequencies dialog is of interest: Charts... If you click on Charts... you'll see:
As you might guess, this is where you can ask SPSS to make a pretty picture to go along with the other statistics. You can get a bar chart, pie chart, or histogram. When and why you would choose one over the other will be covered in the next chapter, but for now, we'll select Histograms. Then select Continue.
Now, back at the Frequencies dialog, click on OK, and wait for the amazing output. In just a second or two you'll see the following in the output window:
What more could you want from your descriptive statistics?
The Margin of Victory variable is a ratio scale variable, which SPSS really likes. What if we want to do descriptive statistics for a nominal variable?
Let's open WorldSeriesWinners.sav
. This data file shows the baseball world series winners for every year since 1903, except for a couple of years there were no world series. We can look at the Champion variable, which is nominal. Remember when you have nominal data there are many descriptive statistics, such as mean, that cannot be used. So, let's open the Analyze, Descriptive Statistics, Frequencies dialog and select Champion as the variable to analyze:
Make sure Display frequency tables is checked, then select Charts.... In the Charts dialog, select Bar charts and click on Continue.
Back in the Frequencies dialog,
Okay, what about if we feed it a logical vector instead? Let’s say I want to know something about how many “blowouts” there were in the 2010 AFL season. I operationalise the concept of a blowout (see Chapter 2) as a game in which the winning margin exceeds 50 points. Let’s create a logical variable blowouts
in which the i-th element is TRUE
if that game was a blowout according to my definition, select Format.... Select Descending counts and then click on Continue. Selecting Descending counts will order the frequency output from the most commonly occurring observation to the least. Makes the table and bar chart easier to interpret.
Now click OK and you should see the following appear in the output window:
As you can now see, the New York Yankees have won the most World Series, followed by the St. Louis Cardinals, then the Boston Red Sox, and your Los Angeles Dodgers. The table shows this in, uh, tabular form, and the bar chart makes it a pretty picture. Now, we don't know much else except the mode here, which is The New York Yankees with 27 World Series Wins, but that's really all we need to know.