Skip to main content
Statistics LibreTexts

3.4: Entering Data Into SPSS

  • Page ID
    29439
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    In order to properly use SPSS you have to have data. And, often, to get data into SPSS you have to create a data file with variables and then enter data. We now begin that fun journey of creating variables in SPSS.

    The Variable View

    Before we input data into the data editor, we need to create the variables using the variable view. To access this view click the ‘Variable View’ tab at the bottom of the data editor. The view will now look something like this:

    clipboard_e01aa1f613f0f9cb534f45fb57c7a9593.png

    Each row in the variable view is a variable and you can set the characteristics of each variable by entering information into the following labeled columns:

    • Name - Enter a name in this column for each variable. This name will appear at the top of the corresponding column in the data view, and helps you to identify variables. You can use almost whatever name you'd like, although there are some symbols you cannot use. For instance, you can't use math symbols in variable names, and you cannot use spaces. If you'd like a space, you might consider using an underscore (such at total_deaths) or you can used something called camel case, where you use capital letters, such as TotalDeaths. If you use a name SPSS does not like, it will yell at your and call you stupid.
    • Type - Here you can define the type of the variable. In most cases the variable type will be numeric (i.e., numbers) because that's what SPSS eats for breakfast. You may also have other types of variables as well, cush as string variables (strings of letters), currency variables (money) and date variables.
    • Width - The width of a variable defines how many spaces are allotted int he data file for the variable. By default, SPSS assigns a width of 8 numbers/characters. This is usually perfectly fine since you don't often use numbers larger than 8 digits, but you may have longer strings of text in which case you would want to increase the width.
    • Decimals - By default, 2 decimal places are displayed. If you want to change the number of decimal places for a given variable, replace the 2 with a new value or increase or decrease the values using the arrows.
    • Label - Here you can enter a longer variable description. This is one of the best habits you can get into.
    • Values - This column is for assigning numbers to represent groups of people. it's incredibly useful and we'll look at how to use this later.
    • Missing - This column is for assigning numbers to missing data.
    • Columns - While on the surface this seems just like Width described above, it is not. Columns defines how many columns of the variable are displayed in the data view even if the width of the variable is larger.
    • Align - You can use this column to select the alignment of the data in the corresponding column of the data editor, either Left, Right, or Center.
    • Measure - Use this column to define the scale of measurement for the variable. You get to choose from Nominal, Ordincal, or Scale. More on this later.
    • Role - SPSS has some functions that try to run analyses automatically without input from you, the user. In order to do that, SPSS needs to know what role the variable plays in the data. You get choices such as Input, Target, Both, None, Partition, and Split. I'd love to tell you more about these options, but we won't be using any of the automatic analyses because I'd rather eat a shoe than allow SPSS to decide what to do with my data.

    Creating Variables

    Now let's take a look at how to set up variables. Below you'll find a table with data in it, and we are going to create that data file in SPSS.

    Name Birthdate Job # of Friends Caffeinated Drinks per Day Annual Income Neuroticism
    Amy 11/03/1973 Professor 13 2 120,000 14
    Bob 01/14/1962 Professor 7 1 100,000 3
    George 07/16/1960 Professor 3 0 23,000 6
    Diane 03/27/1972 Professor 2 3 50,000 18
    Julie 05/18/1980 Professor 9 2 90,000 12
    Ammarah 04/30/1983 Student 10 5 13,000 3
    Conrad 08/29/1998 Student 18 5 12,300 14
    Lauryn 09/25/2000 Student 8 8 7,200 14
    Kwame 10/23/2002 Student 3 3 27,100 10
    Damien 01/11/2001 Student 2 5 1,600 8

     

    Creating a String Variable

    Our first variable is the name of the subject. To do this:

    • Click in the first white cell in the column labeled Name.
    • Type the word ‘Name’.
    • Move from this cell using the arrow keys on the keyboard (you can also just click in a different cell).

    You've now created your first variable. Your screen should look something like this:

    clipboard_e6ba33c27d5a25eddb39205453ce16181.png

    Now it's time to define the characteristics of the variable called Name. First, move the cursor to the Type column:

    clipboard_e697afea84a6b9568b2342ac10c8f074a.png

    You will see that the default variable type is Numeric. However, the Name variable is a string variable. Click on the cell and the type window will open:

    clipboard_efc7c7726c94fd3f1bb44016c3cdae27c.png

    Select String and click on OK. You've now defined the variable Name as a String.

    Next, move to the cell in the Label column and type a description of the variable, such as ‘First Name’. Finally, we can specify the scale of measurement for the variable by going to the column labeled Measure and selecting Nominal, Ordinal, or Scale from the drop-down list. In the case of a string variable, it represents a description of the case and provides no information about the order of cases or the magnitude of one case compared to another. Therefore, select Nominal.

    Once the Name variable has been created, return to the data view by clicking on the ‘Data View’ tab at the bottom of the data editor. Notice that the first column now has the variable label Name at the top.

    clipboard_ef5f77dca8a7d69c075cfd7007bf7f9f2.png

    Creating a Date Variable

    Next up is the birth date variable, which is a date. Shocking, right? Move the cursor to the Name column and the second row and type Birthdate. Move the cursor and now you've created another variable!

    clipboard_ed5d77210e4ff461aa2574d9174eca2b9.png

    With this variable, move to the Type column, click on the cell ane select Date as the type. You can choose any of a number of display types for the date.

    clipboard_e8599ca23d2fd08e2a0bd47c35fa2dcbb.png

    Click OK to continue. Now move to the Label column and type in "Birth Date". That's it for this variable.

    Creating Coding Variables

    Coding variables use numeric values to represent different predictor (independent) groups in a study. SPSS likes numbers way more than it likes text, which is why we use these coding variables. In our example, we have two groups of participants, professors and students. Let's create the next variable and call it Job. When creating coding variables, you assign each group a number. Commonly you would use numbers like 1, 2, 3, etc. to represent group membership. If you are feeling like you want to be different, you can use whatever number you'd like. Maybe the Professor group might be represented by 666 and the student group by 934. It matters not as long as you know what each group is. For our example, let's use 1 for professors and 2 for students.

    In the Variable view, go to the next empty space in the Name column and type Job. Move the cursor to the next cell. SPSS, as usual, has filled in the default values for the remainder of the columns. The variable has been defined as type Numeric, which is great since that's what our variable will be. Because we are using integer values for our coding (I've never seen anyone use decimals, but I suppose it's possible), we'll move to the Decimals column and type 0. Then head on over to the Label column and type something like "Participant's Job," or whatever else you might want to label the variable as. Again, this is always a good habit to have. Sometimes variable names are not clear and the label can help with that. The next column is Values. This is important when using coding variables. This column is where we give labels to each value of the variable. Click on the three dots in the cell and you'll see this window:

    clipboard_efed79977bca772cf45c0efc199cee439.png

    In the Value: box, type 1 and move to the Label: box. Here type in "Professor". Then click the Add button. This will add your entry to the list of value labels. Next go back to the Value: box, type 2 and then to the Label: box and type "Student". Then click the Add button. By the time you're done, you should see:

    clipboard_e7d7d346a044e4dabf0004d4de972bcfb.png

    Click the OK button.

    Next, move to the Measure column. Because this is a categorical variable, click on the drop-down menu and select Nominal. Now you should see something like this:

    clipboard_ee164437068d258204f12041c25c5b029.png

     

    Creating a Numeric Variable

     

    Our next variable is Friends, which is numeric. Numeric variables are the easiest ones to create because they are the default format in SPSS. Move back to the variable view using the tab at the bottom of the data editor (Image 88). Go to the cell in row 4 of the column labeled Name (under the previous variable you created). Type the word ‘Friends’. Move into the column labeled Image 1 using the → key on the keyboard. As with the previous variables we have created, SPSS has assumed that our new variable is Image 1, and because our variable is numeric we don’t need to change this setting.

    The scores for the number of friends have no decimal places (unless you are a very strange person indeed, you can’t have 0.23 of a friend). Move to the Image 89 column and type ‘0’ (or decrease the value from 2 to 0 using Image 1
    ) to tell SPSS that you don’t want to display decimal places.

    Let’s continue our good habit of naming variables and move to the cell in the column labeled Label and type ‘Number of Friends’. Finally, number of friends is measured on the ratio scale of measurement (see Section 1.6.2) and we can specify this by going to the column labeled Measure and selecting Image 90 from the drop-down list (this will have been done automatically, but it’s worth checking).

    Figure 4.11 Coding values in the data editor with the value labels switched off and on
    Figure 52

    SPSS Tip 4.4 Copying and pasting into the data editor and variable viewer Image 11
    Image 75

    Often (especially with coding variables), you need to enter the same value lots of times into the data editor. Similarly, in the variable view, you might have a series of variables that all have the same value labels (e.g., variables representing questions on a questionnaire might all have value labels of 0 = never, 1 = sometimes, 2 = always to represent responses to those questions). Rather than typing the same number lots of times or entering the same value labels multiple times, you can use the copy and paste functions to speed things up. All you need to do is to select the cell containing the information that you want to copy (whether that is a number or text in the data view or a set of value labels or another characteristic within the variable view) and click with the right mouse button to activate a menu within which you can click (with the left mouse button) on Copy (top of Figure 4.12). Next, highlight any cells into which you want to place what you have copied by dragging the mouse over them while holding down the left mouse button. These cells will be highlighted in orange. While the pointer is over the highlighted cells, click with the right mouse button to activate a menu from which you should click Paste (bottom left of Figure 4.12). The highlighted cells will be filled with the value that you copied (bottom right of Figure 4.12). Figure 4.12 shows the process of copying the value ‘1’ and pasting it into four blank cells in the same column.

    Figure 4.12 Copying and pasting into empty cells
    Figure 53
    Image 78
    Image 35

    Why is the ‘Number of Friends’ variable a ‘scale’ variable?

    Once the variable has been created, you can return to the data view by clicking on the ‘Data View’ tab at the bottom of the data editor (Image 91). The contents of the window will change, and you’ll notice that the fourth column now has the label Friends. To enter the data, click the white cell at the top of the column labeled Friends and type the first value, 5. Because we’re entering scores down the column the most sensible way to record this value in this cell is to press the ↓ key on the keyboard. This action moves you down to the next cell, and the number 5 is stored in the cell above. Enter the next number, 2, and then press ↓ to move down to the next cell, and so on.
    Image 35

    Having created the first four variables with a bit of guidance, try to enter the rest of the variables in Table 4.1 yourself.
    4.6.7 Missing values Image 12

    Although we strive to collect complete sets of data, often scores are missing. Missing data can occur for a variety of reasons: in long questionnaires participants accidentally (or, depending on how paranoid you’re feeling, deliberately to irritate you) miss out questions; in experimental procedures mechanical faults can lead to a score not being recorded; and in research on delicate topics (e.g., sexual behavior) participants may exert their right not to answer a question. However, just because we have missed out on some data for a participant, that doesn’t mean that we have to ignore the data we do have (although it creates statistical difficulties). The simplest way to record a missing score is to leave the cell in the data editor empty, but it can be helpful to tell SPSS explicitly that a score is missing. We do this, much like a coding variable, by choosing a number to represent the missing data point. You then tell SPSS to treat that number as missing. For obvious reasons, it is important to choose a code that cannot also be a naturally occurring data value. For example, if we use the value 9 to code missing values and several participants genuinely scored 9, then SPSS will wrongly treat those scores as missing. You need an ‘impossible’ value, so people usually pick a score greater than the maximum possible score on the measure. For example, in an experiment in which attitudes are measured on a 100-point scale (so scores vary from 1 to 100) a good code for missing values might be something like 101, 999 or, my personal favorite, 666 (because missing values are the devil’s work).

    Labcoat Leni’s Real Research 4.1 Gonna be a rock ‘n’ roll singer Image 11
    Image 41

    Oxoby, R. J. (2008). Economic Enquiry, 47(3), 598–602.

    AC/DC are one one of the best-selling hard rock bands in history, with around 100 million certified sales, and an estimated 200 million actual sales. In 1980 their original singer Bon Scott died of alcohol poisoning and choking on his own vomit. He was replaced by Brian Johnson, who has been their singer ever since.5 Debate rages with unerring frequency within the rock music press over who is the better frontman. The conventional wisdom is that Bon Scott was better, although personally, and I seem to be somewhat in the minority here, I prefer Brian Johnson. Anyway, Robert Oxoby, in a playful paper, decided to put this argument to bed once and for all (Oxoby, 2008).

    5 Well, until all that weird stuff with W. Axl Rose in 2016, which I’m trying to pretend didn’t happen.

    Using a task from experimental economics called the ultimatum game, individuals are assigned the role of either proposer or responder and paired randomly. Proposers are allocated $10 from which they have to make a financial offer to the responder (i.e., $2). The responder can accept or reject this offer. If the offer is rejected neither party gets any money, but if the offer is accepted the responder keeps the offered amount (e.g., $2), and the proposer keeps the original amount minus what they offered (e.g., $8). For half of the participants the song ‘It’s a long way to the top’ sung by Bon Scott was playing in the background, for the remainder ‘Shoot to thrill’ sung by Brian Johnson was playing. Oxoby measured the offers made by proposers, and the minimum offers that responders accepted (called the minimum acceptable offer). He reasoned that people would accept lower offers and propose higher offers when listening to something they like (because of the ‘feel-good factor’ the music creates). Therefore, by comparing the value of offers made and the minimum acceptable offers in the two groups, he could see whether people have more of a feel-good factor when listening to Bon or Brian. The offers made (in $) are6 as follows (there were 18 people per group):

    6 These data are estimated from Figures 1 and 2 in the paper because I couldn’t get hold of the author to get the original data files.

        Bon Scott group: 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5
        Brian Johnson group: 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 5

    Enter these data into the SPSS Data Editor, remembering to include value labels, to set the measure property, to give each variable a proper label, and to set the appropriate number of decimal places. Answers are on the companion website, and my version of how this file should look can be found in Oxoby (2008) Offers.sav.
    Image 42

    To specify missing values click in the column labeled Image 92 in the variable view (Image 1) and then click Image 1
    to activate the Missing Values dialog box in Figure 4.13. By default, SPSS assumes that no missing values exist, but you can define them in one of two ways. The first is to select discrete values (by clicking on the radio button next to where it says Discrete missing values), which are single values that represent missing data. SPSS allows you to specify up to three values to represent missing data. The reason why you might choose to have several numbers to represent missing values is that you can assign a different meaning to each discrete value. For example, you could have the number 8 representing a response of ‘not applicable’, a code of 9 representing a ‘don’t know’ response, and a code of 99 meaning that the participant failed to give any response. SPSS treats these values in the same way (it ignores them), but different codes can be helpful to remind you of why a particular score is missing. The second option is to select a range of values to represent missing data and this is useful in situations in which it is necessary to exclude data falling between two points. So, we could exclude all scores between 5 and 10. With this last option you can also (but don’t have to) specify one discrete value.


    This page titled 3.4: Entering Data Into SPSS is shared under a CC BY-SA 4.0 license and was authored, remixed, and/or curated by Danielle Navarro.