0% found this document useful (0 votes)
41 views44 pages

Introduction To SPSS

Here are the variables we can identify from this questionnaire: - Age (numeric) - Height (numeric) - Weight (numeric) - Cigarettes per day (categorical) - Own bathroom scales (categorical) - Own rowing machine (categorical) - Own exercise bike (categorical) - Own punch bag (categorical) - Own other sports equipment (categorical) - Hours exercising per week (numeric) - Member of sporting team (categorical) - Number of books owned (numeric) - Own video recorder (categorical) - Own DVD player (categorical) - Other variables may be identified from question 15 These variables will need to be created in SPSS

Uploaded by

TheGimhan123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views44 pages

Introduction To SPSS

Here are the variables we can identify from this questionnaire: - Age (numeric) - Height (numeric) - Weight (numeric) - Cigarettes per day (categorical) - Own bathroom scales (categorical) - Own rowing machine (categorical) - Own exercise bike (categorical) - Own punch bag (categorical) - Own other sports equipment (categorical) - Hours exercising per week (numeric) - Member of sporting team (categorical) - Number of books owned (numeric) - Own video recorder (categorical) - Own DVD player (categorical) - Other variables may be identified from question 15 These variables will need to be created in SPSS

Uploaded by

TheGimhan123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

INTRODUCTION TO

SPSS v15
Page 2 of 44
CONTENTS

LESSON 1: SPSS BASICS ....................................................................................... 9

1.1 SPSS FILES ........................................................................................................... 9


1.2 STARTING SPSS .................................................................................................. 10
1.3 OPENING AN EXISTING DATA FILE .................................................................... 10
1.4 SAVING A DATA FILE .......................................................................................... 11
1.5 CREATING A NEW DATA FILE ............................................................................ 12
1.6 CREATING SPSS DATA ....................................................................................... 13
1.6.1 SPSS VARIABLES .............................................................................................. 13
1.6.2 IDENTIFYING VARIABLES FROM QUESTIONNAIRES ............................................ 14
1.6.3 CREATING SPSS VARIABLES ............................................................................ 15
1.6.4 DELETING A VARIABLE ...................................................................................... 19
1.6.5 EXCLUDING DATA FROM CALCULATIONS: DEFINING MISSING VARIABLES ...... 20
1.6.6 REDEFINING DATA ............................................................................................ 21
1.7 GETTING AROUND IN SPSS ................................................................................ 27

LESSON 2: SUMMARIZING DATA ................................................................... 31

2.1 SUMMARIZING DATA NUMERICALLY ................................................................ 31


2.1.1 CALCULATING THE FREQUENCY OF A CATEGORICAL VARIABLE....................... 31
2.1.2 CALCULATING FREQUENCIES FOR TABULATED CATEGORICAL VARIABLES
(CROSSTABS) ................................................................................................................ 33
2.1.3 CALCULATING THE MEAN, MAXIMUM AND MINIMUM OF A SCALAR (NON-
CATEGORICAL) VARIABLE ............................................................................................ 36
2.1.4 THE STANDARD DEVIATION .............................................................................. 37
2.2 SUMMARIZING DATA GRAPHICALLY ................................................................. 38
2.2.1 CREATING A BAR CHART .................................................................................. 38
2.2.2 CREATING A PIE CHART .................................................................................... 41
2.2.3 CREATING A LINE CHART.................................................................................. 43

Page 3 of 44
Page 4 of 44
PROCEDURE 0: GETTING DATA
Click on Start > JMU Applications > System Utilities > Change
Library
This displays the Connect Library Screen

Expand the „Change To‟ dropdown list and select „Avril Robarts LRC‟
from the options displayed.
A message is displayed:

Click on „OK‟.
Minimise all programs so you can see your desktop.
Click on the „My Computer‟ icon.
You will see an icon for your L: drive under Network Drives:

Double Click on this icon. This opens the library drive.


Double click on the „SPSS‟ folder to open it.
Copy the contents to some folder on your M: drive.

Page 5 of 44
Page 6 of 44
LESSON 1 –
SPSS BASICS
In this lesson, you will learn how to:
 Start SPSS
 Open an existing data file
 Save a data file
 Create a new data file
 Create SPSS variables
 Enter SPSS data
 Navigate SPSS

Page 7 of 44
Page 8 of 44
Lesson 1: SPSS Basics
1.1 SPSS Files
There are 3 types of file used by SPSS:

o *.sav files (data files):


Spreadsheets that contain the data to be analysed. It is also possible to
import files generated by other packages such as Microsoft Excel and
Microsoft Text into SPSS.

o *.spo files (output files):


Most output generated by SPSS, e.g. results of data analysis, graphs,
charts, and any errors the program may have encountered, are written to
these files. They are viewed from the SPSS Output Viewer

o *.sps files (syntax files).


These files contain text commands that may be run on the data files.
These files are not covered within this course.

Page 9 of 44
1.2 Starting SPSS

PROCEDURE 1: STARTING SPSS


Select the Start Button at the bottom
left corner of the screen
Select „JMU Applications‟ from the
displayed menu
Select „Analysis Tools‟ from the
displayed menu
Select SPSS v15 from the displayed
menu
This starts SPSS. The Welcome
Screen is often opened at this point,
displaying a list of the most recently
opened data files.
o To open an existing file from the
list, highlight that file in the list and
select OK Figure 1: SPSS Welcome Screen
o To create a new file select Cancel
To work with existing data files, see later sections of this document

1.3 Opening an Existing Data File


PROCEDURE 2: OPENING AN EXISTING DATA FILE
If the Welcome If the required file Highlight the required file on the
Screen is is displayed on the list.
visible: list: Select the OK button.
The selected data file is opened
(see Figure 4).
If the required file Select the Cancel button.
is not displayed on The Welcome Screen closes.
the list Proceed as shown in the next row.
If the Welcome Select “File/Open/Data” from the Menu Bar.
Screen is not
The Open screen
visible:
is displayed
Select the required
data file
Select the Open
button

Figure 2: The Open Screen

Page 10 of 44
PROCEDURE 2: OPENING AN EXISTING DATA FILE
The selected data file is opened

Figure 3: An Example Data File

1.4 Saving a Data File


PROCEDURE 3: SAVING A DATA FILE
Either: From the menu bas select File/Save
Or: Select the Save icon from the toolbar

Page 11 of 44
1.5 Creating a New Data File
PROCEDURE 4: CREATING A NEW DATA FILE
If the Welcome Screen is visible: Select the „Cancel‟ button
If the Welcome Screen is not visible: Select “File/New/Data” from the Menu
Bar.
The SPSS data screen is shown (see figure 4)

Figure 4: The SPSS Data Screen

We re now ready to create our data file.

Page 12 of 44
1.6 Creating SPSS Data
1.6.1 SPSS Variables
SPSS requires variables to hold it‟s data, so we must create these variables before we
can enter any data.

SPSS variables can be either:


1. Numeric. Can only contain numbers. Examples of numeric data may include:
o Age,
o Height,
o Weight,
o Distance etc.
2. Text. Can contain any alphanumeric characters. Examples of text data may
include:
o Name,
o Address,
o Make of Car
o Hair Colour etc.
3. Categorical. Can be text or numeric. Categorical data (also called Case
data) can only take one of a predetermined set of values. For example:
o Married? (Y/N)
o Salary Range (Less than £20,000/£20,000-£40,000/More than £40,000)
o Eye Colour (Blue/Brown/Green)
o Age Group (0-12/13-19/20-29/30-49/50-60/70+)
The predetermined values that the user has to choose from are set up when the
variable is created.

Page 13 of 44
1.6.2 Identifying Variables from Questionnaires
Since it is common to obtain data for our statistical experiments from questionnaires,
it is important to understand how to use our questionnaires to identify the variables we
will need.

Consider the following example questionnaire:

EXAMPLE 1: QUESTIONNAIRE
QUESTIONNAIRE
No. Question Answer
1 Age (in years)
2 Height (in meters)
3 Weight (in kilogrammes)
4 How many cigarettes do you smoke per None
day (please tick as appropriate)? 1 - 10
11 - 20
21 - 30
31 - 40
41 or more
5 Do you own a set of Bathroom Scales Y/N
6 Do you own a Rowing Machine Y/N
7 Do you own an Exercise Bike Y/N
8 Do you own a Punch Bag Y/N
9 Do you own any other Sports Equipment Y/N
10 How many yours a week (approx) do you
spend exercising?
11 Are you a member of a sporting team? Y/N
12 How many books do you own?
13 Do you own a video recorder? Y/N
14 Do you own a DVD player? Y/N
15 Do you own a PC? Y/N
16 Do you own a Hi Fi? Y/N
17 How many computer games do you own?
18 Salary (times thousands of pounds). Please < 20
tick one as appropriate. 20.01 – 30
30.01 – 40
40.01 plus
There are 18 questions, and we will need one variable per question. So we will need
18 variables:

Page 14 of 44
EXAMPLE 1: QUESTIONNAIRE (cont.)

QUESTION REQUIRED VARIABLE TYPE


1 Numeric
2 Numeric
3 Numeric
4 Categorical (6 categories: one for each possible answer)
5 Categorical (2 categories: one for each possible answer)
6 Categorical (2 categories: one for each possible answer)
7 Categorical (2 categories: one for each possible answer)
8 Categorical (2 categories: one for each possible answer)
9 Categorical (2 categories: one for each possible answer)
10 Numeric
11 Categorical (2 categories: one for each possible answer)
12 Numeric
13 Categorical (2 categories: one for each possible answer)
14 Categorical (2 categories: one for each possible answer)
15 Categorical (2 categories: one for each possible answer)
16 Categorical (2 categories: one for each possible answer)
17 Numeric
18 Categorical (4 categories: one for each possible answer)

1.6.3 Creating SPSS Variables


Variables are created using the SPSS Variable View form.

PROCEDURE 5: OPENING THE VARIABLE VIEW FORM


Select the Variable View tab Figure 5: The
Variable View
Tab

The Variable View form is displayed.

Figure 6: The Variable View Form Column Headers

Each row in the Variable View form represents one variable.

Page 15 of 44
1.6.3.1 Rules for Variable Names
The following rules apply to variable names:

 The name must begin with a letter. The remaining characters can be any letter,
any digit, a period, or the symbols @, #, _, or $.
 Variable names cannot end with a period.
 Variable names that end with an underscore should be avoided (to avoid
conflict with variables automatically created by some procedures).
 Blanks and special characters (for example, !, ?, ‟, and *) cannot be used.
 Each variable name must be unique; duplication is not allowed. Variable
names are not case sensitive. The names NEWVAR, NewVar, and newvar are
all considered identical.

1.6.3.2 Creating a Numeric Variable


PROCEDURE 6: CREATING A NUMERIC VARIABLE
Open the Variable View form, if it is not already visible (See section 1.6.3)
On the first available blank row: Enter a variable name in the „Name‟ column.
This name will be used to identify your variable
within SPSS. See section 1.6.3.1 for the
Variable Name Rules.
Press the „tab‟ or „return‟ key.
This fills the row with default settings for that
variable.
In the „Decimals‟ column, enter the number of
decimal places you require.
In the Label column, enter a description for that
variable.
Select any other row on the Variable View form
to start creating a new variable, or select the
Data View tab to start entering data.

Page 16 of 44
1.6.3.3 Creating a Text Variable
PROCEDURE 7: CREATING A TEXT VARIABLE
Open the Variable View form, if it is not already visible (See section 1.6.3)
On the first available blank row: Enter a variable name in the „Name‟ column.
This name will be used to identify your variable
within SPSS. See section 1.6.3.1 for the
Variable Name Rules.
Press the „tab‟ or „return‟ key.
This fills the row with default settings for that
variable.
Select the cell in the „Type‟ column. Click on
the button that appears on the right hand
side of the field. This displays the „Variable
Type‟ box.

Figure 7: The Variable Type Box


Select the „String‟ option, and then the OK
button.
This displays the Variable View form again
In the „Width‟ field, set the maximum number
of characters for your data
In the Label column, enter a description for that
variable.
Select any other row on the Variable View form
to start creating a new variable, or select the
Data View tab to start entering data.

1.6.3.4 Creating a Categorical Variable


A categorical variable is created with values to represent each category. These values
can be either numeric or strings.

For example, a Yes/No variable could use the value Y to represent „Yes‟ and N to
represent „No‟. These are string values.

Alternately, you might want to use the value1 to represent „Yes‟ and 0 to represent
„No‟. These are numeric values.

Page 17 of 44
PROCEDURE 8: CREATING A CATEGORICAL VARIABLE
Open the Variable View form, if it is not already visible (See section 1.6.3)
On the first available blank row: Create either a numeric or a string variable,
depending on the type required for the label.
Select the cell in the „Value‟ column. Click on
the button that appears on the right hand
side of the field. This displays the „Value
Labels‟ box

Figure 8: The Value Labels Box

Enter the required value in the „Value‟ field.


Enter the category description in the „Value
Label‟ field

Figure 9: Creating Category Data

Select the „Add‟ button


Keep adding Values and Value Labels until all
categories have been included

Figure 10: The Value Labels Box

Select OK. This closes the Value Labels box


and returns focus to the Variable View form.
Select any other row on the Variable View form
to start creating a new variable, or select the
Data View tab to start entering data.

Page 18 of 44
1.6.4 Deleting a variable
PROCEDURE 9: DELETING A VARIABLE
Open the Variable View
Right-click on the variable number.

Select „Clear‟ from the drop down menu.


The variable is deleted

EXERCISE 1
You have been handed the following questionnaire, which needs to be coded into
SPSS:
EXERCISE 1
QUESTIONNAIRE
No. Question Answer
1 Age (in years)
2 Height (in meters)
3 Name
4 Gender Male
Female
Create an SPSS data file (called Q1.sav) to hold the data.
Before you create the variables, decide:
o How many variables are needed
o What type (numeric, string, categorical) does each variable need to be?

EXERCISE 2
The following 2 completed questionnaires have been submitted.
Enter the data into your SPSS data file Q1.sav.
EXERCISE 2
QUESTIONNAIRE
No. Question Answer
1 Age (in years) 32
2 Height (in meters) 1.56
3 Name Julie Jones
4 Gender Male
Female X
No. Question Answer
1 Age (in years) 41
2 Height (in meters) 1.92
3 Name Andy Andrews
4 Gender Male X
Female

Page 19 of 44
1.6.5 Excluding Data from Calculations: Defining Missing
Variables
In SPSS there are no empty cells within the data file (which is assumed to be
rectangular). If no value has been entered, the system supplies the system-missing
value (represented on the screen as a dot). SPSS will automatically exclude system-
missing values from its statistical calculations.

It may be, however, that the user wishes SPSS to treat certain responses present
within the data as if they were missing. For example, one question on our survey
might have five possible answers (A, B, C, D, E). But supposing that category E
indicates that the respondent refused to answer the question, while category D
indicates that the respondent failed to understand the question. We might wish to
complete an analysis of our data, but not include categories D and E in the
calculations, even though we do want to see their frequencies and counts in the
output. In this case we could define categories D and E as user-missing.

SPSS allows us to define special values (missing values) which will be used to
indicate that the data is user-missing. Data values specified as user-missing are
flagged for special treatment and are excluded from most calculations.

For each variable, you can choose one of the following options:
Define no missing values
Define up to three values that the system will take to read „user-missing.‟ In our
example above, we could define categories C and D as missing values.
Define a range plus one discrete value. For example in an analysis of examination
results we might want to exclude results less than 20% from our analysis, but still
have the counts of those results displayed on our output. We might also want to
exclude those students who left the examination without submitting a paper (a walk
out could be coded as an arbitrary number, such as –9. The minus sign makes it
obvious that it is not an examination result). Here we could define the range 0%-20%
and the discrete value –9 as missing values.

All string values, including null or blank values, are considered valid unless you
explicitly define them as missing. To define null or blank values as missing for a
string variable, enter a single space in one of the fields for Discrete missing values.

Page 20 of 44
PROCEDURE 10: DEFINING MISSING VALUES
Open the Variable View form, if it is not already visible (See section 1.6.3)
On the row representing the variable to have missing values defined:
Select the cell in the „Missing‟ column. Click on the button that appears on the
right hand side of the field. This displays the „Missing Values‟ box
To define „No missing values,‟
either:
Select the „Cancel‟ button, or
Ensure that „No missing values‟ is
checked, then select the „OK‟
button.
To define „Discrete missing
values‟:
Check on „Discrete missing values‟
Enter up to three separate values in
the fields provided
Select the „OK‟ button. Figure 11: The Missing Values Box

Any datum with a value corresponding to one of these entered values will be
marked user-missing and ignored for most calculations.
To define „Range plus one optional discrete missing value‟:
Check on „Range plus one optional discrete missing value‟
Enter the limits of the range in the „Low‟ and „High‟ fields.
If required, enter a further discrete value in the „Discrete value‟ field.
Any datum having a value within the entered range, or corresponding to the
discrete value entered, will be marked user-missing and ignored for most
calculations.

1.6.6 Redefining Data


Many statistical tests require data that has been formatted in a particular way. For
example, some tests require numerical data, while others require categorical data.
Once data has been collected and coded into SPSS, it is often necessary to change the
way in which it is represented in order to perform these tests. It is very common to
have to transform data from one form to another once it has been coded.

1.6.6.1 Converting Numeric Data to Categorical Data


We can translate a numeric variable into a categorical variable by creating categories
corresponding to different values of that variable.

For example, a variable might contain respondents‟ ages, held as numerical data.
However, we may decide to redefine this data so it has the following categories:
19 or below, 20 to 29, 30 to 39, 40 to 49, 50 to 59, 60 or above.

When categorising data, the results can be placed in the same variable, or a new
variable can be created. It is more common to create a new variable:

Page 21 of 44
PROCEDURE 11: CONVERTING NUMERIC DATA TO CATEGORICAL
DATA
From the „Transform‟ menu, select „Recode Into Different Variables…‟
This displays the „Recode into Different Variables‟ dialog box

Figure 12: The Recode into Different Variables Dialog Box

Select the variable you want to transform in the left hand box
Select the arrow to move the variable into the „Input Variable -> Output
Variable box.
Enter a name for the new variable in the „Name‟ field.
Enter a label for the new variable in the „Label‟ field.
Select the „Change‟ button.
Select the „Old and New Values…‟ button
This displays the „Recode into Different Variables: Old and New Values‟ dialog
box.

Figure 13: The Old and New Values Dialog Box

To enter the lowest category Ensure that the „Range: LOWEST through value:‟
(e.g. 19 or below): checkbox is checked „on‟.
Enter the corresponding value (19, in our
example) in the „LOWEST through value‟ field.
Ensure that the „Value‟ check box is set to „on‟ in
the „New Value‟ frame.
Enter the category value in the „Value‟ field of the
„New Value‟ frame..

Page 22 of 44
PROCEDURE 11: CONVERTING NUMERIC DATA TO CATEGORICAL
DATA
Select the „Add‟ button
To enter the highest category Ensure that the „Range: value through HIGHEST:‟
(e.g. 60 or above): checkbox is checked „on‟.
Enter the corresponding value (60, in our
example) in the „value through HIGHEST:‟ field.
Ensure that the „Value‟ check box is set to „on‟ in
the „New Value‟ frame.
Enter the category value in the „Value‟ field of the
„New Value‟ frame..
Select the „Add‟ button
To enter central categories Ensure that the „Range: through‟ checkbox is
(e.g. 20 to 29): checked „on‟.
Enter the lowest and highest values of the range in
the corresponding boxes (20 and 29, in our
example)

NOTE: Ensure that all Figure 14: Adding a Range


ranges are mutually Ensure that the „Value‟ check box is set to „on‟ in
exclusive the „New Value‟ frame.
Enter the category value in the „Value‟ field of the
„New Value‟ frame.
Select the „Add‟ button.
Repeat for all other ranges.
When all ranges have been added, select the „Continue‟ button.
This redisplays the „Recode into Different Variables: Old and New Values‟ dialog
box.
Select the „OK‟ button.
The new variable is now created, and the correct values are computed.
Define the new variable as a Categorical Variable (see procedure 8)

Page 23 of 44
1.6.6.2 Regrouping Categorical Variables
Sometimes it is necessary to combine the categories of a categorical variable to form a
new set of categories. For example, we might have a categorical variable with 5
categories, which we want to recode into 3 categories, as follows:

Existing Variable Required Variable


1=Very Poor
1=Below Average
2=Poor
3=Average 2=Average
4=Good
3=Above Average
5=Very Good

SPSS provides a mechanism for recoding variables. As before, when regrouping


categorical data, the results can be placed in the same variable, or a new variable can
be created. It is more common to create a new variable:

PROCEDURE 12: REGROUPING A CATEGORICAL VARIABLE


From the „Transform‟ menu, select „Recode into Different Variables…‟
This displays the „Recode into Different Variables‟ dialog box

Figure 15: The Recode into Different Variables Dialog Box

Select the categorical variable you want to regroup in the left hand box
Select the arrow to move the variable into the „Input Variable -> Output
Variable box.
Enter a name for the new variable in the „Name‟ field.
Enter a label for the new variable in the „Label‟ field.
Select the „Change‟ button.
Select the „Old and New Values…‟ button

Page 24 of 44
This displays the „Recode into Different Variables: Old and New Values‟ dialog
box.

Figure 16: The Old and New Values Dialog Box

To group the lowest Ensure that the „Range: Lowest through‟


categories (e.g. 1 and 2): checkbox is checked „on‟.
Enter the corresponding value (2, in our example)
in the „Lowest through‟ field.
Ensure that the „Value‟ check box is set to „on‟ in
the „New Value‟ frame.
Enter the new value (1, in our example) in the
„Value‟ field of the „New Value‟ frame
Select the „Add‟ button
To group the highest Ensure that the „Range: through highest‟ checkbox
categories (e.g. 4 and 5): is checked „on‟.
Enter the corresponding value (4, in our example)
in the „through highest‟ field.
Ensure that the „Value‟ check box is set to „on‟ in
the „New Value‟ frame.
Enter the new value (3, in our example) in the
„Value‟ field of the „New Value‟ frame.
Select the „Add‟ button
To include the central Ensure that the „Value‟ checkbox is checked „on‟
category: in the „Old Value‟ frame.
Enter the value to be recoded (in this example, 3)
in the Value field in the „Old Value‟ frame.
Ensure that the „Value‟ check box is set to „on‟ in
the „New Value‟ frame.
Enter the new value (2, in this example) in the
„Value‟ field of the „New Value‟ frame.
Select the „Add‟ button.
Repeat for all other values and ranges.
When all values and ranges are assigned recoded values, select the „Continue‟
button.
This redisplays the „Recode into Different Variables: Old and New Values‟ dialog
box.
Select the „OK‟ button.
The new variable is now created, and the correct values are computed.
Define the new variable as a Categorical Variable (see procedure 8)

Page 25 of 44
1.6.6.3 Calculating New Variables from Existing Data
It is possible to create new variables by performing calculations on existing data. For
example, in your questionnaire you may have two variables, called „before‟ and
„after,‟ describing respondents‟ weight before and after a slimming treatment.
However, once the data has been gathered, you may decide that you need to perform
some tests on the actual difference in weight. It is necessary to calculate this value
from the „before‟ and „after‟ variables.

PROCEDURE 13: CALCULATING NEW VARIABLES FROM EXISTING


DATA
On the menu bar select Transform/Compute Variable.
This displays the „Compute Variable‟ dialog box.

Figure 17: The Compute Variable Dialog Box

Enter the name of the new variable in the „Target Variable:‟ field.
Build the calculation: Select a variable from the left hand box
NOTE: Select the arrow to move it into the „Numeric
VARIABLES USED Expression‟ box.
IN Select an arithmetic operator from the displayed keypad
CALCULATIONS (+, -, *, / etc.).
MUST BE Select another variable from the left hand box
NUMERIC Select the arrow to move it into the „Numeric
Expression‟ box
Continue until the complete expression is built
Select „OK‟
Open the Variable View form (see PROCEDURE 5: ).
Create a label for the new variable.

Page 26 of 44
1.7 Getting Around in SPSS
To change any part of a variable, double-click on the horizontal grey bar at the top of
the spreadsheet. This brings you to the Define Variable window.
Use the arrow keys on the keyboard to move forward and backward.
To go all the way to the end of the data, press <control> and the right arrow key.
To go all the way to the beginning of the data, press <control> and the left arrow key.

Page 27 of 44
Page 28 of 44
LESSON 2 –
SUMMARISING DATA
In this lesson, you will learn how to:
 Use SPSS to summarise data
 Calculate frequencies (i.e. number of occurrences) of data
 Calculate maximum, minimum and mean values
 Calculate Standard Deviation
 Display data in graphical format

Page 29 of 44
Page 30 of 44
Lesson 2: Summarizing Data
2.1 Summarizing Data Numerically
Once your data has been entered, it is often easy to spot broad trends by inspecting
summaries of that data. In SPSS these summaries are called Descriptive Statistics.

The most useful descriptive statistics for our purposes are:

Frequencies
With a categorical variable it is often useful to know how many responses each
category has received. These can be expressed as straightforward counts or as
percentages.

Means
It is often useful to know the average values of our variables, particularly when
comparing one subset of data against another.

Maxima and Minima


It can also be useful to know the least and greatest values of our variables. These can
also be useful when comparing subsets of data.

2.1.1 Calculating the Frequency of a Categorical Variable


PROCEDURE 14: CALCULATING THE FREQUENCY OF A
CATEGORICAL VARIABLE
On the menu bar, select Analyze/Descriptive Statistics/Frequencies…
This displays the „Frequencies‟ dialog box

Figure 18: The Frequencies Dialog Box

Select the variable(s) you want to examine in the left-hand panel (to select more
than one variable, depress the control key while left-clicking on the required
variables.)
Select the arrow button to move the selected variable(s) into the „Variable(s):‟
box.
Ensure that the „Display frequency tables‟ checkbox is checked „on.‟
Select the OK button. The SPSS Output Viewer is displayed.

Page 31 of 44
2.1.1.1 SPSS Output
The following output is derived from a frequency analysis (PROCEDURE 14: ) of the
dataset demo.sav. The particular variable being analysed is car.

OUTPUT 1: Frequency Analysis


Frequencies
Sta tistics

Primary vehic le price c ategory


N Valid 6400
Mis sing 0

Primary ve hicle price ca tegory

Cumulative
Frequency Percent Valid Percent Percent
Valid Ec onomy 1841 28.8 28.8 28.8
Standard 2275 35.5 35.5 64.3
Luxury 2284 35.7 35.7 100.0
Total 6400 100.0 100.0

Notes
o Labels are displayed instead of variable names or category values.
o In the „Statistics‟ table we are shown the number of valid responses (6400),
and the number of missing responses (0) for that variable.
o We are shown the frequency (i.e. the number of responses) for each category
of the variable.
o We are shown the percentages for each category of the variable: 28.8% of
respondents said they had an Economy car

Page 32 of 44
2.1.2 Calculating Frequencies for Tabulated Categorical Variables
(Crosstabs)

2.1.2.1 Discussion
We may want to calculate how the categories of one variable break down against the
categories of another. For example, the data set demo.sav contains two variables:
1. gender, and
2. carcat (Primary vehicle price category).
Say that we would like to determine how car ownership within the vehicle price
categories is broken down between the sexes (i.e. the gender categories).

For this we can calculate a crosstab, displaying the relevant data in the cells of a table

PROCEDURE 15: CALCULATING FREQUENCIES FOR TABULATED


CATEGORICAL DATA (CROSSTABS)
On the menu bar, select Analyze/Descriptive Statistics/Crosstabs…
This displays the Crosstabs dialog box…

Figure 19: The Crosstab dialog box

Select one of the required variables from the left-hand panel.


Press the topmost arrow button to move this variable into the „Row(s)‟ panel
Select the second required variable from the left-hand panel
Press the second arrow button to move this variable into the „Columns(s)‟
panel. (Note: it doesn‟t matter which variable is which)
Select the „Cells…‟ button.

Page 33 of 44
PROCEDURE 15: CALCULATING FREQUENCIES FOR TABULATED
CATEGORICAL DATA (CROSSTABS)
This displays the „Crosstabs: Cell Display‟ dialog box

Figure 20: The Crosstabs cell display


dialog box
Ensure that the „Row‟ and „Column‟ checkboxes (in the „Percentages‟ frame) are
switched on.
Select the „Continue‟ button to re-display the „Crosstabs‟ dialog box
Select the OK button

2.1.2.2 SPSS Ouptut


The following output is derived from a crosstab calculation (see PROCEDURE 15: )
on the variables „gender‟ (label: „Gender‟) and „carcat‟ (label: „Primary vehicle price
category‟).

The Row variable was gender


The column variable was carcat

OUTPUT 2: CROSSTAB CALCULATION


Crosstabs
Case Processing Summary

Cases
Valid Mis sing Total
N Percent N Percent N Percent
Gender * Primary
6400 100.0% 0 .0% 6400 100.0%
vehicle price category

Page 34 of 44
OUTPUT 2: CROSSTAB CALCULATION
Ge nde r * P rima ry ve hicl e price category Crosstabulati on

Primary vehicle price category


Ec onomy Standard Luxury Total
Gender Female Count 909 1137 1133 3179
% within Gender 28.6% 35.8% 35.6% 100.0%
% within Primary
49.4% 50.0% 49.6% 49.7%
vehicle price category
Male Count 932 1138 1151 3221
% within Gender 28.9% 35.3% 35.7% 100.0%
% within Primary
50.6% 50.0% 50.4% 50.3%
vehicle price category
Total Count 1841 2275 2284 6400
% within Gender 28.8% 35.5% 35.7% 100.0%
% within Primary
100.0% 100.0% 100.0% 100.0%
vehicle price category

Notes
o The variable labels are used throughout
o The top table tells us that there are 6400 respondents in the data set, with no
missing values
o The second table tells us, for example, that 28.6% of females bought an
Economy car, while 28.9% of males have Economy cars.
o 49.6% of the owners of luxury cars are female, while the other 50.4% are
male.

Page 35 of 44
2.1.3 Calculating the Mean, Maximum and Minimum of a Scalar
(Non-Categorical) Variable

PROCEDURE 16: CALCULATING THE MEAN, MAXIMUM AND


MINIMUM OF A VARIABLE
On the menu bar, select Analyze/Descriptive Statistics/Descriptives…
This displays the „Descriptives‟ dialog box

Figure 21: The Descriptives Dialog Box

Select the variable(s) you want to examine in the left-hand box (to select more than
one variable, depress the control key while left-clicking on the required variables.)
Select the arrow button to move the selected variable(s) into the „Variable(s):‟
box.
Select the OK button.
The SPSS Output Viewer is displayed

2.1.3.1 SPSS Output


The following output is derived from a descriptive analysis (PROCEDURE
16: ) of the dataset demo.sav. The particular variable being analysed is
income.

OUTPUT 3: Descriptive Statistics


Descriptives
Descriptive Statistics

N Minimum Maximum Mean Std. Deviation


Household income
6400 9.00 1116.00 69.4748 78.71856
in thous ands
Valid N (listwise) 6400

Notes
o The variable‟s label is displayed instead of its name.
o We can see that there were 6400 respondents.
o The minimum salary is 9.00
o The maximum salary is 1116.00
o The mean salary is 96.4748

Page 36 of 44
2.1.4 The Standard Deviation

Referring back to PROCEDURE 16: , we can see that another number, the Standard
Deviation, is also calculated.

Standard Deviation is a measure of the „spread‟ of the data. If all the numbers in your
sample are close to the mean value, then there is a „low spread,‟ and the sample has a
Low Standard Deviation.

If the numbers in your sample are not all close to the mean value, then there is a „high
spread,‟ and the sample is said to have a High Standard Deviation.

EXAMPLE 2: STANDARD DEVIATION


Consider the following sets of numbers:

SET 1
1.3 1.4 1.2 1.1 1.3 1.5 1.2 1.3 1.2 1.1

We can calculate the mean of this set of data to be 1.26


We can see that all the numbers in the set are close to this value (the minimum value is 1.1, and the
maximum value is 1.5), so the set has a „low spread,‟ and a low standard deviation. In fact the
standard deviation can be calculated. The actual value is 0.13.

SET 2
170 10 110 210 70 65 400 253 200 326

We can calculate the mean of this set of data to be 181.4


We can see that the numbers in the set are not all close to this value (the minimum value is 10, and
the maximum value is 400), so the set has a „high spread,‟ and a high standard deviation. In fact
the standard deviation can be calculated. The actual value is 122.72.

The Standard Deviation is an important factor to consider when designing our


statistical tests. Fortunately, SPSS does most of this consideration for us.

Page 37 of 44
2.2 Summarizing Data Graphically

One of the best ways of displaying your data is to display it as a Graph.

Graphs are useful when you want to be able to assimilate your data in one glance.
They provide an easy way of summarizing data in a way that almost anyone can
understand.

There are many types of graph. The most common are:

 Bar Charts (also called Histograms): these are 2600

used to display comparisons between variables, 2400

or between different cases within the same 2200

2000

variable. 1800

1600

1400

1200

Count
1000

800
Under $25 $25 - $49 $50 - $74 $75+

Income category in thousands

 Pie Charts: alternatives to bar charts, used to


display comparisons between variables, or $75+
Under $25

between different cases within the same


variable.

$50 - $74
$25 - $49

 Line Charts: Another alternative to bar charts. 2600

Often used to display changes in a variable over 2400

time (e.g. temperature, profit and loss etc.) 2200

2000

1800

1600

1400

1200
Count

1000
Under $25 $25 - $49 $50 - $74 $75+

Income category in thousands

2.2.1 Creating a Bar Chart

There are many options available when creating a bar chart. In this example we will
use the dataset demo.sav to create a simple bar chart showing the number of
respondents in each income category. The required variable is called inccat.

Page 38 of 44
PROCEDURE 17: CREATING A SIMPLE BAR CHART
On the menu bar, select Graphs/Legacy Dialogs/Bar…
This displays the „Bar Charts‟ dialog box

Figure 22: The Bar Chart


Dialog Box

Click on the „simple‟ option to select it

Ensure that is clicked ON


Select the „Define‟ button
This displays the „Define Simple Bar: Summaries for Groups of Cases‟ dialog box

Figure 23: The Define Simple Bar: Summaries for Groups of Cases
Dialog Box

Select the variable(s) you want to examine in the left-hand box . In this case we
select the variable „Income category in thousands [inccat]‟
Select the arrow button to move the selected variable(s) into the „Category
Axis‟ field.
Select the OK button.
The SPSS Output Viewer is displayed

Page 39 of 44
2.2.1.1 SPSS Output
The following output is the barchart derived from PROCEDURE 17: . The
particular variable being graphed is inccat.

OUTPUT 4: Simple Bar Chart


Graph

2600

2400

2200

2000

1800

1600

1400

1200
Count

1000

800
Under $25 $25 - $49 $50 - $74 $75+

Income category in thousands

Here, you can easily see that most of the respondents were in the $25 - $29 income
category.

Page 40 of 44
2.2.2 Creating a Pie Chart

In this example we will use the dataset demo.sav to create a simple pie chart showing
the number of respondents in each income category. The required variable is called
inccat.

PROCEDURE 18: CREATING A SIMPLE PIE CHART


On the menu bar, select Graphs/Legacy Dialogs/Pie…
This displays the „Pie Charts‟ dialog box

Figure 24: The Pie Chart Dialog Box

Ensure that is clicked ON


Select the „Define‟ button
This displays the „Define Pie: Summaries for Groups of Cases‟ dialog box

Figure 25: The Define Pie: Summaries for Groups of Cases Dialog
Box
Select the variable(s) you want to examine in the left-hand box . In this case we
select the variable „Income category in thousands [inccat]‟
Select the arrow button to move the selected variable(s) into the „Define
Slices by:‟ field.
Select the OK button.
The SPSS Output Viewer is displayed

Page 41 of 44
2.2.2.1 SPSS Output
The following output is the pie chart derived from PROCEDURE 18:
PROCEDURE 18: . The particular variable being graphed is inccat.

OUTPUT 5: Simple Pie Chart


Graph

Under $25

$75+

$50 - $74
$25 - $49

Here, you can easily see that most of the respondents were in the $25 - $29 income
category.

Page 42 of 44
2.2.3 Creating a Line Chart

There are many options available when creating a line chart. In this example we will
use the dataset demo.sav to create a simple line chart showing the number of
respondents in each income category. The required variable is called inccat.

PROCEDURE 19: CREATING A SIMPLE LINE CHART


On the menu bar, select Graphs/Legacy Dialogs/Line…
This displays the „Line Charts‟ dialog box

Figure 26: The Bar Chart


Dialog Box

Click on the „simple‟ option to select it

Ensure that is clicked ON


Select the „Define‟ button
This displays the „Define Simple Line: Summaries for Groups of Cases‟ dialog box

Figure 27: The Define Simple Line: Summaries for Groups of Cases
Dialog Box
Select the variable(s) you want to examine in the left-hand box . In this case we
select the variable „Income category in thousands [inccat]‟

Page 43 of 44
PROCEDURE 19: CREATING A SIMPLE LINE CHART

Select the arrow button to move the selected variable(s) into the „Category
Axis‟ field.
Select the OK button.
The SPSS Output Viewer is displayed

2.2.3.1 SPSS Output


The following output is the barchart derived from PROCEDURE 19: . The
particular variable being graphed is inccat.

OUTPUT 6: Simple Line Chart


Graph

2600

2400

2200

2000

1800

1600

1400

1200
Count

1000
Under $25 $25 - $49 $50 - $74 $75+

Income category in thousands

Page 44 of 44

You might also like