Basics To Intermediate of SPSS
Basics To Intermediate of SPSS
There are six different windows that can be opened when using SPSS. The following
will givea description of each of them.
The Data Editor
The Data Editor is a spreadsheet in which you define your variables and enter data.
Each row corresponds to a case while each column represents a variable. The title bar
displays the name of the open data file or "Untitled" if the file has not yet been saved.
This window opens automatically when SPSS is started.
The Output Navigator window displays the statistical results, tables, and charts from
the analysis you performed. An Output Navigator window opens automatically when
you run a procedure that generates output. In the Output Navigator windows, you
can edit, move, delete and copy your results in a Microsoft Explorer-like
environment.
1
The Pivot Table Editor
Output displayed in pivot tables can be modified in many ways with the Pivot Table
Editor. You can edit text, swap data in rows and columns, add color, create
multidimensional tables, and selectively hide and show results.
You can modify and save high-resolution charts and plots by invoking the Chart
Editor for a certain chart (by double-clicking the chart) in an Output Navigator
window. You can change the colors, select different type fonts or sizes, switch the
horizontal and vertical axes, rotate 3-D scatterplots, and change the chart type.
Text output not displayed in pivot tables can be modified with the Text Output Editor.
You can edit the output and change font characteristics (type, style, color, size).
You can paste your dialog box selections into a Syntax Editor window, where your
selections appear in the form of command syntax.
2
Class 2: Starting A SPSS Session
2. Select the Contents tab. This will give a set of books to look under for the
required information.
3. Type a word in the text box describing the information to search for. This will
give a list ofheadings on the desired information.
1. Select Exit SPSS from the File menu on the Data Editor.
3
Creating and Manipulating Data in SPSS
When creating or accessing data in SPSS, the Data Editor window is used.
There are three steps that must be followed to create a new data set in SPSS. The
following Class will list the steps needed and will give an example of creating a new
data set.
Variables are defined one at a time using the Define Variable dialog box. This box
assigns data definition information to variables. To access the Define Variable
dialog box, double- click on the top of a column where the word var appears or select
Define Variable from the Data menu.
Variable Name: This field describes the name of the variable being defined. To
change the name, place the cursor in this field and type the name.
The variable name must begin with a letter of the alphabet and
cannot exceed 8 characters. Spaces are not allowed within the
variable name. Each variable name must be unique.
Type: This field describes the type of variable that is being defined.
4
To change this field, click on the Type… button. This will open the Define Variable
Type: dialog box. Select the appropriate type of data. When done, click on the Continue
button.
Variable Label: A name for the variable that can be up to 120 characters long and can
include spaces (which variable names cannot). If a variable label is entered, the label will
be printed on charts and reports instead of the name, making them easier to understand.
Value Label: Provides a key for translating numeric data.
To change the variable label, click on the Labels… button. This will open the Define
Labels: dialog box. Enter the appropriate information into the fields. When done, click on
the Continue button.
Missing Values:
This field indicates which subset of the data will not be included in the data set. To
change this field, click on the Missing Values… button. This will open the Define
Missing Values: dialog box. Enter the appropriateinformation into the fields. When
done, click on the Continue button.
5
Alignment:
This field indicates column alignment and width. To change this field, click on the
Column Format… button. This will open the Define Column Format:
dialog box. Enter the appropriate information into the fields. When done,
click on theContinue button.
Once all of the variables are defined, enter the data manually (assuming that the data
is not already in an external file). The data is typed into the spreadsheet one cell at
a time. Each cell represents an observation.
When information is typed into a cell, it appears in the edit area at the top of the
window. The information is entered into the cell when the active cell is changed. The
mouse and the tab, enter, and cursor keys can be used to enter data.
To indicate a cell that does not have a data value, a period is entered. A period
represents the system-missing value.
6
STEP 3: Saving a New Data Set
Work performed on a data set only lasts during the current session. To retain the current
dataset, it must be saved to a file.
1. Select Save from the File menu. The Save Data As dialog box opens.
3. From the Save in drop-down list, select the path where the file will be saved.
4. In the File name box, enter a name for the file. SPSS automatically adds the
extension
.sav.
5. Click Save.
Problem
The following data regarding a person’s name, age and weight must be entered into
a dataset using SPSS.
Solution
1. Double click on the top of the first column in the Data Editor window. This will
open the Define Variable dialog box. Type Name in the Variable Name box.
6
2. Select Type… in the Change Settings area. This will open the Define
Variable Type dialog box. Left click on String.
3. Select Continue. This will close the Define Variable Type dialog box and will re-
open the Define Variable dialog box.
4. Click OK. This will define the first column as a string variable called Name.
5. Double click on the top of the second column. This will open the Define Variable
dialog box. Type Age in the Variable Name box.
6. Select Type… in the Change Settings area. This will open the Define
Variable Type dialog box. Left click on Numeric. In the Width box, set it to 3. In
the Decimal Places box, set it to 0.
7. Select Continue. This will close the Define Variable Type dialog box and will re-
open the Define Variable dialog box.
8. Click OK. This will define the second column as a numeric variable called Age.
9. Double click on the top of the third column. This will open the Define Variable
dialog box. Type Weight in the Variable Name box.
10. Select Type… in the Change Settings area. This will open the Define
Variable Type dialog box. Left click on Numeric. In the Width box, set it to 3. In
the Decimal Places box, set it to 0.
11. Select Continue. This will close the Define Variable Type dialog box and will re-
open the Define Variable dialog box.
12. Click OK. This will define the third column as a numeric variable called Weight.
6
13. Enter the above information into the cells of the spreadsheet. The Data Editor
should look like the following.
16. Type temp in the File name box and click Save. SPSS will save this file as
temp.sav in the specified directory.
The following Class will indicate how to read in a spreadsheet or text file into a data
set inSPSS. Examples will be given of each method.
6
Reading Spreadsheet Files (Lotus 1-2-3 and Excel)
Problem
2. Change the path name to your home directory and open the SPSS folder. This
is where the file to be opened should be.
3. Select Excel(*.xls) (or Lotus(*.w*) for Lotus files) from the Files of type box.
4. Select nba.xls.
5. Click Open. This will open the Opening File Options dialog box. Click on the
Read variable names dialog box. Click OK. This will close the Opening File
Options dialog box and will open nba.xls in the Data Editor. The Output
Navigator will also be opened.
NOTE:
7
If only a partial file is to be read into SPSS, the following steps are taken.
For Lotus files, in the Range box, specify the beginning column letter and
row number followed by two periods followed by the ending column letter
and row number. Ie. A1..C12
For Excel files, in the Range box, specify the beginning column letter and
row number followed by a colon followed by the ending column letter and
row number. Ie. A1:C12
Window Output
8
Reading Text Files
Two ways to read a text file are by using freefield or fixed columns.
Free field
This method is used if the variables are recorded in the same order for each case
but not necessarily in the same column locations.
Problem
2. Specify the variable name and data type. The following gives a description of
each of these fields.
Name: Variable names must begin with a letter and cannot exceed eight
characters.
Each variable name must be unique.
3. Click Add for each separate variable. This will enter the variable name and
data type onto the Defined Variables list.
9
4. Once all variables are defined, click Browse to specify the name of the file to
be read. This will open the Define Freefield Variables: Browse dialog box.
Change the path name to your home directory and open the SPSS folder. This
is where the file to be opened should be.
5. Select citydata.txt and click Open. The Define Freefield Variables dialog box
will be returned.
6. Click OK. This will close the Define Freefield Variables dialog box and will
open citydata.txt in the Data Editor.
Window Output
Fixed Columns
This method is used if each variable is recorded in the same column location for
each case in the data file.
Problem
10
2. Specify the variable name, record, column locations, and data type. The
following gives a description of each of these fields.
Name: Variable names must begin with a letter and cannot exceed eight
characters.
Each variable name must be unique.
Record: A case can have data on more than one line. The record number
indicatesthe line within the case where the variable is located.
11
3. When all information is added for a variable, click Add. This will enter the
record number, start and end columns, variable name, and data type onto the
Defined Variables list.
4. Once all variables are defined, click Browse to specify the name of the file to
be read. This will open the Define Fixed Variables: Browse dialog box.
Change the path name to your home directory and open the SPSS folder. This
is where the file to be opened should be.
5. Select nba.txt and click Open. The Define Fixed Variables dialog box will be
returned.
6. Click OK. This will close the Define Fixed Variables dialog box and will
open nba.txt in the Data Editor.
Window Output
12
Class 3: Opening an Existing SPSS Data Set
1. Select Open from the File menu. This will open the Open File dialog box.
2. From the Files of type drop-down list, select .sav.
3. From the Look in drop-down list, select the appropriate drive where the file is
located.
4. In the File name box, type in the name of the file to be opened.
5. Click Open.
1. Highlight the data that will be printed. To print all of the data, ignore this
step and continue to step 2.
2. Select Print from the File menu. The Print dialog box opens. Change the options
where appropriate.
3. Click OK.
13
Generating Descriptive Statistics in SPSS
The following Classs will demonstrate how to generate descriptive statistics in SPSS.
Class 1: Mean, Sum, Standard Deviation, Variance, Minimum Value,
Maximum Value, and Range
When generating these statistics, the Data Editor must be open with the appropriate
data set before continuing.
Problem
Using the data in the file nba.txt that is located in ~/SPSS/, determine the
mean, sum, standard deviation, variance, minimum value, maximum value, and range
for height only.
Solution
1. From the Statistics menu, select Summarize. From the Summarize drop down
menu, select Descriptives. This will open the Descriptives dialog box.
2. In the variable list, select the variable height. Left click on the right arrow button
between the boxes to move this variable over to the Variable(s) box. To calculate
statistics for many variables, simultaneously add variables to the Variable(s) box.
3. Click on the Options button. This will open the Descriptives: Options dialog box.
14
Click on mean, sum, standard deviation, variance, minimum value, maximum
value, and range.
Click on the Continue button when done.
4. Click OK. The Descriptives dialog box closes and SPSS activates the Output
Navigator to illustrate the statistics.
Window Output
Class 2: Correlation
Two or more variables may be included in a correlation matrix. When generating the
correlation matrix, the Data Editor must be open with the appropriate data set before
continuing.
Problem
Using the data in the file nba.txt that is located in ~/SPSS/, determine the correlation
15
between a player’s height and weight.
Solution
1. From the Statistics menu, select Correlate. From the Correlate drop down menu,
select Bivariate. This will open the Bivariate Correlations dialog box.
2. In the variable list, select height and weight. Left click on the right arrow button
between the boxes to move a variable over to the Variable(s) box.
3. Select the type of correlation coefficients that will be generated. In this case, use
Pearson.
To display the mean and standard deviation for each variable, select Means and
standard deviations. In this case, this option is not used.
16
To display cross product deviations and covariances for each pair of variables,
select Cross-product devations and covariances. In this case, this option will not be
used.
When done, click the Continue button.
7. Click OK. The Bivariate Correlations dialog box closes and SPSS activates the
Output Navigator. The correlation coefficient for each pair of variables is
displayed. The number of cases appears at the bottom.
Window Output
The following Classs introduce how to create scatter plots, histograms, stem and leaf
plots, and box plots using the SPSS Graphs menu located on the Data Editor menu bar.
Problem
Using the data in ~/SPSS/nba.txt, create an x-y plot of a player’s weight versus
height.Solution
17
1. From the Graphs menu, select Scatter… This will open the Scatterplot dialog box.
2. Select the Simple icon and click Define. This will open the Simple Scatterplot
dialog box.
3. From the variable list, select weight. Left click on the right arrow button
between the variable list and the Y Axis box to move the variable, weight, to this
box.
4. From the variable list, select height. Left click on the right arrow button
between the variable list and the X Axis box to move the variable, height, to this
box.
5. Click on the Options… button. This will open the Options dialog box.
18
To display a report of missing values, select Display groups defined by missing
values. In this case, this option will not be used.
When done, click the Continue button.
6. To display titles, subtitles, or footnotes on the histogram, click on the Titles…
19
Class 2: How to Generate a Histogram
Problem
2. From the variable list, select income. Left click on the right arrow button
between the variable list and the Variable box to move the variable, income, to
this box.
3. Select Display normal curve box to show a normal curve on the histogram.
4. To display titles, subtitles, or footnotes on the histogram, click on the Titles…
20
Window Output
Problem
Using the data in ~ /SPSS/statdata.txt, create a stem and leaf plot of per capita
income.Solution
1. From the Statistics menu, select Summarize. From the Summarize drop-down
menu,select Explore… This will open the Explore dialog box.
2. From the variable list, select income. Left click on the right arrow button
between the variable list and the Dependent List box to move the variable,
income, to this box.
21
3. Click on the Statistics… button. This will open the Explore: Statistics dialog box.
To display cases with the five largest and smallest values, select
Outliers.To display percentiles, select Percentiles.
In this case, none of these options are
used.When done, click on the Continue
button.
4. In the Display area, select Plots. This will display the specified plot only (i.e. no
statistics are given).
5. Click on the Plots… button. This opens the Explore: Plots dialog box.
22
To exclude cases that have missing values for any of the variables used in any of
theanalyses, select Exclude cases listwise. In this case, this option is used.
To exclude cases that have missing values for either or both of the pair of variables
in aspecific correlation coefficient, select Exclude cases pairwise.
Window Output
23
Class 4: How to Generate a Box Plot
Problem
Using the data in the file, ~ /SPLUS/statdata.dat, produce a boxplot of per capita
incomeSolution
1. From the Graphs menu, select Boxplot… This will open the Boxplot dialog box.
4. Click on the Define button. This will open the Define Simple Boxplot:
Summaries ofSeparate Variables dialog box.
5. From the variable list, select income. Left click on the right arrow button
between the variable list and the Boxes Represent box to move the variable,
income, to this box.
6. Click on the Options… button. This will open the Options dialog box.
24
To display a report of missing values, select Display groups defined by missing
values. In this case, this option will not be used.
7. Click OK. This will close the Define Simple Boxplot: Summaries of Separate
Variablesdialog box and SPSS activates the Output Navigator to display the box
plot.
Window Output
25
Statistical Models in SPSS
The Regression submenu on the Statistics menu of the Data Editor provides
regression techniques. The following Class will introduce how to perform linear
regression using SPSS. The output contains goodness of fit statistics and the
coefficients for the variables.
Problem
Solution
1. From the Statistics menu, select Regression. From the Regression drop down
menu, select Linear… This will open the Linear Regression dialog box.
2. From the variable list, select weight. Left click on the right arrow button between
the variable list and the Dependent box to move the variable, weight, to this box.
3. From the variable list, select height. Left click on the right arrow button between
26
the variable list and the Independent(s) box to move the variable, height, to this
box.
4. Select the method the independent variables are entered into the analysis. From
the Method drop-down menu, there is a choice of enter, stepwise, remove,
backward, and forward. In this case, we will use the enter method.
6. Determine the variable that will identify the points on plots. Select the variable
and left click on the right arrow between the variable list and the Case Labels box.
In this case, this option is not used.
7. To display statistics, click on the Statistics… button. This will open the Linear
Regression: Statistics dialog box.
Select the appropriate statistics to be displayed and click on the Continue button
when done. In this case, this option is not used.
8. To display specific plots, click on the Plots… button. This will open the
LinearRegression: Plots dialog box.
27
From the variable list, select the variable that will be displayed on the Y axis. Left
click on the right arrow button between the variable list and the Y box. Do this
also for the X axis. When done, click on the Next button. If more plots are
needed, follow the same procedure. In this case, this option is not used.
9. To indicate which statistics should be displayed, click on the Save button. This
will openthe Linear Regression: Save dialog box.
Select the appropriate statistics. To save the coefficient statistics, click on the
box and indicate the file to which you want them saved. In this case, this option is
not used.
10. To indicate the stepping method criteria, click the Options… button. This will
open the Linear Regression: Options dialog box.
28
Select the method to be used. When the selection is made, click on the Continue
button.
11. Click OK. This will close the Linear Regression dialog box. SPSS activates the
Output Navigator to display the results of the analysis.
Window Output
29
Class 2: Analysis of Variance
Problem
Using the data in ~/SPSS/teller1.txt, test if the mean number of customers served per
hour byeach of the four tellers is the same.
Solution
1. From the Statistics menu, select Compare Means. From the Compare Means drop
downmenu, select One-Way ANOVA… This will open the One-Way ANOVA
dialog box.
2. From the variable list, select num_cus. Left click on the right arrow button
between the variable list and the Dependent List box to move the variable,
num_cus, to this box.
3. From the variable list, select teller. Left click on the right arrow button
between the variable list and the Factor box to move the variable, teller, to this
box.
4. Click on the Contrasts… button. This will open the One-Way ANOVA:
30
Contrasts dialogbox.
To enter a numeric coefficient value for each level, click Add. However, the
number of coefficients must equal the number of groups or the analysis is not
performed. Because the levels in this problem are already numeric, this option
does not need to be used.
5. Click on the Post Hoc… button. This will open the One-Way ANOVA: Post Hoc
Multiple Comparisons dialog box.
If equal variances are assumed between the different factor levels, select the
type ofcomparison method to be used.
If equal variance are not assumed between the different factor levels, select the
type ofcomparison method to be used.
To get a description on each of the methods listed, right click on the word. A
description window will appear.
6. Click on the Options… button. This will open the One-Way ANOVA: Options
31
dialog box.
To exclude cases that have missing values for the variable involved in that test,
select Exclude cases analysis by analysis. In this case, select this option.
However, to exclude cases that have missing values for any of the variables used
in anyof the analyses, select Exclude cases listwise.
7. Click OK. The One-Way ANOVA dialog box closes and SPSS activates the
Output Navigator. The means of the dependent variable for each category of the
independent variable can be found under "Descriptives".
32
Window Output
33