Chapter 2 Introduction To SPSS
Chapter 2 Introduction To SPSS
INTRODUCTION
TO SPSS
By: Dr. Nurhayani Romeo
2.1 BASIC STEPS IN DATA ANALYSIS
You have two choices to get your data into SPSS.
SPSS is one of the statistical packages for
i. you can open previously saved SPSS data file; 1. Get your data into
read a spreadsheet or
data analysis. It provides a number of
ii. you can directly enter your data in the Data Editor.
SPSS
statistical procedures, graphical techniques
and data management facilities. Select
If you are
a procedure from the menus to run
familiar with basic concepts of statistical
Windows, techniques or to create a 2. Select a procedure
graphical display.
you will understand this chapter. The basic
steps that you have to consider are:
The variables in the data file are displayed 3. Select the variable
in a dialog box for the procedure. from the analysis
Data Editor
SPSS Viewer
SPSS Viewer
Allow to access
information on how to
use the many features of
SPSS.
Data View: Many of the features of the data view are very
similar to those found in spreadsheet. Several important
things you have to know.
The variable view contains descriptions of the attributes of each variable in the data file. In
the Variable view:
▪ Rows are variables
▪ Columns are variable attributes
While IBM® SPSS® Statistics includes tools for data preparation, sometimes you need
more specialized techniques to get your data ready.
With IBM SPSS Data Preparation, you can easily identify suspicious or invalid cases,
variables and data values; view patterns of missing data; summarize variable distributions;
and more accurately work with algorithms designed for nominal attributes.
Variable name The first column in the Variable View is “Name”. We can create a name of the variable that represents
the information we collected. If we collect data about gender, we can name it “gender,” “sex,” and
“gender_identity1”. Once we create a variable name in the Variable View, the variable names are
displayed at the top of the “Data View” screen.
Variable name
Variable type
Information for the type of each variable is displayed in the Variable View tab. Under the “Type” column, simply click
the cell associated with the variable of interest. A blue “…” button will appear.
Variable types
Variable type • A numeric variable whose values are displayed with commas
delimiting every three places and the period as a decimal delimiter.
Comma • For example, 100,000.00 or 999,988,565.21.
Variable type • A numeric variable whose values are displayed with periods delimiting every three
places and the comma as a decimal delimiter.
• The Data Editor accepts numeric values for dot variables with or without commas
Dot or in scientific notation.
Variable type • A numeric variable whose values are displayed in one of several calendar-date or clock time
formats.
• Select the format from the lists.
Date • The century range for two-digit year value is determined by your Options setting (from Edit menu,
choose Options and click the Data tab)
Variable type • A numeric variable whose values are displayed in one of customs currency
format that you have defined in the Currency tab of the Options dialog box
Custom • Defined custom currency characters cannot used in data entry but displayed
currency in the Data editor
Variable type • Values of a string variable are not numeric and hence not used in calculations.
• They can contain any characters up to the defined length.
• Uppercase and lowercase letters are considered distinct (also known as alphanumeric
String variable).
Variable labels
You can write descriptive variables up to 256 characters long and variable labels can contain space and reserved
characters not allowed in variable names.
Value labels
You can assign descriptive value labels for each value of a variable. This is particularly useful if your data file uses
numeric codes to represent non-numeric categories.
Example: code 1 and code 2 for male and female
Measurement level
The Measure is used to measure something or something that refers to the property that we have focusing upon.
Example:
1. If we are from an anthropology background, we must be interested in measuring blood glucose levels.
2. If we are from a psychology background, we must be interested in measuring the motivation and personality of
the individual.
3. If we are from a management background, we must be interested in the management process.
Role
- indicating what role the variable would play in your data analysis, like independent, dependent, or both roles of the
variable.
- It shows six options like Input, Target, Both, None, Partition, and Split
Let’s practice!
Table 1 shows five different variables and its description. This is the student’s final examination result on three different
subjects. All of the students are chosen at random from one selected school in Malaysia.
Important Notes!
For the first time user, if your
Measure is defined as
Nominal or Ordinal, you
required to fill in Value
Labels. Compulsory! If you
defined as Scale, you do not
require to fill in Value Labels.
Without this information, you
will be harder to understand
the data. When the date
defined, anyone else who
works with the data can
easily distinguish.
Data screening is important stage to ensure the best results from data analysis.
Checking for the typo error, transforming the reverse question, handling missing data,
removing the outlier and checking normality assumption are must.
Based on Table 3, the data have been manipulated for the purpose of explaining this
section.
It is not possible to be 11
Missing 2 information
Option 1:
Replace values with numbers that are known from prior knowledge or from an educated guess. Easily done but can lead
to researcher bias if you’re not careful.
Option 2:
Replace missing values with variable mean. The simplest option but it does lower variability and in turn can bias results.
Option 3:
Using regression to predict the missing values. Other variables act as IVs predicting the variable with missing values
(which acts as the DV). This method is only works if there is significant prediction of the variable with the missing
values. Variability is still reduced but it is more objective than a guess and not as blind as inserting mean.
Step 7: From coefficient table, predict missing value using the following formula:
✓ The distribution behavior of the data can be seen from mean, median and mode. If these three central tendency
measures are equal, then the data is said to be symmetric or normal. Meanwhile, if the data does not indicate
symmetrical or not normal.
✓ Normal distribution: symmetrical bell-shaped curve.
✓ Skewness: the curve appears distorted or skewed either to the left or to the right.
✓ Kurtosis: the sharpness of the peak of a distribution curve.
−0.344 −1.333
Skewness (Malay) = = −0.7926 Kurtosis (Malay) = = −1.5775
0.434 0.845
0.912 0.548
Skewness (Mathematics) = = 2.1014 Kurtosis (Mathematics) = = 0.6485
0.434 0.845
BEST FOR You 72
O R GAN I C S C O M P AN Y
2.6.3 Normality Assumption (cont)
How to Run a Normality Plot and Test
Step 1: Select Analyze Menu Click Descriptive Statistics Click on Explore.
SPSS transformation commands (or simply “transformations”) can be loosely defined as commands that are not
immediately carried out when you run them. Instead, they are kept in mind by SPSS and executed only when necessary.
The table below gives an overview of SPSS' main transformation commands.
Thank You
April Hansson
+1 23 987 6554
[email protected]
www.lucernepublishing.com