0% found this document useful (0 votes)
14 views61 pages

Basic Training in Spss WFP

Uploaded by

khumbokuwani413
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views61 pages

Basic Training in Spss WFP

Uploaded by

khumbokuwani413
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 61

BASIC TRAINING IN SPSS

May 2018
Introduction
• Statistical Package for Social Scientist,
• (SPSS),is one of the most important and
powerful software used in the statistical
production lines in a number of countries.
• Prior knowledge in any widows based
application is sufficient for this short
introduction in the basic features in SPSS.
Overview
• There are a number of different types of windows in
SPSS:
• Data Editor. This window displays the contents of the
data file. You can create new data files or modify existing
ones with the Data Editor. The Data Editor window
opens automatically when you start an SPSS session.
• In older versions of SPSS you can have only one data
file open at a time but later versions you can have
several….
• Viewer. All statistical results, tables, and charts are
displayed in the Viewer. You can edit the output and
save it for later use. A Viewer window opens
automatically the first time you run a procedure that
generates output….
Overview
• Syntax Editor. You can paste your dialog
box choices into a syntax window, where
your selections appear in the form of
command syntax. You can then edit the
command syntax to use special features
of SPSS not available through dialog
boxes. You can save these commands in
a file for use in subsequent SPSS
sessions. …
Status Bar
• The status bar at the bottom of each SPSS window
provides the following information
• Command status. For each procedure or command that
you run, a case counter indicates the number of cases
processed so far. For statistical procedures that require
iterative processing, the number of iterations is
displayed.
• Filter status. If you have selected a random sample or a
subset of cases for analysis, the message Filter on
indicates that some type of case filtering is currently in
effect and not all cases in the data file are included in the
analysis.
Status Bar cont..
• Weight status. The message Weight on
indicates that a weight variable is being
used to weight cases for analysis.
Showing and Hiding the Status
Bar

• From the menus choose:


• View
• Status Bar
Basic steps in Data analysis

• Analyzing data with SPSS is easy. All you have to do


is:
• Get your data into SPSS. You can open a previously
saved SPSS data file; read a spreadsheet, database,
or text data file; or enter your data directly in the Data
Editor.
• Select a procedure. Select a procedure from the
menus to calculate statistics or to create a chart
• Select the variables for the analysis. The variables
in the data file are displayed in a dialog box for the
procedure.
• Run the procedure and look at the results. Results
are displayed in the Viewer.
Statistics coach and finding out
more about SPSS
• If you are unfamiliar with SPSS or with the
statistical procedures available in SPSS,
• the Statistics Coach can help you get
started by prompting you with simple
questions,
• non technical language, and visual
examples that help you select the basic
statistical and charting features that are
best suited for your data.
Statistics coach and finding out
more about SPSS
• For a comprehensive overview of SPSS
basics, see the online tutorial. From any
• SPSS menu choose:
• Help
• Tutorial
Getting Help
• Online Help is provided in several ways:
• Help menu. Every window has a Help
menu on the menu bar. Topics provides
access to the Contents and Index tabs,
which you can use to find specific Help
topics.
• Tutorial provides access to the
introductory tutorial.
Getting Help
• Dialog box Help buttons.
Most dialog boxes have a Help button that
takes you directly to a Help topic for that
dialog box.
The Help topic provides general information
and links to related topics.
Getting Help
• Case Studies.
• The Case Studies item on the Help menu in the Viewer
window provides
• hands-on examples of how to create various types of
statistical analyses and how to
• interpret the results. The sample data files used in the
examples are also provided so that you can work
through the examples to see exactly how the results
were produced.
Data files
• Data files come in a wide variety of formats, and this software is
designed to handle many of them, including:
• Spreadsheets created with Excel
• Database files created with dBASE and various SQL formats
• Tab-delimited and other types of ASCII text files
• Data files in SPSS format created on other operating systems
• SAS data files
• From the menus choose:
• File
• Open
• Data...CFSS
• Paste
File information
• A data file contains much more than raw
data. It also contains any variable
definition information, including:
• Variable names
• Variable formats
• Descriptive variable and value labels
To Obtain Data File Information
• From the menus in the Data Editor window
choose:
• File
• Display Data File Information
To Save a Subset of Variables

• Make the Data Editor the active window (click


anywhere in the window to make it active).
• From the menus choose:
• File
• Save As...
• Select SPSS (*.sav) from the list of file types.
• Click Variables.
• Select the variables that you want to save.
Protecting original data
• To prevent the accidental modification/deletion of your
original data, you can mark the file as read-only.
• From the Data Editor menus choose:
• File
• Mark File Read Only
• If you make subsequent modifications to the data and
then try to save the data file, you
• can save the data only with a different filename; so the
original data are not affected.
• You can change the file permissions back to read/write
by selecting Mark File
• Read Write from the File menu.
Data Editor
• The Data Editor provides a convenient,
spreadsheet-like method for creating and
editing data files. The Data Editor window
opens automatically when you start a
session.
• The Data Editor provides two views of
your data:
Data Editor
• Data view. Displays the actual data values or
defined value labels.
• Variable view. Displays variable definition
information, including
• defined variable
• and value labels, data type (for example, string,
date, and numeric), measurement
• level (nominal, ordinal, or scale), and user-
defined missing values.
• In both views, you can add, change, and delete
information contained in the data file.
Features of the Data view
• Rows are cases. Each row represents a case or an
observation. For example, each
individual respondent to a questionnaire is a case.
• Columns are variables. Each column represents a
variable or characteristic being measured.
• For example, each item on a questionnaire is a
variable.
• Cells contain values. Each cell contains a single value of
a variable for a case.
• The cell is the intersection of the case and the variable.
Cells contain only data values.
• Unlike spreadsheet programs, cells in the Data Editor
cannot contain formulas.
Features of the Data view
• The data file is rectangular. The dimensions of the data
file are determined by the number of cases and
variables.
• You can enter data in any cell. If you enter data in a cell
outside the boundaries of the defined data file, the data
rectangle is extended to include any rows and/or
columns between that cell and the file boundaries.
• There are no “empty” cells within the boundaries of the
data file. For numeric variables, blank cells are
converted to the system-missing value.
.
Features of the Variable view
• The Variable view contains descriptions of the attributes of each
variable in the
• data file. In the Variable view:
• Rows are variables.
• Columns are variable attributes.
• You can add or delete variables and modify attributes of variables,
including:
• Variable name
• Data type
• Number of digits or characters
• Number of decimal places
• Descriptive variable and value labels
• User-defined missing values
• Column width
Variable names
• The following rules apply to variable names:
• The name must begin with a letter. The
remaining characters can be any letter,
any digit, a period, or the symbols @, #, _, or $.
• Variable names that end with an underscore
should be avoided (to avoid conflict
• with variables automatically created by some
procedures).
Variable names
• The length of the name cannot exceed 64 bytes. Sixty-four bytes typically
means 64 characters in single-byte languages (for example, English,
French, German, Spanish, Italian, Hebrew, Russian, Greek, Arabic, Thai)
and 32 characters in double-byte languages (for example, Japanese,
Chinese, Korean).
• Blanks and special characters (for example, !, ?, ', and *) cannot be used.
• Each variable name must be unique; duplication is not allowed.
• Reserved keywords cannot be used as variable names. Reserved
keywords are: ALL, AND, BY, EQ, GE, GT, LE, LT, NE, NOT, OR, TO,
WITH.
• Variable names can be defined with any mixture of upper and lower case
characters, and case is preserved for display purposes.
• When long variable names need to wrap on to multiple lines in output,
SPSS attempts to break lines at underscores, periods, and a change from
lower to upper case.
Variable types

• The Type column is showing Numeric for


all rows. This means that numeric
(number) values will be expected in the
dataset relating to these variables.
• This is correct for all the variables except
sex where we have collected data in the
form of F for female and M for male etc.
Variable types
Variable types
• The common items are Numeric, Date, Custom
currency and String.
• To enter alpha characters into the variable e.g
sex, select data type String by clicking on its
selection button.
• Notice that a string of 8 characters is the default.
Whilst this would cause no problem, it is more
efficient to reduce it to the actual number of
characters you are going to input.
• Therefore change the default of 8 in the box to
for variable sex 1.
Variable width and decimal places

• The next column is headed Width and deals with the


maximum number of characters that will be displayed for
a particular variable in all output relating to this
• variable.
• It does not control the display in the Data View window,
which is determined by Columns - see later.
• For a numeric variable it needs to be considered
alongside the next column labelled Decimals. The value
in this column indicates the number of decimal places
that will be displayed in all output relating to this variable.
Variable width and decimal places

• By default the Width value is set to 8 and


Decimals to 2.
• This may be felt to be appropriate and not
need changing.
• For finer control of your output, you can
alter values as necessary.
• For a string or date variable, the Decimals
column has no meaning.
Variable labels

• The next column is headed Label and is


used to inform SPSS about the details
associated with each variable name.
• The maximum length of any label is 256
characters and there are no restrictions on
what may appear.
• Spaces are entered just as typed.
Syntax variable label
• VARIABLE LABELS sex.1 Sex of
household head.
relationship_1
marital_1
qualification_1?
Value labels

• The next task is to enter Value Labels for


each variable if appropriate.
• These will appear in the Values column.
• The default entry is None.
• For sex you can indicate that M is male
and F is female. Or have it as numeric and
use 0 and 1 or 1 and 2
Value labels
Value label syntax
• VALUE LABELS Sex.1 1 'Male' 2 'Female'.

• relationship_1?
• marital_1
• qualification_1?
Copy Value labels
• The variables sex_1, sex_2,…,p all need the same Value Labels
• Either you can repeat the above instructions for each variable in turn
or take advantage of a useful shortcut:
• Click in the cell containing the value for sex_1. Copy this cell by
using either Edit /Copy from the main menu or by clicking the right
mouse button and selecting Copy, or by pressing <Ctrl + C>.
• Point at the cell for Value for variable sex_2, sex_3,
• choose Paste, or use <Ctrl + V>.
Missing values

• The next column of the Variable View sheet is Missing Values.


• In the statistical analysis of any dataset it is sometimes necessary to
exclude cases where the information is not known or not
appropriate.
• An example of this occurs in the variable Q3.5 in this dataset. The
information is missing in two situations;
Not appropriate one didn’t grow and in the data the appropriate
cells has been left blank;
• In other cases the respondent fails to give an answer to a question
A number that could not be expected as a genuine response is
selected to represent this circumstance.
• However in any analysis, it should not be considered as it would
seriously distort many statistical procedures.
Missing values
• Within SPSS there are two types of missing
value - system-missing values and user-
defined missing values.
• By default, for non-string variables, an empty
cell is defined as a system missing value and
does not need to be further declared.
• For user-defined missing values this column of
the Variable View has to be used.
Missing values
Data display
(columns and align)
• in the Data View window. For the
purposes of this example dataset, the
default values of a column 8 characters
wide and the values right aligned are fine.
• When you have entered your data return
to the Variable View window and change
one or more of these values.
• Then flip to the Data View window and
see the effect your choice has made.
Measurement scale of variables
• You can specify the level of measurement
as scale (numeric data on an interval or
ratio scale), ordinal, or nominal.
• Nominal and ordinal data can be either
string (alphanumeric) or numeric.
Measurement scale of variables
• Measurement specification is relevant only
for:
Custom Tables procedure and chart
procedures that identify variables as scale
or categorical. Nominal and ordinal are
both treated as categorical.
Measurement scale of variables
Scale. Data values are numeric values on an interval or ratio scale—for example, age or
income. Scale variables must be numeric.
Ordinal. Data values represent categories with some intrinsic order (for example,
low, medium, high; strongly agree, agree, disagree, strongly disagree). Ordinal
variables can be either string (alphanumeric) or numeric values that represent distinct
categories (for example, 1 = low, 2 = medium, 3 = high).
Note: For ordinal string variables, the alphabetic order of string values is assumed
to reflect the true order of the categories. For example, for a string variable with
the values of low, medium, high, the order of the categories is interpreted as high,
low, medium—which is not the correct order. In general, it is more reliable to use
numeric codes to represent ordinal data.
Nominal. Data values represent categories with no intrinsic order—for example, job
category or company division. Nominal variables can be either string (alphanumeric)
or numeric values that represent distinct categories—for example, 1 = Male, 2 =
Female.
• scale to represent a numeric variable that
can take discrete or continuous
values along a range
• ordinal to represent values that, although
numeric, only represent an ordered
listing of such values
nominal to represent values that are simply
names
To Enter Numeric Data

• Select a cell in the Data view.


• Enter the data value. The value is
displayed in the cell editor at the top of the
Data Editor.
• Press Enter or select another cell to
record the value.
Editing Data

• With the Data Editor, you can modify data values


in the Data view in many ways.
• You can:
• Change data values.
• Cut, copy, and paste data values.
• Add and delete cases.
• Add and delete variables.
• Change the order of variables.
Identifying Duplicate Cases

• “Duplicate” cases may occur in your data for many


reasons, including:
Data entry errors in which the same case is accidently
entered more than once.
• 􀂄 Multiple cases share a common primary ID value but
have different secondary ID values, such as family
members who all live in the same house. 􀂄 Multiple
cases represent the same case but with different values
for variables other than those that identify the case, such
as multiple purchases made by the same person or
company for different products or at different times.
• Identify Duplicate Cases allows you to define duplicate
almost any way that you want and provides some control
over the automatic determination of primary versus
duplicate cases.
To identify and flag duplicate
cases
• From the menus choose:
• Data
• Identify Duplicate Cases...
• Select one or more variables that identify
matching cases.
• Select one or more of the options in the
Variables to Create group.
Data Transformations
• In an ideal situation, your raw data are perfectly suitable
for the type of analysis you want to perform.
• Preliminary analysis may reveal inconvenient coding
schemes or coding errors, or data transformations
• may be required in order to expose the true relationship
between variables.
• You can perform data transformations ranging from
simple tasks, such as collapsing categories for analysis,
to more advanced tasks, such as creating new variables
based on complex equations and conditional statements.
Computing Variables
• To Compute Variables
• From the menus choose:
• Transform
• Compute...
• Type the name of a single target variable.
It can be an existing variable or a new
variable to be added to the working data
file.
Expenditure
• Food expenditure
• Nonfood expenditure
• Total expenditure
• Food expenditure as %tage of total
• Per capita expenditure per month
Reduced Coping Strategy Index (rCSI)

• Composite indicators
• A single numeric indicator that summarizes a larger set
of sub-indicators
• Coping Strategies Index- summarizes the consumption-
related coping strategies a HH has engaged in the last
month
• COMPUTE CSI=(lessExpenpast7days * 1) +
(borrowpast7days * 2) + (reduceportionsize7days * 1)
+ (reducefoodquant7days * 3) + (reducemeals7days * 1).
EXECUTE.
Syntax rCSI
• COMPUTE CSI=
(reducefoodquant7days * 3) +
(borrowpast7days * 2) +
(reduceportionsize7days * 1) +
(reducemeals7days * 1) +
(lessExpenpast7days * 1).
EXECUTE.
Recoding Values
• You can modify data values by recoding them.
• This is particularly useful for collapsing or
combining categories.
• You can recode the values within existing variables,
or you can create new variables based on the
recoded values of existing variables.
For example: Age, district, education etc
Transform
Recode into different variables
Recode into same variables
Matching data

– Add variables
Data
Merge files
Add variables
– Add cases
Data
Merge files
Add cases
Food Consumption Score
• Dietary diversity and food frequency

• Analysis of 7-day food/food group consumption recall

• Food consumption score calculated using weights


assigned to foods based upon their nutritional value.

• Cut-points or thresholds established to enable trends


analysis and cross-country comparisons and to provide a
benchmark for success.

• Households were classified as having either ‘poor’,


‘borderline’ or ‘good’ consumption based on the food
consumption score.
Food Consumption Score

• a composite score based on:


• dietary diversity = number of foods groups consumed
over a reference period (the past week) by the
household
• food frequency = number of days (in the past week)
that a specific food group has been consumed by the
household
• relative nutrition importance of different food groups
(food groups have weights)
Syntax FCS
• COMPUTE FC_main_staple_days_1=sum(xxx,xxxx).
EXECUTE.

• COMPUTE FC_pulses_days_1=xxxx.
EXECUTE.

• COMPUTE FC_vegetable_days_1=sum(xxxxx,xxxx,xxxxx).
EXECUTE.
• COMPUTE FC_fruit_days_1=sum(xxxxx,xxxxx).
EXECUTE.

• COMPUTE FC_meat_days_1=sum(xxxxxx,xxxxx,XXXXXX).
EXECUTE.

• COMPUTE FC_milk_dairies_days_1=XXXXXX.
EXECUTE.

• COMPUTE FC_sugar_days_1=XXXXXXX.
EXECUTE.

• COMPUTE FC_oil_days_1=XXXX.
EXECUTE.
Syntax FCS
• RECODE FC_main_staple_days_1 FC_pulses_days_1
FC_vegetable_days_1 FC_fruit_days_1 FC_meat_days_1
FC_milk_dairies_days_1 FC_sugar_days_1 FC_oil_days_1
(7 thru Highest=7).
EXECUTE.

• COMPUTE FCS_1=sum((FC_main_staple_days_1*2),
(FC_pulses_days_1*3), (FC_vegetable_days_1*1), (FC_fruit_days_1*1),
(FC_meat_days_1*4), (FC_milk_dairies_days_1*4), (FC_oil_days_1*0.5),
(FC_sugar_days_1*0.5)).
EXECUTE.

• RECODE FCS_1 (Lowest thru 21.00=1) (21 thru 35=2) (35 thru Highest=3)
INTO FC_group.
EXECUTE.

• VALUE LABELS FC_group 3.00 'acceptable' 2.00 'borderline' 1.00 'poor'.


EXECUTE.
Save the data set
• Important to save the data set
• Choose
– File
• Save as
• Make sure to store in the right folder
• Paste the syntax
• Run the syntax
Documentation
• Important to have a good structure
• A main folder for the survey
• A folder for the spss data files
• A folder for the syntax files
• A folder for documentation
• A folder for output listings etc.
• A folder for codelists

You might also like