0% found this document useful (0 votes)
72 views22 pages

Quntative Data Analysis SPSS: Formating, Handling, & Manipulation

This document provides an overview of quantitative data analysis and handling missing data in SPSS. It discusses topics such as data cleaning, formatting, merging, recoding variables, creating new variables, and working with syntax. Modern techniques for handling missing data like maximum likelihood estimation and multiple imputation are presented as superior to older methods like deletion, mean substitution, and last observation carried forward. Social scientists' lack of familiarity with and use of modern missing data techniques is also examined.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views22 pages

Quntative Data Analysis SPSS: Formating, Handling, & Manipulation

This document provides an overview of quantitative data analysis and handling missing data in SPSS. It discusses topics such as data cleaning, formatting, merging, recoding variables, creating new variables, and working with syntax. Modern techniques for handling missing data like maximum likelihood estimation and multiple imputation are presented as superior to older methods like deletion, mean substitution, and last observation carried forward. Social scientists' lack of familiarity with and use of modern missing data techniques is also examined.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Quntative Data Analysis SPSS

Formating, Handling, & Manipulation

Jamil A. Malik (PhD)


National Institute of Psychology
Quaid-e-Azam Univeristy

Data Cleaning in SPSS


1. Data labeling and formatting
2. Data Merging
3. Re-coding existing variables
4. Data manipulation
computation
Creating new variable from existing variables
5. Working with Syntax

Data labeling and formatting


Specifying Type of Variable

HT
61.00
68.00
47.00
66.00
72.00
67.00
72.00
72.00
66.00
60.00
61.00
59.00
73.00
65.00
71.00
68.00
69.00
66.00
66.00
68.00

Data labeling and formatting


Data Labeling

Data labeling and formatting


Variable Formatting

Data labeling and formatting


Specifying missing values

Data labeling and formatting


Measurement category

Data merging in SPSS


1. Make sure that both files are sorted by Key variable in ascending order
2. In SPSS, open Data menu
3. Select Add Variables under Data, Merge Files

Data merging in SPSS


4. Select the dataset you want to merge into the working file.

Data merging in SPSS


5. Click on Match cases on key variables in sorted files,
6. Click on Both files provide cases
7. Highlight ID in the excluded variables box, then click near key
Variables

Recoding existing variables


From SPSS dialog box, go to:
Transform
Recode
Into Same variables

Recoding existing variables


1. Select Group from the variable box into String Variables box
2. Click on Old and new Values to proceed

Recoding existing variables


1.
2.
3.
4.

Type the old value and the new value you want to convert into
Click on Add (To remove, or change, click on Change or Remove)
Type all values in the Old New box, then click Continue
Click OK to execute the commands.

Computing New Variables


Computing patients age from birthday and date enrolled into the study.

Handling Missing

What is certain in life?


Death
Taxes

What is certain in research?


Measurement error
Missing data

Missing data can be:


Due to preventable errors, mistakes, or lack of foresight by the
researcher
Due to problems outside the control of the researcher
Deliberate, intended, or planned by the researcher to reduce cost or
respondent burden
Due to differential applicability of some items to subsets of
respondents
Etc.

Some Characteristics of Missing


Data

Facets of missing data


Persons
Variables
Occasions

Type of non-response
Block non-response
Wave non-response
Item non-response

Special non-response problems in longitudinal


and clustered data
Attrition/drop-out
Group (e. g. family) member non-response

Missing Data in Research


Studies

Missing data mechanism


Missing completely at random (MCAR)Ignorable
Missing at random (MAR)Conditionally ignorable
Missing not at random (MNAR)Nonignorable

Amount of missing data


Percent of cases with missing data
Percent of variables having missing data
Percent of data values that are missing

Older Missing Data Treatments


(1)

Deletion methods
Listwise deletion (complete case analysis)
Pairwise deletion (available case analysis)

Cold deck imputation


Deterministic, logical, or rule-based imputation
Treat missing data for nominal predictors as an additional category

Hot deck (donor case) imputation


Cluster based methods
Distance based (e. g. nearest neighbor) methods

Mean substitution
(Variable) mean substitution
Mean substitution with added random error
Predictor mean substitution with missing data dichotomy

Older Missing Data Treatments


(2)

Regression imputation
Regression predicted value imputation
Regression imputation with added random error

Special methods for longitudinal studies and randomized


controlled trials

Endpoint only analysis


Last observation carried forward (LOCF)
Intent to treat worst (best) case imputation
Summary growth parameters

Special methods for multi-item scales

Available item method of scale construction


Person mean imputation
Two-way imputation
Two-way imputation with added random error

Modern Missing Data


Treatments

Maximum likelihood (ML)


Estimates summary statistics or statistical models using all available data
Available in modern structural equation modeling software (Amos, EQS,
Lisrel, Mplus, Mx, etc.)
The ML covariance matrix and mean vector can also be obtained from SPSS
MVA, and used for standard Regression, Factor analysis, Reliability, and other
procedures
There are also freeware and open source programs that can produce the ML
covariance matrix and mean vector, usually by using the Expectation
Maximization (EM) algorithm (e.g. EMCOV)

Multiple imputation
Imputes individual data values in multiple complete datasets, averaging the
results of the statistical analyses across these datasets
Available in the current versions of certain SEM software (Amos, Mplus).
Also available in SPSS (MVA), SAS (Proc MI and MIANALYZE), Stata (mi impute
and mi estimate), and stand-alone missing data packages such as SOLAS

Why do social scientists use modern


missing data treatments so infrequently?

Lack of awareness or familiarity


They are not convinced of the problems with
older methods
The statistical literature on missing data is
technically daunting
The techniques arent incorporated into the
standard statistical analysis procedures used by
social scientists
Journal reviewers and editors have not required
it

Working with SPSS Syntax

Demonstration

You might also like