0% found this document useful (0 votes)
5 views16 pages

Data Preparation - 2

This document discusses the process of data preparation for statistical analysis. It covers checking questionnaires, editing data, coding responses, transcribing data, cleaning data by checking for inconsistencies, treating missing data, weighting data for representativeness, and selecting appropriate univariate and multivariate analysis strategies based on the characteristics and properties of the data.

Uploaded by

udaylpu9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views16 pages

Data Preparation - 2

This document discusses the process of data preparation for statistical analysis. It covers checking questionnaires, editing data, coding responses, transcribing data, cleaning data by checking for inconsistencies, treating missing data, weighting data for representativeness, and selecting appropriate univariate and multivariate analysis strategies based on the characteristics and properties of the data.

Uploaded by

udaylpu9
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Chapter Fourteen

Data Preparation
14-2

Data Preparation Process


Fig. 14.1 Prepare Preliminary Plan of Data Analysis

Check Questionnaire

Edit

Code

Transcribe

Clean Data

Statistically Adjust the Data

Select Data Analysis Strategy


14-3

Questionnaire Checking
A questionnaire returned from the field may be
unacceptable for several reasons.
 Parts of the questionnaire may be incomplete.

 The pattern of responses may indicate that the

respondent did not understand or follow the


instructions.
 The responses show little variance.

 One or more pages are missing.

 The questionnaire is received after the

preestablished cutoff date.


 The questionnaire is answered by someone who

does not qualify for participation.


14-4

Editing
Treatment of Unsatisfactory Results
 Returning to the Field – The questionnaires
with unsatisfactory responses may be returned to
the field, where the interviewers recontact the
respondents.
 Assigning Missing Values – If returning the
questionnaires to the field is not feasible, the
editor may assign missing values to unsatisfactory
responses.
 Discarding Unsatisfactory Respondents –
In this approach, the respondents with
unsatisfactory responses are simply discarded.
14-5

Coding
Coding means assigning a code, usually a number, to each
possible response to each question. The code includes an
indication of the column position (field) and data record it will
occupy.
Coding Questions

 Fixed field codes, which mean that the number of records for
each respondent is the same and the same data appear in the
same column(s) for all respondents, are highly desirable.
 If possible, standard codes should be used for missing data.
Coding of structured questions is relatively simple, since the
response options are predetermined.
 In questions that permit a large number of responses, each
possible response option should be assigned a separate column.
14-6

Coding
Guidelines for coding unstructured questions:
 Only a few (10% or less) of the responses should fall

into the “other” category.


 Data should be coded to retain as much detail as

possible.

Suggestion for improvement


 Infra (1)

 Parking (2)

 Display (3)
14-7

Codebook
A codebook contains coding instructions and the
necessary information about variables in the data
set. A codebook generally contains the following
information:
 column number
 record number
 variable number
 variable name
 question number
 instructions for coding
14-8

Data Transcription
Fig. 14.4
Raw Data

CATI/ Keypunching via Mark Sense Optical Computerized


CAPI CRT Terminal Forms Scanning Sensory
Analysis
Verification:Correct
Keypunching Errors

Computer Magnetic
Disks
Memory Tapes

Transcribed Data
Data Cleaning 14-9

Consistency Checks

Consistency checks identify data that are out of


range, logically inconsistent, or have extreme values.

 Computer packages like SPSS, SAS, EXCEL and


MINITAB can be programmed to identify out-of-
range values for each variable and print out the
respondent code, variable code, variable name,
record number, column number, and out-of-range
value.
 Extreme values should be closely examined.
Data Cleaning 14-10

Treatment of Missing Responses


 Substitute a Neutral Value – A neutral value, typically
the mean response to the variable, is substituted for the
missing responses.
 Substitute an Imputed Response – The respondents'
pattern of responses to other questions are used to
impute or calculate a suitable response to the missing
questions.
 In casewise deletion, cases, or respondents, with any
missing responses are discarded from the analysis.
 In pairwise deletion, instead of discarding all cases with
any missing values, the researcher uses only the cases or
respondents with complete responses for each calculation.
Statistically Adjusting the Data 14-11

Weighting
 In weighting, each case or respondent in
the database is assigned a weight to reflect
its importance relative to other cases or
respondents.
 Weighting is most widely used to make the
sample data more representative of a target
population on specific characteristics.
 Yet another use of weighting is to adjust the
sample so that greater importance is attached
to respondents with certain characteristics.
14-12

Statistically Adjusting the Data


Use of Weighting for Representativeness

Years of Sample Population


Education Percentage Percentage Weight

Elementary School
0 to 7 years 2.49 4.23 1.70
8 years 1.26 2.19 1.74

High School
1 to 3 years 6.39 8.65 1.35
4 years 25.39 29.24 1.15

College
1 to 3 years 22.33 29.42 1.32
4 years 15.02 12.01 0.80
5 to 6 years 14.94 7.36 0.49
7 years or more 12.18 6.90 0.57

Totals 100.00 100.00


14-13

Selecting a Data Analysis Strategy


Fig. 14.5

Earlier Steps (1, 2, & 3) of the Marketing Research Process

Known Characteristics of the Data

Properties of Statistical Techniques

Background and Philosophy of the Researcher

Data Analysis Strategy


14-14
14-15

A Classification of Univariate Techniques


Fig. 14.6 Univariate Techniques

Metric Data Non-numeric Data

One Sample Two or More One Sample Two or More


Samples Samples
* t test * Frequency
* Z test * Chi-Square
* K-S
* Runs
* Binomial
Independent Related
* Two- * Paired Independent Related
Group test t test
* Z test * Chi-Square
* One-Way * Sign
* Mann-Whitney * Wilcoxon
ANOVA * Median * McNemar
* K-S * Chi-Square
* K-W ANOVA
14-16

A Classification of Multivariate Techniques


Fig. 14.7
Multivariate Techniques

Dependence Interdependence
Technique Technique

One Dependent More Than One Variable Interobject


Variable Dependent Interdependence Similarity
Variable
* Cross- * Multivariate * Factor * Cluster Analysis
Tabulation Analysis of Analysis * Multidimensional
* Analysis of Variance and Scaling
Variance and Covariance
Covariance * Canonical
* Multiple Correlation
Regression * Multiple
* Conjoint Discriminant
Analysis Analysis

You might also like