Chap13 - Quantitative Data Analysis - Revised - Jan2021
Chap13 - Quantitative Data Analysis - Revised - Jan2021
female)
WHAT IS DATA ANALYSIS?
Editing
The process of checking the completeness, consistency, and legibility of data and
making the data ready for coding and transfer to storage.
Item Nonresponse
The technical term for an unanswered question on an otherwise complete
questionnaire resulting in missing data.
Coding
The process of assigning a numerical score or other character symbol to
previously edited data.
Codes
Rules for interpreting, classifying, and recording data in the coding
process.
The actual numerical or other character symbols assigned to raw data.
Data File
The way a data set is stored electronically in spreadsheet-like form in
which the rows represent sampling units and the columns represent
variables.
QUANTIFYING DATA/CODING
QUANTIFYING DATA/CODING
Before we can do any kind of analysis, we need to
quantify our data
Managerial = 2, etc.
Classify the response: “Secretary” is “clerical” and is coded as “3”
Two Basic Rules for Coding Categories:
Any given variable will have a specified set of answer choices and codes to match each answer choice.
For example, the variable gender will have three answer choices and codes for each: 1 for male, 2 for
female, and 0 for no answer. If you have a respondent coded as 6 for this variable, it is clear that an error
has been made since that is not a possible answer code. Possible-code cleaning is the process of checking
to see that only the codes assigned to the answer choices for each question (possible codes) appear in
the data file.
If you are not using a computer program that checks for coding errors during the data entry process, you
can locate some errors simply by examining the distribution of responses to each item in the data set.
For example, you could generate a frequency table for the variable gender and here you would see the
number 6 that was mis-entered. You could then search for that entry in the data file and correct it.
• Data integrity is essential to successful research and
decision making.
F-Statistic 18.695
Significant 0.000
Adjusted R² 0.373
PARAMETRIC STATISTICAL TEST
In the literal meaning of the terms, a parametric statistical test is one that
makes assumptions about the parameters (defining properties) of the
population distribution(s) from which one‘s data are drawn.
they generally also assume that one's measures derive from an equal-
interval scale (Interval or Ratio variables)
Kolmogorov-Smirnova Shapiro-Wilk
Gender
Statistic df Sig. Statistic df Sig.
Male 0.105 127 0.002 0.984 127 0.141
Purchase
Female 0.067 122 0.200 0.986 122 0.223
*. This is a lower bound of true significance.
a. Lilliefors Significance Correction
Female Purchase:
Symmetric bell-shaped
curve, indicates
normally distributed
data
Q-Q PLOT
Univariate Analysis
CS – Interval variable
Bivariate analysis.
Bivariate analysis.
Gender = Nominal
CS = Interval
Bivariate analysis.
Performance: Interval
Training: Nominal
interval.
SPSS steps: Analyze/regression/linear/move
performance to dependent box/move involvement
& welfare to independent box/statistics/tick
estimates, model fit, descriptives & collinearity
diagnostics/continue/OK
SPSS output:
- Refer model summary: Adjusted R square = 0.45, indicates that 45% of the variance in the dependent variable
can be predicted from independent variables.
Refer ANOVA table: P-value=0.00 (<0.05) indicates the regression equation is a good fit.
Refer coefficient table: standard coefficient of involvement is 0.435 (p=0.00) and welfare is 0.314 (p=0.00)
indicates both significantly related to performance.
Thus H6 & H7 is supported.
LOGISTIC REGRESSION
a
Y
X1
b c
X2
MEDIATING VARIABLE
surfaces between the time the independent
variables start operating to influence the
dependent variable and the time their impact is
felt on it.
Example
MODERATION
Y
X
M
MODERATORS
Moderating variable
Moderator is qualitative (e.g., gender, race, class) or
quantitative (e.g., level of reward) variable that affects
the direction and/or strength of relation between
independent and dependent variable.
Example
7. STRUCTURAL EQUATION MODELLING
(SEM)
Structural equation modeling (SEM) uses when a researcher is
faced with a set of interrelated variables, yet none of the multivariate
techniques allow the researcher to address the issues. SEM is widely
used for following:
Time Management
PLANNING YOUR ANALYSIS
Leave enough time for data entry and data formatting
Can take much longer than you expect
This will allow you to plan the proper levels and types of
analysis
PLANNING YOUR ANALYSIS
If your research question requires a level of analysis
your variables won’t allow, you’ll need to transform
them
Create‘dummy’ variables
Collapse categories