0% found this document useful (0 votes)
110 views37 pages

c11 - Quantitative Data Analysis and Interpretation

This chapter discusses quantitative data analysis and interpretation tools and methods. It covers editing raw data to detect errors, coding data, categorizing data files, and testing data goodness and normality. Statistical analysis methods like t-tests, ANOVA, and MANOVA are discussed. Assumptions for parametric tests like the t-test are outlined, including random sampling, normality, and homogeneity of variance. Interpreting computer results and recommendations is also mentioned.

Uploaded by

Nur Amirah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
110 views37 pages

c11 - Quantitative Data Analysis and Interpretation

This chapter discusses quantitative data analysis and interpretation tools and methods. It covers editing raw data to detect errors, coding data, categorizing data files, and testing data goodness and normality. Statistical analysis methods like t-tests, ANOVA, and MANOVA are discussed. Assumptions for parametric tests like the t-test are outlined, including random sampling, normality, and homogeneity of variance. Interpreting computer results and recommendations is also mentioned.

Uploaded by

Nur Amirah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

CHAPTER 11

QUANTITATIVE
DATA ANALYSIS AND
INTERPRETATION
Research Methodology:
Tools, Methods and Techniques

Sundram, V.P.K., Chandran, V.G.R., Atikah, S.B., Rohani, M., Nazura, M.S., Akmal, A.O., & Krishnasamy, T.
Learning Objectives
After completing this chapter, you should be able to:
 Understand the importance of editing the collected raw data to detect errors and omissions
 Set up the coding key for the data set and code the data
 Categorize data and create data files
 Get a ‘feel’ for the data
 Test the goodness of data
 Understand the use of content analysis to interpret and summarize open questions
 Understand the problems and solutions for “don’t know” responses
 Understand the options for data entry and manipulation
 Interpret the computer results and prepare recommendations based on the quantitative data analysis

Research Methodology: Tools, Methods and Techniques 2


Table of Content
11.1 DATA DIAGNOSIS AND TREATMENT
11.2 APPROPRIATE STATISTICAL ANALYSIS
11.3 INTERPRETING SELECTED DATA ANALYSIS

Research Methodology: Tools, Methods and Techniques 3


CHAPTER 11

11.1 DATA DIAGNOSIS AND


TREATMENT

Research Methodology: Tools, Methods and Techniques 4


11.1.1 Missing Data
 Missing data are a certifiably big deal in multivariate
analysis, and it is important to have some tools for
dealing with them.
 A single missing value for a variable can cause either
the variable or the case to be excluded.
 When dealing with missing data, you may leave the
cell blank or assign value codes. If you choose the
latter, then a number of rules apply:
 Missing value codes must be of the same data type as the
data they represent.
 Missing codes cannot occur as data in the data set
 By convention, the choice of digit is usually 9.
Research Methodology: Tools, Methods and Techniques 5
11.1.1 Missing Data
Dealing With Missing Data
An example of this is to replace a sampled country with another country
Case substitution
not yet included in the sample.
Another way of dealing with missing data is by replacing missing data
points with mean value of the variable. This is done by substituting a
Mean substitution
variable’s mean value computed from available cases to fill in missing
data values on the remaining cases.
Cold deck This method replaces the missing value by a constant value from an
substitution external source (for example, from a previous survey).
Regression This is the best method if you have strong relationships and a moderate
substitution amount of missing data.
This is a composite estimation based on several methods. For example,
if you have multiple linear relationships, you could estimate the variable
Multiple methods
value from regression of many different variables and take the mean of
those estimates.
Research Methodology: Tools, Methods and Techniques 6
11.1.2 Outliers
 An outlier is a value that lies outside the normal
range of data.
 Data values for the outliers are added, and
identifiers may be provided for interesting values.
 Box and whisker plots are particularly useful for
comparing group categories (e.g., men versus
women) or several variables (e.g., relative
importance levels of product attributes).

Research Methodology: Tools, Methods and Techniques 7


Boxplot Components

Largest observed
Smallest observed
value of upper
value of lower hinge
hinge

Outside value or Median Outside value


outlier
Whiskers or outlier

Research Methodology: Tools, Methods and Techniques 8


11.1.3 Normality Tests
 The assumption of normality is a perquisite for many
inferential statistical techniques.
 There are a number of different ways to explore this
assumption graphically:

Stem-and-leaf Normal
Histogram Box plot
plot probability plot

Kolmogorov-
Smirnov statistic,
Detrended with a Lilliefors
Skewness Kurtosis
normal plot significance level
and the Shapiro-
Wilks statistic
Research Methodology: Tools, Methods and Techniques 9
11.1.4 Feel of Data

• The mean of a quantitative variable is defined as the sum of


Mean all entries divided by their number.

Variance • The square of the standard deviation.

• This is the most powerful measure of dispersion for


Standard quantitative data.
deviation • It permits very sophisticated descriptions of various
distributions.
• This is another measure of central tendency for quantitative
variables.
Median • It is defined as the value that sits right in the middle of all data
entries when they are listed in ascending order.
Research Methodology: Tools, Methods and Techniques 10
11.1.5 Goodness of Fit
 Reliability – established by testing for both
consistency and stability.
 Consistency indicates how well the items measuring a
concept hang together as a set. Another measure of
consistency reliability used in specific situations is the
split-half reliability coefficient.
 The stability measure can be accessed through:
 parallel-form reliability – when a high correlation between
two similar forms of a measure is obtained
 test-retest reliability – a group of people (preferably 30 or
more) complete the questionnaire twice, with a reasonable
time period (e.g. a week) between the completions.
Research Methodology: Tools, Methods and Techniques 11
11.1.5 Goodness of Fit
 Validity
 Factorial validity – established by submitting the data
for factor analysis. The results of factor analysis (a
multivariate technique) will confirm whether or not the
theorized dimensions emerge.
 Criterion-related validity – established by testing for
the power of the measure to differentiate individuals
who are known to be different.

Research Methodology: Tools, Methods and Techniques 12


11.1.5 Goodness of Fit
 Convergent validity – established when there is a high
degree of correlation between two different sources
responding to the same measure.
Example
Both supervisor and subordinates respond in a similar way to a perceived reward
system measure administrated to them.

 Discriminant validity – established when two distinctly


different concepts are not correlated to each other.
Example

Courage and honesty; leadership and motivation; attitudes and behaviour.

Research Methodology: Tools, Methods and Techniques 13


CHAPTER 11

11.2 APPROPRIATE STATISTICAL


ANALYSIS

Research Methodology: Tools, Methods and Techniques 14


11.2.1 Parametric
 A t-test is used to determine whether a set or sets
of scores are from the same population.
 Three main types of t-test may be applied:
 One sample
 Independent groups
 Repeated measures

Research Methodology: Tools, Methods and Techniques 15


11.2.2 Assumption Testing
 Each statistical test has certain assumptions that must be
met prior to analysis.
 These assumptions need to be evaluated, because the
accuracy of test interpretation depends on whether
assumptions have been violated.
 The generic assumptions underlying all types of t-test are:
1. Scale of Measurement – the data should be at the interval or
ratio level of measurement.
2. Random sampling – the scores should be randomly sampled
from the population of interest.
3. Normality – the scores should be normally distributed in the
population. Research Methodology: Tools, Methods and Techniques 16
11.2.2 Assumption Testing
Paired-samples
One-
t-test Independent
sample t- MANOVA
(Dependent t- t- test
test
test)
Normality of Cell sizes
Independenc
population
Univariate and e of groups
difference
scores multivariate normality

Linearity Homogeneity
of variance
Homogeneity of
regression
Homogeneity of
variance-covariance
matrices
Multicollinearity and
singularity
Research Methodology: Tools, Methods and Techniques 17
11.2.2 Assumption Testing
One-way ANOVA Pearson Multiple regression

Independence Ratio of cases to independent


Related pairs
of groups variables

Homogeneity of
Linearity Outliers
variance

Homoscedastici Multicollinearity and


ty singularity

Normality, linearity,
homoscedasticity and
independence of residuals
Research Methodology: Tools, Methods and Techniques 18
11.2.3 Non-parametric Test
• The test is used when you would use a repeated measures or
Wilcoxon paired t-test – that is, when the same participants perform
under each of the independent variable.

• The test is used to compare two or more related samples, and is


Friedman equivalent to repeated measures or within-subject’s ANOVA.

Mann- • It tests the hypothesis that two independent samples come from
populations having the same distribution. This test is equivalent
Whitney to the independent groups t-test.

Kruskal- • The test is equivalent to the one-way between-groups ANOVA


and thus allows us to examine possible differences between two
Wallis or more groups.

Spearman • A non-parametric alternative to the parametric bivariate


rho correlation (Pearson’s r) is Spearman’s rho.

Research Methodology: Tools, Methods and Techniques 19


CHAPTER 11

11.3 INTERPRETING SELECTED


DATA ANALYSIS

Research Methodology: Tools, Methods and Techniques 20


11.3.1 Interpretation of Descriptive Analysis
 Descriptive statistics are used to describe, examine
and summarize the main features of a collected
data quantitatively.

 Model case 1

Research Methodology: Tools, Methods and Techniques 21


11.3.1 Interpretation of Descriptive Analysis
The table below shows the study of the relationship
between economic fundamentals and the money
supply.
Table ‎1: Results of Descriptive Analysis (Money Supply)

Variable Mean Std Dev Max Min


Money Supply 38180.5 1580 43235.5 33862.7
Inflation (CPI) 101.2 3.5 112.5 88.9
Government Debt 11264.58 1871 11578.86 9435.17
National Income (GDP) 20359.8 2002 21802.6 19189.3

Research Methodology: Tools, Methods and Techniques 22


11.3.1 Interpretation of Descriptive Analysis
(i) Mean
 Mean is used to measure the center tendency of the arithmetic
average of the scores. To compute the mean, all the values are
added up and divided by the number of values.
 The maximum amount for money supply is RM43,235.50 and a
minimum of RM33,862.70. Also, it has a mean of RM38,180.50.
 The inflation (CPI) has the maximum score of 112.5% and a
minimum score of 88.9%. While it has an average score of
101.2%.
 The score for government debt is within the range of RM9,435.17
to RM11,578.86 and the mean is at RM11,264.58.
 The national income variable has a minimum score of
Research Methodology: Tools, Methods and Techniques 23
RM19,189.30 and a maximum of RM21,802.60. While the mean is
11.3.1 Interpretation of Descriptive Analysis
(ii) Standard deviation
 Standard deviation is used to measure variability of the square
root of variance providing an index of variability in the distribution
of scores.
 The standard deviation for the variables of money supply,
inflation, government debt, and national income is RM1,580, 3.5%,
RM1,871, and RM2,002 respectively.
 In the case inflation variable, the standard deviation is 3.5/101.2
or 3.46% of the mean where this value can be considered as
small. On the other hand, for government debt variable, the
standard deviation is 16.61% (1871/11264.58) of the mean, where
this score is perceived as a large deviation.
Research Methodology: Tools, Methods and Techniques 24
11.3.2 Interpretation of Correlation
Analysis
 The correlation analysis determines whether and to
what degree a relationship exists between two or
more quantifiable variables.
 For example, it is used to measure the relationship
strength between the dependent and independent
variables.

 Model case 1

Research Methodology: Tools, Methods and Techniques 25


11.3.2 Interpretation of Correlation
Analysis
The table below shows the correlation coefficients for the
variables average income, total expenditure and number of
people living in the households.
Table ‎2:‎Results‎of‎Correlation‎Analysis‎(Firm’s‎Employees‎in‎SME‎Malaysia)
Average Number of
Total expenditure
Employee Salary Employee
Average Pearson Correlation 1 0.539** 0.293**
Employee Salary Sig (2 tailed) 0.000 0.034
Pearson Correlation 0.539** 1 0.373**
Total expenditure
Sig (2 tailed) 0.000 0.000
Number of Pearson Correlation 0.293** 0.373** 1
Employee Sig (2 tailed) 0.034 0.000
** correlations are significant
Research Methodology: Tools, Methods and Techniques 26
11.3.2 Interpretation of Correlation
Analysis
(i) Definition of correlation coefficient
 Correlation is used to look at the ‘net strength’ relationship
between two continuous variables (Sweet and Martin, 2008).
 A correlation coefficient shows the direction, strength, and
significance of the bivariate relationship among all the variables
that were measured at an interval or ratio level.
 There could be a perfect positive correlation between two
variables, represented by 1.0 (plus 1) or a perfect negative
correlation, which would be -1.0 (minus 1).
 It does not tell us which variable causes which, but it tells us that
the two variables are associated with each other.

Research Methodology: Tools, Methods and Techniques 27


11.3.2 Interpretation of Correlation
Analysis
(ii) Explanation of the study’s correlation analysis
 There is a positively moderate correlation ( = 0.539) or
substantial relationship between the average employee salary and
total expenditure. Also, this relationship is significant at the 0.01
level.
 While average employee salary have a low correlation ( = 0.293)
which is definite but small relationship with number of employee.
However, it has a significant relationship at the 0.01 level.
 The total expenditure and number of employee also have a
definite but small relationship. In other words, have a low
correlation ( = 0.373) and significant at the 0.01 level.

Research Methodology: Tools, Methods and Techniques 28


Correlation Strength Based on Guilford’s Law

R Strength of relationship
< 0.20 Almost negligible relationship
0.20 – 0.40 Low correlation; definite but small relationship
0.40 – 0.70 Moderate correlation; substantial relationship
0.70 – 0.90 High correlation; marked relationship
> 0.90 Very high correlation; very dependable relationship

Research Methodology: Tools, Methods and Techniques 29


11.3.3 Interpretation of Regression Analysis
 Regression analysis is used to measure how many
percent dependent variables can be explain by the
independent variable.

 Model case 1

Research Methodology: Tools, Methods and Techniques 30


11.3.3 Interpretation of Regression Analysis
The table below shows the result of regression analysis of
four independent variables regressed against customer
satisfaction.
Table ‎3: Results of Regression Analysis (Customer Satisfaction)
Standardized
Unstandardized Coefficients
Model Coefficients t Sig
B Std. Error Beta
(Constant) 1.483 .290 5.114 .000
Product Quality .235 .069 .277 3.432 .001
Customer Service .024 .082 .026 .285 .776
Pricing .198 .076 .223 2.620 .010
Promotion .351 .080 .161 1.977 .025
F value 9.349
Sig .000
Adjusted R2 .181
R2 .203
Research Methodology: Tools, Methods and Techniques 31
11.3.3 Interpretation of Regression Analysis
(i) Model fit / Coefficient of determination (R2)
 R2 indicates the percentage variance in the dependent variable
that is explained by the variation in the independent variables.
 The R2 of 0.203 implies that all the independent variables explain
20 percent of the variance in dependent variable.
 79.7 percent of the variance in the dependent variable is not
explained by the independent variables in this study. This
indicates, there are other independent variables which are not
included in this study and could further strengthen the regression
equation.

Research Methodology: Tools, Methods and Techniques 32


11.3.3 Interpretation of Regression Analysis
(ii) Adjusted R2
 Adjustment of R-squared that penalizes the additional of
independent variable (IVs) to the model.
 Adjustment of R-squared penalizes the additional of 0.181 unit of
independent variable (IVs) to the model.

(iii) Model significance


 F-test is significant base on the value of 0.000. Hence all
independent variables significantly explained dependent variable.

Research Methodology: Tools, Methods and Techniques 33


11.3.3 Interpretation of Regression Analysis
(iv) Parameter significance (t-test)
 The result for product quality variable is 0.001 (0.1%), which is below
the 5% significant level. Therefore, product quality variable is
significant. Hence, explain that product quality is positively related with
dependent variable.
 The variable for customer service is not significant. It is because the p-
value for customer service variable is 0.776 (77.6%), which is above the
5% significant level. Hence, explain that customer service is not related
with dependent variable.
 Pricing variable has a p-value of 0.010 (1%), which is below the 5%
significant level. Therefore, pricing variable is significant. Hence, explain
that pricing is positively related with dependent variable.
 The promotion variable is significant with a p-value of 0.025 (2.5%).
Thus, shows it is below the 5% significant level. Hence, explain that
promotion is positively related with dependent variable.
Research Methodology: Tools, Methods and Techniques 34
11.3.3 Interpretation of Regression Analysis
(v) Unstandardized Beta Coefficients
 They are the value of regression equation function for predicting the
dependent variable from the independent variable.
 The column of estimates provides the value for 0 , 1 , 2 for this
equation.
 Customer Satisfaction = 1.483 + 0.235 Product Quality + 0.024
Customer Service + 0.198 Pricing + 0.351 Promotion
 For each one-unit increase in product quality, customer satisfaction will
increase by 0.235 units with holding other independent variable
constant.
 For each one-unit increase in customer service, customer satisfaction
will increase by 0.024 units with holding other independent variable
constant.
 For each one-unit increase in pricing, customer satisfaction will increase
by 0.198 units with holding other independent variable constant.
 For each one-unit increase in promotion, customer satisfaction will
increase by 0.351 units with holding other independent variable
Research Methodology: Tools, Methods and Techniques 35
11.3.3 Interpretation of Regression Analysis
(vi) Standardized Beta Coefficients
 The beta uses a standard unit that is the same for all variables in the equation.
 It tells the same thing as unstandardized beta value but is expressed as
standard deviation.
 As product quality increase by one standard deviation, customer satisfaction
increase by 0.277 of a standard deviation.
 As customer service increase by one standard deviation, customer satisfaction
increase by 0.026 of a standard deviation.
 As pricing increase by one standard deviation, customer satisfaction increase
by 0.223 of a standard deviation.
 As promotion increase by one standard deviation, customer satisfaction
increase by 0.161 of a standard deviation.
 Therefore, the strongest would be the product quality variable with a beta
weight of 0.277. The second would be the pricing variable with a beta weight of
0.223. The weakest variable would be promotion with beta weight of 0.161.
While customer service variable does not explain the variance in customer
satisfaction significantly.
Research Methodology: Tools, Methods and Techniques 36
11.3.3 Interpretation of Regression Analysis
(vii) Recommendation
 The company should ensure that employees are continuously
producing high quality product to ensure customer satisfaction.
 The company needs to put the best price and promotion
advertisement to attract customers.

(viii) Future research


 Future studies should use other variables that have possible
contribution on customer satisfaction.
 Suggest moderating and mediating variables that would
influence the relationship between independent variable and
dependent variable.
Research Methodology: Tools, Methods and Techniques 37

You might also like