0% found this document useful (0 votes)

11 views93 pages

Chapter 10 Data Analysis-Quantitative

Uploaded by

Naty Dereje

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views93 pages

Chapter 10 Data Analysis-Quantitative

Uploaded by

Naty Dereje

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 93

Chapter 10

Data Analysis-Quantitative

1
Data Analysis

Quantitative Analysis

In this chapter, students will be introduced to

the techniques of quantitative data analysis
and interpretation of the result.

2
Data Analysis
Descriptive Statistics
• Numeric data collected in a research project can be
analyzed quantitatively using statistical tools in two
different ways.
• Descriptive analysis refers to statistically describing,
aggregating, and presenting the constructs of interest
or associations between these constructs.
• Inferential analysis refers to the statistical testing of
hypotheses (theory testing).
• Much of today’s quantitative data analysis is conducted
using SPREAD SHEET software programs such as SPSS
or SAS or WEKA,…
• Readers are advised to familiarize themselves with one
of these programs for understanding the concepts
described in this discussion. 3
Data Analysis
Data Preparation
• In research projects, data may be collected
from a variety of sources:
• Mail-in surveys,
• Interviews, pretest or posttest
experimental data,
• Observational data, and so forth.
• This data must be converted into a machine-
readable, numeric format, such as in a
spreadsheet or a text file, so that they can be
analyzed by computer programs like MS
EXCEL or SPSS or SAS or WEKA…

4
Data Analysis
Data preparation usually follows the following steps.
Data Coding
• Coding is the process of converting data into numeric
format.
• A codebook should be created to guide the coding
process.
• A codebook is a comprehensive document containing :
• Detailed description of each variable in a research
study,
• Items or measures for that variable,
• The format of each item (numeric, text, etc.),
• The response scale for each item (i.e., Whether it
is measured on a nominal, ordinal, interval, or
ratio scale;
• Whether such scale is a five-point, seven-point, or
some other type of scale), and
• How to code each value into a numeric format.
5
Data Analysis
For Instance,
• If we have a measurement item on a
seven-point Likert scale with anchors
ranging from “strongly disagree” to
“strongly agree”, we may code that item as
1 for strongly disagree, 4 for neutral, and
7 for strongly agree.

6
Data Analysis
Data Entry
• Coded data can be entered into a spreadsheet,
database, text file, or directly into a statistical
program like SPSS.
• Most statistical programs provide a data editor for
entering data.
• However, these programs store data in their own
native format (e.g., SPSS stores data as .sav files),
which makes it difficult to share that data with
other statistical programs.
• Hence, it is often better to enter data into a
spreadsheet or database, where they can be
reorganized as needed, shared across programs,
and subsets of data can be extracted for analysis.
7
Data Analysis
Univariate Analysis
• Univariate Analysis, or analysis of a single
variable, refers to a set of statistical
techniques that can describe the general
properties of one variable.
Univariate statistics include:
1. Frequency distribution,

2. Central tendency, and

3. Dispersion

8
Data Analysis
i) Frequency Distribution
• The frequency distribution of a variable is a summary of
the frequency (or percentages) of individual values or
ranges of values for that variable.
For Instance,
• We can measure how many times a sample of respondents
attend religious services (as a measure of their
“religiosity”) using a categorical scale:
• Never,
• Once per year,
• Several times per year,
• About once a month,
• Several times per month,
• Several times per week, and
• An optional category for “did not answer.”

9
Data Analysis
Figure 10.1: Frequency Distribution

10
Data Analysis
• With very large samples where observations
are independent and random, the frequency
distribution tends to follow a plot that looked
like a bell-shaped curve (a smoothed bar chart
of the frequency distribution).
• Most observations are clustered toward the
center of the range of values, and fewer and
fewer observations toward the extreme ends
of the range.
• Such a curve is called a normal
distribution.

11
Data Analysis

12
Data Analysis
ii. Central Tendency
• It is an estimate of the center of a distribution of
values.
• There are three major estimates of central
tendency: Mean, Median, and Mode.
• The arithmetic mean (often simply called
the “mean”) is the simple average of all
values in a given distribution.
• Consider a set of eight test scores: 15, 22,
21, 18, 36, 15, 25, and 15.
• The arithmetic mean of these values is
(15 + 20 + 21 + 20 + 36 + 15 + 25 + 15)/8 =
20.875.
13
Data Analysis
• Geometric Mean (nth root of the product of n
numbers in a distribution) and
• Harmonic Mean (the reciprocal of the arithmetic
means of the reciprocal of each value in a
distribution).
• However, these types of means are not popular in
statistical analysis of social research data.

• GM= n
X 1 X 2 ... X n

( x
1
N )
• HM= i

14
Data Analysis
The Median
• The second measure of central tendency, the
median, is the middle value within a range of
values in a distribution.
• This is computed by sorting all values in a
distribution in increasing order and
selecting the middle value.
• In case there are two middle values (if
there is an even number of values in a
distribution), the average of the two
middle values represent the median.
• In the above example, the sorted values
are: 15, 15, 15, 18, 22, 23, 25, 36.
• The two middle values are 18 and 22, and
hence the median is (18 + 22)/2 = 20.
15
Data Analysis
The Mode
• Lastly, the mode is the most frequently
occurring value in a distribution of values.
• In the previous example, the most
frequently occurring value is 15, which is
the mode of the above set of test scores.
• Note that any value that is estimated
from a sample, such as mean, median,
mode, or any of the later estimates are
called a statistic.

16
Data Analysis
iii) Dispersion
• Dispersion refers to the way values are
spread around the central tendency, for
example, how tightly or how widely the
values are clustered around the mean.
• Two common measures of dispersion
are:
• the range and
• standard deviation.

17
Data Analysis
a) Range
• The range is the difference between the
highest and lowest values in a distribution.
• The range in our previous example is 36-15 =
21.
• If the maximum value is raised to 85 in the
above distribution while the other vales
remained the same, the range would be 85-15 =
70.
18
Data Analysis
b) Standard Deviation
• Standard deviation, the second measure of
dispersion, corrects for such outliers by using a
formula that takes into account how close or
how far each value from the distribution mean:
• σ is the standard deviation,
• xi is the ith observation (or value),
• μ is the arithmetic mean,
• n is the total number of observations,
and
• Σ means summation across all
observations.
19
Data Analysis

20
Data Analysis
c)Variance
• The square of the standard deviation is called the
variance of a distribution.
• In a normally distributed frequency distribution, it
is seen that
• 68% of the observations lie within one
standard deviation of the mean (μ + 1σ),
• 95% of the observations lie within two
standard deviations (μ + 2 σ), and
• 99.7% of the observations lie within three
standard deviations (μ + 3 σ).
21
Data Analysis
Bivariate Analysis
• Bivariate Analysis examines how two
variables are related to each other.
• The most common bivariate statistic is the
bivariate correlation (often, simply called
“correlation”), which is a number between
-1 and +1 denoting the strength of the
relationship between two variables.
• Let’s say that we wish to study how age is
related to self-esteem in a sample of 20
respondents, i.e., as age increases, does self-
esteem increase, decrease, or remains
unchanged.
22
Data Analysis
• If self-esteem increases, then we have a
positive correlation between the two variables,
• if self-esteem decreases, we have a negative
correlation, and
• if it remains the same, we have a zero
correlation.
• To calculate the value of this correlation, consider
the hypothetical dataset shown below.

23
Hypothetical Data on Age and Self-Esteem

24
Data Analysis

• The two variables in this dataset are age (x) and

self-esteem (y).
• Age is a ratio-scale variable, while self-
esteem is an average score computed
from a multi-item self-esteem scale
measured using a 7-point Likert scale,
ranging from “strongly disagree” to
“strongly agree.”

• The histogram of each variable is shown

on the left side of Figure below.
25
Data Analysis
The formula for calculating bivariate correlation
is:

26
Data Analysis
• Where rxy is the correlation, x and y are the
sample means of x and y, and sx and sy are
the standard deviations of x and y.
• The manually computed value of correlation
between age and self-esteem, using the
above formula is 0.79.
• This figure indicates that age has a
strong positive correlation with self-
esteem, i.e., self-esteem tends to
increase with increasing age, and
decrease with decreasing age.
27
Data Analysis

28
Data Analysis
• The bivariate scatter plot in the right panel of is
essentially a plot of self-esteem on the vertical axis
against age on the horizontal axis.
• This plot roughly resembles an upward sloping line (i.e.,
positive slope), which is also indicative of a positive
correlation.
• If the two variables were negatively correlated, the
scatter plot would slope down (negative slope), implying
that an increase in age would be related to a decrease in
self-esteem and vice versa.
• If the two variables were uncorrelated, the scatter plot
would approximate a horizontal line (zero slope),
implying than an increase in age would have no
systematic bearing on self-esteem.
29
Data Analysis

• After computing bivariate correlation,

researchers are often interested in knowing
whether the correlation is significant (i.e., a
real one) or caused by mere chance.

• Answering such a question would require

testing of the following hypothesis:

• H0: r = 0
• H1: r ≠ 0
30
Data Analysis

• H0 is called the null hypotheses, and H1 is called

the alternative hypothesis (sometimes, also
represented as Ha) or
• The hypothesis that we actually want to test
(i.e., whether the correlation is different from
zero).

• Although they may seem like two

hypotheses, H0 and H1 jointly represent
a single hypothesis since they are
opposites of each other.
31
Data Analysis
• Also note that H1 is a non-directional
hypotheses since it does not specify whether r
is greater than or less than zero.
• Directional hypotheses will be specified as H0:
r ≤ 0;
• H1: r > 0 (if we are testing for a positive
correlation).
• Significance testing of directional
hypothesis is done using a one-tailed t-
test, while that for non-directional
hypothesis is done using a two-tailed t-
test.
32
Data Analysis
• In statistical testing, the alternative hypothesis
cannot be proven directly or conclusively.

• Rather, it is indirectly proven by rejecting the null

hypotheses with a certain level of probability.

• Statistical testing is always probabilistic, because

we are never sure if our inferences, based on
sample data, apply to the population, since our
sample never equals the population.
33
Data Analysis
• The probability that a statistical inference is caused
by pure chance is called the p-value.
• The p-value is compared with the significance level
(α), which represents the maximum level of risk that
we are willing to take that our inference is incorrect.
• For most statistical analysis, α is set to 0.05.
• A p-value less than α=0.05 indicates that we
have enough statistical evidence to reject the
null hypothesis, and thereby, indirectly accept
the alternative hypothesis.
• If p>0.05, then we do not have adequate
statistical evidence to reject the null
hypothesis or accept the alternative
hypothesis. 34
Data Analysis
• A correlation matrix is a matrix that lists the
variable names along the first row and the first
column, and depicts bivariate correlations
between pairs of variables in the appropriate cell
in the matrix.

• The values along the principal diagonal

(from the top left to the bottom right
corner) of this matrix are always 1, because
any variable is always perfectly correlated
with itself.
35
Data Analysis

36
Data Analysis
Cross-Tab/Contingency Table

• Another useful way of presenting bivariate data is

cross-tabulation (often abbreviated to cross-tab,
and sometimes called more formally as A
CONTINGENCY TABLE).

• A cross-tab is a table that describes the

frequency (or percentage) of all combinations of
two or more nominal or categorical variables.
• As an example, let us assume that we
have the following observations of gender
and grade for a sample of 20 students, as
shown in the Figure on next page.
37
Data Analysis
• Gender is a nominal variable (male/female or M/F),
and grade is a categorical variable with three levels
(A, B, and C).
• A simple cross-tabulation of the data
may display the joint distribution of
gender and grades (i.e., how many
students of each gender are in each
grade category, as a raw frequency
count or as a percentage) in a 2 x 3
matrix.

38
Data Analysis

39
Data Analysis
 Is this pattern real or “statistically
significant”?
 In other words, do the above frequency counts
differ from that that may be expected from
pure chance?
 To answer this question, we should
compute the expected count of
observation in each cell of the 2 x 3
cross-tab matrix.
 This is done by multiplying the marginal
column total and the marginal row
total for each cell and dividing it by the
total number of observations.

40
Data Analysis
For Example,
• For the male/A grade cell, expected
count = 5 * 10 / 20 = 2.5.

• In other words, we were expecting 2.5

male students to receive an A grade, but
in reality, only one student received the
A grade.

41
Data Analysis

 Whether this difference between expected and actual

count is significant can be tested using a chi-square test.
 The chi-square statistic can be computed as the average
difference between observed and expected counts across
all cells.
 We can then compare this number to the critical value
associated with a desired probability level (p < 0.05) and
the degrees of freedom, which is simply (m-1)*(n-1), where
m and n are the number of rows and columns respectively.
 In this example, df = (2 – 1) * (3 – 1) = 2.
42
Data Analysis
• From standard chi-square tables in any
statistics book, the critical chi-square value for
p=0.05 and df=2 is 5.99. {table value}
• The computed chi-square value, based on our
observed data, is 1.00, which is less than the
critical value.
• Hence, we must conclude that the
observed grade pattern is not
statistically different from the pattern
that can be expected by pure chance.

43
Data Analysis
Inferential Statistics
• Inferential statistics are the statistical
procedures that are used to reach conclusions
about associations between variables.
• They differ from descriptive statistics in that
they are explicitly designed to test hypotheses.
• Numerous statistical procedures fall in this
category, most of which are supported by
modern statistical software such as SPSS and
SAS.
• Readers are advised to consult a formal text on
statistics or take a course on statistics for
more advanced procedures.
44
Data Analysis
Testing for Significance
• Having formulated the hypothesis, the next step is its
validity at a certain level of significance.
• The confidence with which a null hypothesis is accepted
or rejected depends upon the significance level.
• A significance level of say 5% means that the risk of
making a wrong decision is 5%.
• The researcher is likely to be wrong in accepting false
hypothesis or rejecting a true hypothesis by 5 out of
100 occasions.
• A significance level of say 1% means, that the
researcher is running the risk of being wrong
in accepting or rejecting the hypothesis is one
of every 100 occasions.
• Therefore, a 1% significance level provides
greater confidence to the decision than 5%
significance level.
45
Data Analysis
One-tailed and two-tailed tests
• A hypothesis test may be one-tailed or
two-tailed.
a) One Tailed Test
• In one-tailed test the test-statistic for
rejection of null hypothesis falls only in
one-tailed of sampling distribution
curve.
46
Data Analysis
Example 2
• A tyre company claims that mean
life of its new tyre is 15,000 km.

• Now the researcher formulates the

hypothesis that tyre life is = 15,000
km.

47
Data Analysis
b) A Two-tailed Test
• It is one in which the test statistics leading to
rejection of null hypothesis falls on both tails of
the sampling distribution curve as shown.

• When we should apply a hypothesis test that is

one-tailed or two-tailed depends on the nature of
the problem.
• One-tailed test is used when the researcher's
interest is primarily on one side of the issue.
• A two-tailed test is appropriate, when the
researcher has no reason to focus on one side of
the issue.
Example:
• Is the current advertisement less effective than
the proposed new advertisement"?
48
Data Analysis

a) Degree of Freedom

• It tells the researcher the number of

elements that can be chosen freely.
Example:

• a+ b/2 =5.
• Fix a=3, b has to be 7.
• Therefore, the degree of freedom is 1.

49
Data Analysis
b) Select Test Criteria
• If the hypothesis pertains to a larger
sample (30 or more), the Z-test is used.
• When the sample is small (less than 30),
the T-test is used.
C) Carry Out Computation

50
Data Analysis
d) Make Decisions

 Accepting or rejecting of the null hypothesis

depends on whether the computed value
falls in :

 the region of rejection at a given level of

significance.

51
Data Analysis
Assumptions about Parametric and Non-Parametric Test
1) Observations in the population are normally distributed.
2) Observations in the population are independent to each
other.
3) Population should posses' homogeneous characteristics.
4) Samples should be drawn using simple random sampling
techniques.
5) To use T test sample size should be less than 30.
6) To use F test sample size should be less than 30.
7) To use Z test sample size should be more than 30.
8) To use chi square minimum number of observation should
be 5.
52
Data Analysis
a) Parametric Test
• Parametric tests are more powerful.
• The data in this test is derived from interval and
ratio measurement.
• In parametric tests, it is assumed that the data
follows normal distributions. Examples of
parametric tests are (a) Z-Test, (b) T-Test and (c) F-
Test.
• Observations must be independent i.e., selection of
any one item should not affect the chances of
selecting any others be included in the sample.
53
Data Analysis
b) Non-Parametric Test
• Non-parametric tests are used to test the hypothesis with
nominal and ordinal data.
• We do not make assumptions about the shape of
population distribution.
• These are distribution-free tests.
• The hypothesis of non-parametric test is concerned with
something other than the value of a population parameter.
• Easy to compute. There are certain situations particularly
in marketing research, where the assumptions of
parametric tests are not valid. Example: In a parametric
test, we assume that data collected follows a normal
distribution. In such cases, non-parametric tests are used.
• Examples of non-parametric tests are:
a)Binomial test
b)Chi-Square test
c)Mann-Whitney U test
d)Sign test.
54
Data Analysis
Binomial test
 A binominal test is used when the population
has only two classes such as male, female;
buyers, non-buyers, success, failure etc.
 All observations made about the population
must fall into one of the two tests.
 The binomial test is used when the sample
size is small.
Advantages of non parametric test
i) They are quick and easy to use.

ii) When data are not very accurate, these tests

produce fairly good results.

55
Data Analysis
Disadvantages of non-parametric test
• Non-Parametric test involves the greater
risk of accepting a false hypothesis and thus
committing a Type 2 error.

56
Data Analysis
Examples of Parametric Tests
• T-test (Parametric test)
• T-test is used in the following circumstances:
When the sample size n<30.
Example:
• A certain pesticide is packed into bags by a
machine. A random sample of 10 bags is drawn
and their contents are found as follows:
50,49,52,44,45,48,46,45,49,45.Confirm whether
the average packaging can be taken to be 50 kgs.
The sample size is less than 30. Standard
deviations are not known using this test. We can
find out if there is any significant difference
between the two means i.e. whether the two
population means are equal.

57
Data Analysis
Illustration
• There are two nourishment programmes 'A' and 'B'.
Two groups of children are subjected to this. Their
weight is measured after six months.
• The first group of children subjected to the
programme 'A' weighed 44,37,48,60,41kgs. at
the end of programme.
• The second group of children were subjected to
nourishment programme 'B' and their weight
was 42, 42, 58, 64, 64, 67, 62 kgs at the end of
the programme. From the above, can we
conclude that nourishment programme 'B'
increased the weight of the children
significantly, given a 5% level of confidence?
58
Data Analysis

• H0: Null Hypothesis

• There is no significant difference between
Nourishment programme 'A' and 'B'.

• HA: Alternative Hypothesis

• Nourishment programme B is better
than 'A' or Nourishment programme 'B'
increase the children's weight
significantly.

59
Solution
Nourishment Programme A Nourishment Programme B

X Y

x-x mean (x-x-mean)2 (y-y mean)( y-y mean)2

= (x-46) =(y-57)

44 -2 4 42 -15 225

37 -9 81 42 -15 225

48 2 4 58 1 1

60 14 196 64 7 49

41 -5 25 64 7 49

67 10 100

62 5 25

230 310 399 674

60
Data Analysis
t=
Here,
n1= 5, n2 = 7
∑Y = 399

61
=

Data Analysis
=

∑Y = 399

=310

= 674

= 57
=

62
Data Analysis

S2 =

D.F = (n1+n2–2) = (5+7–2) = 10

S2 =

63
Data Analysis

= = -1.89

t at 10 d.f. at 5% level is 1.81.

Since, calculated t is greater than 1.81, it is significant. Hence HA is
accepted. Therefore the two nutrition programmes differ significantly with
respect to weight increase.

64
Data Analysis
Analysis of Variance (ANOVA)
a)ANOVA
• It is a statistical technique.
• It is used to test the equality of three or
more sample means.
• Based on the means, inference is drawn
whether samples belongs to same
population or not.

65
Data Analysis
b) Conditions for using ANOVA
1. Data should be quantitative in nature.

2. Data normally distributed.

3. Samples drawn from a population

follow random variation.
(c) ANOVA can be discussed in two parts:
1. One-way classification

2. Two and three-way classification.

66
Data Analysis
One-way ANOVA
Following are the steps followed in ANOVA:

(a) Calculate the variance between samples.

(b) Calculate the variance within samples.
(c) Calculate F ratio using the formula.
F= Variance between the samples/ Variance
within the sample

67
Data Analysis
(d) Compare the value of F obtained above in (c)
with the critical value of F such as 5% level of
significance for the applicable degree of
freedom.

(e) When the calculated value of F is less than the

table value of F, the difference in sample means
is not significant and a null hypothesis is
accepted. On the other, in sample means is
considered as significant and the null hypothesis
is rejected.

68
Data Analysis
Example: ANOVA is useful:

• To compare the mileage achieved by different

brands of automotive fuel.

• Compare the first year earnings of graduates of

half a dozen top business schools.

69
Data Analysis
Two-way ANOVA
• The procedure to be followed to calculate
variance is the same as it is for the one-way
classification. The example of two-way
classification of ANOVA is as follows:
Example:
• A firm has four types of machines - A , B, C
and D. It has put four of its workers on each
machines for a specified period, say one
week. At the end of one week, the average
output of each worker on each type of
machine was calculated. These data are given
below:
70
Data Analysis
Worker Average Production by type of machine

A B C D

Worker 1 25 26 23 28

Worker 2 23 22 24 27

Worker 3 27 30 26 32

Worker 4 29 34 27 33

71
Data Analysis
The firm is interested in knowing:

• Whether the mean productivity of workers is

significantly different.

• Whether there is a significant difference in the

mean productivity of different types of
machines.

72
Data Analysis
Illustration
Company 'X' wants its employees to undergo three
different types of training programme with a view to
obtain improved productivity from them. After the
completion of the training programme, 16 new
employees are assigned at random to three training
methods and the production performance was
recorded. The training managers’ problem is to find
out if there are any differences in the effectiveness
of the training methods? The data recorded is as
under:

73
Data Analysis
Daily Output Of New Employees
Method 1 15 18 19 22 11

Method 2 22 27 18 21 17

Method 3 18 24 19 16 22 15

74
Data Analysis
Following steps are:

1) Calculate Sample mean i.e.

2) Calculate General mean i.e.
3) Calculate variance between columns
using the formula

Where,
K = (n1+n2+n3-3)

75
Data Analysis
4.Calculate sample variance. It is calculated using
formula:

Si2 =

Where n is No. of observation under each method.

5.Calculate variance within columns using the formula:

76
Data Analysis

6. Calculate F using the ratio F =

7. Calculate the number of degree of freedom in

the numerator F ratio using equation, d.f=
(No. of samples -1).

77
Data Analysis

8.Calculate the number of degree of freedom in

the denominator of F ratio using the Equation
d.f =

9. Refer to F table f8 and find value.

10. Draw conclusions

78
Solution
Method 1 Method 2 Method 3

15 22 24

18 27 19

19 18 16

22 21 22

11 17 15

85 105 114

79
Data Analysis

1. Sample mean is calculated as follows: = = 17, =

= 21, = = 19

2. Grand mean: = = = 19

3. Calculate variance between columns:

80
Data Analysis

N - n

5 17 19 -2 4 5X4 = 20

5 21 19 2 4 5X4 = 20

6 19 19 0 0 6X0 = 0

81
Data Analysis

= 20,

The variance between columns is 20

82
Data Analysis
Training Method -1 Training Method -2 Training Method -3

X- X- X-

15-17 -22 = 4 22-21 12 = 1 18-19 -12 = 1

18-17 -12 = 1 27-21 62 = 36 24-19 52 = 25
19-17 22 = 4 18-21 -32 =9 19-19 0 = 0
22-17 52 = 25 21-21 0= 0 16-19 -32 = 9
11-17 -62 = 36 17-21 -42 =16 22-19 32 = 9
15-19 -42 = 16

= 70 62

83
=

Data Analysis

4. Sample variance = 15.5

=12

2
So, s1 = 17.5, S22 = 15.5, s 32 = 12

84
Data Analysis

x17.5 + x12
5. Within column variance is

= 14.76
= 1.354
6. F =
=

85
Data Analysis

7. d.f of Numerator = (3 – 1) = 2.

8. d.f of Denominator = ∑n1-k = (5 - 1) + (5 - 1) + (6 - 1) = 16 - 3

= 13.

9. Refer to table using d.f = 2 and d.f = 13.

10.The value is 3.81. This is the upper limit of acceptance region. Since
calculated value 1.354 lies within it, we can accept H0, the null hypothesis.

86
Data Analysis

Conclusion:

 There is no significant difference in the

effect of the three training methods.

87
Statistical approaches
• Regression Analysis is a statistical procedure for analyzing
associative relationships between a metric dependent variable and
one or more independent variable.
• It can be used in the following ways:
• Determine whether the independent variable
explain a significant variations in the dependent
variable : whether a relationship exists
• Determine how much of the variation in the
dependent variable can be explained by the
dependent variables: strength of the relationship.
• Determine the structure or form of the
relationship : the mathematical equation relating
the independent and dependent variables.
• Predict the value of the dependent variable.
• Control for other independent variables when
evaluating the contribution of a specific variable or
set of variables.
88
Statistical approaches
Bivariate Regression : is a procedure for driving a mathematical
relationship , in the form of an equation , between a single metric
dependent variable and a single metric independent variable.
• This analysis is similar in manways to determining the
simple correlations between two variables.
• However, because an equation has to be derived , one
variable must be identified as the dependent variable
and the other as the independent variable.
• For Example,
• Can variation in sales be explained in terms of
variation in advertising expenditures? What is the
structure and form of this relationship , and can it be
modeled mathematically by an equation describing a
straight line?
• Can the variation in market share be accounted for by
the size of the sales force?
• Are consumers perceptions of quality determined by
their perceptions of price?
89
Statistical approaches
Bivariate regression Model the basic regression
equation is
 Yi=ß0 +B1Xi+ e i
 Where Y= dependent variable or criterion
variable , X= independent or predictor variable,
ß0=intercept of the line , ß1=slope of the line ,
and ei=the error term associated with ith
observation.

90
Statistical approaches
• Multiple Regression is a technique that simultaneously
develops a mathematical relationship between two or more
independent variables and an interval –scaled dependent
variable.
• Can variations in sales be explained in terms of
variations in advertising expenditure, prices, and
level of distribution?
• Can variation in market share be accounted for by
the size of the sales force, advertising expenditure
, and sales promotion budget?
• Are consumers perceptions of quality determined by
their perceptions of prices , brand image, and brand
attributes?

91
Statistical approaches

• Multiple regression Model- is an equation used to

explain the results of multiple regression analysis.

• Y= Yi=ß0 +B1X1+ ß2X2+ ß3X3+ …ßkXk +e

• Residual is the difference between the observed value of

Yi and the value predicted by the regression equation, Ýi.

• Multicollinearity is a state of very high intercorrelations

among independent variables.

92
END OF CHAPTER 10-Quantitative

BRM Data Analysis Techniques
No ratings yet
BRM Data Analysis Techniques
53 pages
BRM - Data Analysis, Interpretation and Reporting Part II
No ratings yet
BRM - Data Analysis, Interpretation and Reporting Part II
102 pages
Chapter 13
No ratings yet
Chapter 13
71 pages
CH 5
No ratings yet
CH 5
27 pages
11 Eleventh - Class - Q 0Q - Research - JAVERIANA
No ratings yet
11 Eleventh - Class - Q 0Q - Research - JAVERIANA
15 pages
Basic Concepts of Statistics
No ratings yet
Basic Concepts of Statistics
43 pages
Statistics in Research Processing and Data Analysis
No ratings yet
Statistics in Research Processing and Data Analysis
34 pages
RM EBBA Class 8 CH0 11 Quatitative Analysis
No ratings yet
RM EBBA Class 8 CH0 11 Quatitative Analysis
37 pages
Data Analysis and Report Writing BRM
No ratings yet
Data Analysis and Report Writing BRM
49 pages
Notes On Data Processing, Analysis, Presentation
No ratings yet
Notes On Data Processing, Analysis, Presentation
63 pages
Chap 4 Research Method and Technical Writing
No ratings yet
Chap 4 Research Method and Technical Writing
33 pages
Step 6 Data Analysis
No ratings yet
Step 6 Data Analysis
23 pages
AS-level - Research Methods 4 - Correlation and Data Analysis
No ratings yet
AS-level - Research Methods 4 - Correlation and Data Analysis
63 pages
Module 3 Descriptive Statistics
No ratings yet
Module 3 Descriptive Statistics
38 pages
Data Analysis
No ratings yet
Data Analysis
30 pages
Presentation On Data Analysis: Submitted by
No ratings yet
Presentation On Data Analysis: Submitted by
38 pages
Data Analysis and Statistical Treatment
No ratings yet
Data Analysis and Statistical Treatment
99 pages
Data Analysis Plan Handout
No ratings yet
Data Analysis Plan Handout
15 pages
Chapter 7
No ratings yet
Chapter 7
39 pages
Data Analysis and Interpretation
No ratings yet
Data Analysis and Interpretation
33 pages
Topic 8 Data Processing and Analysis PDF
No ratings yet
Topic 8 Data Processing and Analysis PDF
157 pages
Day 01-Basic Statistics
No ratings yet
Day 01-Basic Statistics
36 pages
Lecture Notes: (Introduction To Medical Laboratory Science Research)
No ratings yet
Lecture Notes: (Introduction To Medical Laboratory Science Research)
13 pages
Introduction To Data Analysis
No ratings yet
Introduction To Data Analysis
21 pages
Analysing Quantitative Data - 13april2017
No ratings yet
Analysing Quantitative Data - 13april2017
41 pages
Lecture 2 Descriptive Statistics
No ratings yet
Lecture 2 Descriptive Statistics
46 pages
Basic Statistics (3685) PPT - Lecture On 20-01-2019
100% (1)
Basic Statistics (3685) PPT - Lecture On 20-01-2019
64 pages
Statistical Treatment
No ratings yet
Statistical Treatment
22 pages
5 q2 Practical Research
No ratings yet
5 q2 Practical Research
41 pages
Statistics in Research Analysis
No ratings yet
Statistics in Research Analysis
12 pages
Psyc 103 (Stats)
No ratings yet
Psyc 103 (Stats)
75 pages
Lecture 1 - Introduction To Data Analysis
No ratings yet
Lecture 1 - Introduction To Data Analysis
37 pages
2 - Introduction To Statistics
No ratings yet
2 - Introduction To Statistics
97 pages
Unit 4
No ratings yet
Unit 4
21 pages
Research Samples and Explanations
No ratings yet
Research Samples and Explanations
56 pages
Module I. Basic Calculations. Average, Standard Deviation by Excel
No ratings yet
Module I. Basic Calculations. Average, Standard Deviation by Excel
48 pages
Descriptive Analytics
No ratings yet
Descriptive Analytics
6 pages
AL - I (Unit - I)
No ratings yet
AL - I (Unit - I)
19 pages
Data Analysis
No ratings yet
Data Analysis
37 pages
Quantitative Data Analysis 2025
No ratings yet
Quantitative Data Analysis 2025
69 pages
Data Analysis Chapter 7
No ratings yet
Data Analysis Chapter 7
20 pages
Lecture 1
No ratings yet
Lecture 1
39 pages
Statistical Foundations - Intro 64zlf
100% (2)
Statistical Foundations - Intro 64zlf
86 pages
Statistics Notes
No ratings yet
Statistics Notes
16 pages
PR2 Modular M
100% (1)
PR2 Modular M
5 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Data Analysis
No ratings yet
Data Analysis
40 pages
Data Analysis and Reporting HS 490: Missing Data. Once The Coded Data Have Been Entered Into A Computer System
No ratings yet
Data Analysis and Reporting HS 490: Missing Data. Once The Coded Data Have Been Entered Into A Computer System
6 pages
Lesson 5 Planning Data Analyses
No ratings yet
Lesson 5 Planning Data Analyses
19 pages
Psychology Project
No ratings yet
Psychology Project
14 pages
Q2-Lesson 5
No ratings yet
Q2-Lesson 5
81 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Analysis of Data-Statistic: Unit IV
No ratings yet
Analysis of Data-Statistic: Unit IV
30 pages
Data Analysis and Presentation
No ratings yet
Data Analysis and Presentation
77 pages
Week-1 Why Do We Need Statistics
No ratings yet
Week-1 Why Do We Need Statistics
31 pages
Lecture 8 Data Analysis
No ratings yet
Lecture 8 Data Analysis
30 pages
CH11 PPT
No ratings yet
CH11 PPT
33 pages
3 - Descriptive Stat
No ratings yet
3 - Descriptive Stat
70 pages
Blue-Team LAB Environment Week-2 Evaluation (June 30 - July 05)
No ratings yet
Blue-Team LAB Environment Week-2 Evaluation (June 30 - July 05)
2 pages
Roles and Task Allocation For The Blue-Team Security Environment Setup Week 3
No ratings yet
Roles and Task Allocation For The Blue-Team Security Environment Setup Week 3
1 page
Network Security Monitoring Lab Implementation Plan
No ratings yet
Network Security Monitoring Lab Implementation Plan
4 pages
Golden Elegant Certificate of Appreciation
No ratings yet
Golden Elegant Certificate of Appreciation
1 page
Chapter Four Design of Manufacturing and Service Process
No ratings yet
Chapter Four Design of Manufacturing and Service Process
37 pages
Chapter Three Product and Service Design
No ratings yet
Chapter Three Product and Service Design
24 pages
Chapter 3 Formulating Research Topics
No ratings yet
Chapter 3 Formulating Research Topics
53 pages
Chapter 2 Strategy, Competitiveness and Productivity
No ratings yet
Chapter 2 Strategy, Competitiveness and Productivity
54 pages
Chapter 4 The Research Processes
No ratings yet
Chapter 4 The Research Processes
14 pages
Chapter 5 The Research Proposal
No ratings yet
Chapter 5 The Research Proposal
27 pages
Dereje
No ratings yet
Dereje
76 pages
Assignment #2
0% (1)
Assignment #2
3 pages
Exercise
No ratings yet
Exercise
1 page
Lesson 09 Inferential Statistics and Its Applications
No ratings yet
Lesson 09 Inferential Statistics and Its Applications
142 pages
Introduction To Inferential Statistics
No ratings yet
Introduction To Inferential Statistics
11 pages
Sample Size Determination
No ratings yet
Sample Size Determination
21 pages
Ch1 - An Overview of Statistical Concepts
No ratings yet
Ch1 - An Overview of Statistical Concepts
58 pages
The Effectiveness of Jigsaw in Improving Students' Reading Comprehension
No ratings yet
The Effectiveness of Jigsaw in Improving Students' Reading Comprehension
12 pages
EdMaestro April Workshops
No ratings yet
EdMaestro April Workshops
3 pages
5.chapter 2
No ratings yet
5.chapter 2
18 pages
Batangas State University Me/Pete Department: Project Study Guidelines
No ratings yet
Batangas State University Me/Pete Department: Project Study Guidelines
17 pages
DLP
100% (4)
DLP
3 pages
PH D Agric EconomicsSeminarinAdvancedResearchMethodsHypothesistesting
No ratings yet
PH D Agric EconomicsSeminarinAdvancedResearchMethodsHypothesistesting
24 pages
Hypothesis Testing (Critical Value Approach)
No ratings yet
Hypothesis Testing (Critical Value Approach)
3 pages
Lesson 8 - Test Hypo On Pop Mean
No ratings yet
Lesson 8 - Test Hypo On Pop Mean
18 pages
Learning Task No 2 - MAGTIBAY-2
No ratings yet
Learning Task No 2 - MAGTIBAY-2
12 pages
STT100 Week 7 Online Lecture Hypothesis
No ratings yet
STT100 Week 7 Online Lecture Hypothesis
22 pages
Research Methodology MCQ (Multiple Choice Questions) - Javatpoint
No ratings yet
Research Methodology MCQ (Multiple Choice Questions) - Javatpoint
1 page
Customer Orientation
No ratings yet
Customer Orientation
15 pages
T Distribution Tables 1 and 2 Tailed
100% (1)
T Distribution Tables 1 and 2 Tailed
15 pages
Estimation and Hypothesis Testing
No ratings yet
Estimation and Hypothesis Testing
101 pages
Unit 7
No ratings yet
Unit 7
9 pages
Business Report Advance Statistics
No ratings yet
Business Report Advance Statistics
39 pages
The Basic of Hypothesis Testing
No ratings yet
The Basic of Hypothesis Testing
18 pages
Hypothesis Presentation
No ratings yet
Hypothesis Presentation
12 pages
Mid-Semester Test With Solution 2018
No ratings yet
Mid-Semester Test With Solution 2018
7 pages
Inferential Statistics Powerpoint
No ratings yet
Inferential Statistics Powerpoint
65 pages
(Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell - Download The Ebook With All Fully Detailed Chapters
100% (2)
(Ebook PDF) Statistics Unplugged 4th Edition by Sally Caldwell - Download The Ebook With All Fully Detailed Chapters
42 pages
Thesis / Capstone Project / Feasibility Study Guide: NDKC Research
No ratings yet
Thesis / Capstone Project / Feasibility Study Guide: NDKC Research
60 pages
Gretl Guide (301 350)
No ratings yet
Gretl Guide (301 350)
50 pages
Test of Goodness of Fit
No ratings yet
Test of Goodness of Fit
3 pages

Chapter 10 Data Analysis-Quantitative

Uploaded by

Chapter 10 Data Analysis-Quantitative

Uploaded by

Chapter 10

In this chapter, students will be introduced to

2. Central tendency, and

• The two variables in this dataset are age (x) and

• The histogram of each variable is shown

• After computing bivariate correlation,

• Answering such a question would require

• H0 is called the null hypotheses, and H1 is called

• Although they may seem like two

• Rather, it is indirectly proven by rejecting the null

• Statistical testing is always probabilistic, because

• The values along the principal diagonal

• Another useful way of presenting bivariate data is

• A cross-tab is a table that describes the

• In other words, we were expecting 2.5

 Whether this difference between expected and actual

• Now the researcher formulates the

• When we should apply a hypothesis test that is

• It tells the researcher the number of

 Accepting or rejecting of the null hypothesis

 the region of rejection at a given level of

ii) When data are not very accurate, these tests

• H0: Null Hypothesis

• HA: Alternative Hypothesis

x-x mean (x-x-mean)2 (y-y mean)( y-y mean)2

230 310 399 674

D.F = (n1+n2–2) = (5+7–2) = 10

t at 10 d.f. at 5% level is 1.81.

2. Data normally distributed.

3. Samples drawn from a population

2. Two and three-way classification.

(a) Calculate the variance between samples.

(e) When the calculated value of F is less than the

• To compare the mileage achieved by different

• Compare the first year earnings of graduates of

• Whether the mean productivity of workers is

• Whether there is a significant difference in the

1) Calculate Sample mean i.e.

Where n is No. of observation under each method.

6. Calculate F using the ratio F =

7. Calculate the number of degree of freedom in

8.Calculate the number of degree of freedom in

9. Refer to F table f8 and find value.

10. Draw conclusions

1. Sample mean is calculated as follows: = = 17, =

3. Calculate variance between columns:

The variance between columns is 20

15-17 -22 = 4 22-21 12 = 1 18-19 -12 = 1

4. Sample variance = 15.5

8. d.f of Denominator = ∑n1-k = (5 - 1) + (5 - 1) + (6 - 1) = 16 - 3

9. Refer to table using d.f = 2 and d.f = 13.

 There is no significant difference in the

• Multiple regression Model- is an equation used to

explain the results of multiple regression analysis.

• Y= Yi=ß0 +B1X1+ ß2X2+ ß3X3+ …ßkXk +e

• Residual is the difference between the observed value of

Yi and the value predicted by the regression equation, Ýi.

• Multicollinearity is a state of very high intercorrelations

among independent variables.

You might also like