Correlation Analysis and Regression 22
Correlation Analysis and Regression 22
AND REGRESSION
ANALYSIS
ARMANDO C. MANZANO
Correlation
◦
◦The coefficient of correlation denoted by ρ(the Greek
letter rho) or 𝑟, measures the similarity of the changes in
the value of x and y. Its ranges is
−𝟏 ≤ 𝒓 ≤ +𝟏
◦ If y increases when x increases, 𝑟 is positive. If y
decreases when x increases, 𝑟 is negative. If y is
unaffected by x, then 𝒓 = 𝟎.
▪ Only concerned with strength of the relationship
▪ No causal effect is implied
Bivariate data
Are data sets in which each subject has two
observations associated with it.
4
TYPES
POSITIVE CORRELATION – exists when high scores in
one variable are associated with high scores in the second
variable or low scores in one variable are associated with
low scores in the other
NEGATIVE CORRELATION – exists when high scores in
one variable are associated with low scores in the second
or vice versa.
ZERO CORRELATION– exists when the points on the
scatter diagram are spread in a random manner.
PERFECT CORRELATION– all points lie on a straight
line
5
THE STRENGTH OR DEGREE OF THE
RELATIONSHIP IS BASED ON THE FOLLOWING
RA N GES OF T H E CORR EL A T I O N COEFFI CI EN T:
1.8
Patterns of Scatter Diagrams…
◦ Linearity and Direction are two concepts we are interested in
y y
x x
y y
10
x
SCATTER PLOT EXAMPLES
No relationship
x 11
CORRELATION COEFFICIENT
𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑟=
𝑛 σ 𝑥2 − σ 𝑥 2 𝑛 σ 𝑦2 − σ 𝑦 2
◦ where:
r = Sample correlation coefficient
n = Sample size
x = Value of the independent variable
y = Value of the dependent variable
Formula
◦ Pearson Product Moment Coefficient of Correlation
𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑟=
𝑛 σ 𝑥2 − σ 𝑥 2 𝑛 σ 𝑦2 − σ 𝑦 2
◦ where:
r = Sample correlation coefficient
n = Sample size
x = Value of the independent variable
y = Value of the dependent variable
Formula
◦ Pearson Product Moment Coefficient of Correlation
𝑛 σ 𝑥𝑦 − σ 𝑥 σ 𝑦
𝑟=
𝑛 σ 𝑥2 − σ 𝑥 2 𝑛 σ 𝑦2 − σ 𝑦 2
CALCULATIONEXAMPLE
Tree Trunk
Height Diameter
y x xy y2 x2
35 8 280 1225 64
49 9 441 2401 81
27 7 189 729 49
33 6 198 1089 36
60 13 780 3600 169
21 7 147 441 49
45 11 495 2025 121
51 12 612 2601 144
=321 =73 =3142 =14111 =713
CALCULATION EXAMPLE
)
Tree
nxy − xy
Height,
r=
[n(x2 )−(x)2][n(y2)−(y)2]
y
70
60
8(3142)−(73)(321)
50 =
40
[8(713)−(73)2][8(14111)−(321)2]
30
20
= 0.886
10
Trunk Diameter, x
r = 0.886 → relatively strong
0
0 2 4 6 8 10 12 14 positive
21
linear association between x and y
Example: The data below summarizes the results of
midterm grade and final exam result. Let us try to
predict that if a certain grade result in midterm will
determine a value for his final grade.
Let x = midterm grade
y = final grade
x 75 70 65 90 85 85 80 70 65 90
y 80 75 65 95 90 85 90 75 70 90
EXERCISE
Identify the correlation given a pair of variables
where:
R 2
=r 2
R2 = Coefficient of
determination
r = Simple correlation
coefficient
INTRODUCTION TO REGRESSION
ANALYSIS
68
COEFFICIENT OF DETERMINATION, R2
(continued)
Coefficient of determination
R =r2 2
where:
R2 = Coefficient of determination
r = Simple correlation coefficient 33
EXAMPLES OF APPROXIMATE
R 2 VALUES
y
R2 = 1
x 34
R2 = +1
EXAMPLES OF
APPROXIMATE
R2 VALUES
y
0 < R2 < 1
x 35
EXAMPLES OF
APPROXIMATE
2 VALUES
R
R2 = 0
y
No linear relationship
between x and y:
36
EXAMPLE
39
Title Lorem Ipsum
01 02 03
Lorem ipsum Nunc viverra Pellentesque
dolor sit amet, imperdiet enim. habitant morbi
consectetuer Fusce est. tristique
adipiscing elit. Vivamus a senectus et
tellus. netus.