Chapter 3
Chapter 3
20 20
0 0
0 5 10 15 0 5 10 15
100
60 90
80 80
50
60 70
40 60
40 50
30
20 40
20 30
0 20
10
0 5 10 15 10
0
Strong negative
0
0 5 10 15
0 5 10 15
100
80
60
120 40
20
100
0
80 0 5 10 15
60 No relationship
40
20
0 120
0 5 10 15 100
80
Weak negative 60
relationship
40
20
0
0 5 10 15
No relationship
Mathematics
100 Eg : this table shows the test score for
90
80
finance and mathematics tests for seven
70 students in a faculty
60
50
40 Finance 62 65 72 80 85 86 90
30
20 Mathematics 40 55 60 77 80 82 88
10
0
0 20 40 60 80 100
Plot a scatter diagram and determine
Finance whether there is a relationship between
There is a strong positive relationship between finance and mathematics test score
finance and mathematics test score
Linear Correlation Coefficient
Pearson’s product moment
Spearman’s rank correlation coefficient (ρ)
correlation coefficient (r)
To measure the strength of the To measure the strength of the relationship between 2 qualitative variables
relationship between 2 quantitative For quantitative data, the data must be ranked first and then only this
variables correlation coefficient is calculated based on these rankings (less accurate)
-1 < r < 1 -1 < ρ < 1
The sign (-) or (+) for r / ρ identified the kind of relationship between the 2 variables
The value of r / ρ describe the strength of relationship
If r or ρ is close to -1, there is strong negative relationship between 2 variables
If r or ρ is close to 1, there is strong positive relationship between 2 variables
If r or ρ is close to 0, the 2 variables are not related
The strength and direction of the correlation coefficient
-1.00 0 1.00
r / ρ < -0.7 -strong negative correlation
-0.69 < r / ρ < -0.5 -moderate negative correlation
-0.49 < r / ρ < -0.1 -weak negative correlation
xy x y
r n 6 di2
1
x 2 x y 2 y
2 2
n
n
n n 1 2
Regression line / equation
y
y = a + bx
b An equation that represent the
a linear relationship between 2
variables
x The accurate method:
x = independent variable Least squares method (LSM)
y = dependent variable
a = y-intercept (when x is 0 unit, y is ‘a’ unit) The general form of simple linear
b = slope of the line (when x increase by 1 unit, y will increase regression equation
by b unit)
y = a + bx
Find regression line by using this formula:
Regression line: y a bx
xy x y
b n
a y x
b
2 n
x 2
x n
n
2
Coefficient of determination (R )
The total variation of Y is explained by the regression line by using X
R2 = r 2
Higher the value of R2, the more helpful the X variable is on predicting Y
Eg: Examination score and time taken to revise statistics lesson
r = 0.891
2 2
R =r 2
= (0.891)
= 0.7939
Comment : 79.39% of the total variation of examination score is explained by the regression line by
using time taken to revise statistics lesson
20.61% is explained by other factors