0% found this document useful (0 votes)
6 views15 pages

Chapter 3

Chapter 4 discusses correlation and regression analysis, explaining the relationships between independent and dependent variables. It details methods for determining correlation, including scatter diagrams and correlation coefficients, and introduces regression equations to model these relationships. The chapter emphasizes the importance of understanding the strength and direction of correlations to predict outcomes effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views15 pages

Chapter 3

Chapter 4 discusses correlation and regression analysis, explaining the relationships between independent and dependent variables. It details methods for determining correlation, including scatter diagrams and correlation coefficients, and introduces regression equations to model these relationships. The chapter emphasizes the importance of understanding the strength and direction of correlations to predict outcomes effectively.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

CHAPTER 4

Correlation and Regression


By
Noor Fadhilah Mohd Ramlan
Introduction
2 variables can relate to each other in some ways
When an increase in one variable causes another variable to increase, these 2 variables are said to
have a positive relationship
Eg : sugar price and food price
When an increase of one variable may cause another variable to decrease, these 2 variables are
said to have a negative relationship
Eg : food price and restaurant profits
Correlation Analysis Regression Analysis
To measure the strength of the relationship To obtain the equation relating to the 2
between 2 variable variable
Independent & Dependent Variable
Independent / explanatory variable (X): A variable that is provides the information for estimation
Dependent / response variable (Y): A variable that is being estimated
Y is depend on X
Eg:
Age of respondent and blood pressure
X : Age Y : Blood pressure
The association between income and expenditure
X : Income Y : Expenditure
Hours of study and statistics test score
X : Hours of study Y : Test score
Method to Determine Correlation
Between 2 Variable
Scatter Diagram
Graphical method Pearson’s product moment
Plot the available data to determine whether correlation coefficient (r)
a relationship exists between 2 variable or not

Linear Correlation Coefficient


Spearman’s rank Calculation method
correlation coefficient (ρ) To determine the strength of the relationship
Scatter Diagram
Independent variable (X) is labeled on the horizontal axis
The dependent variable (Y) on the vertical axis
The scatter diagram form certain patterns (increasing or decreasing), indicating
that there is a relationship between the 2 variable
If the scatter diagram does not show any pattern / randomly scattered, we can
assume that the 2 variables do not have a relationship between them
120 120
100 100
80 80
60 60
40 40

20 20

0 0
0 5 10 15 0 5 10 15

Perfect positive relationship Perfect negative relationship


120

100
60 90
80 80
50
60 70
40 60
40 50
30
20 40
20 30
0 20
10
0 5 10 15 10
0

Strong negative
0
0 5 10 15
0 5 10 15

Strong positive relationship Weak positive


relationship relationship
120

100

80

60

120 40

20
100
0
80 0 5 10 15

60 No relationship
40

20

0 120

0 5 10 15 100

80

Weak negative 60

relationship
40

20

0
0 5 10 15

No relationship
Mathematics
100 Eg : this table shows the test score for
90
80
finance and mathematics tests for seven
70 students in a faculty
60
50
40 Finance 62 65 72 80 85 86 90
30
20 Mathematics 40 55 60 77 80 82 88
10
0
0 20 40 60 80 100
Plot a scatter diagram and determine
Finance whether there is a relationship between
There is a strong positive relationship between finance and mathematics test score
finance and mathematics test score
Linear Correlation Coefficient
Pearson’s product moment
Spearman’s rank correlation coefficient (ρ)
correlation coefficient (r)
To measure the strength of the To measure the strength of the relationship between 2 qualitative variables
relationship between 2 quantitative For quantitative data, the data must be ranked first and then only this
variables correlation coefficient is calculated based on these rankings (less accurate)
-1 < r < 1 -1 < ρ < 1
The sign (-) or (+) for r / ρ identified the kind of relationship between the 2 variables
The value of r / ρ describe the strength of relationship
If r or ρ is close to -1, there is strong negative relationship between 2 variables
If r or ρ is close to 1, there is strong positive relationship between 2 variables
If r or ρ is close to 0, the 2 variables are not related
The strength and direction of the correlation coefficient

Perfect negative Perfect positive


correlation No correlation correlation

-1.00 0 1.00
r / ρ < -0.7 -strong negative correlation
-0.69 < r / ρ < -0.5 -moderate negative correlation
-0.49 < r / ρ < -0.1 -weak negative correlation

0.1 < r / ρ < 0.49 -weak positive correlation


0.5 < r / ρ < 0.69 -moderate positive correlation
r / ρ > 0.7 -strong positive correlation
Pearson’s product moment Spearman’s rank
correlation coefficient (r) correlation coefficient (ρ)

 xy   x y
r n 6 di2
  1
  x 2   x    y 2   y  

 
2  2


 n 
 n 

n n 1 2
Regression line / equation
y
y = a + bx
b An equation that represent the
a linear relationship between 2
variables
x The accurate method:
x = independent variable Least squares method (LSM)
y = dependent variable
a = y-intercept (when x is 0 unit, y is ‘a’ unit) The general form of simple linear
b = slope of the line (when x increase by 1 unit, y will increase regression equation
by b unit)
y = a + bx
Find regression line by using this formula:

Regression line: y  a  bx

 xy   x y
b n
a  y x
 b 
 2  n 
 x 2

x n  
n
2
Coefficient of determination (R )
The total variation of Y is explained by the regression line by using X
R2 = r 2
Higher the value of R2, the more helpful the X variable is on predicting Y
Eg: Examination score and time taken to revise statistics lesson
r = 0.891
2 2
R =r 2
= (0.891)
= 0.7939
Comment : 79.39% of the total variation of examination score is explained by the regression line by
using time taken to revise statistics lesson
20.61% is explained by other factors

You might also like