Chapter 3 - Correlation & Regression
Chapter 3 - Correlation & Regression
CHAPTER 3
CORRELATION & REGRESSION ANALYSIS
Ù For example:
(sales and profits) or between (advertising expenditure and sales). Basically, we would
say that the sales would determine the profits. In this case, the profit is dependent
Similarly, if we have production cost and production units, than we would say that
production costs will depend on the production units. Thus, the production cost is a
dependent variable (y) while the production unit is an independent variable (x).
1) Scatter diagram
à Graphical method
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 81
_______________________________________________Chapter 3: Correlation & Regression Analysis
Value of 𝑟 Strength
𝑟=1 Perfect
0.80 ≤ 𝑟 ≤ 0.99 Strong
0.50 ≤ 𝑟 ≤ 0.79 Moderate
0 < 𝑟 ≤ 0.49 Weak
𝑟=0 No linear correlation
Ù When interpreting the value of correlation, the strength & direction of the correlation
must be stated.
Ù For examples:
Ù One variable is plotted on the horizontal axis (independent variable, x) and the other is
Ù The pattern of their intersecting points can graphically show relationship patterns.
Ù If the diagram does not show any pattern or is randomly scattered, we can assume that
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 82
_______________________________________________Chapter 3: Correlation & Regression Analysis
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 83
_______________________________________________Chapter 3: Correlation & Regression Analysis
between the two variables and also the strength or degree of correlation.
∑𝒙∑𝒚
∑ 𝒙𝒚 −
𝒓= 𝒏
(∑ 𝒙)𝟐 (∑ 𝒚)𝟐
=> ∑ 𝒙𝟐 − ? >∑ 𝒚𝟐 − ?
𝒏 𝒏
𝑺𝑿𝒀
𝒓=
√𝑺𝑿𝑿. 𝑺𝒀𝒀
Where:
∑𝒙∑𝒚
𝑺𝑿𝒀 = D 𝒙𝒚 −
𝒏
(∑ 𝒙)𝟐
𝑺𝑿𝑿 = D 𝒙𝟐 −
𝒏
(∑ 𝒚)𝟐
𝑺𝒀𝒀 = D 𝒚𝟐 −
𝒏
Ù Note that:
∑x2 ≠ (∑x)2
∑y2 ≠ (∑y)2
∑xy ≠ ∑x
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 84
_______________________________________________Chapter 3: Correlation & Regression Analysis
EXERCISE 1
A marketing officer in a company wants to know the relationship between annual advertising
expenditures (RM million) and annual sales (RM million) of the company. For the study, he
collected data on advertising expenditures and annual sales of the company for the last 8 years.
Annual Advertising
Expenditure 2 1 4 3 2 4 5 3
(RM million)
Annual Sales
5 3 6 5 4 7 8 6
(RM million)
b) Draw a scatter plot to show the relationship between the annual advertising expenditures
and annual sales. What conclusion can be made from the plot?
c) Calculate the Pearson’s product moment coefficient of correlation and explain its meaning.
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 85
_______________________________________________Chapter 3: Correlation & Regression Analysis
EXERCISE 2
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 86
_______________________________________________Chapter 3: Correlation & Regression Analysis
An economist wants to study a relationship between family income and food expenditure. The
following table shows the result of the study based on 8 families that had been chosen
randomly.
Annual Income
8 12 9 24 13 37 19 16
(RM ‘0000)
Food
Expenditure 2.88 3.00 2.97 3.60 3.64 7.03 3.80 3.52
(RM ‘0000)
b) By calculating the product moment correlation coefficient, determine and explain the
correlation of annual income and food expenditure.
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 87
_______________________________________________Chapter 3: Correlation & Regression Analysis
variables that are at least of ordinal scale, which means suitable for qualitative data. It
𝟔 ∑ 𝒅𝟐𝒊
𝒓𝒔 = 𝟏 − E G
𝒏(𝒏𝟐 − 𝟏)
Ù Computation of 𝑟! is simple since it does not use the actual values of data instead it
Ù We usually give rank 1 for the smallest data value and highest rank for the largest data
value.
Ù For tied observations, that is two or more observations receiving the same score on the
same variable, each of them is assigned the average of the ranks which would have been
EXERCISE 3
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 88
_______________________________________________Chapter 3: Correlation & Regression Analysis
The grades of Mathematics and Accounting of 10 students were taken randomly to study the
relationship between the grades of Mathematics and Accounting. The following information is
Mathematics (x) A C D B C A B E B A
Accounting (y) B D D A C A C D B B
Using the rank correlation, what conclusion can be made about the grades of Mathematics and
EXERCISE 4
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 89
_______________________________________________Chapter 3: Correlation & Regression Analysis
The data below show the marks obtained by 8 students in Statistics test and Accounting test.
Is there any relationship between the marks in the two tests using rank correlation?
STATISTICS ACCOUNTING
STUDENT
(x) (y)
Farrish 87 82
Khairina 65 72
Aiman 46 65
Marissa 95 82
Adam 54 61
Athirah 60 68
Farid 79 60
Suri 48 52
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 90
_______________________________________________Chapter 3: Correlation & Regression Analysis
Ù Regression analysis is a statistical technique to estimate the best fitted line to show the
relationship between dependent and independent variables. This best fitted line is also
𝒚 = 𝒂 + 𝒃𝒙
Where:
𝑎 - Is the y-intercept
Ù The values of 𝑎 and 𝑏 can be obtained by using the least squares method. Using this
∑𝑥∑𝑦
∑ 𝑥𝑦 −
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝑏= 𝑛
𝑏= % (∑ 𝑥 )%
%
𝑛 ∑ 𝑥 − (∑ 𝑥 )% OR ∑𝑥 −
𝑛
∑𝑦 ∑𝑥
𝑎= −𝑏
𝑛 𝑛
𝒃 à For every one unit increase in 𝑥, 𝑦 will increase (if 𝑏 positive) or decrease (if 𝑏
negative) by 𝑏 units.
Example: if 𝑏 = 32 means that for every one unit increase 𝑥, 𝑦 in will increase by 32 unit.
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 91
_______________________________________________Chapter 3: Correlation & Regression Analysis
𝑹𝟐 = (𝒓)𝟐
Ù INTERPRETATION OF R2
R2 = 0.83 means that 83% of the total variation in y can be explained by x using the
regression line.
∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ
EXERCISE 5
A lecturer wants to know the relationship between the number of study hours in a week and
GPA obtained by 10 students selected randomly from a class. The data below gives the following
results.
b) Find the Pearson’s product moment coefficient of correlation and explain its meaning.
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 92
_______________________________________________Chapter 3: Correlation & Regression Analysis
c) Find the regression equation of GPA based on the number of study hours.
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 93
_______________________________________________Chapter 3: Correlation & Regression Analysis
f) Estimate the GPA obtained by Lisa if she studies for 11 hours in a week.
EXERCISE 6
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 94
_______________________________________________Chapter 3: Correlation & Regression Analysis
A supervisor of a factory that produces electrical appliances finds that there exists a
relationship between age of worker and the number of absent days. He then collected the
Age (Years) 42 27 36 25 22 39 57 19 33 30
No. of Absent Days 2 7 5 9 10 4 4 8 6 5
b) By calculating the product moment coefficient of correlation, determine and explain the
c) Obtained a regression equation of number of absent days with respect to the ages of
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 95
_______________________________________________Chapter 3: Correlation & Regression Analysis
d) If Harez is 28 years old, what would be the expected number of absent days?
EXERCISE 7
The following statistic was obtained from a survey:
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 96
_______________________________________________Chapter 3: Correlation & Regression Analysis
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 97
_______________________________________________Chapter 3: Correlation & Regression Analysis
Press Function
SHIFT CLR 1 = To clear all memory
MODE MODE 2 1 Regression
SHIFT 1 1 = ∑x2
SHIFT 1 2 = ∑x
SHIFT 1 3 = n
SHIFT 1 41 = ∑y2
SHIFT 1 42 = ∑y
SHIFT 1 43 = ∑xy
SHIFT 2 443 = 𝑟 (Pearson’s product moment)
SHIFT 2 442 = B (Regression slope, 𝑏 )
SHIFT 2 441 = A (Regression intercept, 𝑎)
TUTORIAL 3
REVIEW QUESTIONS 6
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 98
_______________________________________________Chapter 3: Correlation & Regression Analysis
Please do all the questions listed below & show your calculations clearly.
QUESTION QUESTION
Question 1 Question 21
Question 2 Question 22
Question 3 Question 25
Question 4
Question 5
Question 6
Question 7
Question 12
Question 13
Question 14
Question 15
Question 16
Question 17
Question 18
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 99