0% found this document useful (0 votes)
21 views19 pages

Chapter 3 - Correlation & Regression

This chapter discusses correlation and regression analysis. Correlation analysis measures the relationship between two variables, the independent and dependent variables. Pearson's product-moment coefficient and Spearman's rank coefficient are two methods to determine correlation. Pearson's measures correlation between quantitative variables while Spearman's is used for qualitative or ranked data. The value of correlation coefficients r ranges from -1 to 1, with 0 indicating no correlation and values closer to 1 or -1 indicating stronger positive or negative correlations. Scatter plots are also used to visually examine relationships between variables.

Uploaded by

2022680144
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views19 pages

Chapter 3 - Correlation & Regression

This chapter discusses correlation and regression analysis. Correlation analysis measures the relationship between two variables, the independent and dependent variables. Pearson's product-moment coefficient and Spearman's rank coefficient are two methods to determine correlation. Pearson's measures correlation between quantitative variables while Spearman's is used for qualitative or ranked data. The value of correlation coefficients r ranges from -1 to 1, with 0 indicating no correlation and values closer to 1 or -1 indicating stronger positive or negative correlations. Scatter plots are also used to visually examine relationships between variables.

Uploaded by

2022680144
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

_______________________________________________Chapter 3: Correlation & Regression Analysis

CHAPTER 3
CORRELATION & REGRESSION ANALYSIS

3.1 WHAT IS CORRELATION ANALYSIS?

Ù Correlation analysis is a statistical technique used to measure the relationship and

strength of the relationship between two variables, commonly known as independent

variable (x) and dependent variable (y).

Ù Determining which variable is dependent and which variable is independent is a very

important step in correlation analysis.

Ù For example:

In many business situations, managers often want to seek relationship between

(sales and profits) or between (advertising expenditure and sales). Basically, we would

say that the sales would determine the profits. In this case, the profit is dependent

variable (y) and sales as independent variable (x).

Similarly, if we have production cost and production units, than we would say that

production costs will depend on the production units. Thus, the production cost is a

dependent variable (y) while the production unit is an independent variable (x).

3.2 METHODS TO DETERMINE CORRELATION BETWEEN 2


VARIABLES

1) Scatter diagram

à Graphical method

2) Pearson’s product moment coefficient of correlation, (𝑟)

à For quantitative variable

3) Spearman’s rank coefficient of correlation, (𝑟! )

à For qualitative & quantitative variable

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 81
_______________________________________________Chapter 3: Correlation & Regression Analysis

3.3 STRENGTH OF CORRELATION

Value of 𝑟 Strength
𝑟=1 Perfect
0.80 ≤ 𝑟 ≤ 0.99 Strong
0.50 ≤ 𝑟 ≤ 0.79 Moderate
0 < 𝑟 ≤ 0.49 Weak
𝑟=0 No linear correlation

Ù Value of correlation (𝑟) ranges from -1 to 1. (– 𝟏 ≤ 𝒓 ≤ 𝟏)

Ù When interpreting the value of correlation, the strength & direction of the correlation

must be stated.

Ù For examples:

1) If 𝒓 = 𝟎, there is no linear relationship between x and y.

2) If 𝒓 = 𝟏, there is a perfect positive linear relationship between x and y.

3) If 𝒓 = 𝟎. 𝟖𝟑, there is a strong positive linear relationship between x and y.

4) If 𝒓 = −𝟎. 𝟔𝟏, there is a moderate negative linear relationship between x and y.

5) If 𝒓 = −𝟎. 𝟐𝟑, there is a weak negative linear relationship between x and y.

3.4 SCATTER DIAGRAM

Ù A scatter diagram is a tool for analyzing relationships between two variables.

Ù One variable is plotted on the horizontal axis (independent variable, x) and the other is

plotted on the vertical axis (dependent variable, y).

Ù The pattern of their intersecting points can graphically show relationship patterns.

Ù If the diagram does not show any pattern or is randomly scattered, we can assume that

the two variables do not have relationship between them.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 82
_______________________________________________Chapter 3: Correlation & Regression Analysis

Ù Examples of scatter diagram:

STRONG POSITIVE CORRELATION STRONG NEGATIVE CORRELATION

WEAK POSITIVE CORRELATION WEAK NEGATIVE CORRELATION

NO CORRELATION NO LINEAR CORRELATION

PERFECT POSITIVE CORRELATION

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 83
_______________________________________________Chapter 3: Correlation & Regression Analysis

3.5 PEARSON’S PRODUCT MOMENT COEFFICIENT OF


CORRELATION

Ù Pearson’s product moment coefficient of correlation measures the relationship

between the two variables and also the strength or degree of correlation.

Ù Strength of correlation can either be perfectly correlated, strongly correlated,

moderately correlated, weakly correlated and not correlated.

Ù Pearson’s product moment coefficient of correlation is given by:

∑𝒙∑𝒚
∑ 𝒙𝒚 −
𝒓= 𝒏
(∑ 𝒙)𝟐 (∑ 𝒚)𝟐
=> ∑ 𝒙𝟐 − ? >∑ 𝒚𝟐 − ?
𝒏 𝒏

Ù The above formula also can be written as:

𝑺𝑿𝒀
𝒓=
√𝑺𝑿𝑿. 𝑺𝒀𝒀

Where:

∑𝒙∑𝒚
𝑺𝑿𝒀 = D 𝒙𝒚 −
𝒏

(∑ 𝒙)𝟐
𝑺𝑿𝑿 = D 𝒙𝟐 −
𝒏

(∑ 𝒚)𝟐
𝑺𝒀𝒀 = D 𝒚𝟐 −
𝒏

Ù Note that:

∑x2 ≠ (∑x)2

∑y2 ≠ (∑y)2

∑xy ≠ ∑x

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 84
_______________________________________________Chapter 3: Correlation & Regression Analysis

EXERCISE 1
A marketing officer in a company wants to know the relationship between annual advertising

expenditures (RM million) and annual sales (RM million) of the company. For the study, he

collected data on advertising expenditures and annual sales of the company for the last 8 years.

Annual Advertising
Expenditure 2 1 4 3 2 4 5 3
(RM million)
Annual Sales
5 3 6 5 4 7 8 6
(RM million)

a) Determine the independent and dependent variable.

b) Draw a scatter plot to show the relationship between the annual advertising expenditures

and annual sales. What conclusion can be made from the plot?

c) Calculate the Pearson’s product moment coefficient of correlation and explain its meaning.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 85
_______________________________________________Chapter 3: Correlation & Regression Analysis

EXERCISE 2
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 86
_______________________________________________Chapter 3: Correlation & Regression Analysis

An economist wants to study a relationship between family income and food expenditure. The

following table shows the result of the study based on 8 families that had been chosen

randomly.

Annual Income
8 12 9 24 13 37 19 16
(RM ‘0000)
Food
Expenditure 2.88 3.00 2.97 3.60 3.64 7.03 3.80 3.52
(RM ‘0000)

a) Name the dependent and independent variables used in this study.

b) By calculating the product moment correlation coefficient, determine and explain the
correlation of annual income and food expenditure.

3.6 SPEARMAN’S COEFFICIENT OF CORRELATION

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 87
_______________________________________________Chapter 3: Correlation & Regression Analysis

Ù Spearman’s rank coefficient of correlation is a measure of association between two

variables that are at least of ordinal scale, which means suitable for qualitative data. It

also can be used if the two given variables are quantitative.

Ù The Spearman’s rank coefficient of correlation is given by:

𝟔 ∑ 𝒅𝟐𝒊
𝒓𝒔 = 𝟏 − E G
𝒏(𝒏𝟐 − 𝟏)

Where n = Number of observations

di2 = Difference between the ranks

Ù Computation of 𝑟! is simple since it does not use the actual values of data instead it

uses the ranks representing the actual data values.

Ù We usually give rank 1 for the smallest data value and highest rank for the largest data

value.

Ù For tied observations, that is two or more observations receiving the same score on the

same variable, each of them is assigned the average of the ranks which would have been

assigned had no ties occurred.

EXERCISE 3

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 88
_______________________________________________Chapter 3: Correlation & Regression Analysis

The grades of Mathematics and Accounting of 10 students were taken randomly to study the

relationship between the grades of Mathematics and Accounting. The following information is

based on the grades obtained for the two subjects in an examination.

Mathematics (x) A C D B C A B E B A
Accounting (y) B D D A C A C D B B

Using the rank correlation, what conclusion can be made about the grades of Mathematics and

Accounting of the students?

EXERCISE 4
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 89
_______________________________________________Chapter 3: Correlation & Regression Analysis

The data below show the marks obtained by 8 students in Statistics test and Accounting test.

Is there any relationship between the marks in the two tests using rank correlation?

STATISTICS ACCOUNTING
STUDENT
(x) (y)
Farrish 87 82
Khairina 65 72
Aiman 46 65
Marissa 95 82
Adam 54 61
Athirah 60 68
Farid 79 60
Suri 48 52

3.7 SIMPLE LINEAR REGRESSION

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 90
_______________________________________________Chapter 3: Correlation & Regression Analysis

Ù Regression analysis is a statistical technique to estimate the best fitted line to show the

relationship between dependent and independent variables. This best fitted line is also

known as a regression line or regression equation.

Ù The regression equation is in the form of:

𝒚 = 𝒂 + 𝒃𝒙

Where:
𝑎 - Is the y-intercept

𝑏 - Is the slope of the line

𝑥 - Is the independent variable

Ù The values of 𝑎 and 𝑏 can be obtained by using the least squares method. Using this

method values 𝑎 and 𝑏 is given by the following formula.

∑𝑥∑𝑦
∑ 𝑥𝑦 −
𝑛 ∑ 𝑥𝑦 − ∑ 𝑥 ∑ 𝑦 𝑏= 𝑛
𝑏= % (∑ 𝑥 )%
%
𝑛 ∑ 𝑥 − (∑ 𝑥 )% OR ∑𝑥 −
𝑛

∑𝑦 ∑𝑥
𝑎= −𝑏
𝑛 𝑛

Ù INTERPRETATION OF VALUES OF 𝒂 AND 𝒃.

𝒂 à The meaning is, when 𝑥 = 0, 𝑦 = 𝑎.

𝒃 à For every one unit increase in 𝑥, 𝑦 will increase (if 𝑏 positive) or decrease (if 𝑏

negative) by 𝑏 units.

Example: if 𝑏 = 32 means that for every one unit increase 𝑥, 𝑦 in will increase by 32 unit.

3.8 COEFFICIENT OF DETERMINATION, (R2)

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 91
_______________________________________________Chapter 3: Correlation & Regression Analysis

Ù Coefficient of determination measures the proportion of variation in dependent variable

(y) that can be explained by independent variable (x).

Ù The coefficient of determination is the square of correlation coefficient,(𝑟)% . It is

expressed as a percentage where:

𝑹𝟐 = (𝒓)𝟐

Ù For example, if 𝑟 = 0.91, thus:


𝑅% = (𝑟)% = (0.91)% = 0.83

Ù INTERPRETATION OF R2

R2 = 0.83 means that 83% of the total variation in y can be explained by x using the

regression line.

∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ∞Õ

EXERCISE 5
A lecturer wants to know the relationship between the number of study hours in a week and

GPA obtained by 10 students selected randomly from a class. The data below gives the following

results.

Number of Study Hours 9 7 10 6 7 8 12 4 5 6


GPA 3.20 3.00 3.15 2.84 2.98 3.05 3.48 2.01 2.28 2.90

a) State the independent and dependent variable.

b) Find the Pearson’s product moment coefficient of correlation and explain its meaning.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 92
_______________________________________________Chapter 3: Correlation & Regression Analysis

c) Find the regression equation of GPA based on the number of study hours.

d) Explain the meaning of regression coefficients obtained in (c).

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 93
_______________________________________________Chapter 3: Correlation & Regression Analysis

e) Calculate the coefficient of determination and explain its meaning.

f) Estimate the GPA obtained by Lisa if she studies for 11 hours in a week.

EXERCISE 6
P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 94
_______________________________________________Chapter 3: Correlation & Regression Analysis

A supervisor of a factory that produces electrical appliances finds that there exists a

relationship between age of worker and the number of absent days. He then collected the

following data from 10 production operators taken at random.

Age (Years) 42 27 36 25 22 39 57 19 33 30
No. of Absent Days 2 7 5 9 10 4 4 8 6 5

a) Name the independent and dependent variable.

b) By calculating the product moment coefficient of correlation, determine and explain the

correlation of the age and the number of absent days.

c) Obtained a regression equation of number of absent days with respect to the ages of

workers using the least squares method.

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 95
_______________________________________________Chapter 3: Correlation & Regression Analysis

d) If Harez is 28 years old, what would be the expected number of absent days?

e) Calculate the coefficient of determination and explain its meaning.

EXERCISE 7
The following statistic was obtained from a survey:

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 96
_______________________________________________Chapter 3: Correlation & Regression Analysis

𝑛 = 19 𝑋U = 1.87 𝑌U = 80.37 D 𝑋𝑌 = 2901.7 D 𝑋 % = 70.83 D 𝑌 % = 124 561

a) Determine the strength of correlation between X and Y.

b) Find the least squares equation of Y based on X.

USING CALCULATOR TO COMPUTE PEARSON’S PRODUCT MOMENT COEFFICIENT OF CORRELATION,

REGRESSION INTERCEPT (𝒂) AND SLOPE (𝒃)

Calculator Model Casio fx-570MS

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 97
_______________________________________________Chapter 3: Correlation & Regression Analysis

Enter data into calculator in the format of: 𝒙, 𝒚

Press Function
SHIFT CLR 1 = To clear all memory
MODE MODE 2 1 Regression

Then input each given data: x,y then press M+

SHIFT 1 1 = ∑x2
SHIFT 1 2 = ∑x
SHIFT 1 3 = n
SHIFT 1 41 = ∑y2
SHIFT 1 42 = ∑y
SHIFT 1 43 = ∑xy
SHIFT 2 443 = 𝑟 (Pearson’s product moment)
SHIFT 2 442 = B (Regression slope, 𝑏 )
SHIFT 2 441 = A (Regression intercept, 𝑎)

TUTORIAL 3
REVIEW QUESTIONS 6

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 98
_______________________________________________Chapter 3: Correlation & Regression Analysis

Text Book Page 163

Please do all the questions listed below & show your calculations clearly.

QUESTION QUESTION

Question 1 Question 21

Question 2 Question 22

Question 3 Question 25

Question 4

Question 5

Question 6

Question 7

Question 12

Question 13

Question 14

Question 15

Question 16

Question 17

Question 18

P r e p a r e d B y : N O R A S L I L Y S A R K A M © | 99

You might also like