0% found this document useful (0 votes)

23 views23 pages

Stat Chapter 6

This document discusses simple linear regression and correlation. It defines key terms like regression, dependent and independent variables. It explains that regression attempts to determine the relationship between one dependent variable and one or more independent variables. Correlation describes the strength and direction of the linear relationship between two variables. The coefficient of correlation r ranges from -1 to 1, where values closer to these extremes indicate a stronger relationship. An example calculates r between household income and consumption as 0.973, showing a strong positive correlation.

Uploaded by

temesgenalemayehu488

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views23 pages

Stat Chapter 6

Uploaded by

temesgenalemayehu488

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 23

Unit six(6)

SIMPLE LINEAR REGRESSION AND CORRELATION

Unit Objectives
After completing this unit, you will be able to:
• Describe the meaning of regression and correlation
• Demonstrate the procedures for computing descriptive
measures of the strength of linear relationship between
two variables.
• Explain how to find a ‘best fitting’ line relating two
variables.
• Outline the computation of rank correlation which is a
measure of association between two rankings
• Demonstrate the procedure of test statistics for
analyzing analytical data
6.1 Definition of key terms

• Regression: is a statistical measure that attempts to determine

the relationship between one dependent variable and the values
of one or more independent variables.
• Regression analysis is the estimation or prediction of the
Unknown values of one variable from known values of the other
variable.
 In Regression analysis there are two types of variables.
• The variable whose value is influenced or to be predicted is called
dependent (regressed or explained) variable.
• The variable which influences the values or is used for prediction,
is called independent variable (regressor or Predictor or
explanatory).
• The mathematical equation (or mathematical model) relating the
dependent variable and the independent variable(s) is called a
regression model.
Cont…
• There are two types of regression. These are
simple and multiple regression.
• The situation where we have only one independent
variable is called simple regression. While if two or
more independent variables are involved in the
system are called multiple regression.
• In simple regression, the relationship between the
dependent variable (Y) and the independent
variable (X) may have various forms:
1.linear relationship: Y  a  bX
Y  a bX 
2.exponential relationship:
3.quadratic relationship: Y  aX  bX  c
2
6.2 Correlation

 Correlation: is used to describe the degree of relationship (association or

interdependence) between the two variables.
 The relation ship b/n the two variable may be either
• Positive/Direct or
• Negative/inverse or
• No relation b/n them. We can identify these relation ship by plotting a
scatter diagram.
a) Positive or direct linear relationship
• The points cluster around a line that runs from the
lower left to upper right of the graph area.
• An increase in the value of X is more likely associated
with an increase in the value of Y and Vise versa.
• If the points closer to the line, the relationship is
strong.
Cont…
Graphically

Figure 1: Positive (direct) linear relationship between variables

B) Negative or inverse linear relationship
• The points cluster around a line that runs from the
upper left to lower right of the graph area.
• An increase in the value of X is more likely associated
with a Decrease in the value of Y and Vise versa.
• If the points closer to the line, the relationship is
strong.
Cont…
Graphycally,

Figure 2: Negative (inverse) linear relationship between variables

C) No linear relationship
 The data points are randomly scattered, then
there is no linear relationship between the two
variables. This means there is a low or zero
correlation between the variables
Cont…
Graphically,

Figure 3: No linear relationship between variables

 A measure of strength and direction of linear
relationship between two variables X and Y is called
coefficient of correlation(r), which is defined as:
n  xy  ( x)( y)
r 
 n  x 2  ( x)2  n  y 2  ( y) 2 
Properties of Correlation Coefficient
• The coefficient of correlation lies between –1≤ r ≤1
• The sign of “r” indicates the direction of the relation
• The magnitude of “r” indicates the strength of linear
relationship between the two variables X and Y.
• If r =0 indicate that there is no linear relation ship between
two variables.
• If r = -1 indicate that there is perfect negative (inverse) linear
relationship between two variables.
• If r = 1 indicate that there is perfect positive (direct) linear
relationship between two variables.
• A coefficient of correlation(r) that is closes to zero shows the
relationship is quite weak.
• A coefficient of correlation(r) is closest to +1 or -
1,shows that the relationship is strong.
Cont…
The following table shows the summary of these relationships.
What happens to What happens to Types of correlation Value Example
variable X variable Y
X increase in Y increase in value Direct or positive Positive , rangingThe more time you
value from 0 to +1 spend studying, the
higher your test score
will be
X decreases in Y decreases in value Direct or positive Positive , ranging The less money you
value from 0 to +1 put in the bank, the
less interest you will
earn
X increases in Y decrease in value Indirect or negative Negative, ranging The more you exercise,
value from -1 to 0 the less you will
weight.
X decreases in Y increase in value Indirect or negative Negative, ranging The less time you take
value from -1 to 0 to complete the exam,
the more you will get
wrong.
Example 1:

A researcher who is concerned about the

consumption rate of households took a sample of
10 households and observed their consumption
and income (both in tens of Birr) for one month.
The results are given in table 1 below.
household income (x) consumption (y)

1 15 15
2 35 30
3 42 30
4 60 50
5 72 48
6 128 100
7 98 93
8 35 33
9 15 14
10 50 50
Calculate the coefficient of correlation and interpret.
Cont…
Solution:
Table 2: Calculation of the necessary summary statistics

income consumption (y) 2

(x) xy x 2 y
15 15 15(15) = 225 (15)2 = 225 (15)2 = 225
35 30 35(30) = 1050 (35)2 = 1225 (30)2 = 900
42 30 42(30) = 1260 (42)2 = 1764 (30)2 = 900
60 50 60(50) = 3000 (60)2 = 3600 (50)2 = 2500
72 48 72(48) = 3456 (72)2 = 5184 (48)2 = 2304
128 100 128(100) = 12800 (128)2 = 16384 (100)2 = 10000
98 93 98(93) = 9114 (98)2 = 9604 (93)2 = 8649
35 33 35(33) = 1155 (35)2 = 1225 (33)2 = 1089
15 14 15(14) = 210 (15)2 = 225 (14)2 = 196
50 50 50(50) = 2500 (50)2 = 2500 (50)2 = 2500
550 463 34770 41936 29263
Cont…
• The coefficient of correlation is then computed
as:
n  xy  ( x)( y)
r 
 n  x 2  ( x)2   n  y2  ( y)2 

10(34770)  (550)(463)
 = 0.973
[10(41936)  (550) ][10(29263)  (463) ]
2 2

Here, since the value of r is very close to 1, we can conclude

that there is a strong direct (positive) linear relationship
between income and consumption.
6.3 Coefficient of Determination
 Another measure of goodness-of-fit of the regression line is the
coefficient of determination which is the square of the coefficient of
correlation; i.e., coefficient of determination = r2
 The coefficient of determination is used to explain how much variability of
one factor can be caused by its relationship to another factor.
 The value of the coefficient of determination (r2) lies between 0 and 1,
inclusive.
 If r2 is close to 1, then this is an indication of dependent variable is better
to predicted by the independent variable ,
 while a value of r2 close to 0 indicates that the dependent variable is not
predicted by the independent variable.
• The total variation in the dependent variable (Y) can be divided into two:
1. Explained variation and
2. Unexplained variation
1. Explained variation is the variation in the dependent variable (Y) that is
explained by changes (or variation) in the independent variable (X). The
proportion of explained variation is: r2 x 100%.
Cont…
2. Unexplained variation is the variation in the dependent variable (Y)
that is caused by factors other than X (such as chance, excluded
variables, etc). The proportion of unexplained variation is:
(1- r2) x 100%.
Example 2: Consider the data on consumption expenditure and
income of households in Table 1. Find
1. Coefficient of determination?
2. the proportion of explained Variation?
3. unexplained variations? and Interpret each results?
Households income (x) consumption (y)
1 15 15
2 35 30
3 42 30
4 60 50
5 72 48
6 128 100
7 98 93
8 35 33
9 15 14
10 50 50
Cont…
Solution: Table 2: Calculation of the necessary summary statistics
income (x) consumption (y) XY X2 Y2
15 15 15(15) = 225 (15)2 = 225 (15)2 = 225
35 30 35(30) = 1050 (35)2 = 1225 (30)2 = 900
42 30 42(30) = 1260 (42)2 = 1764 (30)2 = 900
60 50 60(50) = 3000 (60)2 = 3600 (50)2 = 2500
72 48 72(48) = 3456 (72)2 = 5184 (48)2 = 2304
128 100 128(100) = 12800 (128)2 = 16384 (100)2 = 10000
98 93 98(93) = 9114 (98)2 = 9604 (93)2 = 8649
35 33 35(33) = 1155 (35)2 = 1225 (33)2 = 1089
15 14 15(14) = 210 (15)2 = 225 (14)2 = 196
50 50 50(50) = 2500 (50)2 = 2500 (50)2 = 2500
550 463 34770 41936 29263

The coefficient of correlation was computed as r = 0.973.

• Coefficient of determination =r2=(0.973)2=0.95
• The proportion of explained variation is: r2*100%= 0.95%100%=95% . Thus, about 95%
of the variation (change) in the monthly consumption expenditure of households is due to
variation in their income.
 The proportion of unexplained variation is:(1- r2)*100%= 0.05*100%=5%. Thus, about
5% of the variation in the monthly consumption expenditure of households is due to
factors other than income.
6.4 Regression and the method of least squares

• Once we have a clear understanding of the strength of linear

relationship existing between the dependent and independent
variables, the next step is to determine a mathematical model (a
linear equation) relating the two.
• The most common technique for obtaining such an equation is the
method of least squares
• The liner least square fitting technique is the simplest and the most
commonly applied form of linear regression and provides a solution
to the problem of finding the best fitting straight line through a set
of points.
• If the relationship between two variables X and Y is linear, we
express this as:
Where,Y-dependent variable
X-independent variable
α- y-intercept
β -slop
Cont…
• This y represents the individual values of the
actual observed points.
• So, we should begin to use to symbolize the
individual values of the estimated points; i.e.,
those points that lie on the estimating line.
Accordingly, we shall write the equation of the
estimating line as:

The sum of squares of the errors (SSE) is:

Cont…
The estimating line will have a ‘good fit’ if it
minimizes the error between the estimated points
on the line and the actual observed points that
were used to draw it.
Cont…
The ‘best’ fitting line is the line for which the SSE is the minimum. By applying
differential calculus to the SSE, the slope of the best fitting line becomes:

a  y  bx
Cont…
Example:- Table 5 shows the number of items produced
(X) and the cost (Y) incurred in producing them (in Birr) at a
certain factory.

n  xy  (  x)(  y) 5(616)  (32)(93)

b
n  x 2  (  x) 2 5(222)  (32) 2

a  y  bx
Cont…
Therefore, the equation of the least squares line is:
ŷ  a  bx  ŷ = 10.86 + 1.21x
•The y-intercept is: a = 10.86. This value tells us that,
even if no item is produced, there will be a fixed cost
of 10.86 Birr (such as insurance cost, maintenance
cost, etc.). The slope is: b = 1.21. This figure
indicates that for a unit increase (decrease) in the
number of items produced, the cost increases
(decreases) by 1.21 Birr.
6.5 Rank correlation
Rank correlation is used to measure the strength of the
linear association between two ranked variables, denoted
6 d 2
by rs and given by rs  1 
n(n  1)
2

where n = number of paired observations

d = difference between the ranks for each pair of observations
 The steps involved in computing the Spearman’s rank correlation
coefficient are as follows:
Step1: Rank the x’s among themselves giving rank 1 to the largest (or
smallest) observation, rank 2 to the second largest (or second
smallest) observation, and so on.
and Rank the y’s similarly.
Step 2: Find rank of x - rank of y for each pair of observations
Step 3: Find d = 
2
d (the sum of squares of the differences
between each pair of ranks)
Step 4: Compute the rank correlation coefficient using the above
TH
A NK
YO
U

Setting Up Zoom Rooms With Office 365
No ratings yet
Setting Up Zoom Rooms With Office 365
11 pages
A House in The Rift v0.5.11r1 Scene Guide
No ratings yet
A House in The Rift v0.5.11r1 Scene Guide
14 pages
Chemistry SPM Forecast Papers
0% (1)
Chemistry SPM Forecast Papers
16 pages
Stastics ll:6
No ratings yet
Stastics ll:6
22 pages
Chapter Eight 8 Simple Linear Regression and Correlation: N XY X Y N X X
No ratings yet
Chapter Eight 8 Simple Linear Regression and Correlation: N XY X Y N X X
5 pages
Regression Correlation
No ratings yet
Regression Correlation
22 pages
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
No ratings yet
M. Amir Hossain PHD: Course No: Emba 502: Business Mathematics and Statistics
31 pages
CH 6
No ratings yet
CH 6
43 pages
Chapter-9-Simple Linear Regression & Correlation
No ratings yet
Chapter-9-Simple Linear Regression & Correlation
11 pages
Correlation
100% (1)
Correlation
29 pages
Stat II Chapter 6
No ratings yet
Stat II Chapter 6
11 pages
CHAP5.0 STA404 Bivariate Analysis
No ratings yet
CHAP5.0 STA404 Bivariate Analysis
7 pages
Stat Cor Reg
No ratings yet
Stat Cor Reg
85 pages
Correlation-Regression 2019
No ratings yet
Correlation-Regression 2019
76 pages
Portion 10
No ratings yet
Portion 10
55 pages
Regression & Correlation 230224 221642
No ratings yet
Regression & Correlation 230224 221642
9 pages
Handout 5 Correlation and Regression (Recovered)
No ratings yet
Handout 5 Correlation and Regression (Recovered)
6 pages
Correlation and Regression
No ratings yet
Correlation and Regression
4 pages
Review: I Am Examining Differences in The Mean Between Groups
100% (2)
Review: I Am Examining Differences in The Mean Between Groups
44 pages
Correlation and Regression
No ratings yet
Correlation and Regression
6 pages
Correlation Regression
No ratings yet
Correlation Regression
58 pages
Corr - Regression Analysis
No ratings yet
Corr - Regression Analysis
19 pages
Correlation Anad Regression
No ratings yet
Correlation Anad Regression
13 pages
Econometrics For Finance
100% (1)
Econometrics For Finance
54 pages
Class Note II - 044242
No ratings yet
Class Note II - 044242
19 pages
CH VII - Regression & Correlation
No ratings yet
CH VII - Regression & Correlation
7 pages
CH 6
No ratings yet
CH 6
42 pages
Chapter - Six
No ratings yet
Chapter - Six
8 pages
Correlation and Regression
No ratings yet
Correlation and Regression
16 pages
15 MAY - NR - Correlation and Regression
No ratings yet
15 MAY - NR - Correlation and Regression
10 pages
Correlation and Linear Regression
No ratings yet
Correlation and Linear Regression
25 pages
Chapter 1
No ratings yet
Chapter 1
22 pages
26 - Correlation and Regression Analysis
No ratings yet
26 - Correlation and Regression Analysis
50 pages
Correlation Regression
100% (1)
Correlation Regression
25 pages
Chapter 5 - 1
No ratings yet
Chapter 5 - 1
5 pages
Sta404 - Chapter 5 - Bivariate Analysis (Student)
No ratings yet
Sta404 - Chapter 5 - Bivariate Analysis (Student)
27 pages
Correlation and Simple Linear Regression Analyses: Objectives
No ratings yet
Correlation and Simple Linear Regression Analyses: Objectives
6 pages
Correlation and Simple Linear Regression: Y. I.E. X
100% (1)
Correlation and Simple Linear Regression: Y. I.E. X
9 pages
06 Correlation and Regression
No ratings yet
06 Correlation and Regression
63 pages
CH 4 - Correlation and Regression YARA&LAMA
No ratings yet
CH 4 - Correlation and Regression YARA&LAMA
27 pages
Investigating Variables
No ratings yet
Investigating Variables
15 pages
07 - Correlation and Regression Analysis-1
No ratings yet
07 - Correlation and Regression Analysis-1
13 pages
Correlation and Regression 2020
No ratings yet
Correlation and Regression 2020
63 pages
Linear Regression
No ratings yet
Linear Regression
9 pages
Oe Statistics Notes
No ratings yet
Oe Statistics Notes
32 pages
Correction
No ratings yet
Correction
10 pages
Correlation & Simple Regression
No ratings yet
Correlation & Simple Regression
15 pages
13simple Linear Regression
No ratings yet
13simple Linear Regression
127 pages
Chapter 14 Simple Linear Regression .
No ratings yet
Chapter 14 Simple Linear Regression .
39 pages
CH 5 - Correlation and Regression
No ratings yet
CH 5 - Correlation and Regression
9 pages
Correlation and Regression
No ratings yet
Correlation and Regression
15 pages
Regression Analysis
No ratings yet
Regression Analysis
18 pages
Module 2 - Section 4 (Linear Regression) - 11
No ratings yet
Module 2 - Section 4 (Linear Regression) - 11
20 pages
Simple and Multiple Linear Regression
No ratings yet
Simple and Multiple Linear Regression
91 pages
Correlation and Regression
No ratings yet
Correlation and Regression
32 pages
Unit 6, Regression
No ratings yet
Unit 6, Regression
34 pages
Correlation & Regression
No ratings yet
Correlation & Regression
26 pages
Presentation4 - Bivariate Analysis and Simple Linear Regression
No ratings yet
Presentation4 - Bivariate Analysis and Simple Linear Regression
31 pages
Correlation & Regression Analysis
100% (1)
Correlation & Regression Analysis
39 pages
Correlation and Regression
No ratings yet
Correlation and Regression
7 pages
6 - Regression and Correlation PDF
No ratings yet
6 - Regression and Correlation PDF
15 pages
Correlation and Regression Notes
No ratings yet
Correlation and Regression Notes
5 pages
Correlation and Regression: Six Sigma Thinking, #8
From Everand
Correlation and Regression: Six Sigma Thinking, #8
Sumeet Savant
5/5 (1)
Surah Al Baqarah (2:222) - Prohibiting Coitus During Menstruation
No ratings yet
Surah Al Baqarah (2:222) - Prohibiting Coitus During Menstruation
4 pages
BRILLIANT Portraiture 1991 Introduction
100% (1)
BRILLIANT Portraiture 1991 Introduction
19 pages
Federal Water Supply and Sewerage Management Project, Kanchanpur PDF
No ratings yet
Federal Water Supply and Sewerage Management Project, Kanchanpur PDF
51 pages
CBSE Sample Paper For Class 5 English With Solutions - Mock Paper-1
100% (2)
CBSE Sample Paper For Class 5 English With Solutions - Mock Paper-1
5 pages
6 Coin Puzzle Script
No ratings yet
6 Coin Puzzle Script
4 pages
Bonding and Structure
100% (1)
Bonding and Structure
9 pages
Bar Management Midterm Exam 2015-16
100% (1)
Bar Management Midterm Exam 2015-16
6 pages
Solved Pair of Words 1971 To 2022
100% (1)
Solved Pair of Words 1971 To 2022
62 pages
How To Solve Just About Any Problem - Book - Print Version - LATEST - FINAL - EDITED
No ratings yet
How To Solve Just About Any Problem - Book - Print Version - LATEST - FINAL - EDITED
228 pages
Mid-Term Year 5 Paper 2 (2021)
No ratings yet
Mid-Term Year 5 Paper 2 (2021)
6 pages
Robert Anton Wilson Crowley Class - Week 1
No ratings yet
Robert Anton Wilson Crowley Class - Week 1
9 pages
Lehningers Principles of Biochemistry 6th Edition Unlocked Test Bank
No ratings yet
Lehningers Principles of Biochemistry 6th Edition Unlocked Test Bank
321 pages
Ensaio Clínico de Terapia Manual No Tratamento Da Condromalácia Patelar
No ratings yet
Ensaio Clínico de Terapia Manual No Tratamento Da Condromalácia Patelar
4 pages
MPMC Theory Handout CS1403
No ratings yet
MPMC Theory Handout CS1403
7 pages
Mobile Database Management System 3
No ratings yet
Mobile Database Management System 3
17 pages
Chapter 1 To 3 - Example Format
No ratings yet
Chapter 1 To 3 - Example Format
7 pages
52, 53 SMU MBA Assignment IV Sem
No ratings yet
52, 53 SMU MBA Assignment IV Sem
119 pages
Material MGT BBA 5 MDU
No ratings yet
Material MGT BBA 5 MDU
365 pages
Faculty of Engineering & Technology Mechanical Engineering Syllabus Structure For B.E. (Mechanical Engineering) W.E.F. Academic Year 2017-2018 (CGPA)
No ratings yet
Faculty of Engineering & Technology Mechanical Engineering Syllabus Structure For B.E. (Mechanical Engineering) W.E.F. Academic Year 2017-2018 (CGPA)
52 pages
Information Retrieval Systems (A70533)
No ratings yet
Information Retrieval Systems (A70533)
11 pages
Phrasal Verbs Text
No ratings yet
Phrasal Verbs Text
4 pages
The Handmaid's Tale Chapter 35 37 Vocabulary
No ratings yet
The Handmaid's Tale Chapter 35 37 Vocabulary
2 pages
The US Legislature
No ratings yet
The US Legislature
11 pages
Tokaido Player Aid v1.1
No ratings yet
Tokaido Player Aid v1.1
1 page
ABM-PRINCIPLES OF MARKETING 11 - Q1 - W6 - Mod6
No ratings yet
ABM-PRINCIPLES OF MARKETING 11 - Q1 - W6 - Mod6
16 pages
JM Proposal Final (Print)
No ratings yet
JM Proposal Final (Print)
61 pages
CMPLDW Model Validation - FIDVR Events
No ratings yet
CMPLDW Model Validation - FIDVR Events
20 pages

Stat Chapter 6

Uploaded by

Stat Chapter 6

Uploaded by

Unit six(6)

SIMPLE LINEAR REGRESSION AND CORRELATION

• Regression: is a statistical measure that attempts to determine

 Correlation: is used to describe the degree of relationship (association or

Figure 1: Positive (direct) linear relationship between variables

Figure 2: Negative (inverse) linear relationship between variables

Figure 3: No linear relationship between variables

A researcher who is concerned about the

income consumption (y) 2

Here, since the value of r is very close to 1, we can conclude

The coefficient of correlation was computed as r = 0.973.

• Once we have a clear understanding of the strength of linear

The sum of squares of the errors (SSE) is:

n  xy  (  x)(  y) 5(616)  (32)(93)

where n = number of paired observations

You might also like