0% found this document useful (0 votes)
219 views

Topic 6 Simple Linear Regression

The values required are: SSxy = 211.7143 SSxx = 801.4286 SSyy = Not given but can be calculated Then, r = √(SSxy/SSxxSSyy) 35 Solution 13-6 From Example 13-1: SSxy = 211.7143 SSxx = 801.4286 To calculate SSyy: Σy = 64 y̅ = 9.1429 SSyy = Σ(y - y̅)2 = 64 Then, SSxy r = √ = √(211.7143/(801.4286)(64

Uploaded by

milk tea
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
219 views

Topic 6 Simple Linear Regression

The values required are: SSxy = 211.7143 SSxx = 801.4286 SSyy = Not given but can be calculated Then, r = √(SSxy/SSxxSSyy) 35 Solution 13-6 From Example 13-1: SSxy = 211.7143 SSxx = 801.4286 To calculate SSyy: Σy = 64 y̅ = 9.1429 SSyy = Σ(y - y̅)2 = 64 Then, SSxy r = √ = √(211.7143/(801.4286)(64

Uploaded by

milk tea
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Chapter 6:

SIMPLE LINEAR REGRESSION


SIMPLE LINEAR REGRESSION
ANALYSIS
 Scatter Diagram
 Least Square Line
 Interpretation of a and b

2
Scatter Diagram
Definition
A plot of paired observations is called a
scatter diagram.

3
Table 13.1 Incomes (in hundreds of dollars) and Food
Expenditures of Seven Households

Income Food Expenditure


35 9
49 15
21 7
39 11
15 5
28 8
25 9
4
Figure 13.4
Food expenditure Scatter diagram.

First household
Seventh household

Income
5
Least Squares Line
Figure 13.6 Regression line and random errors.

e
Food expenditure

Regression line

Income 6
Linear Regression
Definition
A (simple) regression model that gives a
straight-line relationship between two
variables is called a linear regression
model.

7
SIMPLE LINEAR REGRESSION
ANALYSIS cont.

Constant term or y-intercept Slope

y = A + Bx

Dependent variable Independent variable

8
Figure 13.1 Relationship between food expenditure
and income. (a) Linear relationship. (b)
Nonlinear relationship.
Food Expenditure

Food Expenditure
Linear

Nonlinear

Income Income

(a) (b)

9
Figure 13.2 Plotting a linear equation.

y
y = 50 + 5x
150

100 x = 10
y = 100
50 x=0
y = 50
5 10 15 x
10
Figure 13.3 y-intercept and slope of a line.

5
1
5 Change in y
1
50
Change in x
y-intercept
x
11
The Least Squares Line
For the least squares regression line
ŷ = a + bx,

SS xy
b and a  y  bx
SS xx

12
The Least Squares Line cont.

where
 x y   x
2

SS xy   xy  and SS xx  x 
2

n n

and SS stands for “sum of squares”. The


least squares regression line ŷ = a + bx us
also called the regression of y on x.

13
Example 13-1
Find the least squares regression line for
the data on incomes and food expenditure
on the seven households given in the
Table 13.1. Use income as an independent
variable and food expenditure as a
dependent variable.

14
Table 13.2
Income Food Expenditure
x y xy x²
35 9 315 1225
49 15 735 2401
21 7 147 441
39 11 429 1521
15 5 75 225
28 8 224 784
25 9 225 625
Σx = 212 Σy = 64 Σxy = 2150 Σx² = 7222
15
Solution 13-1

 x  212  y  64
x   x / n  212 / 7  30.2857
y   y / n  64 / 7  9.1429

16
Solution 13-1

 x  y  (212)(64)
SS xy   xy   2150   211.7143
n 7
 x  2
(212) 2
SS xx  x 
2
 7222   801.4286
n 7

17
Solution 13-1
SS xy 211.7143
b   .2642
SS xx 801.4286
a  y  bx  9.1429  (.2642)(30.2857)  1.1414
Thus,
ŷ = 1.1414 + .2642x

18
Figure 13.7 Error of prediction.

ŷ = 1.1414 + .2642x
Food expenditure

Predicted = $1038.84
e Error = -$138.84

Actual = $900

Income
19
Interpretation of a and b
Interpretation of a
 Consider the household with zero income
 ŷ = 1.1414 + .2642(0) = $1.1414 hundred
 Thus, we can state that households with
no income is expected to spend $114.14
per month on food
 The regression line is valid only for the
values of x between 15 and 49
20
Interpretation of a and b cont.
Interpretation of b
 The value of b in the regression model

gives the change in y due to change of


one unit in x
 We can state that, on average, a $1
increase in income of a household will
increase the food expenditure by $.2642

21
Figure 13.8 Positive and negative linear relationships
between x and y.

y y

b<0
b>0

(a) Positive linear x (b) Negative linear x


relationship. relationship.

22
Figure 13.13 Nonlinear relations between x and y.

y y

x x
(a) (b)

23
Figure 13.16 regression model.

ŷ = 1.1414 + .2642x
Food expenditure

Income
24
LINEAR CORRELATION
 Linear Correlation Coefficient

25
Linear Correlation Coefficient
Value of the Correlation Coefficient
The value of the correlation coefficient
always lies in the range of –1 to 1; that is,
-1 ≤ ρ ≤ 1 and -1 ≤ r ≤ 1

26
Figure 13.18 Linear correlation between two
variables.

(a) Perfect positive linear correlation, r = 1


y

r=1

x 27
Figure 13.18 Linear correlation between two
variables.

(b) Perfect negative linear correlation, r = -1


y

r = -1

28
x
Figure 13.18 Linear correlation between two
variables.

(c) No linear correlation, , r ≈ 0


y

r≈0

29
x
Figure 13.19 Linear correlation between variables.

x
(a) Strong positive linear correlation (r is close to 1)
30
Figure 13.19 Linear correlation between variables.

x
(b) Weak positive linear correlation (r is positive
but close to 0)
31
Figure 13.19 Linear correlation between variables.

x
(c) Strong negative linear correlation (r is close to -1)
32
Figure 13.19 Linear correlation between variables.

x
(d) Weak negative linear correlation (r is negative
and close to 0)
33
Linear Correlation Coefficient
cont.
Linear Correlation Coefficient
The simple linear correlation, denoted
by r, measures the strength of the linear
relationship between two variables for a
sample and is calculated as
SS xy
r
SS xx SS yy
34
Example 13-6
Calculate the correlation coefficient for the
example on incomes and food expenditures
of seven households.

35
Solution 13-6

SS xy
r
SS xx SS yy
211.7143
  .96
(801.4286)(60.8571)

36
REGRESSION ANALYSIS:
COMPLETE EXAMPLE
Example 13-8
A random sample of eight drivers insured
with a company and having similar auto
insurance policies was selected. The
following table lists their driving experience
(in years) and monthly auto insurance
premiums.

37
Example 13-8
Driving Experience Monthly Auto Insurance
(years) Premium
5 $64
2 87
12 50
9 71
15 44
6 56
25 42
16 60
38
Example 13-8
a) Does the insurance premium depend on
the driving experience or does the driving
experience depend on the insurance
premium? Do you expect a positive or a
negative relationship between these two
variables?

39
Solution 13-8
a) The insurance premium depends on
driving experience
 The insurance premium is the dependent
variable
 The driving experience is the independent
variable

40
Example 13-8

b) Compute SSxx, SSyy, and SSxy.

41
Table 13.5
Experience Premium
x y xy x² y²
5 64 320 25 4096
2 87 174 4 7569
12 50 600 144 2500
9 71 639 81 5041
15 44 660 225 1936
6 56 336 36 3136
25 42 1050 625 1764
16 60 960 256 3600
Σx = 90 Σy = 474 Σxy = 4739 Σx² = 1396 Σy² = 29,642
42
Solution 13-8
b)
x   x / n  90 / 8  11.25
y   y / n  474 / 8  59.25
( x)( y ) (90)(474)
SS xy   xy   4739   593.5000
n 8
( x ) 2 (90) 2
SS xx   x 2   1396   383.5000
n 8
( y ) 2 (474) 2
SS yy   y 
2
 29,642   1557.5000
n 8
43
Example 13-8
c) Find the least squares regression line by
choosing appropriate dependent and
independent variables based on your
answer in part a.

44
Solution 13-8
c)
SS xy  593.5000
b   1.5476
SS xx 383.5000
a  y  bx  59.25  (1.5476)(11.25)  76.6605

yˆ  76.6605  1.547 x

45
Example 13-8

d) Interpret the meaning of the values of a


and b calculated in part c.

46
Solution 13-8
d) The value of a = 76.6605 gives the
value of ŷ for x = 0
Here, b = -1.5476 indicates that, on
average, for every extra year of driving
experience, the monthly auto insurance
premium decreases by $1.55.

47
Example 13-8

e) Plot the scatter diagram and the


regression line.

48
Figure 13.21 Scatter diagram and the regression
line.

e)
Insurance premium

yˆ  76.6605  1.547 x

Experience

49
Example 13-8

f) Calculate r and explain what they mean.

50
Solution 13-8
f)

SS xy  593.5000
r   .77
SS xx SS yy (383.5000)(1557.5000)

51
Solution 13-8
f) The value of r = -0.77 indicates that the
driving experience
 Monthly auto insurance premium are
negatively related
 The (linear) relationship is strong but not very
strong

52
Solution 13-8
g) The predict value of y for x = 10 is

ŷ = 76.6605 – 1.5476(10) = $61.18

53
A fire insurance company wishes to find out the relationship between the
amount of fire damage of a residential area with the distance between the
burning house and the nearest fire station. A sample of 10 recent fires in the
area is selected. The amount of damage (RM ‘000) and the distance are
recorded.

Distance (km) Fire Damage (RM


‘000)
1.8 17.8
6.1 36.4
4.3 32.0
2.6 19.5
3.4 26.2
5.5 36.0
1.1 17.3
4.8 43.1
3.0 22.3
2.1 24.1
54
Exercise

Suppose that a retail outlet suspects that its sales depend on


the prices of its goods. The following table shows the
relationship between the prices and the sales of 10 types of 1kg
tin milk powder during the first quarter of year 2001.

Prices of 15.5 16.7 15.9 18.5 16.3 14.5 11.9 12.5 13.9 17.5
1kg tin milk
powder
(RM)
Sales 195 178 193 125 180 205 250 230 200 150
(number of
tins)
55
Exercise

Alan, a Biology student, conducted a laboratory experiment to


investigate the relationship between weight and height. In his
experiment, 8 male college students were randomly selected and the
data are tabulated below:

Height 140 163 171 156 149 136 137 151


(cm)

Weight 51 68 66 59 56 49 52 64
(kg)

56
Exercise

A firm has the following data on the age of a piece


of machinery and how much it costs to repair.

Age of Machine (years) Cost of Repairs (RM


‘000)
1 2
2 2.5
3 3.1
4 3.8
5 4.6
6 5.4
7 6.3

57

You might also like