0% found this document useful (0 votes)
51 views19 pages

Linear Regression

Linear regression involves calculating a least squares regression line to make predictions from explanatory and response variables. Residuals measure the difference between observed and predicted values. Transforming either variable changes the regression equation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views19 pages

Linear Regression

Linear regression involves calculating a least squares regression line to make predictions from explanatory and response variables. Residuals measure the difference between observed and predicted values. Transforming either variable changes the regression equation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 19

Linear regression

Linear Regression
• In this lesson you will learn:

– How to calculate a least squares regression


l i ne and us e i t t o make pr edi ct i ons .

– How t o us e r es i dual s

– How a regression equation is effected by a linear


transformation of either of the variables.
Linear regression
scatter graph
two variables
The least squares regression
line

At GCSE you did a line of best fit to


des cr i be cor r el at i on whi ch was a
bi t of a gues s i ng game a bi t hi t
and miss.

• But now you wi l l l ear n a s l i ght l y


mor e accur at e met hod ‘t he met hod of
l eas t s quar es ’ t o cal cul at e t he l i ne
of r egr es s i on.
Calculating the Line of
Regression
Using – the equation of a straight line
y=mx+c
• Get the mean of all the x values
• Get the mean of all the y values and use the
 y  Cy1)  m  x  x 
following equation (from
y  mx   y  mx 

where (y-mx) is the intercept



• Plot the point x , y  this is the only
point that we know on the line of regression.
• The onl y t hi ng t o do now i s wor k out t he
gr adi ent (m)
S xy
Find the gradient m
S xy   ( xi  x )( yi  y )
S xx

  xi yi 
  xi   yi 
n
  xi yi  nxy You need t he
di f f er ent
f or ms as
S xx   ( xi  x ) 2 pr obl ems wi l l
be pr es ent ed
  x 
2
i n di f f er ent
x
i
2
i  ways.
n
  xi2  nx 2
In a graphics calculator
x y  x  x   y  y   x  x  y  y 

x y   x  x  y  y 
n n
=Sxy
= x = y

x  x  x  x 
2
x
x i
 i
( x  x ) 2

x
=Sxx
Task
• Exercise A
– page 127
Explanatory variables
• which makes more sense to you:
– The sale of ice-cream affects the temperature
– The temperature affects the sales of ice-cream

• Temperature is the ‘Explanatory Variable’ and therefore


the one w e call x
Task
• Exercise B
Scaling- you could do each value or just apply it to the
whole equation

Temp in (oC) x 25 26 16 23 24 18 21 18 19 28
Ice-cream sales 305 373 77 162 316 132 184 148 178 402
y

Line of regression
for ice-cream sales
y  26.2 x  344
If the temperature were given in Fahrenheit (t) the values
woul d need conver t i ng or t he equat i on woul d need
9 Fahr enhei t (t )
r ewr i t i ng i n t er ms of
IF t  x  32
5
5
So x  t  32 
9
 the line of regression is:

5 
y=26.2  t  32    344  14.6t  809.8
9 
Task
• Exercise C
– page 130
Residuals
t he di f f er ence bet ween Pr act i ce and
Theor y
• A residual is the difference between observed y-value
and the predicted y-value using the line of regression.

• Residuals are shown by a vertical line from the line


of regression.
http://www.math.csusb.edu/faculty/stanton/m262/regress/regress.html

http://www.math.csusb.edu/faculty/stanton/probstat/regression.html

both of these show how each item of data will have an affect on the
line of regression.
Can you work out what it is?
Calculating the residuals
x y (x-xmean) y-ymean xy xx
3.60 181.00 1.54 58.60 90.44 2.38
3.25 159.00 1.19 36.60 43.68 1.42
3.50 134.00 1.44 11.60 16.74 2.08
3.00 149.00 0.94 26.60 25.09 0.89
2.50 124.00 0.44 1.60 0.71 0.20
2.00 137.00 -0.06 14.60 -0.83 0.00
2.00 106.00 -0.06 -16.40 0.93 0.00
1.80 121.00 -0.26 -1.40 0.36 0.07
1.60 117.00 -0.46 -5.40 2.47 0.21
1.40 118.00 -0.66 -4.40 2.89 0.43
1.30 102.00 -0.76 -20.40 15.44 0.57
1.50 101.00 -0.56 -21.40 11.91 0.31
1.30 96.00 -0.76 -26.40 19.98 0.57
1.10 101.00 -0.96 -21.40 20.47 0.92
1.00 90.00 -1.06 -32.40 34.24 1.12
Mean 2.06 122.40 sum 284.51 11.17
gradient of regression line is: 25.5
Calculating the residuals
y - y1 = m(x - x1)
y=m(x - x1) + y1 expecte expected -
dy actual
161.7 19.3
152.8 6.2
159.1 -25.1 This is
146.4 2.6 because we
133.7 -9.7 have
121.0 16.0 calculated
121.0 -15.0 regression
115.9 5.1 line using the
110.8 6.2
method of
105.7 12.3
least squares.
103.1 -1.1
108.2 -7.2
103.1 -7.1
98.0 3.0
95.5 -5.5
Total 0.00
TASK
• Exercise D
– Page 132
Calculating the residuals
y - y1 = m(x - x1)
y=m(x - x1) + y1 expecte expected -
dy actual
161.7 19.3
152.8 6.2
159.1 -25.1 Can you think
146.4 2.6 of any
133.7 -9.7 problems we
121.0 16.0 might have
121.0 -15.0 with our line
115.9 5.1 of regression?
110.8 6.2
105.7 12.3
103.1 -1.1
108.2 -7.2
103.1 -7.1
98.0 3.0
95.5 -5.5
Total 0.00
Calculating the residuals
y - y1 = m(x - x1)
y=m(x - x1) + y1 expecte expected - What can we
dy actual do to improve
161.7 19.3 our line of
152.8 6.2 regression?
159.1 -25.1
146.4 2.6
Tip you may
133.7 -9.7
have already
121.0 16.0
been told what
121.0 -15.0
115.9 5.1 to do with a
110.8 6.2 line of best fit
105.7 12.3 at GCSE maths
103.1 -1.1
108.2 -7.2
103.1 -7.1
98.0 3.0
95.5 -5.5
Total 0.00
Linear Regression
• The mai n poi nt s of t he l es s on wer e :

– How to calculate a least squares regression


l i ne and us e i t t o make pr edi ct i ons .

– How t o us e r es i dual s

– How a regression equation is effected by a linear


transformation of either of the variables.

You might also like