Lecture 04
Lecture 04
Lecture 04
An Overview
• Introduction to Regression
• Types of Regression
• Key Concepts
• Applications
• Conclusion
Function Approximation x
Many models could be used – Simplest is linear regression
Fit data with the best hyper-plane which "goes through" the points
For each point the difference between the predicted point and the actual
observation is the residue
School of Energy Science & Engineering
Linear Regression
Linear regression is like fitting a line or (hyper)plane to a set of points
2 50
3 60
5 80
7 90
8 95
To find the values for the coefficients (weights) which minimize the objective function
we can take the partial derivatives of the objective function (SSE) with respect to the
coefficients. Set these to 0 and solve
n∑ xy − ∑ x∑ y
β0 =
∑ y − β1∑ x β1 = 2 2
n n∑ x − (∑ x )
Remember for regression we use an output node which is not thresholded (just does
a linear sum) and iteratively apply the delta rule – thus the net is the output
What are the new weights after one iteration through the following training set using
the delta rule with a learning rate c = 1. How does it generalize for the novel input (-
.3, 0)?
x1 x2 Target y
.5 -.2 1
1 1 0
Initial Setup
Initial Weights: w1=1, w2=1
Learning Rate: c=1
Predicted Δw2=
Input Target Error Δw1= Updated Updated
Output(net) c(t-net).x2
(x1,x2) (t) (t-net) c(t-net).x1 w1 w2
= w1x1+w2x2
Initial
1.00 1.00
Weights
(0.5, - 0.5*1+(−0.2)* 0.7⋅0.5=0. 0.7⋅−0.2= 1+0.35=1. 1−0.14=0.
1 1−0.3=0.7
0.2) 1 =0.3 35 −0.14 35 86
1*1.35+1*0.8 0−2.21=− −2.21*1= −2.21*1= 1.35−2.21 0.86−2.21
(1, 1) 0
6=2.21 2.21 −2.21 −2.21 =−0.86 =−1.35
x1 x2 Target
.3 .8 .7
-.3 1.6 -.1
.9 0 1.3
Obese 1.0
Not Obese
Weight
Weight
School of Energy Science & Engineering
R2 compares a measure of a good fit, SS(fit)
to a measure of a bad fit, SS(mean)
Weight
1
y= −x
1+ e
Obese 1
The curve
goes from
0 to 1
Not Obese 0
Probability
a mouse is
Obese
Obese
Not Obese
School of Energy Science & Engineering weight
Log(odds)
The Odds in favor of my team wining the game are 5 to 3:
⎛ 5 ⎞
⎛ p ⎞ ⎜ ⎟ ⎛ 5⎞
log(odds) = log ⎜ ⎟ = log ⎜ 8 ⎟ = log ⎜ ⎟
⎝ 1− p ⎠ ⎜⎜ 1− 5 ⎟⎟ ⎝ 3⎠
⎝ 8⎠
Probability
Of Obesity
Not Obese 0
weight
⎛ p ⎞
Log(odds of obesity) = log(odds) = log ⎜ ⎟
School of Energy Science & Engineering ⎝ 1− p ⎠
1
Log(odds of obesity)
0
weight ⎛ 0.88 ⎞
log ⎜ ⎟=2
⎛ 0.5 ⎞ ⎝ 1− 0.88 ⎠
log ⎜ ⎟=0
⎝ 1− 0.5 ⎠ ⎛ 0.95 ⎞
log ⎜ ⎟=3
⎛ 0.731 ⎞ ⎝ 1− 0.95 ⎠
log ⎜ ⎟ =1
⎝ 1− 0.731 ⎠
School of Energy Science & Engineering
The coefficient for the line in logistic Regression
Log(odds of obesity)
Y = -3.48 + 1.83 x weight
log(odds)
e
p= log(odds)
1+ e
e log(odds)
p=
1+ e log(odds)
(a) Calculate the log(likelihood) of all the given data points when fitting line X
(b) Calculate the log(likelihood) of all the given data points when fitting line Y
(c) Which line can be considered the best fitting line for the above scenario and why?
e log(odds)
p=
1+ e log(odds)
We can call this LL(fit), for the log-likelihood of the fitted line,
and use it as substitute for SS(fit)
LL(fit)= -2.1813
School of Energy Science & Engineering
We need a measure of a poorly fitted line
that is analogous to SS(mean)
⎛ no of obese mice ⎞
= log ⎜ ⎟
⎝ total no of mice not obese ⎠
⎛5⎞
= log ⎜ ⎟ = 0.22
⎝4⎠
e 0.22
p= 0.22
= 0.56
1+ e
e log(odds)
p=
1+ e log(odds)
2
R =
(
−6.18 − −2.1813 ) = 0.6475
School of Energy Science & Engineering −6.18
Numerical
Calculate the log-likelihood of the data given the the best
fitting squiggle for malignant tumour. Then calculate for R2.
Malignant Non-Malignant
0.45 0.001
0.9 0.002
0.91 0.005
0.95 0.2
0.99 0.34