0% found this document useful (0 votes)
155 views52 pages

Unit 4 - Linear Regression

Linear regression with one variable can be used to predict a continuous output value based on a single input feature. It finds the linear relationship between a dependent (output) variable and independent (input) variable. The gradient descent algorithm is used to minimize the cost function and find the optimal parameters theta-0 (intercept) and theta-1 (slope) that best fit the linear model to the training data. It works by iteratively updating the parameters in the opposite direction of the gradient of the cost function until reaching a minimum.

Uploaded by

shinjo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
155 views52 pages

Unit 4 - Linear Regression

Linear regression with one variable can be used to predict a continuous output value based on a single input feature. It finds the linear relationship between a dependent (output) variable and independent (input) variable. The gradient descent algorithm is used to minimize the cost function and find the optimal parameters theta-0 (intercept) and theta-1 (slope) that best fit the linear model to the training data. It works by iteratively updating the parameters in the opposite direction of the gradient of the cost function until reaching a minimum.

Uploaded by

shinjo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Linear Regression

with one Variable/Feature


Supervised
Learning

Regression Classification

- Continuous values as output - Discrete values as output


- Example: - Binary/ Multi Classes
- For a house - Examples:
size = 1100 sq.ft. - Email: Spam or Not Spam,
Selling price = ? - Image classification: Car, Bike, Truck,
Tree.
Questions

• Finding birds based on their whistle is a


a) Classification problem
b) Regression problem
c) Clustering problem
d) All.

• Finding Oilfield, Gold mines or Archaeological sites is


a) Classification problem
b) Regression problem
c) Both
d) None
Regression Model

Dependent vs Independent Variables:

 Dependent Variable 
- A dependent variable is the variable being tested and measured in a scientific
experiment.
- This is the main factor that you’re trying to understand or predict. 

 Independent Variables 
- the variables that are changed or controlled in a scientific experiment to test the effects
on the dependent variable.
- These are the factors that you hypothesize have an impact on your dependent variable.
For example, a scientist wants to see if the brightness of light has any effect on a moth being
attracted to the light. The brightness of the light is controlled by the scientist. This would be the
independent variable. How the moth reacts to the different light levels (distance to light
source) would be the dependent variable.
Regression Model

 Regression model is an approach to find the relationships between


predictors or independent variables and  response or dependent variable.

y= a + bx1 + cx2 + dx1x2+ e(x1)2

 Regression analysis is a reliable method of identifying which variables (x1


and x2) have impact on a topic of interest y.

 The process of performing a regression allows you to confidently determine


which factors matter most, which factors can be ignored, and how these
factors influence each other.
Types of Regression Model

Regression
Models

One Multiple
Feature Features

Linear Non-Linear Linear Non-Linear


Linear Regression with one Feature
500000
Housing Prices
400000
Prediction
300000
220
Price 200000
(in 1000s 100000
of dollars) 0 1250
0 500 1000 1500 2000 2500 3000
Size (feet2)
Supervised Learning Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
Size in feet2 (x) Price ($) in 1000's (y)
Training set of 2104 460
housing prices 1416 232
1534 315 m
852 178
… …
Notation:
m = Number of training examples
x’s = “input” variable / features
x(1) = 2104
y’s = “output” variable / “target” variable
x(2) = 1416
(x, y) – one training example y(1) = 460
(x(i), y(i)) – ith training example (x(2), y(2)) = (1416, 232)
Training Set How do we represent h ?

Learning Algorithm

Size of h Estimated
house price
x hypothesis Estimated
value Linear regression with one variable.
Univariate linear regression.
h maps from x’s to y’s One variable
How to choose ?
Size in feet2 (x) Price ($) in 1000's (y)
2104 460
1416 232
Training Set
1534 315
852 178
… …

Hypothesis:
: Parameters
3 3 3
h(x) = 1.5 + 0·x h(x) = 0.5·x
2 2 2

1 1 1 h(x) = 1 + 0.5·x

0 0 0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
Cost Function

(x(i), y(i)) Squared error function

y
Θ0, Θ1
 

Idea: Choose so that


is close to for
our training examples
Cost Function Minimization
(Assume θ0=0)
(for fixed , this is a function of x) (function of the parameter )

3 3

2 2
y
1 1

0 0
0 1 x 2 3 -0.5 0 0.5 1 1.5 2 2.5

𝐽  ( 1 )=0
Cost Function Minimization
(for both parameters)

Hypothesis:

Parameters:

Cost Function:

Goal:
(for fixed , this is a function of x) (function of the parameters )

Contour plot
(for fixed , this is a function of x) (function of the parameters )

h(x) = 360 + 0·x

  = 360
  =0
(for fixed , this is a function of x) (function of the parameters )

h(x) = 500 - 0·05x


(for fixed , this is a function of x) (function of the parameters )

h(x) = 100 + 0·13x


Gradient Descent Algorithm
Have some function
Want

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
  := -
 ≥ 0
  := -

  is moving towards local minima

  := -
 ≤ 0
  := -

  is moving towards local minima


 
: Learning Rate

If α is too small, gradient descent


can be slow.

If α is too large, gradient descent can


overshoot the minimum. It may fail to
converge, or even diverge. i.e. in every
iteration it may miss the local minima.
at local optima

Current value of

=> .0
=>
The parameter values will not be changed
Gradient descent algorithm

assignment
a:=b

 Simultaneously update
Learning rate &

Correct: Simultaneous update Incorrect:


Gradient descent algorithm Linear Regression Model
Gradient descent algorithm

update
and
simultaneously

 
J(0,1)

1
0
J(0,1)

1
0
Convex function

Bowl-shaped
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
Linear Regression with multiple Feature
Gradient Descent Algorithm
Gradient Descent Algorithm
Suppose we have 2 features, then
Polynomial Regression

You might also like