0% found this document useful (0 votes)

3 views56 pages

ML 3

This document provides an overview of supervised learning, specifically focusing on linear regression and gradient descent. It explains the process of supervised learning, the theory behind linear regression, and the method of ordinary least squares (OLS) for minimizing residual errors. Additionally, it introduces gradient descent as an iterative method for optimizing the cost function in regression analysis.

Uploaded by

mazina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views56 pages

ML 3

Uploaded by

mazina

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 56

Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Introduction to Machine Learning

Course 3 - Supervised Learning : Linear Regression
4th year Statistics and Data Science

Ayoub Asri

12 February 2025

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Section 1

Supervised Learning

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Supervised Learning

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Supervised Learning Process I

A simplified process can be presented as :

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Supervised Learning Process II

It can be defined after adjustements :

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Supervised Learning Process III

The final process is then defined as :

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Section 2

Supervised Learning : Linear Regression

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory I

The basic intuitive idea behind the linear regression is to try

to find constant linear relationship between the variables.

Ideally find the relationship : y = x

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory 2

After finding the relationship between y and x, we can use it to

fit estimate the value of new observation (that has not been
present in the original data, which means was not included
in the creation of the regression line)

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory 3

The real question is to know, where to put the regression line

for real life data.

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory 4

This line can be a good fit ?

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory 5

Or even this one !!

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory 6

The fundamental idea is to minimize the global distance

between the points and the regression line

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory 7

This distance (measure) is called : The residual error

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Theory 8

The best method which provides the best solution to this

problem is Ordinary Least squares (OLS)
Ps. the details can be found in the cours of econometrics.

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Linear Regression : Example

To better understand the subject of linear regression, we can
use a real life data set as an example.

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Linear Regression : General formulation

For a more general case :

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

OLS
For each variable of the data set, we will associate a coefficient β

ŷ = β0 x0 + · · · + βn xn
or

n
X
ŷ = β i xi
i=0

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

OLS solution formulation I

Find the algebraic formulation of the solution, means it is the

unique solution.

Uni variate case :

For example, for the case of a simple regression problem :
y = b0 + b1 x

The solution is given by :

i − x̄)(yi − ȳ)
 P
b1 = (x

2
(xi − x̄)
P

b0 = ȳ − b1 x̄


Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

OLS solution formulation I

Multivariate case
For the multivariate case, where we have k explanatory
variables, the solution is given by :

′ ′
β = (β0 , · · · , βk ) = (X X)−1 (X Y )

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

OLS : drowbacks !

Since OLS provide a theoretical and algebraic solution,

does it always provide a good solution ?
What will happen to this solution, if we have many
observations ? or when we have many variables ?
We be there an effect ?
What is the alternative ?

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The alternative

The alternative to OLS in the case of large data sets or

generally in ML is to use methods that are iterative

Iterative methods can easily be implemented and are much

easier to handle
To introduce the iterative solution, we need to introduce
some concepts.

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Section 3

Gradient Descent

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The cost function

The main goal of this problem is to find the value of β that

minimize the value of the residual error

m
X
(y j − ŷ j )2
j=1

Or we can even calculate the mean of squared errors

m
1 X
(y j − ŷ j )2
m j=1

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Minimisation Problem

The goal of the problem is then : Find the values of β that

minimize the value of the mean of squared residuals.
This is called : a minimization problem with respect to a cost
function :

m
1 X
J(β) = (y j − ŷ j )2
2m j=1

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The cost function : simplification

We can simplify the cost function to :

m
1 X
J(β) = (y j − ŷ j )2
2m j=1
m n
!2
1 X
βi xji
X
= yj −
2m j=1 i=1

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Derivative of cost function

To find the minimum, we need to calculate the derivative :

 !2 
m n
∂J ∂  1 X
βi xji
X
(β) = yj − 
∂βk ∂βk 2m j=1 i=1
m n
!
1 X
βi xji (−xjk )
X
= yj −
m j=1 i=1

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The cost function 2

The analytical solution is very complex and take many

power and time to execute
We propose to determine the solution by Gradient
Descent

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 1
We will present the different steps of the Gradient Descent
Algorithm applied to this problem.
We start by caculating the derivative :
m n
!
∂J 1 X
βi xji (−xjk )
X
(β) = yj −
∂βk m j=1 i=1

For simplification and more general use, we present the

matrix form of the parial derivatives :

 ∂J 
∂β
 .0 
 .. 
∇β J =  
∂J
∂βn

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 2

The matrix form of the data is given by :

     
1 x11 x12 ... x1n y1 β0
1

x21 x22 ... 2
xn  
 y2 
 
 β1 
 
X=
 .. .. .. .. ..  y =  ..  β =  .. 
   
. . . . .   .   . 
1 xm
1 xm
2 ... xmn y m βn

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 3

Now, we calculate the gradient in the matrix form :


1 Pm j Pn j

j
−m j=1 y − i=1 βi xi x0
..
 
∇β J = 
 
 .


1 m n j
−m yj − xjn
P P
j=1 i=1 βi xi

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 3

We can simplify this form of the Gradient :

j j m Pn j j
P  P 
m
j=1 y x0 j=1 i=1 βi xi x0
1  .. + 1 
  .. 
∇β J = − .
m Pm .
 
 m 
j xj m n j j
y
P P
j=1 n j=1 β x
i=1 i i nx

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 4

Q. Ca, you determine what is are the unknowns in the last

formula ?

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 5

The only unknown is β

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 5

Goal of Gradient descent

We have to find the best method that allows us to “guess” the
correct values of β that minimizes the gradient.

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 6

Given a cost function J(β) how can we computationally search

for the correct value of β that minimizes the gradient of that
function ?
What would the search process looks like in the case of single
value of β ?

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 7

A common answer of the second question is : “the common

mountain analogy”
The common mountain analogy

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The Common mountain analogy I

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The Common mountain analogy II

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The Common mountain analogy III

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The Common mountain analogy IV

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The Common mountain analogy V

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

The Common mountain analogy VI

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 8

This is exaclty what gradient descent does

It even looks similar for the case of a single coefficient search

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 1 dimensional cost function 1

This is the case of a regression with only one explanatory

variable

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 1 dimensional cost function 2

We start by choosing a a starting point

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 1 dimensional cost function 3

Then, we calculate the gradient at that point

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 1 dimensional cost function 3

Step forward, proportional to negative gradient

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 1 dimensional cost function 4

Repeat the steps

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 1 dimensional cost function 5

Repeat the steps

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 1 dimensional cost function 6

what we are essentially doing is mapping the gradient.

If we walk along the gradient, we will Eventually ! find the
value that minimizes the value of the gradient.

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 8

some ideas about Gradient Descent :

Since steps are proportional to the negative of gradients

then : Steeper steps at start gives larger gradients
and smaller gradients at end gives smaller gradients
This practically assures a solution for every problem

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 2 dimensional cost function 1

We can apply the same principle to a 2-D cost function (two
variables).

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 2 dimensional cost function 2

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Example of 2 dimensional cost function 3

We can show the contour plot of this solution

Ayoub Asri
Introduction to Machine Learning
Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Gradient Descent 9

Finally, we can propose the algorithm of Gradient Descent that

can be applied to any minimization problem
Gradient descent algorithm

Ayoub Asri
Introduction to Machine Learning

Lecture 0.2 - Linear Methods For Regression, Optimization
No ratings yet
Lecture 0.2 - Linear Methods For Regression, Optimization
53 pages
Intro To ML - 3
No ratings yet
Intro To ML - 3
20 pages
Lecture 5 - Linear Regression
No ratings yet
Lecture 5 - Linear Regression
51 pages
ML Lecture 2 2023
No ratings yet
ML Lecture 2 2023
59 pages
Lecture Slides - Linear Regression (2025)
No ratings yet
Lecture Slides - Linear Regression (2025)
45 pages
Module B Handbook
No ratings yet
Module B Handbook
11 pages
Regression
No ratings yet
Regression
25 pages
Module2 Optimizations
No ratings yet
Module2 Optimizations
65 pages
2 Linear Regression
No ratings yet
2 Linear Regression
14 pages
Introduction Supervised Machine Learning
No ratings yet
Introduction Supervised Machine Learning
27 pages
DataScience - Chapter03 - Machine Learning With Python - 03 - Regression
No ratings yet
DataScience - Chapter03 - Machine Learning With Python - 03 - Regression
19 pages
Linear Regression
No ratings yet
Linear Regression
75 pages
Lec6 7 Linear Regression
No ratings yet
Lec6 7 Linear Regression
38 pages
Linear - Regression - SGD
No ratings yet
Linear - Regression - SGD
71 pages
Linear Regression
No ratings yet
Linear Regression
62 pages
ML 2
No ratings yet
ML 2
155 pages
Lecture 3
No ratings yet
Lecture 3
22 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
48 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Regression
No ratings yet
Regression
16 pages
GradientDescent-Regression Slides
No ratings yet
GradientDescent-Regression Slides
26 pages
LinearRegression1 210720 171800
No ratings yet
LinearRegression1 210720 171800
41 pages
Lecture3 Upload
No ratings yet
Lecture3 Upload
28 pages
2-Linear Regression
No ratings yet
2-Linear Regression
31 pages
Module3 Ch1
No ratings yet
Module3 Ch1
83 pages
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
No ratings yet
Understanding The Geometry of Predictive Models: Workshop at S P Jain School Institute of Management and Research
78 pages
Week 1 Lecture Notes
No ratings yet
Week 1 Lecture Notes
7 pages
Regression PPT
No ratings yet
Regression PPT
21 pages
Progression Linaire
No ratings yet
Progression Linaire
187 pages
Lecture 3 - Regression
No ratings yet
Lecture 3 - Regression
47 pages
2EL1730 ML Lecture02 Linear and Logistic Regression
No ratings yet
2EL1730 ML Lecture02 Linear and Logistic Regression
65 pages
ML L6 Linear Regresion
No ratings yet
ML L6 Linear Regresion
54 pages
Python Tutorial
No ratings yet
Python Tutorial
37 pages
CSE 412 Lab Manual 3 Linear Regression
No ratings yet
CSE 412 Lab Manual 3 Linear Regression
10 pages
MACHINE LEARNING ALGORITHM Unit-II
No ratings yet
MACHINE LEARNING ALGORITHM Unit-II
115 pages
Linear Regression
No ratings yet
Linear Regression
36 pages
Machine Learning: Linear Models For Regression
No ratings yet
Machine Learning: Linear Models For Regression
54 pages
ML: Introduction 1. What Is Machine Learning?
No ratings yet
ML: Introduction 1. What Is Machine Learning?
38 pages
Introml 02 Regression Annotated PDF
No ratings yet
Introml 02 Regression Annotated PDF
26 pages
Linear and Logistic Regression: Marta Arias Marias@lsi - Upc.edu
No ratings yet
Linear and Logistic Regression: Marta Arias Marias@lsi - Upc.edu
25 pages
Introduction To Econometrics With R: Christoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer
No ratings yet
Introduction To Econometrics With R: Christoph Hanck, Martin Arnold, Alexander Gerber, and Martin Schmelzer
481 pages
Wk05 Machine Learning
No ratings yet
Wk05 Machine Learning
6 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
10 pages
Lec9 - Linear Models
No ratings yet
Lec9 - Linear Models
44 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
5 pages
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
No ratings yet
DS303: Introduction To Machine Learning: Manjesh K. Hanawal
17 pages
Lec 07-08 - Final
No ratings yet
Lec 07-08 - Final
32 pages
Linear Regression
No ratings yet
Linear Regression
26 pages
Basic Machine Learning: Case Study
No ratings yet
Basic Machine Learning: Case Study
11 pages
Ecometrici Module
No ratings yet
Ecometrici Module
338 pages
The Hundred-Page Machine Learning Book - Andriy Burkov
No ratings yet
The Hundred-Page Machine Learning Book - Andriy Burkov
16 pages
(Machine Learning Coursera) Lecture Note Week 1
No ratings yet
(Machine Learning Coursera) Lecture Note Week 1
8 pages
L02 Linear Regression
No ratings yet
L02 Linear Regression
9 pages
ML:Introduction: Week 1 Lecture Notes
No ratings yet
ML:Introduction: Week 1 Lecture Notes
8 pages
Lecture 1, Part 1: Linear Regression: Roger Grosse
No ratings yet
Lecture 1, Part 1: Linear Regression: Roger Grosse
9 pages
Dr. Etazaz Econometrics Notes PDF
100% (1)
Dr. Etazaz Econometrics Notes PDF
98 pages
Hundred Page ML Book CH 3
No ratings yet
Hundred Page ML Book CH 3
16 pages
Essentials of Linear Regression in Python
No ratings yet
Essentials of Linear Regression in Python
23 pages
Hansen (1999) (Testing For Linearity) 06
No ratings yet
Hansen (1999) (Testing For Linearity) 06
26 pages
Data Science Lab Manual
No ratings yet
Data Science Lab Manual
74 pages
Survey Methods in Public Administration Research A
100% (1)
Survey Methods in Public Administration Research A
37 pages
Chapter 10: Multicollinearity Chapter 10: Multicollinearity: Iris Wang
No ratings yet
Chapter 10: Multicollinearity Chapter 10: Multicollinearity: Iris Wang
56 pages
Note 13 - Linear Regression
No ratings yet
Note 13 - Linear Regression
25 pages
Satisfaction and Comparison Income
No ratings yet
Satisfaction and Comparison Income
29 pages
Econometrics 5 and 6
No ratings yet
Econometrics 5 and 6
16 pages
Basic Econometrics Revision - Econometric Modelling
No ratings yet
Basic Econometrics Revision - Econometric Modelling
65 pages
Miljenko Antić : Democracy Versus Dictatorship: The Influence of Political Regime On GDP Per Capita Growth
No ratings yet
Miljenko Antić : Democracy Versus Dictatorship: The Influence of Political Regime On GDP Per Capita Growth
31 pages
Autocorrelation
No ratings yet
Autocorrelation
36 pages
On Bias, Inconsistency, and Efficiency of Various Estimators in Dynamic Panel Data Models
No ratings yet
On Bias, Inconsistency, and Efficiency of Various Estimators in Dynamic Panel Data Models
26 pages
Quiz Final Ae
No ratings yet
Quiz Final Ae
23 pages
A Stargazer Cheatsheet PDF
No ratings yet
A Stargazer Cheatsheet PDF
40 pages
Chapter 8: Interval Estimates and Hypothesis Testing
No ratings yet
Chapter 8: Interval Estimates and Hypothesis Testing
30 pages
Simple Regression
No ratings yet
Simple Regression
22 pages
Blind Reverberation Time Estimation From Ambisonic
No ratings yet
Blind Reverberation Time Estimation From Ambisonic
6 pages
Dorm Living On Student Performa
No ratings yet
Dorm Living On Student Performa
15 pages
2016 Does A Taller Husband Make His Wife Happier
No ratings yet
2016 Does A Taller Husband Make His Wife Happier
8 pages
ECONS303 2022 Fall Outline Approved
No ratings yet
ECONS303 2022 Fall Outline Approved
8 pages
Introduction To Machine Learning Algorithms: Linear Regression
No ratings yet
Introduction To Machine Learning Algorithms: Linear Regression
1 page
MSC Syllabus
No ratings yet
MSC Syllabus
14 pages
Econometrics I Final Examination Summer Term 2013, July 26, 2013
No ratings yet
Econometrics I Final Examination Summer Term 2013, July 26, 2013
9 pages
ML Summary PDF
No ratings yet
ML Summary PDF
5 pages
Multiple Regression - WPS Office
No ratings yet
Multiple Regression - WPS Office
2 pages
Machine Learning Assignment 2
No ratings yet
Machine Learning Assignment 2
4 pages
Assignment 1
No ratings yet
Assignment 1
2 pages
Assignment 1: The Simple Linear Regression Model
No ratings yet
Assignment 1: The Simple Linear Regression Model
3 pages
CH 10 Quiz
No ratings yet
CH 10 Quiz
7 pages
hw2 Answer Econ120c Su05 PDF
No ratings yet
hw2 Answer Econ120c Su05 PDF
4 pages
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
From Everand
Student's Solutions Manual and Supplementary Materials for Econometric Analysis of Cross Section and Panel Data, second edition
Jeffrey M. Wooldridge
No ratings yet
Calculus-II (Mathematics) Question Bank
From Everand
Calculus-II (Mathematics) Question Bank
Mohmmad Khaja Shareef
No ratings yet
Numerical Analysis II Essentials
From Everand
Numerical Analysis II Essentials
The Editors of REA
No ratings yet

ML 3

Uploaded by

ML 3

Uploaded by

Supervised Learning Supervised Learning : Linear Regression Gradient Descent

Introduction to Machine Learning

Supervised Learning Process I

A simplified process can be presented as :

Supervised Learning Process II

It can be defined after adjustements :

Supervised Learning Process III

The final process is then defined as :

Supervised Learning : Linear Regression

The basic intuitive idea behind the linear regression is to try

Ideally find the relationship : y = x

After finding the relationship between y and x, we can use it to

The real question is to know, where to put the regression line

This line can be a good fit ?

Or even this one !!

The fundamental idea is to minimize the global distance

This distance (measure) is called : The residual error

The best method which provides the best solution to this

Linear Regression : Example

Linear Regression : General formulation

OLS solution formulation I

Find the algebraic formulation of the solution, means it is the

Uni variate case :

The solution is given by :

OLS solution formulation I

Since OLS provide a theoretical and algebraic solution,

The alternative to OLS in the case of large data sets or

Iterative methods can easily be implemented and are much

The cost function

The main goal of this problem is to find the value of β that

Or we can even calculate the mean of squared errors

The goal of the problem is then : Find the values of β that

The cost function : simplification

We can simplify the cost function to :

Derivative of cost function

To find the minimum, we need to calculate the derivative :

The cost function 2

The analytical solution is very complex and take many

For simplification and more general use, we present the

The matrix form of the data is given by :

Now, we calculate the gradient in the matrix form :

We can simplify this form of the Gradient :

Q. Ca, you determine what is are the unknowns in the last

The only unknown is β

Goal of Gradient descent

Given a cost function J(β) how can we computationally search

A common answer of the second question is : “the common

The Common mountain analogy I

The Common mountain analogy II

The Common mountain analogy III

The Common mountain analogy IV

The Common mountain analogy V

The Common mountain analogy VI

This is exaclty what gradient descent does

Example of 1 dimensional cost function 1

This is the case of a regression with only one explanatory

Example of 1 dimensional cost function 2

We start by choosing a a starting point

Example of 1 dimensional cost function 3

Then, we calculate the gradient at that point

Example of 1 dimensional cost function 3

Step forward, proportional to negative gradient

Example of 1 dimensional cost function 4

Repeat the steps

Example of 1 dimensional cost function 5

Repeat the steps

Example of 1 dimensional cost function 6

what we are essentially doing is mapping the gradient.

some ideas about Gradient Descent :

Since steps are proportional to the negative of gradients

Example of 2 dimensional cost function 1

Example of 2 dimensional cost function 2

Example of 2 dimensional cost function 3

We can show the contour plot of this solution

Finally, we can propose the algorithm of Gradient Descent that

You might also like