ML Lec8
ML Lec8
► Batch methods
► Ordinary least squares (OLS)
Lecture 8 Linear Model for Regression ► Maximum likelihood estimates
► Sequential methods
► Least mean squares (LMS)
► Recursive (sequential) least squares (RLS)
2/27
Problem Setup
What is regression analysis? which associates x with y, such that we can make prediction about
when a new input is provided.
3/27 4/27
Regression Regression Function: Conditional Mean
► Regression aims at modeling the dependence of a response Y on a We consider the mean squared error and find the MMSE estimate:
covariate X. In other words, the goal of regression is to predict the
value of one or more continuous target variables y given the value of
input vector x.
► The regression model is described by
► Terminology:
► x: input, independent variable, predictor, regressor, covariate
► y: output, dependent variable, response
► The dependence of a response on a covariate is captured via a
conditional probability distribution, .
► Depending on f(x),
► Linear regression with basis functions: .
► Linear regression with kernels:
5/27 6/27
Linear Regression
9/27 10/27
Basis Functions
► Polynomial regression:
► Gaussian basis functions:
Ordinary Least Squares
► Spline basis functions: Piecewise polynomials (divide the input
space up into regions and fit a different polynomial in each Loss function view
region).
► Many other possible basis functions: sigmoidal basis functions,
hyperbolic tangent basis functions, Fourier basis, wavelet basis,
and so on.
11/27 12/27
Least Squares Method
Given a set of training data , we determine the weight
vector which minimizes Find the estimate such that
Solve for w.
Note that
13/27 14/27
Note that
Therefore, leads to the normal equation that
is of the form
Then, we have
15/27 16/27
Maximum Likelihood
We consider a linear model where the target variable yn is assumed to
be generated by a deterministic function with
additive Gaussian noise:
Least Squares
for and
Probabilistic model view with MLE
In a compact form, we have
17/27 18/27
Sequential Methods
MLE is given by
LMS and RLS
leading to
19/27 20/27
Online Learning Mean Squared Error (MSE)
[Source: Wikipedia]
21/27 22/27
LMS is a gradient-descent method which minimizes the instantaneous We introduce the forgetting factor λ to de-emphasize
squared error old samples, leading to the following error function
The gradient descent method leads to the updating rule for w that is of where
the form Solving for wn leads to
23/27 24/27
We define
25/27 26/27
27/27