0% found this document useful (0 votes)

11 views22 pages

ML Lecture Linear Regression 3

Uploaded by

yiruiliu115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views22 pages

ML Lecture Linear Regression 3

Uploaded by

yiruiliu115

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

Learning Objectives

1、How to achieve linear regression using basis functions?

2、What are the relationships between maximum likelihood and least
squares, between maximum a posterior and regularization, and among
expected loss, bias, variance, and noise?
3、What are the common regularization methods for regression?
4、How to achieve Bayesian linear regression?
5、What is the kernel for regression?
6、How to choose the model complexity?
7、What are the evidence approximation and maximization?
Outlines

 Linear Basis Function Models

 Maximum Likelihood and Least Squares
 Bias Variance Decomposition
 Bayesian Linear Regression
 Predictive Distribution
 Bayesian Model Comparison
 Evidence Approximation and Maximization
Bayesian Model Comparison (1)
 How do we choose the ‘right’ model?
 Assume we want to compare models Mi, i=1,…,L,
using data D; this requires computing

Posterior Prior Model evidence or

marginal likelihood

 Bayes Factor: ratio of evidence for two models

Bayesian Model Comparison (2)
 Having computed , we can compute
the predictive (mixture) distribution

 A simpler approximation, known as model

selection, is to use the model with the
highest evidence.
Bayesian Model Comparison (3)
 For a model with parameters w, we get the
model evidence by marginalizing over w

 Note that
Bayesian Model Comparison (4)
For a given model with a
single parameter, w, con-
sider the approximation

where the posterior is

assumed to be sharply
peaked. 1
𝑝 𝑤 =
∆𝑤𝑝𝑟𝑖𝑜𝑟
Bayesian Model Comparison (5)
 Taking logarithms, we obtain

Negative

 With M parameters, all assumed to have the same

ratio , we get

Negative and linear in M.

Bayesian Model Comparison (6)
Matching data and model complexity

under-fitting

over-fitting
Outlines

 Linear Basis Function Models

 Maximum Likelihood and Least Squares
 Bias Variance Decomposition
 Bayesian Linear Regression
 Predictive Distribution
 Bayesian Model Comparison
 Evidence Approximation and Maximization*
The Evidence Approximation (1)*
The fully Bayesian predictive distribution is given by

but this integral is intractable. Approximate with

where is the mode of , which is assumed to

be sharply peaked; a.k.a. empirical Bayes, type II or gene-
ralized maximum likelihood, or evidence approximation.
The Evidence Approximation (2)*
From Bayes’ theorem we have

and if we assume p(®,¯) to be flat we see that

General results for Gaussian integrals give

The Evidence Approximation (3)*

Precision:
The Evidence Approximation (4)*
 Example: sinusoidal data, M th degree polynomial,
Maximizing the Evidence Function (1)*
 To maximise w.r.t. ® and ¯, we define
the eigenvector equation

Precision:

 Thus
Precision:

has eigenvalues ¸i + ®.
Maximizing the Evidence Function (2)*

𝜕𝑝(𝐭|𝛼, 𝛽)
=
𝜕𝛼

𝜕𝑝(𝐭|𝛼, 𝛽)
=
𝜕𝛽
Maximizing the Evidence Function (3)*
 We can now differentiate w.r.t. ® and
¯, and set the results to zero, to get

1
：
𝛽MAP

where
° depends on both ® and ¯.
recall
Effective Number of Parameters (1)*

w1 is not well determined

Likelihood by the likelihood when
( +2)-1 2-1 more disturbed from 

1-1
( +1)-1 w2 is well determined by
the likelihood when less
disturbed from 
Prior
° is the number of well
-1 determined parameters
Effective Number of Parameters (2)*
 Example: sinusoidal data, 9 Gaussian basis
functions, ¯ = 11.1 (true value *).

*
Effective Number of Parameters (3)*
 Example: sinusoidal data, 9 Gaussian basis
functions, ¯ = 11.1 (true value *).

Test set error

*
Effective Number of Parameters (4)*
 Example: sinusoidal data, 9 Gaussian basis
functions, ¯ = 11.1 (true value *).

0

0  ° 10
Effective Number of Parameters (5)*
 In the limit , ° = M and we can consider
using the easy-to-compute approximation
Limitations of Fixed Basis Functions
 M basis function along each dimension of a
D-dimensional input space requires MD
basis functions: the curse of dimensionality.
 In later chapters, we shall see how we can
get away with fewer basis functions, by
choosing these using the training data.

Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
From Everand
Sample Size for Analytical Surveys, Using a Pretest-Posttest-Comparison-Group Design
Joseph George Caldwell
No ratings yet
Meteors Comets, Asteroids And: Science 8
100% (2)
Meteors Comets, Asteroids And: Science 8
30 pages
PRML Slides 3
No ratings yet
PRML Slides 3
57 pages
ML 3
No ratings yet
ML 3
66 pages
ML Lecture Linear Regression 2
No ratings yet
ML Lecture Linear Regression 2
23 pages
Linear Modal For Regresion
No ratings yet
Linear Modal For Regresion
32 pages
ML Lecture Linear Regression 1
No ratings yet
ML Lecture Linear Regression 1
33 pages
1.2.6 Advanced
No ratings yet
1.2.6 Advanced
5 pages
Lecture2 2013
No ratings yet
Lecture2 2013
60 pages
Linear - Regression
100% (1)
Linear - Regression
39 pages
Machine Learning and Pattern Recognition Bayesian Complexity Control
No ratings yet
Machine Learning and Pattern Recognition Bayesian Complexity Control
4 pages
W6a Gaussian Process Kernels
No ratings yet
W6a Gaussian Process Kernels
6 pages
Unit-2 Machine Learning
No ratings yet
Unit-2 Machine Learning
110 pages
Bayesian Linear Regression For Posterior Predictive Distribution MATLAB
No ratings yet
Bayesian Linear Regression For Posterior Predictive Distribution MATLAB
46 pages
PR M4 Notes
No ratings yet
PR M4 Notes
38 pages
3 Practical
No ratings yet
3 Practical
2 pages
Bayesian Inference in The Normal Linear Regression Model
No ratings yet
Bayesian Inference in The Normal Linear Regression Model
53 pages
Pattern Recognition Machine Learning: Chapter 3: Linear Models For Regression
100% (1)
Pattern Recognition Machine Learning: Chapter 3: Linear Models For Regression
48 pages
Chapter-3-Linear Models For Regression
100% (1)
Chapter-3-Linear Models For Regression
61 pages
Bayesian Kernel Methods
No ratings yet
Bayesian Kernel Methods
40 pages
BR 2
No ratings yet
BR 2
36 pages
ML Unit3
No ratings yet
ML Unit3
9 pages
PRCI Slides 1
No ratings yet
PRCI Slides 1
86 pages
Gaussian Process Tutorial by Andrew NG
No ratings yet
Gaussian Process Tutorial by Andrew NG
13 pages
Ryan Adams 140814 Bayesopt Ncap
No ratings yet
Ryan Adams 140814 Bayesopt Ncap
84 pages
Lec22 Introduction2BayesianRegression
No ratings yet
Lec22 Introduction2BayesianRegression
42 pages
Bayesian Linear Regression-II
No ratings yet
Bayesian Linear Regression-II
12 pages
Today: - Calculus
No ratings yet
Today: - Calculus
61 pages
Lec23 Evidence4Regression
No ratings yet
Lec23 Evidence4Regression
38 pages
Lecture 5
No ratings yet
Lecture 5
23 pages
GLMConstrained
No ratings yet
GLMConstrained
11 pages
9 Mle
No ratings yet
9 Mle
39 pages
Unit 2 - ML - SRM
No ratings yet
Unit 2 - ML - SRM
89 pages
Lin Reg
No ratings yet
Lin Reg
34 pages
Unit 2 - ML - SRM
No ratings yet
Unit 2 - ML - SRM
66 pages
جلسه پنجم-1
No ratings yet
جلسه پنجم-1
15 pages
Lec11 Introduction2BayesianStatistics
No ratings yet
Lec11 Introduction2BayesianStatistics
48 pages
Revisiting Revisiting Logistic Regression & Naïve Logistic Regression & Naïve Bayes Bayes
No ratings yet
Revisiting Revisiting Logistic Regression & Naïve Logistic Regression & Naïve Bayes Bayes
46 pages
07 - Bayesian Learning
No ratings yet
07 - Bayesian Learning
55 pages
PRML Slides 2
No ratings yet
PRML Slides 2
86 pages
Chap3 01
No ratings yet
Chap3 01
35 pages
Chap 3
No ratings yet
Chap 3
74 pages
Unit 2
No ratings yet
Unit 2
133 pages
P&AD Lect 17 1 Unit2
No ratings yet
P&AD Lect 17 1 Unit2
14 pages
Lec8 MLE
No ratings yet
Lec8 MLE
35 pages
Advanced ML Notes (Midterm)
No ratings yet
Advanced ML Notes (Midterm)
10 pages
Chapter 2.3.6
No ratings yet
Chapter 2.3.6
4 pages
Dataanalyticsunit 2
No ratings yet
Dataanalyticsunit 2
24 pages
DA Unit 2
No ratings yet
DA Unit 2
124 pages
Gaussian Processes: Probabilistic Inference (CO-493)
No ratings yet
Gaussian Processes: Probabilistic Inference (CO-493)
146 pages
Bayesian Linear Regression in Data Mining: K.Sathyanarayana Sharma, Dr.S.Rajagopal
No ratings yet
Bayesian Linear Regression in Data Mining: K.Sathyanarayana Sharma, Dr.S.Rajagopal
3 pages
MLT Unit 2 - Updated
No ratings yet
MLT Unit 2 - Updated
58 pages
When Models Meet Data
No ratings yet
When Models Meet Data
25 pages
Bayesian Modelling For Data Analysis and Learning From Data
No ratings yet
Bayesian Modelling For Data Analysis and Learning From Data
19 pages
Learning Models From Data: 1 Parametric Estimation
No ratings yet
Learning Models From Data: 1 Parametric Estimation
14 pages
Kondor Regression
No ratings yet
Kondor Regression
4 pages
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Exercises of Statistical Inference
From Everand
Exercises of Statistical Inference
Simone Malacrida
No ratings yet
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
From Everand
Gauss Nodes Revolution: Numerical Integration Theory Radically Simplified And Generalised
Rob Porter
No ratings yet
(Q1) MODULE 1 - The Nature of Matter PDF
No ratings yet
(Q1) MODULE 1 - The Nature of Matter PDF
26 pages
Sex Determination 12th Class
No ratings yet
Sex Determination 12th Class
5 pages
Practical Magnetic Design - Inductors and Coupled Inductors (Article)
No ratings yet
Practical Magnetic Design - Inductors and Coupled Inductors (Article)
23 pages
User Manual
No ratings yet
User Manual
86 pages
Engineering Reports - 2020 - Ferreira - An Artificial Accelerogram Generator Code Written in Matlab
No ratings yet
Engineering Reports - 2020 - Ferreira - An Artificial Accelerogram Generator Code Written in Matlab
17 pages
ICT 8 Activity Sheet: Quarter 3 - Weeks 5-6
No ratings yet
ICT 8 Activity Sheet: Quarter 3 - Weeks 5-6
10 pages
I ST Year II Sem. Scheme Syllabus 22 23 New
No ratings yet
I ST Year II Sem. Scheme Syllabus 22 23 New
38 pages
Logistics Co-Ordinator
No ratings yet
Logistics Co-Ordinator
2 pages
File 4
No ratings yet
File 4
10 pages
Purine Alkaloids
100% (1)
Purine Alkaloids
10 pages
A Finite Element-Based Approach For Predictions of Rigid Pile Group Stiffness Efficiency in Clays
No ratings yet
A Finite Element-Based Approach For Predictions of Rigid Pile Group Stiffness Efficiency in Clays
16 pages
Title: Fabrication of Solar Operated Grinding Machine: Roll Number
No ratings yet
Title: Fabrication of Solar Operated Grinding Machine: Roll Number
7 pages
Case Study Creo
No ratings yet
Case Study Creo
5 pages
ESD-Final Exam - SCS3140-013
No ratings yet
ESD-Final Exam - SCS3140-013
4 pages
7 Types of Regression Techniques You Should Know PDF
No ratings yet
7 Types of Regression Techniques You Should Know PDF
31 pages
Probability Rubric 2
No ratings yet
Probability Rubric 2
1 page
MODULE 9 Personal Relationships
No ratings yet
MODULE 9 Personal Relationships
91 pages
Case Study 2
No ratings yet
Case Study 2
3 pages
Permutations & Combinations MS
No ratings yet
Permutations & Combinations MS
19 pages
04-05-2025 - INC JR IIT STAR CO SUPER CHAINA MODEL-A & B - Jee - Main - WTM-03 - KEY&SOL
No ratings yet
04-05-2025 - INC JR IIT STAR CO SUPER CHAINA MODEL-A & B - Jee - Main - WTM-03 - KEY&SOL
10 pages
Grade 7 C R e Schemes of Work Term 1 Mentor 2025 Teacher - Co - Ke
No ratings yet
Grade 7 C R e Schemes of Work Term 1 Mentor 2025 Teacher - Co - Ke
10 pages
Learning Theories Research Paper
No ratings yet
Learning Theories Research Paper
10 pages
Sceco Materials Standard Specification: B@IZ y ( (©X@R - ) (A ™ X - @@la (A ™yz@c (A
No ratings yet
Sceco Materials Standard Specification: B@IZ y ( (©X@R - ) (A ™ X - @@la (A ™yz@c (A
15 pages
Q2-English Dlp-Week 7
No ratings yet
Q2-English Dlp-Week 7
16 pages
N (0:1:40) A 1.2 F 0.1 X A Cos (2 Pi F N) Stem (N, X,'r','filled') Xlabel ('TIME') Ylabel ('AMPLITUDE')
No ratings yet
N (0:1:40) A 1.2 F 0.1 X A Cos (2 Pi F N) Stem (N, X,'r','filled') Xlabel ('TIME') Ylabel ('AMPLITUDE')
7 pages
X Holiday Homework (2025-26)
No ratings yet
X Holiday Homework (2025-26)
3 pages
2.1 Project Planning, Scheduling & Resource Leveling
No ratings yet
2.1 Project Planning, Scheduling & Resource Leveling
25 pages
Bde Unit IV
No ratings yet
Bde Unit IV
21 pages
WTMD MSR - 0614 508 PDF
No ratings yet
WTMD MSR - 0614 508 PDF
27 pages

ML Lecture Linear Regression 3

Uploaded by

ML Lecture Linear Regression 3

Uploaded by

Learning Objectives

1、How to achieve linear regression using basis functions?

 Linear Basis Function Models

Posterior Prior Model evidence or

 Bayes Factor: ratio of evidence for two models

 A simpler approximation, known as model

where the posterior is

 With M parameters, all assumed to have the same

Negative and linear in M.

 Linear Basis Function Models

but this integral is intractable. Approximate with

where is the mode of , which is assumed to

and if we assume p(®,¯) to be flat we see that

General results for Gaussian integrals give

w1 is not well determined

Test set error

You might also like