ML Lecture Linear Regression 3
ML Lecture Linear Regression 3
Note that
Bayesian Model Comparison (4)
For a given model with a
single parameter, w, con-
sider the approximation
Negative
under-fitting
over-fitting
Outlines
Precision:
The Evidence Approximation (4)*
Example: sinusoidal data, M th degree polynomial,
Maximizing the Evidence Function (1)*
To maximise w.r.t. ® and ¯, we define
the eigenvector equation
Precision:
Thus
Precision:
has eigenvalues ¸i + ®.
Maximizing the Evidence Function (2)*
𝜕𝑝(𝐭|𝛼, 𝛽)
=
𝜕𝛼
𝜕𝑝(𝐭|𝛼, 𝛽)
=
𝜕𝛽
Maximizing the Evidence Function (3)*
We can now differentiate w.r.t. ® and
¯, and set the results to zero, to get
1
:
𝛽MAP
where
° depends on both ® and ¯.
recall
Effective Number of Parameters (1)*
1-1
( +1)-1 w2 is well determined by
the likelihood when less
disturbed from
Prior
° is the number of well
-1 determined parameters
Effective Number of Parameters (2)*
Example: sinusoidal data, 9 Gaussian basis
functions, ¯ = 11.1 (true value *).
*
Effective Number of Parameters (3)*
Example: sinusoidal data, 9 Gaussian basis
functions, ¯ = 11.1 (true value *).
0
0 ° 10
Effective Number of Parameters (5)*
In the limit , ° = M and we can consider
using the easy-to-compute approximation
Limitations of Fixed Basis Functions
M basis function along each dimension of a
D-dimensional input space requires MD
basis functions: the curse of dimensionality.
In later chapters, we shall see how we can
get away with fewer basis functions, by
choosing these using the training data.