Department of CSA, IISc E0 270: Machine Learning (Spring 2025)
E0 270: Machine Learning (Spring 2025)
Department of CSA, IISc
Instructor: Prof. Ambedkar Dukkipati
Practice Problem Sheet
Problem 1: Generative Classification Model
Consider a generative classification model for K classes defined by prior class prob-
abilities p(Ck ) = πk and general class-conditional densities p(ϕ|Ck ). Suppose we
are given a dataset {ϕn , tn } where tn follows a 1-of-K coding scheme.
(a) Maximum Likelihood Estimation
https://
Show that the MLE for prior probabilities is: people.eecs.berkeley.edu/~jrs/
189/exam/mids14.pdf
Nk
πk =
N
where Nk is the number of data points in class Ck . look at 8 ques in
the above link
(b) Gaussian Class-Conditional Model
Assume p(ϕ|Ck ) is Gaussian with shared covariance:
p(ϕ|Ck ) = N (ϕ|µk , Σ).
Show that:
N K
1 X X Nk
µk = tnk ϕn , Σ= Sk
Nk n=1 k=1
N
where Sk is the class covariance matrix.
Problem 2: Weighted Least Squares
Given a dataset where each observation has an associated weight rn > 0, the sum-
of-squares error function is:
N
1X 2
ED (w) = rn tn − w⊤ ϕ(xn ) .
2 n=1
Find the optimal w⋆ that minimizes this function.
1
Department of CSA, IISc E0 270: Machine Learning (Spring 2025)
Problem 3: Noisy Labels in Classification
For a binary classification task with label noise, each sample xn has a probability
πn of being assigned label tn = 1. Given a probabilistic model p(t = 1|ϕ), derive
the appropriate log-likelihood function.
Problem 4: Information Theory and Gaussian Distri-
butions
(a) Entropy of a Multivariate Gaussian: Show that the entropy of a Gaussian
N (x|µ, Σ) is:
1 D
ln |Σ| + (1 + ln(2π)).
H[x] =
2 2
(b) KL Divergence Between Two Gaussians: Compute the Kullback-Leibler
divergence between:
p(x) = N (x|µ1 , Σ1 ), q(x) = N (x|µ2 , Σ2 ).
(c) Maximum Entropy Distribution: Show that the multivariate distribution
with maximum entropy, for a given covariance, is Gaussian.