Set3sol 2022
Set3sol 2022
Lecture 14 to 21
Problem Set 3
what extra conditions on z do we need to obtain the correct objective for k means clustering?
* Computing Sciences Department, Bocconi University. [email protected]. Disclaimer: These notes have
not been subjected to the usual scrutiny reserved for formal publications. They may be distributed outside this class only with
the permission of the Instructor.
1
Pk
zi,j ∈ {0, 1}, j=1 zi,j = 1 [ac: this one]
Pk
zi,j ∈ [0, 1], j=1 zi,j = 1
Pk
zi,j ∈ [0, 1], j=1 zi,j = 0
6. Consider the loss in Equation (1) with the appropriate conditions on zi,j specified in the previous
question. The statements below are all true except one. Which of the following statement is false?
For 2 ≤ k < N and with the initial means randomly chosen as k data points, the k-means algorithm
with k clusters is not guaranteed to reach the optimal Lk loss value.
For 2 ≤ k < N , Lk is computationally hard to compute.
For k ≥ N , Lk = 0.
The sequence (Lk )1≤k≤N is not necessarily strictly decreasing. [ac: this one is false. Since all the
datapoints are distinct, we can always strictly improve the loss by assigning µ̂k+1 = xi where
xi 6= µ̂j for all 1 ≤ j ≤ k.]
7. You are given the data in R2 illustrated in the following figure which you want to cluster into an inner
ring and an outer ring (hence a number of clusters k = 2). Which of the following statement(s) is/are
correct?
(a) There exists some initialization such that k-means clustering succeeds.
(b) There exists an appropriate feature expansion such that k-means (with standard initialization)
succeeds.
(c) There exists an appropriate feature expansion such that the Expectation Maximization algorithm
(with standard initialization) for a Gaussian Mixture Model succeeds.
Only a and c
Only a
All of them
Only b and c [ac: this one]
None of them
Only a and b
2 Open Questions
1. Describe the main ideas of behind autoencoders.
2. Illustrate the AdaBoost algorithm, explain its purpose and in which case it is useful.
3. Tell if the following statement about the Principal Component Analysis (PCA) procedure are true or
false. Motivate your answers.
2
(a) The set of the Principal Components vectors are providing an orthonormal base for the original
feature space.
(b) Using as features for regression/classification problems the projection of the original features into
the principal components provided by the PCA reduces the phenomenon of overfitting.
(c) The percentage of the variance explained by a Principal Component is inversely proportional to
the value of the corresponding eigenvalue.
(d) PCA might get stuck into local optima, thus trying multiple random initializations might help.
(e) Even if all the input features have similar scales, we should still perform mean normalization (so
that each feature has zero mean) before running PCA.
[ac: TRUE: The procedure for the PCA looks for the direction in the dataset which is providing the
most variance and extracts the (first) Principal Component (PC) as the unit vector identifying that
direction. Iteratively, it checks the direction, orthogonal to the previous one, with the most variance
and extracts another (second) PC. The process iterates over a number of PC equal to the number
of dimensions of the dataset. This produces an orthogonal base to the initial dataset. FALSE: only
selecting the first K PC one has the chance to remove the noise from the data and avoid overfitting. If
we are keeping all the PC, we would have a linear transformation of the original dataset, which is likely
to behave as good as the original one for the supervised task. FALSE: The variance explained by each
PC is directly proportional to the corresponding eigenvalue. FALSE: There is no source of stochasticity
in the process of performing PCA. Thus the use of multiple initialization would not produce different
results. TRUE: The process considers the case in which the points are centered in the origin (have
zero mean). If one of the component has an average value far different from zero, we might bias the
direction of the first eigen–vector towards this dimension. ]