Quiz 4

Uploaded by

lakshay22266

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

15 views4 pages

Quiz 4

Uploaded by

lakshay22266

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

CSE343/CSE543/ECE363/ECE563: Machine Learning Sec A (Monsoon 2024)

Quiz - 4

Date of Examination: 12/12/2024 Duration: 40 mins Total Marks: 8 marks

Instructions –

• Attempt all questions.

• MCQs may have multiple correct options.
• State any assumptions you have made clearly.
• Standard institute plagiarism policy holds.
• No evaluation without suitable justification.
• 0 marks if the option or justification of MCQs is incorrect.

1. [1 marks] The Gaussian Mixture Model (GMM) and the k-means algorithm are closely related—the latter is a special case
of GMM. The likelihood of a GMM with Z denoting the latent components can be expressed typically as
X
P (X) = P (X|Z)P (Z),
Z

where P (X|Z) is the (multivariate) Gaussian likelihood conditioned on the mixture component, and P (Z) is the prior on
the components.
Such a likelihood formulation can also be used to describe a k-means clustering model. Which of the following statements
is/are true? Choose all correct options if there are multiple ones.

(A) P (Z) is uniform in k-means but this is not necessarily true in GMM.
(B) The values in the covariance matrix in P (X|Z) tend towards zero in k-means, but this is not so in GMM.
(C) The values in the covariance matrix in P (X|Z) tend towards infinity in k-means, but this is not so in GMM.
(D) The covariance matrix in P (X|Z) in k-means is diagonal, but this is not necessarily the case in GMM.

Correct Answer: A and B. 1 mark for correct answer and correct reason.
Explanation: The correct options are (A) and (B).
(A): In k-means, P(Z) is uniform because all clusters are equally likely, while in GMM, P(Z) is determined by the mixing
coefficients, which are not necessarily uniform.
(B): In k-means, the covariance matrix values conceptually tend to zero because it assumes clusters are concentrated at
their centers. In GMM, the covariance matrix explicitly models the spread and orientation of clusters and does not shrink
to zero.
Options (C) and (D) are incorrect because the covariance does not tend to infinity in k-means, and k-means does not
explicitly use a diagonal covariance matrix.

2. [1 marks] Which of the following can act as possible termination conditions in K-Means?

1. For a fixed number of iterations.

2. The assignment of observations to clusters does not change between iterations, except for cases with a bad local
minimum.
3. Centroids do not change between successive iterations.
4. Terminate when RSS (Residual Sum of Squares) falls below a threshold.

(A) 1, 3 and 4
(B) 1, 2 and 3
(C) 1, 2 and 4
(D) All of the above
Correct Answer: D. 1 mark for correct answer and correct reason.
Reason:
All four conditions can be used as possible termination conditions in K-Means clustering:
A. This condition limits the runtime of the clustering algorithm, but in some cases, the clustering quality will be poor
because of an insufficient number of iterations.
B. This produces good clustering except for cases with a bad local minimum, but runtimes may be unacceptably long.
C. This also ensures that the algorithm has converged at the minimum.
D. Terminate when RSS falls below a threshold. This criterion ensures that the clustering is of the desired quality after
termination. Practically, combining it with a bound on the number of iterations to guarantee termination is a good practice.

3. [1 marks] Which of the following is NOT a desirable property of a distance measure in clustering?

(A) Symmetry
(B) Positivity
(C) Triangle inequality
(D) Dependency on labels

Correct Answer: D: Dependency on labels, 1 mark for correct answer and correct reason.
Reason:
Clustering methods, including distance measures, operate in an unsupervised setting with no labels. Dependency on labels
contradicts this principle.

4. [1 marks] In which of the following cases will K-Means clustering fail to give good results?

1. Data points with outliers

2. Data points with different densities
3. Data points with round shapes
4. Data points with non-convex shapes

Options:

A. 1 and 2
B. 2 and 3
C. 2 and 4
D. 1, 2 and 4
E. 1, 2, 3 and 4

Correct Answer: D: 1, 2 and 4, 1 mark for correct answer and correct reason.
Reason:
The K-Means clustering algorithm fails to give good results when the data contains outliers, the density spread of data
points across the data space differs, and the data points follow non-convex shapes.
Figure 1: Question 5

Data # x y
1 1.90 0.97
2 1.76 0.84
3 2.32 1.63
4 2.31 2.09
5 1.14 2.11
6 5.02 3.02
7 5.74 3.84
8 2.25 3.47
9 4.71 3.60
10 3.17 4.96

Table 1: ⟨x, y⟩ Pairs

5. [4 marks] Suppose you are given the following ⟨x, y⟩ pairs. You will simulate the k-means algorithm and Gaussian Mixture
Models (GMM) learning algorithm to identify two clusters in the data.
Suppose you are given the initial assignment of cluster centres as:

Cluster 1: #1, Cluster 2: #10.

The first data point is used as the centre for the first cluster, and the 10th data point is used as the centre for the second
cluster.
1. Please simulate the k-means algorithm (k = 2) for one iteration. What are the cluster assignments after one iteration?
Assume k-means uses Euclidean distance.
2. What are the cluster assignments until convergence?
Correct Answer
Initialization::
Number of clusters (K) = 2, centroid for cluster0 (C1)= (1.90, 0.97) and centroid for cluster1 (C2) = (3.17, 4.96). We use
Euclidean distance to find the closest point to centroids.
Record Number Close to C0 (1.90, 0.97) Close to C1 (3.17, 4.96) Assign to Cluster
1 (1.90, 0.97) dist(1, C1) = 0.0 dist(1, C2) = 4.19 Cluster0
2 (1.76, 0.84) dist(2, C1) = 0.19 dist(2, C2) = 4.35 Cluster0
3 (2.31,1.63) dist(3, C1) = 0.78 dist(3, C2) = 3.44 Cluster0
4 (2.31, 2.09) dist(4, C1) = 1.19 dist(4, C2) = 3.00 Cluster0
5 (1.14, 2.11) dist(5, C1) = 1.37 dist(5, C2) = 3.50 Cluster0
6 (5.02, 3.02) dist(6, C1) = 3.73 dist(6, C2) = 2.68 Cluster1
7 (5.74, 3.84) dist(7, C1) = 4.79 dist(7, C2) = 2.80 Cluster1
8 (2.25, 3.47) dist(8, C1) = 2.52 dist(8, C2) = 1.75 Cluster1
9 (4.71, 3.60) dist(9, C1) = 3.85 dist(9, C2) = 2.05 Cluster1
10 (3.17, 4.96) dist(10, C1) = 4.19 dist(10, C2) = 0.00 Cluster1

Table 2: After Iteration 1 (1 mark)

Thus, we obtain two clusters containing(0.5 each):

Cluster0 {1, 2, 3, 4, 5} and Cluster1 {6, 7, 8, 9, 10}.
For the updated cluster, we calculate centroids:

1.90 + 1.76 + 2.32 + 2.31 + 1.14 0.97 + 0.84 + 1.63 + 2.09 + 2.11
C0 = ,
5 5
= (1.886, 1.528)

5.02 + 5.74 + 2.25 + 4.71 + 3.17 3.02 + 3.84 + 3.47 + 4.96
C1 = ,
5 5
= (4.178, 3.778)

Part 2:(2 Marks) Run until convergence:

Total Iterations it took: 2 or 3
Final Class Assignments

Record Number Close to C0 (1.886, 1.528) Close to C1 (4.178, 3.778) Assign to Cluster
1 (1.90, 0.97) dist(1, C1) = 0.56 dist(1, C2) = 3.62 Cluster0
2 (1.76, 0.84) dist(2, C1) = 0.70 dist(2, C2) = 3.81 Cluster0
3 (2.31,1.63) dist(3, C1) = 0.45 dist(3, C2) = 2.84 Cluster0
4 (2.31, 2.09) dist(4, C1) = 0.70 dist(4, C2) = 2.52 Cluster0
5 (1.14, 2.11) dist(5, C1) = 0.95 dist(5, C2) = 3.47 Cluster0
6 (5.02, 3.02) dist(6, C1) = 3.47 dist(6, C2) = 1.13 Cluster1
7 (5.74, 3.84) dist(7, C1) = 4.49 dist(7, C2) = 1.56 Cluster1
8 (2.25, 3.47) dist(8, C1) = 1.98 dist(8, C2) = 1.95 Cluster1
9 (4.71, 3.60) dist(9, C1) = 3.50 dist(9, C2) = 0.56 Cluster1
10 (3.17, 4.96) dist(10, C1) = 3.66 dist(10, C2) = 1.55 Cluster1

Table 3: After Convergence (1 mark)

Final Centroids (0.5 each)

1.90 + 1.76 + 2.32 + 2.31 + 1.14 0.97 + 0.84 + 1.63 + 2.09 + 2.11
C0 = ,
5 5
= (1.886, 1.528)

5.02 + 5.74 + 2.25 + 4.71 + 3.17 3.02 + 3.84 + 3.47 + 4.96
C1 = ,
5 5
= (4.178, 3.778)
2 marks for simulate the k-means algorithm (k = 2) for one iteration and 2 marks for ”What are the cluster assignments
until convergence” Solution.

Regression Statistics
No ratings yet
Regression Statistics
4 pages
INAIO Stage 2 Sample Problems MLTheory
No ratings yet
INAIO Stage 2 Sample Problems MLTheory
6 pages
A Paper With 12pt Global Font Size
No ratings yet
A Paper With 12pt Global Font Size
13 pages
Endsem ML Regular AK
No ratings yet
Endsem ML Regular AK
7 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
ME3435E ADDTE Lect33 Machine Learning For Signal Processing 07.04.25
No ratings yet
ME3435E ADDTE Lect33 Machine Learning For Signal Processing 07.04.25
16 pages
Report 1
No ratings yet
Report 1
3 pages
K Means Example
No ratings yet
K Means Example
14 pages
Unit V
No ratings yet
Unit V
165 pages
A Tutorial On Clustering Algorithms
No ratings yet
A Tutorial On Clustering Algorithms
4 pages
Week 5 v1.1 - Unsupervised Learning
No ratings yet
Week 5 v1.1 - Unsupervised Learning
40 pages
Assignment 8 Solution
No ratings yet
Assignment 8 Solution
7 pages
10 Marks Questions
No ratings yet
10 Marks Questions
19 pages
Final Exam, 10701 Machine Learning, Spring 2009: Max. Score Score 1 2 3 4 5 6 7 8 9 10
No ratings yet
Final Exam, 10701 Machine Learning, Spring 2009: Max. Score Score 1 2 3 4 5 6 7 8 9 10
25 pages
ML Lec-16
No ratings yet
ML Lec-16
16 pages
Introduction To Unsupervised Learning:: Clustering
No ratings yet
Introduction To Unsupervised Learning:: Clustering
21 pages
Clustering: CMPUT 466/551 Nilanjan Ray
No ratings yet
Clustering: CMPUT 466/551 Nilanjan Ray
34 pages
1654402DA CPQ WB-2025 Machine Learning
No ratings yet
1654402DA CPQ WB-2025 Machine Learning
3 pages
AI-AG-Day-2-28th Feb 2023
No ratings yet
AI-AG-Day-2-28th Feb 2023
44 pages
01 K Means - Merged
No ratings yet
01 K Means - Merged
26 pages
Lec 11
No ratings yet
Lec 11
57 pages
K-Means Clustering
No ratings yet
K-Means Clustering
38 pages
Exercise6 - Unsupervised Learning With K-Means
No ratings yet
Exercise6 - Unsupervised Learning With K-Means
26 pages
K-Means Clustering
No ratings yet
K-Means Clustering
21 pages
Application of K-Means 1002.2425 PDF
No ratings yet
Application of K-Means 1002.2425 PDF
4 pages
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
No ratings yet
A Novel Approach of Implementing An Optimal K-Means Plus Plus Algorithm For Scalar Data
6 pages
Capture D'écran, Le 2025-04-21 À 21.26.38
No ratings yet
Capture D'écran, Le 2025-04-21 À 21.26.38
14 pages
ML-Notes - 4 and 5 - 16 Marks
No ratings yet
ML-Notes - 4 and 5 - 16 Marks
21 pages
Data Mining - Clustering
No ratings yet
Data Mining - Clustering
90 pages
K-Means Clustering
No ratings yet
K-Means Clustering
7 pages
Machine Learning-4
No ratings yet
Machine Learning-4
73 pages
401 Week7 Part 1 KMeans
No ratings yet
401 Week7 Part 1 KMeans
45 pages
K Means Algorithms
No ratings yet
K Means Algorithms
27 pages
Example 1
No ratings yet
Example 1
8 pages
Artificial Intelligence Lab 10
No ratings yet
Artificial Intelligence Lab 10
8 pages
IML-IITKGP - Assignment 8 Solution
No ratings yet
IML-IITKGP - Assignment 8 Solution
8 pages
Intro Data Science: Cluster Analysis
No ratings yet
Intro Data Science: Cluster Analysis
60 pages
Clustering Solved Examples
No ratings yet
Clustering Solved Examples
13 pages
ML Lecture06 Unsupervised Learning
No ratings yet
ML Lecture06 Unsupervised Learning
87 pages
K-Means Clustering Algorithm: - V - ' Is The Euclidean Distance Between X ' Is The Number of Data Points in I
No ratings yet
K-Means Clustering Algorithm: - V - ' Is The Euclidean Distance Between X ' Is The Number of Data Points in I
3 pages
K Means
No ratings yet
K Means
33 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
78 pages
Lecture 16
No ratings yet
Lecture 16
5 pages
Introduction To (Statistical) Machine Learning
No ratings yet
Introduction To (Statistical) Machine Learning
30 pages
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
No ratings yet
MS6711 Data Mining Homework 1: 1.1 Implement K-Means Manually (8 PTS)
6 pages
5 - CH 5-K-Means Clustering
No ratings yet
5 - CH 5-K-Means Clustering
54 pages
Lapidot 2018
No ratings yet
Lapidot 2018
5 pages
Unsupervised Learning Models Overview, K-Means Algorithm: Sir Syed University of Engineering & Technology, Karachi
No ratings yet
Unsupervised Learning Models Overview, K-Means Algorithm: Sir Syed University of Engineering & Technology, Karachi
36 pages
Lect 4
No ratings yet
Lect 4
34 pages
Kmeans&Variants
No ratings yet
Kmeans&Variants
29 pages
K Means
No ratings yet
K Means
19 pages
Kmeans
No ratings yet
Kmeans
6 pages
ML DSBA Lab7
No ratings yet
ML DSBA Lab7
6 pages
Kernel Clustering
No ratings yet
Kernel Clustering
57 pages
DM&BAFall2204 2
No ratings yet
DM&BAFall2204 2
61 pages
Kmean Clustering
No ratings yet
Kmean Clustering
3 pages
Clustering Algorithms
No ratings yet
Clustering Algorithms
19 pages
ML Unit-5
No ratings yet
ML Unit-5
21 pages
Le 4
No ratings yet
Le 4
12 pages
IGNOU MCA Discrete Mathematics Previous Years Unsolved Papers MCS 212
From Everand
IGNOU MCA Discrete Mathematics Previous Years Unsolved Papers MCS 212
Manish Soni
No ratings yet
Analytic Geometry: Graphic Solutions Using Matlab Language
From Everand
Analytic Geometry: Graphic Solutions Using Matlab Language
Ing. Mario Castillo
No ratings yet
Devi Bus PDF May 2025 New 0
No ratings yet
Devi Bus PDF May 2025 New 0
8 pages
Quiz2 B
No ratings yet
Quiz2 B
6 pages
Quiz 1
No ratings yet
Quiz 1
6 pages
Quiz 3
No ratings yet
Quiz 3
12 pages
Quiz2 A
No ratings yet
Quiz2 A
5 pages
Smple Linear Rssion
No ratings yet
Smple Linear Rssion
10 pages
Lecture 5 Discriminant Analysis
No ratings yet
Lecture 5 Discriminant Analysis
10 pages
Linear Regression: Student: Mohammed Abu Musameh Supervisor: Eng. Akram Abu Garad
No ratings yet
Linear Regression: Student: Mohammed Abu Musameh Supervisor: Eng. Akram Abu Garad
35 pages
RP A1.1-Đã-S A
No ratings yet
RP A1.1-Đã-S A
31 pages
Part 1.2. Back Propagation
No ratings yet
Part 1.2. Back Propagation
30 pages
TH3769 1
No ratings yet
TH3769 1
10 pages
MDC Lecture 1 - Anova
No ratings yet
MDC Lecture 1 - Anova
10 pages
R For Economic Research - 12 Introduction
No ratings yet
R For Economic Research - 12 Introduction
2 pages
Simple-Linear-Regression-Model-3 24
No ratings yet
Simple-Linear-Regression-Model-3 24
87 pages
Individual Assignment
No ratings yet
Individual Assignment
3 pages
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
No ratings yet
STAT 3008 Applied Regression Analysis Tutorial 1 - Term 2, 2019 20
2 pages
Variance in Ation Factor: As A Condition For The Inclusion of Suppressor Variable(s) in Regression Analysis
No ratings yet
Variance in Ation Factor: As A Condition For The Inclusion of Suppressor Variable(s) in Regression Analysis
15 pages
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
No ratings yet
Dr. Siti Mariam Binti Abdul Rahman Faculty of Mechanical Engineering Office: T1-A14-01C E-Mail: Mariam4528@salam - Uitm.edu - My
30 pages
Multiple Linear Regression (Multiple Regression Analysis)
No ratings yet
Multiple Linear Regression (Multiple Regression Analysis)
37 pages
FINAL Lesson 6.2 Pearson Product Moment Correlation Coefficient Quarter 4 Week 8 For Grouphings-1
No ratings yet
FINAL Lesson 6.2 Pearson Product Moment Correlation Coefficient Quarter 4 Week 8 For Grouphings-1
10 pages
Tugas 2 Feri Tri Setiawan
No ratings yet
Tugas 2 Feri Tri Setiawan
4 pages
01-Simple Regression
No ratings yet
01-Simple Regression
13 pages
Decision Tree Questions
No ratings yet
Decision Tree Questions
8 pages
Introduction To Analytics - BBA 2020 - CO
No ratings yet
Introduction To Analytics - BBA 2020 - CO
13 pages
Chapter 8 Stat
No ratings yet
Chapter 8 Stat
36 pages
Chapter 2 - Logistic Regression
No ratings yet
Chapter 2 - Logistic Regression
88 pages
Practice Set 7
No ratings yet
Practice Set 7
5 pages
Predicting Stem Volume To Any Height Limit For Native Tree Species in Southern New South Wales and Victoria - Bi - 1999
No ratings yet
Predicting Stem Volume To Any Height Limit For Native Tree Species in Southern New South Wales and Victoria - Bi - 1999
14 pages
Clustering 10/36-702 Spring 2018
No ratings yet
Clustering 10/36-702 Spring 2018
50 pages
Class Lecture-06 (Multiple Correlation and Regression Analysis)
No ratings yet
Class Lecture-06 (Multiple Correlation and Regression Analysis)
19 pages
ISOM2500 Spring 25 - Topic 10 - Assumptions For Linear Regression
No ratings yet
ISOM2500 Spring 25 - Topic 10 - Assumptions For Linear Regression
35 pages
(Why) Should We Use SEM? Pros and Cons of Structural Equation Modeling
No ratings yet
(Why) Should We Use SEM? Pros and Cons of Structural Equation Modeling
22 pages
Unit 5
No ratings yet
Unit 5
8 pages
Forecast Proposal For Intek Tapes: Project Work Submitted To: University at Buffalo, The State University at New York
No ratings yet
Forecast Proposal For Intek Tapes: Project Work Submitted To: University at Buffalo, The State University at New York
23 pages