0% found this document useful (0 votes)
58 views

Machine Learning 1 - Programming Assignment 1

This document provides instructions for a machine learning assignment involving clustering algorithms. Students are asked to: 1) Implement k-means and spectral clustering on a dataset, plot the results, and report the accuracy of each. 2) Generate two datasets from Gaussian distributions, use EM to estimate the distributions' parameters without prior knowledge, and use EM and k-means for clustering. Plot and compare the results. Students are provided details on file naming, submitting code and a report, using only specified functions, and avoiding plagiarism. The preferred language is Python. Accurately completing both questions is required.

Uploaded by

Rimpesh Katiyar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
58 views

Machine Learning 1 - Programming Assignment 1

This document provides instructions for a machine learning assignment involving clustering algorithms. Students are asked to: 1) Implement k-means and spectral clustering on a dataset, plot the results, and report the accuracy of each. 2) Generate two datasets from Gaussian distributions, use EM to estimate the distributions' parameters without prior knowledge, and use EM and k-means for clustering. Plot and compare the results. Students are provided details on file naming, submitting code and a report, using only specified functions, and avoiding plagiarism. The preferred language is Python. Accurately completing both questions is required.

Uploaded by

Rimpesh Katiyar
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1

Indian Institute of Technology Jodhpur

Machine Learning I: Fractal 2


Programming Assignment 1
Maximum Marks: 30

Instructions:
• Submit a .zip file with named ”RollNo_PA1.zip”. Before compressing using zip, please name your
folder as ”RollNo_PA1”.
• Submit a report containing a description of algorithm, results, and evaluation. This should be named
”RollNo_PA1.pdf”. Also, name the code file for each question as ”RollNo_PA1_QNo.py”.
• Your code should be executable by the command python RollNo_PA1_QNo.py <path to the dataset
file> and should plot the results and print the accuracy.
• Inbuilt functions can not be used to solve the problem. For example, any inbuilt function for spectral
clustering from any library can not be used. It should be implemented by you. You can use any inbuilt
function for finding the Eigenvalue decomposition of a matrix.
• Plagiarism will not be tolerated and will result in severe consequences if found.
• The preferred programming language is python.
• Please attempt both the questions.
1. Implement the k-means and spectral clustering algorithms for clustering the points given in the datasets:
http://cs.joensuu.fi/sipu/datasets/jain.txt. Plot the obtained results. In order to evaluate the
performance of these algorithms, find the percentage of points for which the estimated cluster label is
correct. Report the accuracy of both the algorithm. The ground truth clustering is given as the third
column of the given text file. [4+4+2 Marks]
[ ]
2.2
2. Draw 1000 samples from a multivariate Gaussian distribution with the mean vector equal to and
[ ] 0
1 1.5
the covariance matrix equal to . Let X1 be the set of these points. Now, draw 500 samples
1.5 3 [ ]
0
from a multivariate Gaussian distribution with the mean vector equal to and the covariance matrix
[ ] 2.2
1 1.5
equal to . Let X2 be the set of these points. [2 Marks]
1.5 3
Now, consider that you are only given the dataset X = X1 ∪ X2 .
(a) Implement the expectation maximization (EM) algorithm to estimate the mean and the covariance
matrices and the weights of both the Gaussian distributions. Your initial parameters must be
initialized randomly and should not be initialized with the ground truth values. [12 Marks]
(b) Now, use the EM algorithm as a clustering algorithm to cluster these points. Also, use the k-means
algorithm (implemented in Question 1) to cluster these points. [4 Marks]
(c) Plot the results and report the accuracies for both the algorithms. Justify your results. [2 Marks]

You might also like