This document provides instructions for an assignment involving classification of iris flowers using machine learning techniques. Students are asked to use a publicly available iris dataset containing 150 samples of 3 iris types with 4 features each. The data should be split into two parts, with part I used to determine class priors and part II used for classification. First, students should classify the part II data using just the sepal length feature, assuming it follows a Gaussian distribution for each class. Next, students should classify the part II data using all 4 features, again assuming a Gaussian distribution. The assignment should be submitted as a Word or PDF document.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
46 views
Assignment: Due Date: March 31, 2014
This document provides instructions for an assignment involving classification of iris flowers using machine learning techniques. Students are asked to use a publicly available iris dataset containing 150 samples of 3 iris types with 4 features each. The data should be split into two parts, with part I used to determine class priors and part II used for classification. First, students should classify the part II data using just the sepal length feature, assuming it follows a Gaussian distribution for each class. Next, students should classify the part II data using all 4 features, again assuming a Gaussian distribution. The assignment should be submitted as a Word or PDF document.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 1
Assignment
Due Date: March 31, 2014
For this assignment, you will use the "Iris" data-set from UCI Machine Learning Repository. You can acess this data-set from http://archive.ics.uci.edu/ml/datasets/Iris This data-set considers three classes of flowers : Iris Setosa Iris Versicolour Iris Virginica and has 50 samples from each flower (so a total of 150 samples). For each sample, it records 4 features : sepal length in cm sepal width in cm petal length in cm petal width in cm Divide the data into 2 parts : part I containing a random set of 30 samples, and the rest into part II. 1. Use part I to get prior probabilities of each class. 2. First you will use only the first feature to classify the flowers into these 3 classes. Assume that the class conditional distribution of "sepal length" is Gaussian for each of the 3 classes. Estimate the parameters of the three distributions using maximum likelihood estimation. Use data in part II. 3. Using the parameters obtained above, classify the data in part II. Report on the number of errors. 4. Now use all four features in Step 3 above : assume that the class conditional distribution is a 4-dimensional Gaussian. Repeat Step 4. Note: 1. For the answers for the above problems if there is any sharing of information from your friend/internet/other reference material, it is desired that you please include/acknowledge the name of that reference. 2. You may submit the report of your assignment as a word or pdf document. 5. Assignment http://www.cse.iitd.ernet.in/~pkalra/siv895/assignment.html 1 of 1 13-03-2014 15:46