0% found this document useful (0 votes)
739 views13 pages

Iris Project Presentation

This document analyzes iris flower data using various machine learning techniques. A team of students analyzed physical attributes of three iris species - Iris Setosa, Iris Versicolor, Iris Virginica - to accurately predict flower class. They found the four attributes of sepal length, sepal width, petal length and petal width were sufficient to distinguish species. Various analyses were conducted including bi-variant analysis using pair plots, uni-variant analysis of attribute distributions, and a KNN classification model with K values of 3 and 5. An ANOVA test was also performed to validate the model predictions.

Uploaded by

Rajeev Mukhesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
739 views13 pages

Iris Project Presentation

This document analyzes iris flower data using various machine learning techniques. A team of students analyzed physical attributes of three iris species - Iris Setosa, Iris Versicolor, Iris Virginica - to accurately predict flower class. They found the four attributes of sepal length, sepal width, petal length and petal width were sufficient to distinguish species. Various analyses were conducted including bi-variant analysis using pair plots, uni-variant analysis of attribute distributions, and a KNN classification model with K values of 3 and 5. An ANOVA test was also performed to validate the model predictions.

Uploaded by

Rajeev Mukhesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

An Analysis of iris features and measure its significance with

respect to the iris species

Prof. Mark V. Albert

TEAM
Sanam Rajeev Mukhesh
Maanas Katta
Chennuri Aravind
Sri Ram Reddy Koteru
Aiswarya Marapatla
ABSTRACT

This project deals with the IRIS dataset, which includes


three different iris flower species (Iris Setosa, Iris
Versicolor, Iris Virginica). The dataset contains 4 physical
parameters (attributes/dimensions) which can accurately
predict the class of the flower. We believe that these
measurements are sufficient to distinguish between the
three types of Iris flowers. We'll also look at the
relationship between the characteristics in each bloom
and determine how important it is to represent the
species. This dataset has multivariate characteristics.
DATA DESCRIPTION

• Dataset consists of 4 features – sepal length, sepal width, petal length, petal width
• Total dataset consist of 150 datapoints (50 for each species).
• An early exploratory data analysis has been made on the set as shown in the picture.
• The dataset was taken from Kaggle.
Bi-Variant Analysis:

• As it is hard to visualize a 4 dimensional data, we choose ‘Pair-


Plot’ which plots the combinations of all the available features to
analyse the best classification pair.
• Pairwise Scatter Plot for Different Species.
Uni – Variant Analysis:
• Probability density functions (pdf) for each
feature.
• Plot for petal length for different species.
• Plot for petal width for different species.
• Plot for sepal length for different species.
• Plot for sepal length for different species.
Calculation of Mean, Variance and Standard Deviation
KNN ALGORITHM

• We have chosen K-Nearest Neighbours (KNN)


algorithm to train our model, which is a simple
supervised machine learning algorithm that can be
used to solve both classification and regression
problems.
• We have chosen the values 5 and 3 as the values of K.
KNN Algorithm with K = 3
STATISTICAL TEST: ANOVA

• We chose to conduct a simple ANOVA test on our


model.
• As p < 0.05, we reject null hypothesis and accept
alternate hypothesis.

You might also like