100% found this document useful (1 vote)

30 views14 pages

Iris Flower Classification Project

This project report details the Iris Flower Classification using machine learning, specifically employing a supervised learning approach with the Support Vector Machine (SVM) algorithm. It outlines the steps taken to classify iris flowers based on features such as sepal and petal dimensions, including data loading, analysis, model training, evaluation, and testing. The project concludes with insights gained from the process, emphasizing the importance of model selection and evaluation in machine learning tasks.

Uploaded by

meowyoongi159

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (1 vote)

30 views14 pages

Iris Flower Classification Project

Uploaded by

meowyoongi159

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

A Project Report

Iris Flower Classification

Submitted in partial fulfillment of the

requirement for the award of the degree of

MASTER OF COMPUTER APPLICATION

BACHELOR'S OF COMPUTER APPLICATION

Session 2024-25

[…Name of discipline…]

By
Sona Halder(23SCSE1041009)

Dhruv Gupta(23SCSE1040993)

Under the guidance of

Dr. Sanjiv Sarma
SCHOOL OF COMPUTER APPLICATIONS AND TECHNOLOGY

GALGOTIAS UNIVERSITY, GREATER NOIDA

INDIA

June, 2025

pg. 1
Table of content

S.no. content page no.

1. Introduction 3
• machine learning
• Categories
• Application

2. Steps to classify iris flowers 7

• Step 1 – Load the data
• Step 2 – Analyse and visualize the
dataset
• Step 3 – Model training
• Step 4 – Model Evaluation
• Step 5 – Testing the model

3. Conclusion 14

pg. 2
Chapter -1 Introduction
Machine learning is almost everywhere nowadays. It has become increasingly
necessary day by day. From recommending what to buy to recognizing a
person, robotics everywhere relies on machine learning. So in this project,
we’ll create the “Hello World” of machine learning, which means Iris flower
classification.

Iris flower classification is a very popular machine learning project. The iris
dataset contains three classes of flowers, Versicolor, Setosa, Virginica, and
each class contains 4 features, ‘Sepal length’, ‘Sepal width’, ‘Petal length’, and
‘Petal width’. The iris flower classification aims to predict flowers based on
their specific features.

What is machine learning?

Machine learning is about learning to predict something or extracting

knowledge from data. ML is a part of artificial intelligence. ML algorithms build
a model based on sample data or known as training data and based upon the
training data the algorithm can predict something on new data.

Categories of Machine Learning :

• Supervised machine learning: Supervised machine learning are types of

machine learning that are trained on well-labeled training data. Labeled
data means the training data is already tagged with the correct output.

• Unsupervised machine learning: Unlike supervised learning,

unsupervised learning doesn’t have any tagged data. It learned patterns
from untagged data. Basically, it creates a group of objects based on the
input data/features.

pg. 3
• Semi-supervised machine learning: Semi-supervised learning falls
between supervised and unsupervised learning. It has a small amount of
tagged data and a large amount of untagged data.

Applications of Machine Learning:

1. Speech Recognition: Speech recognition uses NLP (Natural Language

Processing) to process human speech into written format and vice versa. Some
examples are – Google Assistant, Alexa, Siri.

2. Recommendation Engine: Using the past behavior of a human’s search data

the recommendation engine can produce new data to cross-sell products to
customers. For example – Amazon product recommendations, Spotify music
recommendations.

3. Chatbot: Chatbots are used to give customer services without any human
agent. It takes questions from users and based on the question it gives an
answer as a response.

In this project, we’ll solve the problem using a supervised learning approach.
We’ll use an algorithm called “Support vector machine”.

Support vector machine: A support vector machine (also known as a support

vector network) is a supervised machine learning algorithm that analyzes data
for classification and regression. SVMs are one of the most robust
classifications methods.

pg. 4
pg. 5
SVM approximates a separate line (Hyperplane) between tSVM algorithm finds
the points closest to the line from both classes. These points are known as
support vectors. Then it computes the distance between the line and support
vectors. This distance is called the margin. The main goal is to maximize the
margin. The hyperplane which has the maximum margin is known as the
optimal hyperplane.

SVM mainly supports binary classification natively. For multiclass classification,

It separates the data for binary classification and utilizes the same principle by
breaking down multi-classification problems into multiple binary classification
problems.

pg. 6
Chapter-2 :Steps to classify iris flower
Steps to Classify Iris Flower:

1. Load the data

2. Analyze and visualize the dataset
3. Model training.
4. Model Evaluation.
5. Testing the model.

Step 1 – Load the data:

# DataFlair Iris Flower Classification
# Import Packages
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
%matplotlib inline

First, we’ve imported some necessary packages for the project.

• Numpy will be used for any computational operations.

• We’ll use Matplotlib and seaborn for data visualization.

• Pandas help to load data from various sources like local storage,
database, excel file, CSV file, etc.
columns = ['Sepal length', 'Sepal width', 'Petal length', 'Petal width', 'Class_labels']
# Load the data
df = pd.read_csv('iris.data', names=columns)
df.head()

pg. 7
• Next, we load the data using pd.read_csv() and set the column name as
per the iris data information.

• Pd.read_csv reads CSV files. CSV stands for comma separated value.

• df.head() only shows the first 5 rows from the data set table.

• All the numerical values are in centimeters.

Step 2 – Analyze and visualize the dataset:

Let’s see some information about the dataset.
# Some basic statistical analysis about the data
df.describe()

From this description, we can see all the descriptions about the data, like
average length and width, minimum value, maximum value, the 25%, 50%, and
75% distribution value, etc.

Let’s visualize the dataset.

# Visualize the whole dataset
sns.pairplot(df, hue='Class_labels')

pg. 8
• To visualize the whole dataset we used the seaborn pair plot method. It
plots the whole dataset’s information.

• From this visualization, we can tell that iris-setosa is well separated from
the other two flowers.

• And iris virginica is the longest flower and iris setosa is the shortest.

Now let’s plot the average of each feature of each class.

# Separate features and target
data = df.values
X = data[:,0:4]
Y = data[:,4]

• Here we separated the features from the target value.

pg. 9
# Calculate average of each features for all classes
Y_Data = np.array([np.average(X[:, i][Y==j].astype('float32')) for i in range (X.shape[1])
for j in (np.unique(Y))])
Y_Data_reshaped = Y_Data.reshape(4, 3)
Y_Data_reshaped = np.swapaxes(Y_Data_reshaped, 0, 1)
X_axis = np.arange(len(columns)-1)
width = 0.25

• Np.average calculates the average from an array.

• Here we used two for loops inside a list. This is known as list
comprehension.

• List comprehension helps to reduce the number of lines of code.

• The Y_Data is a 1D array, but we have 4 features for every 3 classes. So

we reshaped Y_Data to a (4, 3) shaped array.

• Then we change the axis of the reshaped matrix.

# Plot the average
plt.bar(X_axis, Y_Data_reshaped[0], width, label = 'Setosa')
plt.bar(X_axis+width, Y_Data_reshaped[1], width, label = 'Versicolour')
plt.bar(X_axis+width*2, Y_Data_reshaped[2], width, label = 'Virginica')
plt.xticks(X_axis, columns[:4])
plt.xlabel("Features")
plt.ylabel("Value in cm.")
plt.legend(bbox_to_anchor=(1.3,1))
plt.show()

• We used matplotlib to show the averages in a bar plot.

pg. 10
• Here we can clearly see the verginica is the longest and setosa is the
shortest flower.

Step 3 – Model training:

# Split the data to train and test dataset.
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)

• Using train_test_split we split the whole data into training and testing
datasets. Later we’ll use the testing dataset to check the accuracy of the
model.
# Support vector machine algorithm
from sklearn.svm import SVC
svn = SVC()
svn.fit(X_train, y_train)

• Here we imported a support vector classifier from the scikit-learn

support vector machine.

• Then, we created an object and named it svn.

• After that, we feed the training dataset into the algorithm by using the
svn.fit() method.

Step 4 – Model Evaluation:

# Predict from the test dataset
predictions = svn.predict(X_test)
# Calculate the accuracy
from sklearn.metrics import accuracy_score
accuracy_score(y_test, predictions)

• Now we predict the classes from the test dataset using our trained
model.

• Then we check the accuracy score of the predicted classes.

• accuracy_score() takes true values and predicted values and returns the
percentage of accuracy.

pg. 11
Output:
0.9666666666666667

The accuracy is above 96%.

Now let’s see the detailed classification report based on the test dataset.
# A detailed classification report
from sklearn.metrics import classification_report
print(classification_report(y_test, predictions))
precision recall f1-score support
Iris-setosa 1.00 1.00 1.00 9
Iris-versicolor 1.00 0.83 0.91 12
Iris-virginica 0.82 1.00 0.90 9
accuracy 0.93 30
macro avg 0.94 0.94 0.94 30
weighted avg 0.95 0.93 0.93 30

• The classification report gives a detailed report of the prediction.

• Precision defines the ratio of true positives to the sum of true positive
and false positives.

• Recall defines the ratio of true positive to the sum of true positive and
false negative.

• F1-score is the mean of precision and recall value.

• Support is the number of actual occurrences of the class in the specified

dataset.

Step 5 – Testing the model:

X_new = np.array([[3, 2, 1, 0.2], [ 4.9, 2.2, 3.8, 1.1 ], [ 5.3, 2.5, 4.6, 1.9 ]])
#Prediction of the species from the input vector
prediction = svn.predict(X_new)

print("Prediction of Species: {}".format(prediction))

• Here we take some random values based on the average plot to see if
the model can predict accurately.

pg. 12
Output:

Prediction of Species: [‘Iris-setosa’ ‘Iris-versicolor’ ‘Iris-virginica’]

It looks like the model is predicting correctly because the setosa is shortest and
virginica is the longest and versicolor is in between these two.
# Save the model
import pickle
with open('SVM.pickle', 'wb') as f:
pickle.dump(svn, f)
# Load the model
with open('SVM.pickle', 'rb') as f:
model = pickle.load(f)
model.predict(X_new)

• We can save the model using pickle format.

• And again we can load the model in any other program using pickle and
use it using model.predict to predict the iris data.

pg. 13
Chapter-3: Conclusion
In this project, we explored the Iris Flower Classification using supervised
machine learning techniques. The main objective was to build a model that can
accurately classify iris flowers into one of three species: Setosa, Versicolor, or
Virginica, based on features such as sepal length, sepal width, petal length, and
petal width. We began by loading and understanding the dataset, followed by
thorough data preprocessing and exploration to identify patterns and
relationships among features. Data visualization techniques, including
histograms, pair plots, and correlation matrices, were used to better
understand the distribution and correlation of variables.

We then applied various supervised machine learning algorithms such as

Logistic Regression, K-Nearest Neighbors (KNN), Support Vector Machine
(SVM), and Decision Tree to train the model. The models were evaluated using
accuracy scores and confusion matrices to measure their performance. Cross-
validation techniques were also implemented to ensure model reliability and
to avoid overfitting. Among the models tested, some delivered higher accuracy
and better generalization.

Through this project, we gained hands-on experience in data analysis, model

selection, training, and evaluation. We also learned the importance of
choosing the right algorithm and tuning hyperparameters for optimal
performance. Overall, this project provided valuable insights into the workflow
of a machine learning task from start to finish and helped strengthen our
foundational understanding of classification problems.

pg. 14

Glencoe Mcgraw Hill Pre Algebra Homework Practice Workbook Answer Key
50% (2)
Glencoe Mcgraw Hill Pre Algebra Homework Practice Workbook Answer Key
5 pages
Exploring The Excel Environment
No ratings yet
Exploring The Excel Environment
4 pages
BT-2016 SEM-IV Project Report (Review 1)
No ratings yet
BT-2016 SEM-IV Project Report (Review 1)
42 pages
Iris Flower Classification Final
No ratings yet
Iris Flower Classification Final
15 pages
JAYESH BANSAL - FinalProjectReport - Jayesh Bansal
No ratings yet
JAYESH BANSAL - FinalProjectReport - Jayesh Bansal
38 pages
Iris Dataset Project Report - Compress
No ratings yet
Iris Dataset Project Report - Compress
16 pages
Iris Classification
No ratings yet
Iris Classification
6 pages
An Approach Based Iris Flower Species Recognition Using Machine Learning Classifiers
No ratings yet
An Approach Based Iris Flower Species Recognition Using Machine Learning Classifiers
7 pages
Lab 6
No ratings yet
Lab 6
4 pages
Machine Learning: Lecture 7: Create Your First Project
No ratings yet
Machine Learning: Lecture 7: Create Your First Project
17 pages
Classification of Iris Flower Species Updated
100% (1)
Classification of Iris Flower Species Updated
5 pages
Amber Iris
No ratings yet
Amber Iris
23 pages
Types of ML Systems
No ratings yet
Types of ML Systems
5 pages
Task 1 Iris Flower Classification Using Machine Learning
No ratings yet
Task 1 Iris Flower Classification Using Machine Learning
10 pages
Machine Learning Project
No ratings yet
Machine Learning Project
9 pages
61 JBS1753
No ratings yet
61 JBS1753
13 pages
SUMITs MINOR REPORT
No ratings yet
SUMITs MINOR REPORT
16 pages
ML Lecture 10 Project
No ratings yet
ML Lecture 10 Project
20 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
3 pages
ST1 4483 8995 Capstone PPT Template
No ratings yet
ST1 4483 8995 Capstone PPT Template
10 pages
Assignment 4 R Program1
No ratings yet
Assignment 4 R Program1
11 pages
IJARESM June2021
No ratings yet
IJARESM June2021
10 pages
Practical 6
No ratings yet
Practical 6
4 pages
Shelly
No ratings yet
Shelly
15 pages
Attiq Ahmad Afsar MLAssignment 3 Flask
No ratings yet
Attiq Ahmad Afsar MLAssignment 3 Flask
9 pages
Major Project (Kartik Joshi)
No ratings yet
Major Project (Kartik Joshi)
4 pages
Practical File DL
No ratings yet
Practical File DL
14 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Classification of Flower Species Final
No ratings yet
Classification of Flower Species Final
32 pages
Understanding-Code-for A-Classifier
No ratings yet
Understanding-Code-for A-Classifier
15 pages
DS Report
No ratings yet
DS Report
11 pages
Introduction To ML
No ratings yet
Introduction To ML
80 pages
Wa0001
No ratings yet
Wa0001
39 pages
Lab Report 10 FDS
No ratings yet
Lab Report 10 FDS
7 pages
Ludic - Workshop - Iris - Copie
No ratings yet
Ludic - Workshop - Iris - Copie
5 pages
Project Template
No ratings yet
Project Template
15 pages
R Course - Part7 ML - Exercise Sheet 2024
No ratings yet
R Course - Part7 ML - Exercise Sheet 2024
8 pages
Iris Flower Classification Using ML - by Modassir - Medium
No ratings yet
Iris Flower Classification Using ML - by Modassir - Medium
21 pages
Iris Classification
No ratings yet
Iris Classification
8 pages
Shelly Mehndiratta IrisFlowerClassification
No ratings yet
Shelly Mehndiratta IrisFlowerClassification
15 pages
DTS 101 Lecture 4
No ratings yet
DTS 101 Lecture 4
27 pages
Iris Flower Classification Project
No ratings yet
Iris Flower Classification Project
9 pages
Fo DS
No ratings yet
Fo DS
9 pages
AML Lab3 2021wb15156
No ratings yet
AML Lab3 2021wb15156
13 pages
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. CLASSIFICATION PREDICTIVE TECHNIQUES: SUPPORT VECTOR MACHINE, LOGISTIC REGRESSION, DISCRIMINANT ANALYSIS and DECISION TREES: Examples with MATLAB
César Pérez López
No ratings yet
Iris Flower Classification
No ratings yet
Iris Flower Classification
1 page
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
A Study of Pattern Recognition of Iris Flower Based On Machine Learning
No ratings yet
A Study of Pattern Recognition of Iris Flower Based On Machine Learning
43 pages
Assessment 1
No ratings yet
Assessment 1
4 pages
Data Science Project
No ratings yet
Data Science Project
31 pages
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
From Everand
DEEP LEARNING TECHNIQUES: CLUSTER ANALYSIS and PATTERN RECOGNITION with NEURAL NETWORKS. Examples with MATLAB
César Pérez López
No ratings yet
Iris Flower Classification
No ratings yet
Iris Flower Classification
47 pages
Data Mining Models: Techniques and Applications
From Everand
Data Mining Models: Techniques and Applications
Ravi Deshpande
No ratings yet
Flower Detection
No ratings yet
Flower Detection
9 pages
IRIS Species Predictor
No ratings yet
IRIS Species Predictor
8 pages
ML Lab1 PGM
No ratings yet
ML Lab1 PGM
4 pages
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING: CLUSTER ANALYSIS and kNN CLASSIFIERS. Examples with MATLAB
César Pérez López
No ratings yet
Iris Classification: It Workshop Report (BTCS 305-18)
No ratings yet
Iris Classification: It Workshop Report (BTCS 305-18)
13 pages
KNN Datacamp
No ratings yet
KNN Datacamp
31 pages
Sridevi Women'S Engineering College: Mini Project Seminar On
No ratings yet
Sridevi Women'S Engineering College: Mini Project Seminar On
23 pages
Module 4
No ratings yet
Module 4
30 pages
IRIS - Flower
No ratings yet
IRIS - Flower
13 pages
Realestate Quiz Part1
No ratings yet
Realestate Quiz Part1
27 pages
1ST Cot in Mathematics 5 Sy 2020-2021
No ratings yet
1ST Cot in Mathematics 5 Sy 2020-2021
31 pages
911 Calls Data Capstone Project .HTML
No ratings yet
911 Calls Data Capstone Project .HTML
15 pages
Problems Set 1
No ratings yet
Problems Set 1
6 pages
Trigonometry p2 Revision
No ratings yet
Trigonometry p2 Revision
30 pages
Chapter 7 Partial Redundancy Analysis - Workshop 10 - Advanced Multivariate Analyses in R
No ratings yet
Chapter 7 Partial Redundancy Analysis - Workshop 10 - Advanced Multivariate Analyses in R
8 pages
Problem Set - VI
No ratings yet
Problem Set - VI
2 pages
1222201922027PM-Class 9 Maths Worksheet - HERONS FORMULA
100% (1)
1222201922027PM-Class 9 Maths Worksheet - HERONS FORMULA
2 pages
Tut Sheets Unit-1 22MA11C
No ratings yet
Tut Sheets Unit-1 22MA11C
4 pages
Golden Ratio
No ratings yet
Golden Ratio
49 pages
Som Notes New Civil Engineering
No ratings yet
Som Notes New Civil Engineering
160 pages
Birt Cross Tab Tutorial Advanced
No ratings yet
Birt Cross Tab Tutorial Advanced
7 pages
Electromagnetic Fields R 22 - Hyd ECE Course Structure & Syllabus
No ratings yet
Electromagnetic Fields R 22 - Hyd ECE Course Structure & Syllabus
2 pages
Mechatronics
No ratings yet
Mechatronics
26 pages
Simple Stresses
No ratings yet
Simple Stresses
12 pages
CG Assignment Questions With Answers
No ratings yet
CG Assignment Questions With Answers
124 pages
Magi Astrology Lesson Four: Everyone Has Two Birth Charts
100% (1)
Magi Astrology Lesson Four: Everyone Has Two Birth Charts
6 pages
Assessment Test Test (Physics, Chemistry & Mathematics) Physics
No ratings yet
Assessment Test Test (Physics, Chemistry & Mathematics) Physics
13 pages
Optimal and Systematic Design of Current Controller For Grid-Connected Inverters
No ratings yet
Optimal and Systematic Design of Current Controller For Grid-Connected Inverters
13 pages
Union: Course Materials May Not Be Reproduced in Whole or in Part Without The Prior Written Permission of IBM
No ratings yet
Union: Course Materials May Not Be Reproduced in Whole or in Part Without The Prior Written Permission of IBM
16 pages
A I T S 16 - PAPER - 2 - ADVANCED - 01 05 2020 AnswerKey - HTML
No ratings yet
A I T S 16 - PAPER - 2 - ADVANCED - 01 05 2020 AnswerKey - HTML
15 pages
Course Outcomes: Automata Theory and Compiler Design (21CS51) Module-5
No ratings yet
Course Outcomes: Automata Theory and Compiler Design (21CS51) Module-5
53 pages
Lesson Plan - Math8 - Q3 - Day 1 Correspondence
No ratings yet
Lesson Plan - Math8 - Q3 - Day 1 Correspondence
7 pages
E0 - 270 (On-Campus) - Practice Set
No ratings yet
E0 - 270 (On-Campus) - Practice Set
2 pages
ProblemsChapter 05 Cables PDF
No ratings yet
ProblemsChapter 05 Cables PDF
4 pages
Physics Solutions CH 7 Serway Algebra/Trigonometry Physics
No ratings yet
Physics Solutions CH 7 Serway Algebra/Trigonometry Physics
81 pages
Problems
No ratings yet
Problems
8 pages