0% found this document useful (0 votes)

53 views3 pages

Iris Flower Classification

The Iris Flower Classification project aims to classify iris flowers into three species using a dataset of 150 instances with four features. Three classification algorithms—Logistic Regression, K-Nearest Neighbors, and Random Forest—were implemented, all achieving an accuracy of 0.97%, with Random Forest being the best-performing model. The trained Random Forest model is saved for future predictions, highlighting its robustness and effectiveness in handling non-linearity.

Uploaded by

Prajwal Mahendrakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

53 views3 pages

Iris Flower Classification

Uploaded by

Prajwal Mahendrakar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Iris Flower Classification

1. Introduction
The Iris dataset is a well-known dataset in the field of machine learning, commonly used for
classification tasks. The dataset consists of 150 instances with four features: sepal length, sepal width,
petal length, and petal width. The goal of this project is to develop machine learning models that can
accurately classify iris flowers into one of three species: Setosa, Versicolor, and Virginica.
This project employs three classification algorithms: Logistic Regression, K-Nearest Neighbors
(KNN), and Random Forest. The trained models are evaluated based on their accuracy, and the best-
performing model is saved for future predictions.

2. Data Preprocessing
2.1 Dataset Overview
The dataset consists of the following features:
• sepal_length (continuous variable)
• sepal_width (continuous variable)
• petal_length (continuous variable)
• petal_width (continuous variable)
• species (categorical target variable with three classes: Setosa, Versicolor, Virginica)

2.2 Loading the Data

The dataset is loaded into a Pandas DataFrame using the read_csv() function. The info() method is
used to inspect data types and check for missing values. Additionally, describe() is used to understand
statistical properties such as mean, standard deviation, and percentiles of each feature.

2.3 Exploratory Data Analysis (EDA)

Outlier Detection
To identify potential outliers in the dataset, a box plot is created for the sepal_length feature. Outliers
can affect model performance and may require handling through techniques such as removal,
transformation, or imputation.
Feature Relationship Analysis
A scatter plot is generated to analyze the relationship between sepal_length and sepal_width, with
species color-coded. This visualization helps identify patterns or clusters in the data that may be
useful for classification.
Correlation Analysis
A correlation matrix is computed to understand the interdependence of different features. Strong
correlations between features indicate that some variables may be redundant or provide significant
predictive power.

3. Data Splitting and Preprocessing

3.1 Splitting the Dataset
To ensure unbiased model evaluation, the dataset is split into training and testing sets using
train_test_split() from Scikit-learn. A 70-30 split is used, where 70% of the data is used for training,
and 30% is used for testing.

3.2 Feature Scaling

Scaling is crucial for models like Support Vector Machines (SVM) and KNN, which are sensitive to
feature magnitudes. The StandardScaler from Scikit-learn is used to normalize the training and
testing datasets.

4. Model Implementation
4.1 Logistic Regression
Logistic Regression is a widely used classification algorithm that works well for linearly separable
data.
• Model Training: The model is trained with a maximum iteration of 200.
• Prediction: The trained model predicts the species of the test dataset.
• Evaluation: The accuracy score is computed using accuracy_score().

Results:
The Logistic Regression model achieved an accuracy of 0.97%.

4.2 K-Nearest Neighbors (KNN)

KNN is a distance-based algorithm that classifies new points based on their nearest neighbors.
• Model Training: A KNN classifier with 3 neighbors is used.
• Prediction: The trained model predicts species on the test data.
• Evaluation: The accuracy score is computed.
Results:
The KNN model achieved an accuracy of 0.97%.

4.3 Random Forest Classifier

Random Forest is an ensemble learning method that builds multiple decision trees and merges their
results to improve accuracy and reduce overfitting.
• Model Training: A Random Forest classifier with 100 estimators is trained.
• Prediction: The model predicts test set species without requiring feature scaling.
• Evaluation: The accuracy score is calculated.
Results:
The Random Forest model achieved an accuracy of 0.97%.
5. Model Comparison
Model Accuracy

Logistic Regression 0.97%

K-Nearest Neighbors 0.97%

Random Forest 0.97%

The Random Forest model outperformed the other models due to its ability to handle non-linearity
and its robustness against overfitting.

6. Model Deployment
To ensure the model's reusability, the trained Random Forest model is saved using the pickle
module.

7. Conclusion and Future Work

7.1 Conclusion
This project successfully implemented and evaluated three classification models for predicting iris
species. The Random Forest model provided the best accuracy, making it the most suitable model
for deployment.

Iris Flower Classification Project
100% (1)
Iris Flower Classification Project
14 pages
Task 1 Iris Flower Classification Using Machine Learning
No ratings yet
Task 1 Iris Flower Classification Using Machine Learning
10 pages
JAYESH BANSAL - FinalProjectReport - Jayesh Bansal
No ratings yet
JAYESH BANSAL - FinalProjectReport - Jayesh Bansal
38 pages
Iris Project Presentation
No ratings yet
Iris Project Presentation
13 pages
Classification of Iris Flower Species Updated
100% (1)
Classification of Iris Flower Species Updated
5 pages
ST1 4483 8995 Capstone PPT Template
No ratings yet
ST1 4483 8995 Capstone PPT Template
10 pages
Iris Flower Classification Project
No ratings yet
Iris Flower Classification Project
9 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
47 pages
ML Mod-4
No ratings yet
ML Mod-4
30 pages
ML Lecture 10 Project
No ratings yet
ML Lecture 10 Project
20 pages
Iris Dataset Project Report - Compress
No ratings yet
Iris Dataset Project Report - Compress
16 pages
Data Science Project
No ratings yet
Data Science Project
3 pages
Iris Flower Classification Final
No ratings yet
Iris Flower Classification Final
15 pages
Shelly
No ratings yet
Shelly
15 pages
Iris Flower Classification Using ML
No ratings yet
Iris Flower Classification Using ML
12 pages
Shelly Mehndiratta IrisFlowerClassification
No ratings yet
Shelly Mehndiratta IrisFlowerClassification
15 pages
Project Template
No ratings yet
Project Template
15 pages
BT-2016 SEM-IV Project Report (Review 1)
No ratings yet
BT-2016 SEM-IV Project Report (Review 1)
42 pages
AML Lab3 2021wb15156
No ratings yet
AML Lab3 2021wb15156
13 pages
Understanding-Code-for A-Classifier
No ratings yet
Understanding-Code-for A-Classifier
15 pages
王玉 20201108012390
No ratings yet
王玉 20201108012390
13 pages
Bagging, Random Forest, Gradient Boost, AdaBoost & PCA
No ratings yet
Bagging, Random Forest, Gradient Boost, AdaBoost & PCA
8 pages
DS Report
No ratings yet
DS Report
11 pages
SUMITs MINOR REPORT
No ratings yet
SUMITs MINOR REPORT
16 pages
61 JBS1753
No ratings yet
61 JBS1753
13 pages
Animal Species Prediction Using Machine Learning
No ratings yet
Animal Species Prediction Using Machine Learning
10 pages
Iris Classification
No ratings yet
Iris Classification
8 pages
22BCS14374 - Sanya - Singh - Assignment 2
No ratings yet
22BCS14374 - Sanya - Singh - Assignment 2
8 pages
Machine Learning Project
No ratings yet
Machine Learning Project
9 pages
Animal Species Prediction Using Machine Learning
No ratings yet
Animal Species Prediction Using Machine Learning
10 pages
Attiq Ahmad Afsar MLAssignment 3 Flask
No ratings yet
Attiq Ahmad Afsar MLAssignment 3 Flask
9 pages
Fo DS
No ratings yet
Fo DS
9 pages
ML 1
No ratings yet
ML 1
4 pages
Lab Report 10 FDS
No ratings yet
Lab Report 10 FDS
7 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
Decision Tree
No ratings yet
Decision Tree
4 pages
Machine Learning: Lecture 7: Create Your First Project
No ratings yet
Machine Learning: Lecture 7: Create Your First Project
17 pages
Assignment 2 Solution
No ratings yet
Assignment 2 Solution
4 pages
22IZ023 Nikhil - Exercise 7 A - Decision Trees
No ratings yet
22IZ023 Nikhil - Exercise 7 A - Decision Trees
4 pages
AI Model For Iris Species Prediction Using Logistic Regression Algorithm-1
No ratings yet
AI Model For Iris Species Prediction Using Logistic Regression Algorithm-1
5 pages
1 Assignment 3 - Classification
No ratings yet
1 Assignment 3 - Classification
16 pages
Bs Report On Iris
No ratings yet
Bs Report On Iris
6 pages
AAM 5th Practicle
No ratings yet
AAM 5th Practicle
3 pages
Iris Classification
No ratings yet
Iris Classification
6 pages
Comparison of Classifiers
No ratings yet
Comparison of Classifiers
6 pages
Project Title: Iris Flower Detection Using Decision Tree and KNN Classifier
No ratings yet
Project Title: Iris Flower Detection Using Decision Tree and KNN Classifier
4 pages
Pra 8
No ratings yet
Pra 8
4 pages
ML 10
No ratings yet
ML 10
3 pages
Nomlab 14 Ai
No ratings yet
Nomlab 14 Ai
3 pages
ML Lab1 PGM
No ratings yet
ML Lab1 PGM
4 pages
Assignment 2
No ratings yet
Assignment 2
5 pages
Combined Excel Files
No ratings yet
Combined Excel Files
650 pages
Ludic - Workshop - Iris - Copie
No ratings yet
Ludic - Workshop - Iris - Copie
5 pages
Lab 6
No ratings yet
Lab 6
4 pages
PS1 Ramos 2B
No ratings yet
PS1 Ramos 2B
2 pages
SVM and KNN
No ratings yet
SVM and KNN
3 pages
Untitled Document
No ratings yet
Untitled Document
2 pages
Major Project (Kartik Joshi)
No ratings yet
Major Project (Kartik Joshi)
4 pages
3 Text
No ratings yet
3 Text
2 pages
Capslet: Practical Research Ii
100% (1)
Capslet: Practical Research Ii
14 pages
Mining Class Comparisons
100% (1)
Mining Class Comparisons
4 pages
Problem 1
No ratings yet
Problem 1
12 pages
GRADE 11 Practical Research (November December)
No ratings yet
GRADE 11 Practical Research (November December)
22 pages
02 - Decision Tree Classification On Iris Dataset
No ratings yet
02 - Decision Tree Classification On Iris Dataset
6 pages
Iris Flower Classification
No ratings yet
Iris Flower Classification
1 page
Lineofbalance-Mar 10 2025
No ratings yet
Lineofbalance-Mar 10 2025
192 pages
Lineofbalance-Apr 7 2025
No ratings yet
Lineofbalance-Apr 7 2025
192 pages
Lineofbalance-Feb 14 2025
No ratings yet
Lineofbalance-Feb 14 2025
192 pages
CLASS - 4th To 5th - Exercise - VOCABULARY - WORD POWER
No ratings yet
CLASS - 4th To 5th - Exercise - VOCABULARY - WORD POWER
86 pages
Smtb1402-Probability & Statistics: Correlation
No ratings yet
Smtb1402-Probability & Statistics: Correlation
19 pages
Solution - Chapter 14
No ratings yet
Solution - Chapter 14
93 pages
A Comprehensive Study On Demand Forecasting Method
No ratings yet
A Comprehensive Study On Demand Forecasting Method
13 pages
Binomial Distribution
No ratings yet
Binomial Distribution
7 pages
RM Unit 4 - Overview
No ratings yet
RM Unit 4 - Overview
62 pages
Consistent Problem Unit 2
No ratings yet
Consistent Problem Unit 2
9 pages
Practice Problems - Part 1
No ratings yet
Practice Problems - Part 1
6 pages
Measures of Central Tendency and Dispersion
No ratings yet
Measures of Central Tendency and Dispersion
27 pages
Statistics For Management MCQs and Terminal Questions From All Units
No ratings yet
Statistics For Management MCQs and Terminal Questions From All Units
22 pages
One Way ANOVA For H0: M
No ratings yet
One Way ANOVA For H0: M
2 pages
Section V HW - PDF B
No ratings yet
Section V HW - PDF B
7 pages
(Ebooks PDF) Download (Ebook PDF) Statistical Techniques in Business and Economics 18th Edition Full Chapters
100% (5)
(Ebooks PDF) Download (Ebook PDF) Statistical Techniques in Business and Economics 18th Edition Full Chapters
41 pages
Regression
No ratings yet
Regression
7 pages
Notes - Parameter Estimation Statistics
No ratings yet
Notes - Parameter Estimation Statistics
1 page
Statistical Properties of OLS
No ratings yet
Statistical Properties of OLS
59 pages
Statistics in Medicine - 2024 - Zhang - Weighted Expectile Regression Neural Networks For Right Censored Data
No ratings yet
Statistics in Medicine - 2024 - Zhang - Weighted Expectile Regression Neural Networks For Right Censored Data
15 pages
Viva Questions and Possible Answers - Ver 1.0
No ratings yet
Viva Questions and Possible Answers - Ver 1.0
3 pages
Week #2b-3: This Lecture: Chi-Square Tests For
No ratings yet
Week #2b-3: This Lecture: Chi-Square Tests For
57 pages
SAMPLING METHODS in Order To Answer The Research Questions
No ratings yet
SAMPLING METHODS in Order To Answer The Research Questions
5 pages
Screenshot 2024-01-31 at 6.54.16 PM
No ratings yet
Screenshot 2024-01-31 at 6.54.16 PM
8 pages
RPS FIS622104 Statistika Sosial
No ratings yet
RPS FIS622104 Statistika Sosial
7 pages
Accountability and Fraud Type Effects On Fraud Detection Responsibility
No ratings yet
Accountability and Fraud Type Effects On Fraud Detection Responsibility
13 pages
Introduction To Minitab
No ratings yet
Introduction To Minitab
14 pages
NM Project AI&DS.
No ratings yet
NM Project AI&DS.
3 pages
Mma 205 Market Research Techniques 2010
No ratings yet
Mma 205 Market Research Techniques 2010
7 pages
AMR Concept Notes (Sessions 1-2)
No ratings yet
AMR Concept Notes (Sessions 1-2)
8 pages
Exersice Week 11 Answer
No ratings yet
Exersice Week 11 Answer
6 pages

Iris Flower Classification

Uploaded by

Iris Flower Classification

Uploaded by

Iris Flower Classification

2.2 Loading the Data

2.3 Exploratory Data Analysis (EDA)

3. Data Splitting and Preprocessing

3.2 Feature Scaling

4.2 K-Nearest Neighbors (KNN)

4.3 Random Forest Classifier

Logistic Regression 0.97%

K-Nearest Neighbors 0.97%

Random Forest 0.97%

7. Conclusion and Future Work

You might also like