Iris Flower Classification
Iris Flower Classification
1. Introduction
The Iris dataset is a well-known dataset in the field of machine learning, commonly used for
classification tasks. The dataset consists of 150 instances with four features: sepal length, sepal width,
petal length, and petal width. The goal of this project is to develop machine learning models that can
accurately classify iris flowers into one of three species: Setosa, Versicolor, and Virginica.
This project employs three classification algorithms: Logistic Regression, K-Nearest Neighbors
(KNN), and Random Forest. The trained models are evaluated based on their accuracy, and the best-
performing model is saved for future predictions.
2. Data Preprocessing
2.1 Dataset Overview
The dataset consists of the following features:
• sepal_length (continuous variable)
• sepal_width (continuous variable)
• petal_length (continuous variable)
• petal_width (continuous variable)
• species (categorical target variable with three classes: Setosa, Versicolor, Virginica)
4. Model Implementation
4.1 Logistic Regression
Logistic Regression is a widely used classification algorithm that works well for linearly separable
data.
• Model Training: The model is trained with a maximum iteration of 200.
• Prediction: The trained model predicts the species of the test dataset.
• Evaluation: The accuracy score is computed using accuracy_score().
Results:
The Logistic Regression model achieved an accuracy of 0.97%.
The Random Forest model outperformed the other models due to its ability to handle non-linearity
and its robustness against overfitting.
6. Model Deployment
To ensure the model's reusability, the trained Random Forest model is saved using the pickle
module.