0% found this document useful (0 votes)

32 views8 pages

Mini Project With Output

Uploaded by

sushantsx8.nemesis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views8 pages

Mini Project With Output

Uploaded by

sushantsx8.nemesis

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

Step 1: Data Acquisition and Understanding

1. Load required Libraries and Dataset

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score
from sklearn.preprocessing import StandardScaler

df = pd.read_csv(‘Dry_Bean_Dataset.csv')
df.head()

2. Perform initial Exploratory Data Analysis (EDA) to understand basic statistics:

df.info()
df.describe()

3. Check for missing values:

df.isnull().sum()

OUTPUT:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13611 entries, 0 to 13610
Data columns (total 17 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Area 13611 non-null int64
1 Perimeter 13611 non-null float64
2 MajorAxisLength 13611 non-null float64
3 MinorAxisLength 13611 non-null float64
4 AspectRation 13611 non-null float64
5 Eccentricity 13611 non-null float64
6 ConvexArea 13611 non-null int64
7 EquivDiameter 13611 non-null float64
8 Extent 13611 non-null float64
9 Solidity 13611 non-null float64
10 roundness 13611 non-null float64
11 Compactness 13611 non-null float64
12 ShapeFactor1 13611 non-null float64
13 ShapeFactor2 13611 non-null float64
14 ShapeFactor3 13611 non-null float64
15 ShapeFactor4 13611 non-null float64
16 Class 13611 non-null object
dtypes: float64(14), int64(2), object(1)
memory usage: 1.8+ MB
0

Area 0

Perimeter 0

MajorAxisLength 0

MinorAxisLength 0

AspectRation 0

Eccentricity 0

ConvexArea 0

EquivDiameter 0

Extent 0

Solidity 0

roundness 0

Compactness 0

ShapeFactor1 0

ShapeFactor2 0

ShapeFactor3 0

ShapeFactor4 0

Class 0

dtype: int64

Step 2: Data Preprocessing and Transformation

1. Handling Missing Values: If the dataset has missing values, we can handle them by
imputing the mean for numerical columns or using forward-fill for categorical columns.
2. Spliting dataset using train_test_split:

X = df.drop(columns=['Class'])
y = df['Class']
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
3. Feature scaling on data:

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

OUTPUT :
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 13611 entries, 0 to 13610
Data columns (total 17 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Area 13611 non-null int64
1 Perimeter 13611 non-null float64
2 MajorAxisLength 13611 non-null float64
3 MinorAxisLength 13611 non-null float64
4 AspectRation 13611 non-null float64
5 Eccentricity 13611 non-null float64
6 ConvexArea 13611 non-null int64
7 EquivDiameter 13611 non-null float64
8 Extent 13611 non-null float64
9 Solidity 13611 non-null float64
10 roundness 13611 non-null float64
11 Compactness 13611 non-null float64
12 ShapeFactor1 13611 non-null float64
13 ShapeFactor2 13611 non-null float64
14 ShapeFactor3 13611 non-null float64
15 ShapeFactor4 13611 non-null float64
16 Class 13611 non-null object
dtypes: float64(14), int64(2), object(1)
memory usage: 1.8+ MB

Step 3: Data Visualization

1. Correlation Matrix to see relationships between features:

import seaborn as sns

import matplotlib.pyplot as plt
plt.figure(figsize=(10,8))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm')

OUTPUT:
<Axes: >

2. Distribution Plots of key features to understand the spread of values:

sns.displot(df[Area])
sns.displot(df['MajorAxisLength'])

OUTPUT:
<seaborn.axisgrid.FacetGrid at 0x77fc9ff6ad70>
Pair Plots to visualize relationships between input features:

sns.pairplot(df[['Area', 'Perimeter', 'MajorAxisLength',

'MinorAxisLength','AspectRation']], hue='Class')

OUTPUT:
<seaborn.axisgrid.PairGrid at 0x77fce876bee0>

Step 4: Model Building

Implement multiple models for comparison:

1. Logistic Regression:
from sklearn.linear_model import LogisticRegression
logreg = LogisticRegression()
logreg.fit(X_train, y_train)

2. Support Vector Machine (SVM):

from sklearn.svm import SVC

svm = SVC()
svm.fit(X_train, y_train)

3. K-Nearest Neighbors (KNN):

from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier()
knn.fit(X_train, y_train)

Step 5: Model Evaluation

1. Accuracy:

Logistic Regression

from sklearn.metrics import accuracy_score

y_pred_logreg = logreg.predict(X_test)
print("Accuracy for Logistic Regression:", accuracy_score(y_test,
y_pred_logreg))

OUTPUT:
Accuracy for Logistic Regression: 0.9265515975027543

Support Vector Machine (SVM):

from sklearn.metrics import accuracy_score

y_pred_svm = svm.predict(X_test)
print("Accuracy for SVM:", accuracy_score(y_test, y_pred_svm))

OUTPUT:
Accuracy for SVM: 0.9338964377524789
K-Nearest Neighbors (KNN):

from sklearn.metrics import accuracy_score

y_pred_knn = knn.predict(X_test)
print("Accuracy for KNN:", accuracy_score(y_test, y_pred_knn))

OUTPUT:

Accuracy for KNN: 0.9232464193903782

2. Accuracy across models to select the best one. For instance:

o Logistic Regression: 92.65% accuracy
o SVM : 93.38% accuracy
o K-Nearest Neighbors (KNN): 92.32% accuracy

Conclusion and Insights

The best-performing model is Support Vector Machine (SVM) , with an accuracy of 93.38%.
The most important features contributing to the prediction of Dry Beans are Area,
MajorAxisLength, Perimeter.

(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
Tau WH40k Aun - Shi
100% (2)
Tau WH40k Aun - Shi
31 pages
Performance Task in Mathematics 10: Harmonic Mean
No ratings yet
Performance Task in Mathematics 10: Harmonic Mean
9 pages
Coconut Husk RRS
No ratings yet
Coconut Husk RRS
6 pages
ADS - Documentation - Channel Simulation
No ratings yet
ADS - Documentation - Channel Simulation
294 pages
# (Data Preprocessing) : (Cheatsheet)
No ratings yet
# (Data Preprocessing) : (Cheatsheet)
10 pages
Importing Libraries: Pandas PD Matplotlib - Pyplot PLT Numpy NP
No ratings yet
Importing Libraries: Pandas PD Matplotlib - Pyplot PLT Numpy NP
10 pages
ML Book Notes
No ratings yet
ML Book Notes
9 pages
CatBoost - An In-Depth Guide Python
No ratings yet
CatBoost - An In-Depth Guide Python
33 pages
1ps0 01 Rms 20240822
100% (1)
1ps0 01 Rms 20240822
27 pages
Conversation Course Book
No ratings yet
Conversation Course Book
41 pages
Mini Project
No ratings yet
Mini Project
8 pages
Machine Learning Algorithm
No ratings yet
Machine Learning Algorithm
18 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
No ratings yet
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
22 pages
Mini Project Sushant 612210154
No ratings yet
Mini Project Sushant 612210154
3 pages
Machine Learning (ML)
No ratings yet
Machine Learning (ML)
35 pages
Classification of Dry Bean
No ratings yet
Classification of Dry Bean
16 pages
MLA Lab 6:-Implementation of Decision Tree
No ratings yet
MLA Lab 6:-Implementation of Decision Tree
16 pages
Implementing KNN Algorithm On The Iris Dataset
No ratings yet
Implementing KNN Algorithm On The Iris Dataset
7 pages
ML Lab Exam Document
No ratings yet
ML Lab Exam Document
14 pages
Python
No ratings yet
Python
4 pages
Manufacturing Machine Learning Tool Mechanical
No ratings yet
Manufacturing Machine Learning Tool Mechanical
13 pages
Unit1 ML Programs
No ratings yet
Unit1 ML Programs
5 pages
Random Forest 1 Image
No ratings yet
Random Forest 1 Image
5 pages
Project Data Mining (AMAN YADAV)
No ratings yet
Project Data Mining (AMAN YADAV)
12 pages
ML 3
No ratings yet
ML 3
24 pages
ML Labmanual
No ratings yet
ML Labmanual
33 pages
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
100% (1)
SVM (Support Vector Machine) For Classification - by Aditya Kumar - Towards Data Science
28 pages
(Q1) MODULE 1 - The Nature of Matter PDF
No ratings yet
(Q1) MODULE 1 - The Nature of Matter PDF
26 pages
Strangers
No ratings yet
Strangers
8 pages
Practical 5
No ratings yet
Practical 5
6 pages
Capital Project Request Form - 1
No ratings yet
Capital Project Request Form - 1
3 pages
DSM 2
No ratings yet
DSM 2
7 pages
Human Activity Recognition
No ratings yet
Human Activity Recognition
8 pages
ML Lab Manual
No ratings yet
ML Lab Manual
24 pages
22mid0187 ML Lab-5
No ratings yet
22mid0187 ML Lab-5
13 pages
Udacity Machine Learning Analysis Supervised Learning
100% (1)
Udacity Machine Learning Analysis Supervised Learning
504 pages
Reading Data: #Importing Required Libraries
No ratings yet
Reading Data: #Importing Required Libraries
16 pages
Untitled Document
No ratings yet
Untitled Document
8 pages
Roll NO 2020
No ratings yet
Roll NO 2020
8 pages
UNITIV BtechIot
No ratings yet
UNITIV BtechIot
43 pages
Module 4 - Supervised Learning - First ML Model
No ratings yet
Module 4 - Supervised Learning - First ML Model
23 pages
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
Inbuilt Kmeans
No ratings yet
Inbuilt Kmeans
3 pages
ML Shristi File
No ratings yet
ML Shristi File
49 pages
Mlalllabprgs
No ratings yet
Mlalllabprgs
17 pages
EX - NO:3: Algorithm
No ratings yet
EX - NO:3: Algorithm
11 pages
M PDF
No ratings yet
M PDF
13 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
Numpy Cheatsheet
No ratings yet
Numpy Cheatsheet
11 pages
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
EDS - Python Cheat Sheet
0% (1)
EDS - Python Cheat Sheet
3 pages
To Study About Numpy, Pandas and Matplotlib Libraries in Python
No ratings yet
To Study About Numpy, Pandas and Matplotlib Libraries in Python
21 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
2 pages
Data Science Libraries
No ratings yet
Data Science Libraries
4 pages
Unit 6 Pyspark - MLlib
No ratings yet
Unit 6 Pyspark - MLlib
6 pages
Final ML File
No ratings yet
Final ML File
34 pages
Mlda - Lab
No ratings yet
Mlda - Lab
35 pages
SVM K NN MLP With Sklearn Jupyter NoteBo
No ratings yet
SVM K NN MLP With Sklearn Jupyter NoteBo
22 pages
Experiment 1
No ratings yet
Experiment 1
19 pages
DA Programs
No ratings yet
DA Programs
44 pages
Devesh
No ratings yet
Devesh
11 pages
MLLab Manual
No ratings yet
MLLab Manual
24 pages
WME01 01 MSC 20190307 PDF
No ratings yet
WME01 01 MSC 20190307 PDF
15 pages
ML Short Code - Under Updating
No ratings yet
ML Short Code - Under Updating
4 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
26 pages
Advanced English Grammar
No ratings yet
Advanced English Grammar
2 pages
Application Summary
No ratings yet
Application Summary
1 page
1ab29bb7-bd81-49c3-8a8e-c373e8db6363
No ratings yet
1ab29bb7-bd81-49c3-8a8e-c373e8db6363
947 pages
Applied Thermal Engineering: Han-Taw Chen, Shih-Ting Lai, Li-Ying Haung
No ratings yet
Applied Thermal Engineering: Han-Taw Chen, Shih-Ting Lai, Li-Ying Haung
9 pages
EH-SolutionsLibrary 999902104 1
No ratings yet
EH-SolutionsLibrary 999902104 1
16 pages
Ashwin Report
No ratings yet
Ashwin Report
18 pages
Ultraviolet and Visible Spectros
No ratings yet
Ultraviolet and Visible Spectros
7 pages
Probability Rubric 2
No ratings yet
Probability Rubric 2
1 page
5708OTHM L4 Diploma in Psychology Spec 2023
No ratings yet
5708OTHM L4 Diploma in Psychology Spec 2023
34 pages
Organizational Needs Assessment Process and Directions For Use
No ratings yet
Organizational Needs Assessment Process and Directions For Use
14 pages
Group Project On Transcom
No ratings yet
Group Project On Transcom
31 pages
Permutations Combinations
No ratings yet
Permutations Combinations
8 pages
STEM Clubs Zombie Apocalypse Booklet
No ratings yet
STEM Clubs Zombie Apocalypse Booklet
32 pages
Examen Febrero
No ratings yet
Examen Febrero
10 pages
Lecture - Water Reqmts Spreadsheet
No ratings yet
Lecture - Water Reqmts Spreadsheet
8 pages
IMportant Question 4th
No ratings yet
IMportant Question 4th
8 pages
Positi Vis M
No ratings yet
Positi Vis M
5 pages
End Term Examination IKS
No ratings yet
End Term Examination IKS
3 pages
BWT ALL H2Flow Biowater Media Datasheet
No ratings yet
BWT ALL H2Flow Biowater Media Datasheet
3 pages
Report Card
No ratings yet
Report Card
1 page
F1 Self-Checking MC Quiz Chapter 10 Manipulation of Simple Polynomials - PDF - Google Drive 2
No ratings yet
F1 Self-Checking MC Quiz Chapter 10 Manipulation of Simple Polynomials - PDF - Google Drive 2
1 page
WI - Rating Sheet Dti
No ratings yet
WI - Rating Sheet Dti
1 page
Foundations of Image Science
From Everand
Foundations of Image Science
Harrison H. Barrett
No ratings yet