0% found this document useful (0 votes)

23 views6 pages

45 AIML Practical 09

Uploaded by

Ahmed Shaikh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views6 pages

45 AIML Practical 09

Uploaded by

Ahmed Shaikh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Name of Student: Ahmed Mobin Ahmed Shaikh

Roll Number: 45 Lab Practical Number: 09

Title of Lab Assignment: Implementation of Bagging Algorithm:

Decision Tree / Random Forest.

DOP: 19/03/24 DOS: 27/03/24

CO Mapped: PO Mapped: Signature:

CO5. PO2, PO3,
PO4, PO5,
PO6, PO7,
PSO1, PSO2.
4/5/24, 12:51 AM 45_AIML_Practical_09.ipynb - Colaboratory

keyboard_arrow_down Aim: Implementation of Bagging Algorithm: Decision Tree / Random Forest.

Bagging, short for Bootstrap Aggregating, is a popular ensemble learning technique used in machine learning. It involves training multiple
models independently and then combining their predictions to make a final prediction. The basic idea behind bagging is to reduce variance and
improve the overall performance of a single model by averaging or voting over multiple models trained on different subsets of the data.

Bagging offers several advantages:

Reduction of Variance: By training multiple models on different subsets of the data, bagging reduces the variance of the final prediction.
This helps to improve the generalization performance of the ensemble model.

Improved Stability: Bagging can make the model more robust to outliers and noisy data since it combines predictions from multiple
models.

Parallelizable: Since each model in a bagging ensemble is trained independently, bagging can be easily parallelized, allowing for efficient
use of computational resources.

Works with Any Base Learner: Bagging can be used with any base learning algorithm, making it a versatile technique that can be applied
to a wide range of problems.

However, it's important to note that bagging may not always lead to improvements, especially if the base learning algorithm is already robust to
variance and noise. Additionally, bagging can increase computational complexity and memory requirements since it involves training multiple
models.

keyboard_arrow_down Bagging on Dataset - Chronic Kidney Disease

keyboard_arrow_down Import Libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix

keyboard_arrow_down Get the Data

# Load the dataset
ahmed_ds = pd.read_csv('/content/kidney_disease_train.csv')
# target class label : category
#(0 - blood donor, -1 - suspect blood donor, 1 - hepatitis, 2 - fibrosis, 3 - cirrhosis)

ahmed_ds.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 280 entries, 0 to 279
Data columns (total 26 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 id 280 non-null int64
1 age 275 non-null float64
2 bp 271 non-null float64
3 sg 244 non-null float64
4 al 245 non-null float64
5 su 242 non-null float64
6 rbc 173 non-null object
7 pc 230 non-null object
8 pcc 276 non-null object
9 ba 276 non-null object
10 bgr 247 non-null float64
11 bu 266 non-null float64
12 sc 268 non-null float64
13 sod 213 non-null float64
14 pot 212 non-null float64
15 hemo 241 non-null float64
16 pcv 229 non-null float64
17 wc 203 non-null object
18 rc 187 non-null object
19 htn 279 non-null object

https://colab.research.google.com/drive/1_WlvTehI7_fmaWH1e1oR00xjQk3cXUZx#scrollTo=LAWVF6H9LQYz&printMode=true 1/5
4/5/24, 12:51 AM 45_AIML_Practical_09.ipynb - Colaboratory
20 dm 279 non-null object
21 cad 279 non-null object
22 appet 280 non-null object
23 pe 280 non-null object
24 ane 280 non-null object
25 classification 280 non-null object
dtypes: float64(12), int64(1), object(13)
memory usage: 57.0+ KB

ahmed_ds.isnull().sum()

output id
age
0
5
bp 9
sg 36
al 35
su 38
rbc 107
pc 50
pcc 4
ba 4
bgr 33
bu 14
sc 12
sod 67
pot 68
hemo 39
pcv 51
wc 77
rc 93
htn 1
dm 1
cad 1
appet 0
pe 0
ane 0
classification 0
dtype: int64

ahmed_ds = ahmed_ds.dropna()

ahmed_ds.isnull().sum()

id 0
age 0
bp 0
sg 0
al 0
su 0
rbc 0
pc 0
pcc 0
ba 0
bgr 0
bu 0
sc 0
sod 0
pot 0
hemo 0
pcv 0
wc 0
rc 0
htn 0
dm 0
cad 0
appet 0
pe 0
ane 0
classification 0
dtype: int64

ahmed_ds.describe()

https://colab.research.google.com/drive/1_WlvTehI7_fmaWH1e1oR00xjQk3cXUZx#scrollTo=LAWVF6H9LQYz&printMode=true 2/5
4/5/24, 12:51 AM 45_AIML_Practical_09.ipynb - Colaboratory

id age bp sg al su bgr bu sc sod pot

count 107.000000 107.000000 107.000000 107.000000 107.000000 107.000000 107.000000 107.000000 107.000000 107.000000 107.000000

mean 273.551402 49.682243 73.084112 1.020047 0.794393 0.233645 130.130841 51.635514 1.971963 138.869159 4.812150

std 99.999362 16.377964 10.764303 0.005429 1.419130 0.759586 54.841123 45.669525 2.600101 7.287990 4.175122

min 11.000000 6.000000 50.000000 1.005000 0.000000 0.000000 70.000000 10.000000 0.400000 114.000000 2.900000

25% 235.500000 38.000000 60.000000 1.020000 0.000000 0.000000 99.000000 27.000000 0.700000 135.000000 3.800000

50% 298.000000 52.000000 70.000000 1.020000 0.000000 0.000000 118.000000 39.000000 1.000000 139.000000 4.600000

75% 352.000000 61.500000 80.000000 1.025000 1.000000 0.000000 131.000000 49.500000 1.250000 144.000000 4.900000

max 399.000000 83.000000 100.000000 1.025000 4.000000 4.000000 380.000000 309.000000 13.300000 150.000000 47.000000

ahmed_ds.columns

Index(['id', 'age', 'bp', 'sg', 'al', 'su', 'rbc', 'pc', 'pcc', 'ba', 'bgr',
'bu', 'sc', 'sod', 'pot', 'hemo', 'pcv', 'wc', 'rc', 'htn', 'dm', 'cad',
'appet', 'pe', 'ane', 'classification'],
dtype='object')

keyboard_arrow_down Select features and target variable

x = ahmed_ds.iloc[:, 11:13] # Features
x

bu sc

0 42.0 1.7

3 25.0 1.0

6 49.0 0.9

10 18.0 1.1

12 20.0 0.5

... ... ...

272 18.0 1.1

273 148.0 3.9

275 92.0 3.3

277 34.0 1.1

278 19.0 0.5

107 rows × 2 columns

Next steps: toggle_off View recommended plots

y = ahmed_ds.iloc[:, 25] # Selecting the 25th column (index starts from 0)

0 ckd
3 notckd
6 notckd
10 notckd
12 notckd
...
272 notckd
273 ckd
275 ckd
277 notckd
278 notckd
Name: classification, Length: 107, dtype: object

keyboard_arrow_down Split the dataset into train and test sets

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.25, random_state=0)

keyboard_arrow_down Feature scaling

https://colab.research.google.com/drive/1_WlvTehI7_fmaWH1e1oR00xjQk3cXUZx#scrollTo=LAWVF6H9LQYz&printMode=true 3/5
4/5/24, 12:51 AM 45_AIML_Practical_09.ipynb - Colaboratory
scaler = StandardScaler()
x_train = scaler.fit_transform(x_train)
x_test = scaler.transform(x_test)

keyboard_arrow_down Initialize and train the Random Forest classifier

classifier = RandomForestClassifier(n_estimators=5, random_state=0)
classifier.fit(x_train, y_train)

▾ RandomForestClassifier
RandomForestClassifier(n_estimators=5, random_state=0)

keyboard_arrow_down Make predictions on the test set

y_pred = classifier.predict(x_test)

keyboard_arrow_down Evaluate the model

cm = confusion_matrix(y_test, y_pred)
print(f"CONFUSION MATRIX:\n", cm)

CONFUSION MATRIX:
[[ 5 1]
[ 0 21]]

import pandas as pd
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

# Calculate classification report

clf_report = pd.DataFrame(classification_report(y_test, y_pred, output_dict=True))

# Print accuracy score

print(f"ACCURACY SCORE:\n{accuracy_score(y_test, y_pred):.4f}")

# Print classification report

print(f"CLASSIFICATION REPORT:\n{clf_report}")

ACCURACY SCORE:
0.9630
CLASSIFICATION REPORT:
ckd notckd accuracy macro avg weighted avg
precision 1.000000 0.954545 0.962963 0.977273 0.964646
recall 0.833333 1.000000 0.962963 0.916667 0.962963
f1-score 0.909091 0.976744 0.962963 0.942918 0.961710
support 6.000000 21.000000 0.962963 27.000000 27.000000

keyboard_arrow_down Plotting the decision boundary for training set

#train
from matplotlib.colors import ListedColormap
x_set, y_set = x_train, y_train
x1,x2 = np.meshgrid(np.arange(start = x_set[:, 0].min() - 1, stop = x_set[:, 0].max() + 1, step =0.01), np.arange(start = x_set[:, 1].mi
plt.xlim(x1.min(), x1.max())
plt.ylim(x2.min(), x2.max())
for i, j in enumerate(np.unique(y_set)):
plt.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('(Training set)')
plt.xlabel('Blood Urea')
plt.ylabel('Serum creatinine')
plt.legend()
plt.show()

https://colab.research.google.com/drive/1_WlvTehI7_fmaWH1e1oR00xjQk3cXUZx#scrollTo=LAWVF6H9LQYz&printMode=true 4/5
4/5/24, 12:51 AM 45_AIML_Practical_09.ipynb - Colaboratory

<ipython-input-29-5c8bfb494df1>:8: UserWarning: *c* argument looks like a single numeric RGB or RGBA sequence, which should be avoid
plt.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1], c = ListedColormap(('red', 'green'))(i), label = j)

keyboard_arrow_down Trees of a RandomForestClassifier

from sklearn import tree #plot the RandomForestClassifier’s first 5 trees
fig,axes = plt.subplots(nrows = 1,ncols = 5,figsize = (10,2), dpi=900)
for index in range(0, 5):
tree.plot_tree(classifier.estimators_[index],feature_names= x.columns, class_names= 'status',filled = True,ax = axes[index])
axes[index].set_title('Estimator: ' + str(index+1), fontsize = 11)
fig.savefig('Random Forest 5 Trees.png')

keyboard_arrow_down Conclusion
In conclusion, bagging is a versatile and effective ensemble learning technique that enhances model performance
by combining the predictions of multiple base learners trained on bootstrapped subsets of the data.

https://colab.research.google.com/drive/1_WlvTehI7_fmaWH1e1oR00xjQk3cXUZx#scrollTo=LAWVF6H9LQYz&printMode=true 5/5

ITN Final Skills Exam
0% (1)
ITN Final Skills Exam
5 pages
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
No ratings yet
Step-By-Step-Diabetes-Classification-Knn-Detailed-Copy1 - Jupyter Notebook
12 pages
Bio-Signal Analysis For Smoking
No ratings yet
Bio-Signal Analysis For Smoking
1 page
My Code
No ratings yet
My Code
7 pages
Kidney Disease Prediction - Ipynb
No ratings yet
Kidney Disease Prediction - Ipynb
148 pages
Hcin620 m6 Lab6 Hanifahmutesi-Finalproject
No ratings yet
Hcin620 m6 Lab6 Hanifahmutesi-Finalproject
5 pages
Artificial Neural Network (Ann)
No ratings yet
Artificial Neural Network (Ann)
1 page
LP Practical ! Jupyter Notebook
No ratings yet
LP Practical ! Jupyter Notebook
6 pages
Mini Projects 1-3-Satyaki Mitra
No ratings yet
Mini Projects 1-3-Satyaki Mitra
33 pages
Aids
No ratings yet
Aids
88 pages
TP3.ipynb - Colab
No ratings yet
TP3.ipynb - Colab
17 pages
Loading The Dataset: 'Diabetes - CSV'
No ratings yet
Loading The Dataset: 'Diabetes - CSV'
4 pages
Sample Worksheet 1
No ratings yet
Sample Worksheet 1
8 pages
45B AIML Practical 08
No ratings yet
45B AIML Practical 08
10 pages
MajorProject - Ipynb - Colaboratory
No ratings yet
MajorProject - Ipynb - Colaboratory
11 pages
Heart Disease Classification Using Ann Hands-On
No ratings yet
Heart Disease Classification Using Ann Hands-On
7 pages
LAB8 LogisticReg HeartDisease
No ratings yet
LAB8 LogisticReg HeartDisease
31 pages
Assignment 1 - LP1
No ratings yet
Assignment 1 - LP1
14 pages
QUIZ Week 2 CART Practice PDF
No ratings yet
QUIZ Week 2 CART Practice PDF
10 pages
Kidney Ipynb
No ratings yet
Kidney Ipynb
253 pages
Assignment 03
No ratings yet
Assignment 03
6 pages
Dovdush KN-305 Lab3
No ratings yet
Dovdush KN-305 Lab3
2 pages
AML Sessional 1 Students
No ratings yet
AML Sessional 1 Students
16 pages
Major Project - Colab
No ratings yet
Major Project - Colab
15 pages
ML Lab Records
No ratings yet
ML Lab Records
101 pages
AML - LAB21 6 6 1.ipynb - Colab
No ratings yet
AML - LAB21 6 6 1.ipynb - Colab
6 pages
Diabetes Prediction 1704256341
No ratings yet
Diabetes Prediction 1704256341
17 pages
1FsWES7YJDERHD-bZ2ujFakbQyzi6 Yin
No ratings yet
1FsWES7YJDERHD-bZ2ujFakbQyzi6 Yin
9 pages
Machine Learning Algorithm
No ratings yet
Machine Learning Algorithm
18 pages
K-Nearest Neighbors For Diabetes Prediction: Malik Yousaf (F2020019038) Ahsan Rauf (F2020019057)
No ratings yet
K-Nearest Neighbors For Diabetes Prediction: Malik Yousaf (F2020019038) Ahsan Rauf (F2020019057)
15 pages
C2M4 - Assignment: 1 Cox Proportional Hazards and Random Survival Forests
No ratings yet
C2M4 - Assignment: 1 Cox Proportional Hazards and Random Survival Forests
18 pages
Practical 1
No ratings yet
Practical 1
7 pages
Fetal Health Paper Huo
No ratings yet
Fetal Health Paper Huo
25 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
42 pages
Experiment 4
No ratings yet
Experiment 4
5 pages
Assignment 1 A
No ratings yet
Assignment 1 A
12 pages
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
No ratings yet
Logistic Regression For Binary Classification With Core APIs - TensorFlow Core
22 pages
KNN - Jupyter Notebook
No ratings yet
KNN - Jupyter Notebook
5 pages
Practical No - 1
No ratings yet
Practical No - 1
5 pages
Stroke Prediction
No ratings yet
Stroke Prediction
10 pages
Model2.ipynb - Colab
No ratings yet
Model2.ipynb - Colab
11 pages
UQ21CA632B Unit2 Class14a Data Representation
No ratings yet
UQ21CA632B Unit2 Class14a Data Representation
5 pages
Bayesian Network Notes
No ratings yet
Bayesian Network Notes
4 pages
ML Proj Diabetes
No ratings yet
ML Proj Diabetes
51 pages
Baseline - Ipynb - Colab
No ratings yet
Baseline - Ipynb - Colab
5 pages
Quality Prediction Checkpoint
No ratings yet
Quality Prediction Checkpoint
14 pages
Importing Libraries: Pandas PD Matplotlib - Pyplot PLT Numpy NP
No ratings yet
Importing Libraries: Pandas PD Matplotlib - Pyplot PLT Numpy NP
10 pages
Mod 4
No ratings yet
Mod 4
2 pages
Untitled2.Ipynb - Colab
No ratings yet
Untitled2.Ipynb - Colab
8 pages
ML Practice Assignment
No ratings yet
ML Practice Assignment
7 pages
Dovdush KN-305 Lab2
No ratings yet
Dovdush KN-305 Lab2
2 pages
Jashan ML
No ratings yet
Jashan ML
20 pages
Pyhton 2
No ratings yet
Pyhton 2
8 pages
Data Science Practicals - Ipynb
No ratings yet
Data Science Practicals - Ipynb
54 pages
Ayush
No ratings yet
Ayush
23 pages
Data Cleaning
No ratings yet
Data Cleaning
13 pages
B58 - Handling Missing Values, Feature - Selection
No ratings yet
B58 - Handling Missing Values, Feature - Selection
4 pages
DS Report 03
No ratings yet
DS Report 03
30 pages
ML0101EN Clas Decision Trees Drug Py v1
No ratings yet
ML0101EN Clas Decision Trees Drug Py v1
12 pages
Heart Disease Diagnosis Using Machine Learning
No ratings yet
Heart Disease Diagnosis Using Machine Learning
26 pages
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet
45b - Ui Prac2
No ratings yet
45b - Ui Prac2
30 pages
45B Ahmed Shaikh AIML Prac05
No ratings yet
45B Ahmed Shaikh AIML Prac05
4 pages
45B AIML Practical 11
No ratings yet
45B AIML Practical 11
5 pages
45B AIML Practical 06
No ratings yet
45B AIML Practical 06
5 pages
45B AIML Practical07 Clustering
No ratings yet
45B AIML Practical07 Clustering
8 pages
200-301 Cisco CCNA Exam Updated Practice Questions
No ratings yet
200-301 Cisco CCNA Exam Updated Practice Questions
67 pages
Resume: Professional Overview
No ratings yet
Resume: Professional Overview
4 pages
Scatman Cup App
No ratings yet
Scatman Cup App
4 pages
OAT Unit-5
No ratings yet
OAT Unit-5
8 pages
FreeRTOS With Arduino Tutorial - How To Create Tasks
No ratings yet
FreeRTOS With Arduino Tutorial - How To Create Tasks
14 pages
DA-100 Mod6-ENU-PowerPoint
No ratings yet
DA-100 Mod6-ENU-PowerPoint
26 pages
Thesis On Mobile Computing PDF
100% (3)
Thesis On Mobile Computing PDF
6 pages
Sympathy For The Traitor: A Translation Manifesto (The MIT Press) - Mark Polizzotti (PDF File (PDF, Epub, TXT) )
No ratings yet
Sympathy For The Traitor: A Translation Manifesto (The MIT Press) - Mark Polizzotti (PDF File (PDF, Epub, TXT) )
5 pages
0.pham Bac Nguyen - LLM Algorithm
No ratings yet
0.pham Bac Nguyen - LLM Algorithm
2 pages
Ecotruck Api-Instrukcja
No ratings yet
Ecotruck Api-Instrukcja
5 pages
Doip
No ratings yet
Doip
15 pages
SAMA APP Overview Specifications (For Tender)
No ratings yet
SAMA APP Overview Specifications (For Tender)
11 pages
IAP301 SE161501 Lab2docx
No ratings yet
IAP301 SE161501 Lab2docx
5 pages
Advantages and Disadvantages of VIDEO CALLING and SOCIAL NETWORKING
No ratings yet
Advantages and Disadvantages of VIDEO CALLING and SOCIAL NETWORKING
1 page
University Institute of Computing: Big Data Analytics 22CAH-782
No ratings yet
University Institute of Computing: Big Data Analytics 22CAH-782
27 pages
Cookie Settings
No ratings yet
Cookie Settings
11 pages
Q4 - WEEK2 - WW - PT For G9
No ratings yet
Q4 - WEEK2 - WW - PT For G9
3 pages
Iphone 12 Mini 07300290A Repair
100% (1)
Iphone 12 Mini 07300290A Repair
81 pages
Chapter 05 PCPF
No ratings yet
Chapter 05 PCPF
32 pages
NPM-D3A en 25 0101
No ratings yet
NPM-D3A en 25 0101
4 pages
MTH603 Final Term Solved MCQ's
No ratings yet
MTH603 Final Term Solved MCQ's
9 pages
ADB Bearing Sensor Tester
No ratings yet
ADB Bearing Sensor Tester
2 pages
Cover Letter Job Application No Experience
100% (2)
Cover Letter Job Application No Experience
8 pages
UltraPIPE Software Ingles
No ratings yet
UltraPIPE Software Ingles
34 pages
E-Commerce Lab - Code 108 & 311 - BBA (G.) & BBA (B & I) - Sem. II
No ratings yet
E-Commerce Lab - Code 108 & 311 - BBA (G.) & BBA (B & I) - Sem. II
12 pages
l3 Phono Stage v2
100% (1)
l3 Phono Stage v2
110 pages
Azure OpenAI Cookbook
No ratings yet
Azure OpenAI Cookbook
173 pages
HDFS Internals
No ratings yet
HDFS Internals
30 pages

45 AIML Practical 09

Uploaded by

45 AIML Practical 09

Uploaded by

Name of Student: Ahmed Mobin Ahmed Shaikh

Roll Number: 45 Lab Practical Number: 09

Title of Lab Assignment: Implementation of Bagging Algorithm:

DOP: 19/03/24 DOS: 27/03/24

CO Mapped: PO Mapped: Signature:

keyboard_arrow_down Aim: Implementation of Bagging Algorithm: Decision Tree / Random Forest.

Bagging offers several advantages:

keyboard_arrow_down Bagging on Dataset - Chronic Kidney Disease

keyboard_arrow_down Get the Data

id age bp sg al su bgr bu sc sod pot

keyboard_arrow_down Select features and target variable

... ... ...

272 18.0 1.1

273 148.0 3.9

275 92.0 3.3

277 34.0 1.1

278 19.0 0.5

107 rows × 2 columns

Next steps: toggle_off View recommended plots

y = ahmed_ds.iloc[:, 25] # Selecting the 25th column (index starts from 0)

keyboard_arrow_down Split the dataset into train and test sets

keyboard_arrow_down Feature scaling

keyboard_arrow_down Initialize and train the Random Forest classifier

keyboard_arrow_down Make predictions on the test set

keyboard_arrow_down Evaluate the model

# Calculate classification report

# Print accuracy score

# Print classification report

keyboard_arrow_down Plotting the decision boundary for training set

keyboard_arrow_down Trees of a RandomForestClassifier

You might also like