0% found this document useful (0 votes)

23 views4 pages

Random Forest - Car - Jupyter Notebook

The document is a Jupyter Notebook detailing the implementation of a Random Forest classifier using a car evaluation dataset. It includes data loading, preprocessing, model training, and evaluation, achieving an accuracy score of 0.9649 with 100 decision trees. Additionally, it visualizes feature importance scores for the model's predictors.

Uploaded by

Aastha Mehta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views4 pages

Random Forest - Car - Jupyter Notebook

Uploaded by

Aastha Mehta

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

7/24/24, 1:07 PM Random forest - Jupyter Notebook

In [6]: import numpy as np # linear algebra

import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt # data visualization
import seaborn as sns # statistical data visualization
%matplotlib inline

In [7]: df = pd.read_csv("C:\\Users\\Welcome\\Downloads\\car_evaluation.csv")

In [8]: df.shape

Out[8]: (1727, 7)

In [9]: df.head()

Out[9]:
vhigh vhigh.1 2 2.1 small low unacc

0 vhigh vhigh 2 2 small med unacc

1 vhigh vhigh 2 2 small high unacc

2 vhigh vhigh 2 2 med low unacc

3 vhigh vhigh 2 2 med med unacc

4 vhigh vhigh 2 2 med high unacc

In [10]: col_names = ['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 'class']

df.columns = col_names

col_names

Out[10]: ['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 'class']

In [11]: df.head()

Out[11]:
buying maint doors persons lug_boot safety class

0 vhigh vhigh 2 2 small med unacc

1 vhigh vhigh 2 2 small high unacc

2 vhigh vhigh 2 2 med low unacc

3 vhigh vhigh 2 2 med med unacc

4 vhigh vhigh 2 2 med high unacc

In [12]: X = df.drop(['class'], axis=1)

y = df['class']

In [13]: # split data into training and testing sets

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.33, random_state = 42)

In [14]: # check the shape of X_train and X_test

X_train.shape, X_test.shape

Out[14]: ((1157, 6), (570, 6))

In [15]: from tensorflow.keras.utils import to_categorical

localhost:8888/notebooks/downloads/Random forest.ipynb 1/4

7/24/24, 1:07 PM Random forest - Jupyter Notebook

In [19]: conda install -c conda-forge category_encoders

Collecting package metadata (current_repodata.json): ...working... done

Solving environment: ...working... done

## Package Plan ##

environment location: C:\Users\Welcome\anaconda3

added / updated specs:

- category_encoders

The following packages will be downloaded:

package | build
---------------------------|-----------------
category_encoders-2.2.2 | pyhd3eb1b0_0 58 KB
conda-4.14.0 | py37h03978a9_0 1018 KB conda-forge
python_abi-3.7 | 2_cp37m 4 KB conda-forge
------------------------------------------------------------
Total: 1.1 MB

The following NEW packages will be INSTALLED:

category_encoders pkgs/main/noarch::category_encoders-2.2.2-pyhd3eb1b0_0
python_abi conda-forge/win-64::python_abi-3.7-2_cp37m

The following packages will be UPDATED:

conda pkgs/main::conda-4.8.2-py37_0 --> conda-forge::conda-4.14.0-py37h03978a9_0

Downloading and Extracting Packages

python_abi-3.7 | 4 KB | | 0%
python_abi-3.7 | 4 KB | ########## | 100%

conda-4.14.0 | 1018 KB | | 0%
conda-4.14.0 | 1018 KB | 1 | 2%
conda-4.14.0 | 1018 KB | ##9 | 30%
conda-4.14.0 | 1018 KB | ######2 | 63%
conda-4.14.0 | 1018 KB | #########7 | 97%
conda-4.14.0 | 1018 KB | ########## | 100%

category_encoders-2. | 58 KB | | 0%
category_encoders-2. | 58 KB | ##7 | 27%
category_encoders-2. | 58 KB | ########## | 100%
Preparing transaction: ...working... done
Verifying transaction: ...working... done
Executing transaction: ...working... done

Note: you may need to restart the kernel to use updated packages.

==> WARNING: A newer version of conda exists. <==

current version: 4.8.2
latest version: 24.5.0

Please update conda by running

$ conda update -n base -c defaults conda

In [20]: import category_encoders as ce

In [21]: encoder = ce.OrdinalEncoder(cols=['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety'])

X_train = encoder.fit_transform(X_train)

X_test = encoder.transform(X_test)

C:\Users\Welcome\anaconda3\lib\site-packages\category_encoders\utils.py:21: FutureWarning: is_categorical is deprecated and wil

l be removed in a future version. Use is_categorical_dtype instead
elif pd.api.types.is_categorical(cols):

localhost:8888/notebooks/downloads/Random forest.ipynb 2/4

7/24/24, 1:07 PM Random forest - Jupyter Notebook

In [22]: X_train.head()

Out[22]:
buying maint doors persons lug_boot safety

83 1 1 1 1 1 1

48 1 1 2 2 1 2

468 2 1 2 3 2 2

155 1 2 2 2 1 1

1043 3 2 3 2 2 1

In [23]: X_test.head()

Out[23]:
buying maint doors persons lug_boot safety

599 2 2 3 1 3 1

932 3 1 3 3 3 1

628 2 2 1 1 3 3

1497 4 2 1 3 1 2

1262 3 4 3 2 1 1

In [24]: # import Random Forest classifier

from sklearn.ensemble import RandomForestClassifier

# instantiate the classifier

rfc = RandomForestClassifier(random_state=0)

# fit the model

rfc.fit(X_train, y_train)

# Predict the Test set results

y_pred = rfc.predict(X_test)

# Check accuracy score

from sklearn.metrics import accuracy_score

print('Model accuracy score with 10 decision-trees : {0:0.4f}'. format(accuracy_score(y_test, y_pred)))

Model accuracy score with 10 decision-trees : 0.9649

In [25]: # instantiate the classifier with n_estimators = 100

rfc_100 = RandomForestClassifier(n_estimators=100, random_state=0)

# fit the model to the training set

rfc_100.fit(X_train, y_train)

# Predict on the test set results

y_pred_100 = rfc_100.predict(X_test)

# Check accuracy score

print('Model accuracy score with 100 decision-trees : {0:0.4f}'. format(accuracy_score(y_test, y_pred_100)))

Model accuracy score with 100 decision-trees : 0.9649

localhost:8888/notebooks/downloads/Random forest.ipynb 3/4

7/24/24, 1:07 PM Random forest - Jupyter Notebook

In [26]: # create the classifier with n_estimators = 100

clf = RandomForestClassifier(n_estimators=100, random_state=0)

# fit the model to the training set

clf.fit(X_train, y_train)

Out[26]: RandomForestClassifier(random_state=0)

In [27]: # view the feature scores

feature_scores = pd.Series(clf.feature_importances_, index=X_train.columns).sort_values(ascending=False)

feature_scores

Out[27]: safety 0.291657

persons 0.235380
buying 0.160692
maint 0.134143
lug_boot 0.111595
doors 0.066533
dtype: float64

In [28]: # Creating a seaborn bar plot

sns.barplot(x=feature_scores, y=feature_scores.index)

# Add labels to the graph

plt.xlabel('Feature Importance Score')

plt.ylabel('Features')

# Add title to the graph

plt.title("Visualizing Important Features")

# Visualize the graph

plt.show()

In [ ]:

localhost:8888/notebooks/downloads/Random forest.ipynb 4/4

Microprocessor
No ratings yet
Microprocessor
626 pages
Modernizing Your Soc Strategy
No ratings yet
Modernizing Your Soc Strategy
16 pages
RDBMS For BCOM 6th (Old) and 3rd (New) Sem
80% (5)
RDBMS For BCOM 6th (Old) and 3rd (New) Sem
110 pages
6492beef73e27 c844 Emerging Technologies in Cybersecurity Pa Task 2
No ratings yet
6492beef73e27 c844 Emerging Technologies in Cybersecurity Pa Task 2
9 pages
StatisticsMachineLearningPythonDraft PDF
100% (1)
StatisticsMachineLearningPythonDraft PDF
323 pages
Automata
88% (8)
Automata
102 pages
StatisticsMachineLearningPythonDraft PDF
100% (1)
StatisticsMachineLearningPythonDraft PDF
219 pages
Statistics Machine Learning Python
100% (1)
Statistics Machine Learning Python
389 pages
Statistics Machine Learning Python Draft
100% (1)
Statistics Machine Learning Python Draft
333 pages
StatisticsMachineLearningPythonDraft PDF
100% (1)
StatisticsMachineLearningPythonDraft PDF
313 pages
IZO Cloud Platform & Services Sales Playbook
No ratings yet
IZO Cloud Platform & Services Sales Playbook
55 pages
Mitsubishi PLC AnSH CPU Users Manual
No ratings yet
Mitsubishi PLC AnSH CPU Users Manual
242 pages
StatisticsMachineLearningPythonDraft PDF
100% (1)
StatisticsMachineLearningPythonDraft PDF
319 pages
Radar System Object Detection
No ratings yet
Radar System Object Detection
9 pages
Statistics and Machine Learning in Python
100% (1)
Statistics and Machine Learning in Python
166 pages
Workshop 7-1: HFSS-IE: ANSYS HFSS For Antenna Design
No ratings yet
Workshop 7-1: HFSS-IE: ANSYS HFSS For Antenna Design
19 pages
VLSI Physical Design: From Graph Partitioning To Timing Closure
No ratings yet
VLSI Physical Design: From Graph Partitioning To Timing Closure
30 pages
ML Notesv1
100% (1)
ML Notesv1
300 pages
Statistics and Machine Learning in Python
No ratings yet
Statistics and Machine Learning in Python
218 pages
Statistics Machine Learning Python Draft
No ratings yet
Statistics Machine Learning Python Draft
319 pages
Conda Cheat Sheet: Bit - Ly/tryconda
No ratings yet
Conda Cheat Sheet: Bit - Ly/tryconda
2 pages
Automatic Street Light Project Report PDF Free
No ratings yet
Automatic Street Light Project Report PDF Free
34 pages
AA Lab Manual Session 2022-23
No ratings yet
AA Lab Manual Session 2022-23
33 pages
Numerical Machines - Notes
No ratings yet
Numerical Machines - Notes
23 pages
Aos Question Bank
No ratings yet
Aos Question Bank
12 pages
3 Solve Address General Workplace
No ratings yet
3 Solve Address General Workplace
23 pages
Statistics and Machine Learning in Python
No ratings yet
Statistics and Machine Learning in Python
300 pages
Stat and Machine Learning Python PDF
No ratings yet
Stat and Machine Learning Python PDF
300 pages
Unit 2 ML
No ratings yet
Unit 2 ML
93 pages
Data Science With Python Workflow
100% (2)
Data Science With Python Workflow
2 pages
Jupiter Notebook Tricks
100% (1)
Jupiter Notebook Tricks
9 pages
Coding Club Learn. Inspire. Grow. Python, Anaconda & Other Library Installation
No ratings yet
Coding Club Learn. Inspire. Grow. Python, Anaconda & Other Library Installation
3 pages
CS3401 ALG UNIT 1 NOTES EduEngg
No ratings yet
CS3401 ALG UNIT 1 NOTES EduEngg
37 pages
Unit-2 Feature Selection
No ratings yet
Unit-2 Feature Selection
92 pages
Conda Cheat Sheet 1
No ratings yet
Conda Cheat Sheet 1
1 page
Progammable Logic Control Lab EEAC007IU Lab 3 Time Functions
No ratings yet
Progammable Logic Control Lab EEAC007IU Lab 3 Time Functions
27 pages
RadiSys EPC 6A Manual
No ratings yet
RadiSys EPC 6A Manual
74 pages
Import Numpy As NP Import Pandas As PD
No ratings yet
Import Numpy As NP Import Pandas As PD
7 pages
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
Cambridge O Level: Computer Science 2210/23
No ratings yet
Cambridge O Level: Computer Science 2210/23
16 pages
Reactor Comfyui - Ipynb
No ratings yet
Reactor Comfyui - Ipynb
24 pages
Temp1: Pandas PD Numpy NP
No ratings yet
Temp1: Pandas PD Numpy NP
4 pages
DL Lab Manual
No ratings yet
DL Lab Manual
34 pages
Virtual Broadband Network Gateway (VBNG) : Key Highlights
No ratings yet
Virtual Broadband Network Gateway (VBNG) : Key Highlights
4 pages
Mini Projects 3-6-Satyaki Mitra
No ratings yet
Mini Projects 3-6-Satyaki Mitra
60 pages
AKHILESH... Project Report On Isports
No ratings yet
AKHILESH... Project Report On Isports
31 pages
Machine Learning Lab Dlihebca6sem
100% (1)
Machine Learning Lab Dlihebca6sem
25 pages
Comfyui-Upscaling Kaggle - Ipynb
No ratings yet
Comfyui-Upscaling Kaggle - Ipynb
10 pages
Random Forest
No ratings yet
Random Forest
3 pages
Decision Tree
No ratings yet
Decision Tree
9 pages
Deep Learning Development Environment Construction
No ratings yet
Deep Learning Development Environment Construction
3 pages
PANASONIC Dell - PowerEdge - R730xd
No ratings yet
PANASONIC Dell - PowerEdge - R730xd
1 page
Proyecto Final Model
No ratings yet
Proyecto Final Model
13 pages
Statistics Machine Learning Python Draft
No ratings yet
Statistics Machine Learning Python Draft
329 pages
Workshop - NLP - Ipynb - Colaboratory
No ratings yet
Workshop - NLP - Ipynb - Colaboratory
5 pages
Hardware Key
No ratings yet
Hardware Key
3 pages
Random Forest 1737667979
No ratings yet
Random Forest 1737667979
11 pages
Exp 5
No ratings yet
Exp 5
9 pages
W1-4 Smart Modules For Home Appliance-R
No ratings yet
W1-4 Smart Modules For Home Appliance-R
30 pages
S5 Bot
No ratings yet
S5 Bot
2 pages
Pip Ins
No ratings yet
Pip Ins
2 pages
Exercise Random Forests
No ratings yet
Exercise Random Forests
2 pages
Python Project Front Page Prashant Jain
No ratings yet
Python Project Front Page Prashant Jain
4 pages
Pandas Documentation PDF
No ratings yet
Pandas Documentation PDF
86 pages
Ensemble Learning
No ratings yet
Ensemble Learning
1 page
Apply SVM To Amazon Reviews Data Set Avg W2vec (M)
No ratings yet
Apply SVM To Amazon Reviews Data Set Avg W2vec (M)
8 pages
WXT530 Order Form 2020 09 25
No ratings yet
WXT530 Order Form 2020 09 25
2 pages
Chapter 2-Data Transmission
No ratings yet
Chapter 2-Data Transmission
16 pages
Lab 3
No ratings yet
Lab 3
6 pages
Pca2 1
No ratings yet
Pca2 1
26 pages
2 - Data - Analysis - Ipynb - Colaboratory
No ratings yet
2 - Data - Analysis - Ipynb - Colaboratory
28 pages
AAM 6th Prac
No ratings yet
AAM 6th Prac
3 pages
ML Pgms - 24mar2025
No ratings yet
ML Pgms - 24mar2025
23 pages
Reast Cancer Prediction Using Debt
No ratings yet
Reast Cancer Prediction Using Debt
18 pages
Bda 3.1
No ratings yet
Bda 3.1
2 pages
ML LabManual
No ratings yet
ML LabManual
16 pages
IMPORTANT
No ratings yet
IMPORTANT
6 pages
The Little Book of Sitecore® Tips: Volume 1
From Everand
The Little Book of Sitecore® Tips: Volume 1
Neil P Shack
No ratings yet
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
From Everand
TensorFlow Developer Certificate Exam Practice Tests 2024 Made Easy
Mr Troy
No ratings yet
LPIC-1 Primer
From Everand
LPIC-1 Primer
John Greene
4.5/5 (3)
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
From Everand
Advanced Multiplayer Game Development with Ureal Engine 5: A Comprehensive Guide to C++ Scripting
Vladimir Kiselev
No ratings yet
Kubernetes Made Easy
From Everand
Kubernetes Made Easy
Pankaj Joshi
No ratings yet
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
From Everand
Evaluation of Some Intrusion Detection and Vulnerability Assessment Tools
Dr. Hedaya Mahmood Alasooly
No ratings yet
Some Tutorials in Computer Networking Hacking
From Everand
Some Tutorials in Computer Networking Hacking
Dr. Hidaia Mahmood Alassouli
No ratings yet
Solid Edge 2024 Black Book
From Everand
Solid Edge 2024 Black Book
Gaurav Verma
No ratings yet
Autodesk 3ds Max 2023: A Comprehensive Guide, 23rd Edition
From Everand
Autodesk 3ds Max 2023: A Comprehensive Guide, 23rd Edition
Prof. Sham Tickoo
No ratings yet
Accelerated Computing With HIP: Second Edition
From Everand
Accelerated Computing With HIP: Second Edition
Yifan Sun
No ratings yet
Tinkercad Black Book
From Everand
Tinkercad Black Book
Gaurav Verma
No ratings yet
Autodesk Fusion PCB Black Book (V 2.0.21528)
From Everand
Autodesk Fusion PCB Black Book (V 2.0.21528)
Gaurav Verma
No ratings yet
TensorFlow深度学习项目实战: Chinese Edition
From Everand
TensorFlow深度学习项目实战: Chinese Edition
Posts & Telecom Press
No ratings yet

Random Forest - Car - Jupyter Notebook

Uploaded by

Random Forest - Car - Jupyter Notebook

Uploaded by

7/24/24, 1:07 PM Random forest - Jupyter Notebook

In [6]: import numpy as np # linear algebra

0 vhigh vhigh 2 2 small med unacc

1 vhigh vhigh 2 2 small high unacc

2 vhigh vhigh 2 2 med low unacc

3 vhigh vhigh 2 2 med med unacc

4 vhigh vhigh 2 2 med high unacc

In [10]: col_names = ['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 'class']

Out[10]: ['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety', 'class']

0 vhigh vhigh 2 2 small med unacc

1 vhigh vhigh 2 2 small high unacc

2 vhigh vhigh 2 2 med low unacc

3 vhigh vhigh 2 2 med med unacc

4 vhigh vhigh 2 2 med high unacc

In [12]: X = df.drop(['class'], axis=1)

In [13]: # split data into training and testing sets

In [14]: # check the shape of X_train and X_test

Out[14]: ((1157, 6), (570, 6))

In [15]: from tensorflow.keras.utils import to_categorical

localhost:8888/notebooks/downloads/Random forest.ipynb 1/4

In [19]: conda install -c conda-forge category_encoders

Collecting package metadata (current_repodata.json): ...working... done

environment location: C:\Users\Welcome\anaconda3

added / updated specs:

The following packages will be downloaded:

The following NEW packages will be INSTALLED:

The following packages will be UPDATED:

conda pkgs/main::conda-4.8.2-py37_0 --> conda-forge::conda-4.14.0-py37h03978a9_0

Downloading and Extracting Packages

==> WARNING: A newer version of conda exists. <==

Please update conda by running

$ conda update -n base -c defaults conda

In [20]: import category_encoders as ce

In [21]: encoder = ce.OrdinalEncoder(cols=['buying', 'maint', 'doors', 'persons', 'lug_boot', 'safety'])

C:\Users\Welcome\anaconda3\lib\site-packages\category_encoders\utils.py:21: FutureWarning: is_categorical is deprecated and wil

localhost:8888/notebooks/downloads/Random forest.ipynb 2/4

In [24]: # import Random Forest classifier

Model accuracy score with 10 decision-trees : 0.9649

In [25]: # instantiate the classifier with n_estimators = 100

Model accuracy score with 100 decision-trees : 0.9649

localhost:8888/notebooks/downloads/Random forest.ipynb 3/4

In [26]: # create the classifier with n_estimators = 100

In [27]: # view the feature scores

Out[27]: safety 0.291657

In [28]: # Creating a seaborn bar plot

localhost:8888/notebooks/downloads/Random forest.ipynb 4/4

You might also like