0% found this document useful (0 votes)

30 views3 pages

Random Forest

The document discusses implementing a random forest classifier on the Titanic dataset. It covers preprocessing the data, handling categorical variables, splitting into train and test sets, building the random forest classifier model with hyperparameters like n_estimators and max_depth, making predictions on the test set, and evaluating accuracy.

Uploaded by

bunsglazing135

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views3 pages

Random Forest

Uploaded by

bunsglazing135

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 3

Random Forest Classifier

(i) https://www.datacamp.com/tutorial/random-forests-classifier-python
(ii) Below Code for practice
About dataset
The dataset used in this is ‘titanic.csv’ which is available for free, which is available on
Kaggle.com. This dataset includes the following features
1. Importing Libraries and reading dataset
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.read_csv("titanic.csv")
df

2. Data preprocessing
df.drop(['Cabin','PassengerId','Name','Ticket’],axis=1,inplace=True)
df = df.fillna(0)

3. Handling categorical data

from sklearn.preprocessing import LabelEncoder
le=LabelEncoder()
df['Sex']=le.fit_transform(df['Sex'])
df['Embarked']=le.fit_transform(df['Embarked'])
df
4. Dependent and independent variables
# Putting feature variable to X
X = df.drop('Survived',axis=1)
# Putting response variable to y
y = df['Survived']

5. Splitting dataset into Training and Testing Set

# Splitting the data into train and test
from sklearn.model_selection import train_test_split
# Splitting the data into train and test
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, random_state=42)
Next, split both x and y into training and testing sets with the help of the train_test_split()
function. In this training data set is 0.8 which means 80%.
6. Implementing a Random forest classifier
#Import Random Forest Model
from sklearn.ensemble import RandomForestClassifier
#Create a Gaussian Classifier
clf=RandomForestClassifier(n_estimators=100)
#Train the model using the training sets y_pred=clf.predict(X_test)
clf.fit(X_train,y_train)
Different parameters are used in the Random forest algorithm
1. N_estimators-The number of decision trees in the forest.
Note: The default value is 100. You can increase the number of trees that can increase the
accuracy but be careful that should not lead to overfitting
2. criterion{“gini”, “entropy”}, default=”gini”
This is to measure the quality of a split. These are the criterion by which the decision tree
actually split the variables.
 “gini” for the Gini impurity
 “entropy” for the information gain
3. Max_depth int, default=None
The maximum depth of the tree(root node to terminal node).
Note: If you are using a high value that means you are overcomplicating the things and that
can lead to overfitting. So be careful while choosing the value.
4. min_samples_split(int or float, default=2)
The minimum number of samples actually required to split an internal node:
Remember the lower the value the higher the chance to fit errors but that doesn’t mean you
choose a very high value because that will over generalize the model leading to overfitting.
So choose value accordingly.
5. min_samples_leaf(int or float, default=1)
The minimum number of samples is required to be at a leaf node.
6. Max_features {“auto”, “sqrt”, “log2”}, int or float, default=”auto”
a maximum number of features random forest considers when looking for the best split.
7. n_jobs(int, default=None)
It is the number of jobs to run in parallel. This is used when you have the capability to do
parallel processing where n_jobs= -1 means using all processors and n_jobs=1, it can use
only one processor
8. random_state(int, RandomState instance or None, default=None)
It Controls both the randomness of the samples used when building trees.
9. verbose(int, default=0)
Controls the verbosity when fitting and predicting. It gives you all the run-time information.
You can hyper-tune these by changing the values. You can read my blog on hyper-tuning.
7. Predicting test cases using random forest
# Predicting the test set results
Pred = classifier.predict(X_test)
print(Pred)
Output:
[0 1 1 0 1 1 1 0 0 0 1 0 1 0 1 1 1 0 0 0 1 0 1 1 1 1 1 1 0 1 0 0 0 0 1 0 1
1111111101100001100010001011001111111
0 1 1 0 0 0 1 0 1 1 0 0 0 1 0 0 1]

8. Checking the accuracy score

from sklearn.metrics import classification_report
rand_score=classifier.score(X_test, y_test)
'''rand_score=classifier.accuracy_score(y_test,Pred)'''
classification_report_rf=classification_report(y_test,Pred)
print("Accuracy score:",rand_score)
Output:
Accuracy score: 0.8268156424581006

Random Forest Classification
No ratings yet
Random Forest Classification
8 pages
Supervised ML
No ratings yet
Supervised ML
13 pages
Decision Trees and Random Forest
No ratings yet
Decision Trees and Random Forest
79 pages
Build A Random Forest Algorithm Aim
No ratings yet
Build A Random Forest Algorithm Aim
3 pages
Randon Forest
No ratings yet
Randon Forest
34 pages
Random Forest 1737667979
No ratings yet
Random Forest 1737667979
11 pages
Decision Trees Implementation
No ratings yet
Decision Trees Implementation
13 pages
015 - Random Forest
No ratings yet
015 - Random Forest
15 pages
Experiment 8
No ratings yet
Experiment 8
4 pages
Random Forest
No ratings yet
Random Forest
21 pages
Learn Python From Scratch
No ratings yet
Learn Python From Scratch
9 pages
Decision Tree, Random Forest
No ratings yet
Decision Tree, Random Forest
37 pages
Unit Iii Machine Learning
No ratings yet
Unit Iii Machine Learning
19 pages
ML Asst.-01
No ratings yet
ML Asst.-01
21 pages
Practical 15 Python
No ratings yet
Practical 15 Python
6 pages
AAM 6th Prac
No ratings yet
AAM 6th Prac
3 pages
2023AIB1008 Lab08
No ratings yet
2023AIB1008 Lab08
8 pages
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
No ratings yet
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
29 pages
Random Forest
No ratings yet
Random Forest
11 pages
Machine Learning Random Forest Algorithm - Javatpoint
No ratings yet
Machine Learning Random Forest Algorithm - Javatpoint
14 pages
Lecture 7.2 - DTC Algorithm Implementation
No ratings yet
Lecture 7.2 - DTC Algorithm Implementation
7 pages
Irm Su 1381
No ratings yet
Irm Su 1381
26 pages
Random Forests
No ratings yet
Random Forests
43 pages
Random Forest Algorithm 1
No ratings yet
Random Forest Algorithm 1
14 pages
Random Forests
No ratings yet
Random Forests
35 pages
Python Implementation of Random Forest Algorithm
No ratings yet
Python Implementation of Random Forest Algorithm
10 pages
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
No ratings yet
1.10. Decision Trees - Scikit-Learn 0.24.1 Documentation
10 pages
10 Random - Forest - Algo
No ratings yet
10 Random - Forest - Algo
6 pages
Advanced Predictive Analytics Using R & Python: - Muquayyar Ahmed Data Scientist
No ratings yet
Advanced Predictive Analytics Using R & Python: - Muquayyar Ahmed Data Scientist
11 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
Random Forest
No ratings yet
Random Forest
25 pages
03 - Random Forest
No ratings yet
03 - Random Forest
24 pages
Random Forests 2
No ratings yet
Random Forests 2
43 pages
Random Forest
No ratings yet
Random Forest
13 pages
Hazmat
100% (1)
Hazmat
102 pages
Exp 3 121a1047 Lavanya Kurup ML
No ratings yet
Exp 3 121a1047 Lavanya Kurup ML
4 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
CSL0777 L26
No ratings yet
CSL0777 L26
33 pages
ML Mid Question Solve
No ratings yet
ML Mid Question Solve
19 pages
CSET301 LabW8L2
No ratings yet
CSET301 LabW8L2
1 page
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
25 pages
The 4 Unique Buying Styles
100% (1)
The 4 Unique Buying Styles
4 pages
ML Lec6
No ratings yet
ML Lec6
4 pages
Random FOrest
No ratings yet
Random FOrest
19 pages
Department of Management Presentation
No ratings yet
Department of Management Presentation
84 pages
Random Forest: The Algorithm in A Nutshell
No ratings yet
Random Forest: The Algorithm in A Nutshell
10 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
Final Suggestion EC, BC by GKJ
No ratings yet
Final Suggestion EC, BC by GKJ
109 pages
Import Numpy As NP Import Pandas As PD
No ratings yet
Import Numpy As NP Import Pandas As PD
7 pages
BS en Iso 14692-3-2017
No ratings yet
BS en Iso 14692-3-2017
46 pages
Decision Trees
No ratings yet
Decision Trees
11 pages
Sentence Building
No ratings yet
Sentence Building
1 page
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
9 pages
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
100% (1)
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
6 pages
Machine Learning - Random Forest
No ratings yet
Machine Learning - Random Forest
6 pages
MGP 2025 Test Code 813215 Sol Eng
No ratings yet
MGP 2025 Test Code 813215 Sol Eng
12 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
SQAP For Starter or Control Panel
No ratings yet
SQAP For Starter or Control Panel
29 pages
Random Forest - Basics
No ratings yet
Random Forest - Basics
9 pages
Random Forest
No ratings yet
Random Forest
18 pages
05.random Forest
No ratings yet
05.random Forest
3 pages
1 Updated Offer-Letter-Yadhukrishna
No ratings yet
1 Updated Offer-Letter-Yadhukrishna
3 pages
KPCSW Report.2022
No ratings yet
KPCSW Report.2022
43 pages
"The Electoral Reforms Law of 1987" Sec. 27. Election Offenses. - in Addition To The Prohibited Acts and Election Offenses Enumerated in
100% (1)
"The Electoral Reforms Law of 1987" Sec. 27. Election Offenses. - in Addition To The Prohibited Acts and Election Offenses Enumerated in
24 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Felcom 12 15 16 Ssas Tie PDF
No ratings yet
Felcom 12 15 16 Ssas Tie PDF
80 pages
Law Assignment (Final)
No ratings yet
Law Assignment (Final)
10 pages
The Writer Vol.129 N 09 (September 2016)
No ratings yet
The Writer Vol.129 N 09 (September 2016)
54 pages
451866136ba Ii Year
No ratings yet
451866136ba Ii Year
16 pages
Random Forest
No ratings yet
Random Forest
8 pages
Yashaswini (DBMS)
No ratings yet
Yashaswini (DBMS)
8 pages
Solution Assigment Chapter 5
No ratings yet
Solution Assigment Chapter 5
11 pages
Illycaffe: The Starbucks Threat: Marketing Strategy
No ratings yet
Illycaffe: The Starbucks Threat: Marketing Strategy
12 pages
Summary-RK Narayan - The Financial Expert
100% (13)
Summary-RK Narayan - The Financial Expert
5 pages
Mobile Data
No ratings yet
Mobile Data
5 pages
Spa - For Companies
No ratings yet
Spa - For Companies
2 pages
NCIP-AO-1-Series-of-2009-IPMC (Clear Copy But Unsigned Copy) PDF
No ratings yet
NCIP-AO-1-Series-of-2009-IPMC (Clear Copy But Unsigned Copy) PDF
7 pages
Forbidden Topic in Health Policy Debate - Cost Effectiveness - The New York Times
No ratings yet
Forbidden Topic in Health Policy Debate - Cost Effectiveness - The New York Times
4 pages
FANUC Software WeldPRO
No ratings yet
FANUC Software WeldPRO
2 pages
Special Power of Attorney
No ratings yet
Special Power of Attorney
2 pages
Expanding Mental Health Care in The Kingdom of Eswatini: Successes, Challenges and Recommendations From Initial Experiences in Lubombo Region
No ratings yet
Expanding Mental Health Care in The Kingdom of Eswatini: Successes, Challenges and Recommendations From Initial Experiences in Lubombo Region
8 pages
Cold Working of Metals 2997
No ratings yet
Cold Working of Metals 2997
7 pages
Debit+Notes DN0006 60006782970 60006783462 474581000000381079 1662203079064
No ratings yet
Debit+Notes DN0006 60006782970 60006783462 474581000000381079 1662203079064
1 page
Substitute Leadership
100% (1)
Substitute Leadership
1 page
Pricing Strategy
No ratings yet
Pricing Strategy
1 page
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
From Everand
Python for Data Science: Data Science Mastery by Nikhil Khan, #1
Nikhil Khan
No ratings yet
Advanced C Concepts and Programming: First Edition
From Everand
Advanced C Concepts and Programming: First Edition
Gayatri
3/5 (1)
Python For Beginners
From Everand
Python For Beginners
Célio Azevedo
No ratings yet
Profound Python Data Science
From Everand
Profound Python Data Science
Onder Teker
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet

Random Forest

Uploaded by

Random Forest

Uploaded by

Random Forest Classifier

3. Handling categorical data

5. Splitting dataset into Training and Testing Set

8. Checking the accuracy score

You might also like