2. Random Forest Algorithm

Uploaded by

nicolaas.ryota

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views

2. Random Forest Algorithm

Uploaded by

nicolaas.ryota

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 2

Random Forest Algorithm (for Crab Age Prediction)

How it Works: Random Forest is an ensemble learning algorithm that creates multiple decision trees. It splits data
randomly at each node and averages the predictions of all trees for regression tasks like predicting the age of crabs.

Steps:

1. Collect Data: Gather crab data (e.g., size, weight, shell dimensions) and their ages.

2. Preprocess Data: Handle missing data and split the data into training and testing sets.

3. Train Model: Build a Random Forest model using the training data.

4. Evaluate: Use metrics like Mean Absolute Error (MAE) and R² to assess the model’s performance.

Advantages:

 Can capture complex, non-linear relationships.

 Robust to overfitting and handles missing data well.

CODE
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.preprocessing import StandardScaler

# Load your dataset (replace 'your_dataset.csv' with the actual file path)
dataset = pd.read_csv('your_dataset.csv')

# Assume the last column is the target variable

X = dataset.iloc[:, :-1] # Features
y = dataset.iloc[:, -1] # Target variable

# Preprocess the features (Standardizing the data)

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the data into training and testing sets (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# Create the Random Forest model with default parameters

model = RandomForestClassifier(random_state=42)

# Hyperparameter tuning using GridSearchCV to find the best parameters

param_grid = {
'n_estimators': [50, 100, 200], # Number of trees
'max_depth': [None, 10, 20, 30], # Maximum depth of trees
'min_samples_split': [2, 5, 10], # Minimum samples required to split a node
'min_samples_leaf': [1, 2, 4], # Minimum samples required at a leaf node
'bootstrap': [True, False] # Bootstrap sampling (whether to use bootstrapping)
}
# Set up GridSearchCV with cross-validation
grid_search = GridSearchCV(estimator=model, param_grid=param_grid, cv=5, n_jobs=-1, verbose=2)

# Fit the GridSearchCV model on the training data

grid_search.fit(X_train, y_train)

# Get the best parameters from the grid search

best_params = grid_search.best_params_
print(f"Best Hyperparameters: {best_params}")

# Train the Random Forest model with the best parameters

best_model = grid_search.best_estimator_

# Predict on the test set

y_pred = best_model.predict(X_test)

# Evaluate the model's accuracy

accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy of Random Forest model: {accuracy * 100:.2f}%")

# Print a classification report for more detailed performance analysis

print("\nClassification Report:")
print(classification_report(y_test, y_pred))

# Perform cross-validation to assess the model's stability

cv_scores = cross_val_score(best_model, X_scaled, y, cv=5)
print(f"Cross-Validation Accuracy: {cv_scores.mean() * 100:.2f}% ± {cv_scores.std() * 100:.2f}%")

Accuracy of Random Forest model: 80.00%

Docc 1984
No ratings yet
Docc 1984
6 pages
ML Asst.-01(25) (1)
No ratings yet
ML Asst.-01(25) (1)
21 pages
Tuning A CART's Hyperparameters: Elie Kawerk
No ratings yet
Tuning A CART's Hyperparameters: Elie Kawerk
26 pages
Python Implementation of Random Forest Algorithm
No ratings yet
Python Implementation of Random Forest Algorithm
10 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Untitled Document
No ratings yet
Untitled Document
6 pages
RANDOM_FOREST__1737667979
No ratings yet
RANDOM_FOREST__1737667979
11 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Hyperparameter Tuning
No ratings yet
Hyperparameter Tuning
9 pages
Random Forest: The Algorithm in A Nutshell
No ratings yet
Random Forest: The Algorithm in A Nutshell
10 pages
AAM 6th Prac
No ratings yet
AAM 6th Prac
3 pages
Machine Learning Random Forest Algorithm - Javatpoint
No ratings yet
Machine Learning Random Forest Algorithm - Javatpoint
14 pages
Dm.practical06
No ratings yet
Dm.practical06
12 pages
10 Random - Forest - Algo
No ratings yet
10 Random - Forest - Algo
6 pages
Random Forest
No ratings yet
Random Forest
3 pages
CSET301 LabW8L2
No ratings yet
CSET301 LabW8L2
1 page
ML_4,5 (1)
No ratings yet
ML_4,5 (1)
5 pages
03_Random Forest
No ratings yet
03_Random Forest
24 pages
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
100% (1)
A) What Is Motivation Behind Ensemble Methods? Give Your Answer in Probabilistic Terms
6 pages
Hyperparameter Tuning For Machine Learning Models
No ratings yet
Hyperparameter Tuning For Machine Learning Models
5 pages
Machine learning
No ratings yet
Machine learning
23 pages
Random Forest
No ratings yet
Random Forest
25 pages
Import Numpy As NP Import Pandas As PD
No ratings yet
Import Numpy As NP Import Pandas As PD
7 pages
AIH_Lab2
No ratings yet
AIH_Lab2
10 pages
Random Forest
No ratings yet
Random Forest
28 pages
ml using python programs
No ratings yet
ml using python programs
12 pages
L3_Classification_RandomForest - Jupyter Notebook
No ratings yet
L3_Classification_RandomForest - Jupyter Notebook
6 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
9 pages
aam p-4 to 6
No ratings yet
aam p-4 to 6
6 pages
practical 15 python
No ratings yet
practical 15 python
6 pages
Machine Learning - Random Forest
No ratings yet
Machine Learning - Random Forest
6 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Bagging - Ipynb - Colab
No ratings yet
Bagging - Ipynb - Colab
2 pages
2023AIB1008_Lab08
No ratings yet
2023AIB1008_Lab08
8 pages
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
No ratings yet
Setup: This Notebook Contains All The Sample Code and Solutions To The Exercises in Chapter 7
23 pages
Random Forest Algorithm unit 3
No ratings yet
Random Forest Algorithm unit 3
2 pages
CSL0777 L26
No ratings yet
CSL0777 L26
33 pages
ML Algorithms Cheat Sheet
No ratings yet
ML Algorithms Cheat Sheet
9 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
Supple Maximizing Performance in Cs CuBiCl
No ratings yet
Supple Maximizing Performance in Cs CuBiCl
5 pages
Decision Tree, Random Forest
No ratings yet
Decision Tree, Random Forest
37 pages
5) Randomforest - Ipynb - Colaboratory
No ratings yet
5) Randomforest - Ipynb - Colaboratory
12 pages
Random Forest
No ratings yet
Random Forest
2 pages
Assessment of The Random Forest Algorithm 1
No ratings yet
Assessment of The Random Forest Algorithm 1
4 pages
3. Decision Tree Algorithm
No ratings yet
3. Decision Tree Algorithm
2 pages
Lec-04-05
No ratings yet
Lec-04-05
37 pages
Random Forests
No ratings yet
Random Forests
1 page
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
25 pages
Scikit Learn What Were Covering
No ratings yet
Scikit Learn What Were Covering
15 pages
ex 6b
No ratings yet
ex 6b
3 pages
Random Forest
No ratings yet
Random Forest
11 pages
FB Models PDF
No ratings yet
FB Models PDF
14 pages
Random Forest
No ratings yet
Random Forest
16 pages
7
No ratings yet
7
2 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
MLA Lab 6:-Implementation of Decision Tree
No ratings yet
MLA Lab 6:-Implementation of Decision Tree
16 pages
AML_code_for_m2
No ratings yet
AML_code_for_m2
7 pages
DS_7
No ratings yet
DS_7
5 pages
Random Forest - Basics
No ratings yet
Random Forest - Basics
9 pages
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
From Everand
DATA MINING and MACHINE LEARNING. PREDICTIVE TECHNIQUES: ENSEMBLE METHODS, BOOSTING, BAGGING, RANDOM FOREST, DECISION TREES and REGRESSION TREES.: Examples with MATLAB
César Pérez López
No ratings yet
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet
Electromagnetism Basics
0% (1)
Electromagnetism Basics
14 pages
8MA10081_EW48_Modbus_EN_0410
No ratings yet
8MA10081_EW48_Modbus_EN_0410
18 pages
Class: X Mathematics Marking Scheme 2018-19 Time Allowed: 3Hrs Maximum Marks: 80 Qno Section A Marks 1 1
No ratings yet
Class: X Mathematics Marking Scheme 2018-19 Time Allowed: 3Hrs Maximum Marks: 80 Qno Section A Marks 1 1
9 pages
Syllabus For IEE 572 Fall 2011
No ratings yet
Syllabus For IEE 572 Fall 2011
5 pages
WEEK1 DLL MATH 1
No ratings yet
WEEK1 DLL MATH 1
6 pages
@DR Khan @research Methodology
No ratings yet
@DR Khan @research Methodology
53 pages
Derivative Worksheet
No ratings yet
Derivative Worksheet
5 pages
CH-1 Introduction To Biostatistics
No ratings yet
CH-1 Introduction To Biostatistics
51 pages
Data Analysis Project
No ratings yet
Data Analysis Project
19 pages
E 468 - E 468M - 23
No ratings yet
E 468 - E 468M - 23
7 pages
Transmission Line Parameters
100% (1)
Transmission Line Parameters
16 pages
Impact of Interest Rates On Profitability of Islamic and Conventional Banks, Pakistan
No ratings yet
Impact of Interest Rates On Profitability of Islamic and Conventional Banks, Pakistan
25 pages
C Programming Slides 03 (Fall 2021)
No ratings yet
C Programming Slides 03 (Fall 2021)
65 pages
True False
No ratings yet
True False
2 pages
V Programming Language
No ratings yet
V Programming Language
34 pages
sequence and series inequalities pron
No ratings yet
sequence and series inequalities pron
52 pages
MB0048 - Operations Research
No ratings yet
MB0048 - Operations Research
10 pages
Flow Through An Orifice
No ratings yet
Flow Through An Orifice
5 pages
Probability DRV Mean and Variance of DRV
No ratings yet
Probability DRV Mean and Variance of DRV
87 pages
REVIEWER
No ratings yet
REVIEWER
42 pages
Math Set 1 Refresher PDF
No ratings yet
Math Set 1 Refresher PDF
2 pages
Revised-I Puc-Mid Term - Pre Final Examination 2021-22-Karnataka
No ratings yet
Revised-I Puc-Mid Term - Pre Final Examination 2021-22-Karnataka
5 pages
Class 6 Math CRA Revision Worksheet
100% (1)
Class 6 Math CRA Revision Worksheet
2 pages
KV3, BBSR HHW Class 3
No ratings yet
KV3, BBSR HHW Class 3
4 pages
Resúmen 2
No ratings yet
Resúmen 2
29 pages
Heuristic-Based Optimization Models For Assembly Line Balancing in Garment Industry
No ratings yet
Heuristic-Based Optimization Models For Assembly Line Balancing in Garment Industry
15 pages
Salazar, Activity3C
No ratings yet
Salazar, Activity3C
4 pages
Natreviewgrade6 160306045816
No ratings yet
Natreviewgrade6 160306045816
2 pages
Model-Based Fault Diagnosis in Electric Drives Using Machine Learning
No ratings yet
Model-Based Fault Diagnosis in Electric Drives Using Machine Learning
14 pages