0% found this document useful (0 votes)
22 views9 pages

Random Forest

Uploaded by

dimpsdy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views9 pages

Random Forest

Uploaded by

dimpsdy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 9

RANDOM FOREST ALGORITHM

Machine Learning | Supervised


Learning Technique
What is Random Forest?
 Random Forest builds multiple decision
trees and merges them together to get a
more accurate and stable prediction.
 Each tree is trained on a random subset of data and
features.
 Final prediction is based on majority vote
(classification) or average (regression).
What is Ensemble Learning?
• Ensemble Learning is a technique that combines
multiple machine learning models to improve
performance.
• The main goal is to create a strong learner from
several weak learners.
• Ensemble methods help improve prediction accuracy
and reduce overfitting.
Bagging vs Boosting
What is Bagging Technique?
Bagging or Bootstrap Aggregating is an ensemble learning method that is used to

reduce the error by training homogeneous weak learners on different random

samples from the training set, in parallel. The results of these base learners are then

combined through voting or averaging approach to produce an ensemble model

that is more robust and accurate.

What is Boosting Technique?


Boosting is an ensemble learning method that involves training homogenous weak

learners sequentially such that a base model depends on the previously fitted

base models. All these base learners are then combined in a very adaptive way to

obtain an ensemble model.


Random Forest Algorithm
 Random Forest works in two-phase first is to create the random forest by

combining N decision tree, and second is to make predictions for each tree

created in the first phase.

The Working process can be explained in the below steps and diagram:

Step-1: Select random K data points from the training set.

Step-2: Build the decision trees associated with the selected data points

(Subsets).

Step-3: Choose the number N for decision trees that you want to build.

Step-4: Repeat Step 1 & 2.

Step-5: For new data points, find the predictions of each decision tree, and

assign the new data points to the category that wins the majority votes.
The working of the algorithm can be better
understood by the below example:

Suppose there is a dataset that


contains multiple fruit images. So,
this dataset is given to the Random
forest classifier. The dataset is
divided into subsets and given to
each decision tree. During the
training phase, each decision tree
produces a prediction result, and
when a new data point occurs, then
based on the majority of results, the
Random Forest classifier predicts the
final decision.
Advantages and Disadvantages
Advantages:
• High accuracy.
• Handles large datasets and missing values.
• Reduces overfitting compared to single decision trees.
• Works well for both classification and regression.

Disadvantages:
• Slower for real-time predictions.
• Complex and difficult to interpret.
• Requires more memory and computation.
Applications of Random Forest
o Medical Diagnosis (e.g., disease prediction).
o Fraud Detection in banking.
o Customer Segmentation in marketing.
o Product Recommendation Systems.
o Credit Scoring and Risk Analysis.

You might also like