0% found this document useful (0 votes)

70 views22 pages

Random Forests

The document discusses decision trees and random forests, explaining that decision trees can overfit data but random forests address this by building multiple decision trees on randomly selected subsets of data and taking the majority vote of the class predictions of each tree. It provides examples of how to calculate information gain to determine the best feature to split on at each node when building decision trees and compares the advantages and disadvantages of decision trees versus random forests.

Uploaded by

Waleed Soudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views22 pages

Random Forests

Uploaded by

Waleed Soudi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Week #8

Random Forest

Artificial Intelligence &

Intelligent Systems
Slides By Dr.Rami Ibrahim
Objectives
 Introduction to Decision Trees.
 Decision Tree components.
 Build Decision Tree (Entropy Calculation)
 Decision Trees Advantages/Disadvantages
 Random Forest
Decision Trees
 Let us assume that a kid wants to play soccer.
However, we need to check the weather before going
out!!!
Decision Trees
 We have three roots in this tree, Sunny, Overcast, Rain.
 This is called a decision tree and is used for classification
(Yes, No).
Decision Trees
 We can answer sequential questions before making a
decision (outcome) by going along a route (top to bottom)
of the tree.
 At each node we have “If this, then that” condition.
Decision Tree Components
 Root Node: Starting point of the tree (Outlook).
 Branches: Lines connecting nodes.
 Leaf Nodes: Terminal nodes that predict the outcome
(Yes, No)
 Internal Nodes: Nodes that split for a value (Wind).
Decision Tree Components
 Depth of decision tree is the number of questions asked
before reaching the leaf node (class).
 The tree depth is represented by its longest route.
 Our tree depth is 2.
Decision Tree Example
 Assume we have the following table with 3 features A, B,
and C, and two target classes, Green and Blue.
A B C Target
1 1 1 Green
1 1 0 Green
0 0 1 Blue
1 0 0 Blue

 To decide which feature to start with in the decision tree,

we need to know some mathematical concepts first.
Decision Tree Example
 Entropy, H, is the level of impurity in a group

Where is the probability of

 Information Gain, IG, decides which feature to split at

each step when building the decision tree.

𝐼𝐺=𝐻 𝑃𝑎𝑟𝑒𝑛𝑡 𝑛𝑜𝑑𝑒 − 𝐻 𝑐h𝑖𝑙𝑑𝑛𝑜𝑑𝑒𝑠

Decision Tree Example
 Case#1: Split at A
A B C Target
1 1 1 Green
H(Parent) = -(2/4) (2/4) – (2/4) (2/4) =
1 1 0 Green
-(0.5)(-1)-(0.5)(-1) = 1
0 0 1 Blue
1 0 0 Blue
H(Child#1)= -(1/3) (1/3) – (2/3) (2/3) =
-(1/3)(-1.58)-(2/3)(-0.58) = 0.92

H(Child#2)= -(1) (1) = 0

IG = H(Parent) - (3/4) H(Child#1) – (1/4) H(Child#2) =

1 – (3/4) (0.92) – (1/4) 0 = 1-0.69-0 = 0.31
Decision Tree Example
 Case#2: Split at B
A B C Target
1 1 1 Green
H(Parent) = -(2/4) (2/4) – (2/4) (2/4) =
1 1 0 Green
-(0.5)(-1)-(0.5)(-1) = 1
0 0 1 Blue
1 0 0 Blue
H(Child#1)= -(2/2) (2/2) = 0

H(Child#2)= -(2/2) (2/2) = 0

IG = H(Parent) - (2/4) H(Child#1) – (2/4) H(Child#2) =

1 – 0 - 0= 1
Decision Tree Example
 Case#3: Split at C
A B C Target
1 1 1 Green
H(Parent) = -(2/4) (2/4) – (2/4) (2/4) =
1 1 0 Green
-(0.5)(-1)-(0.5)(-1) = 1
0 0 1 Blue
1 0 0 Blue
H(Child#1)= -(1/2) (1/2) -(1/2) (1/2) = 1

H(Child#2)= -(1/2) (1/2) -(1/2) (1/2) = 1

IG = H(Parent) - (2/4) H(Child#1) – (2/4) H(Child#2) =

1 – (2/4) (1) – (2/4) (1)= 1 – 0.5 – 0.5 = 1 -1 = 0
Decision Tree Example
 Summarizing the IG values for three features A, B, C:
Feature A B C
IG 0.31 1 0
Quality of Split Poor Best Worst

 The higher IG value means a higher quality of split for this

feature (Example: feature B). We want to maximize the IG
while building the decision tree.
 It is good to know how things work but all of these calculations
will be performed internally in Python.
Decision Tree Example
Decision Tree
Advantages/Disadvantages
 Advantages:
 Easy to follow and understand (Follow the route starting from root
node and ending with leaf node).
 Fast and can handle numerical and categorical datasets.
 Disadvantages:
 Training decision trees can be high computational.
 They can overfit (over-complex decision trees that cannot
generalize well and results in poor performance on unseen data).
Training error goes down when the depth is higher.
 Improve decision trees by:
 Pruning
 Bagging (Random Forest)
Overfitting & Underfitting
Overfitting & Underfitting
Random Forest
 Random Forest (RF) is one of the most powerful ML
algorithms that corrects overfitting decision trees while
training.
 The purpose of RF is to assemble multiple trees in
randomly selected subsets at training and outputting the
mode class.
 The decision tree is a greedy algorithm and will use the
feature as top split to minimize the error (maximize
information gain IG).
 RF de-correlate trees and exclude candidate features so
those features will not have a strong effect on the
prediction.
Random Forest
 To avoid the overfitting issue in decision trees, we can
apply a bagging approach which does the following:
 Create 5 bagged decision trees.
 Create 500 subsets from the dataset.
 Train the model on each subset.
 Assume that the classes predicted in the 5 decision
trees are: G, G, B, G, B. Then the most frequent class
G is the final prediction.
Random Forest
 Random Forest assemble multiple trees in randomly
selected subsets at training time and return the class
that is the mode of the classes.
 Random Forest make trees independent and minimize
correlation among them (strong features has no effect on
prediction).
 Given P number of features, number of random selected
features at each split (tree) m is sqrt(P).
 If we have 25 features in the dataset, the number of
selected features m is set to 5
Random Forest

m selected
P features features
N examples

Dataset m selected Take the

features mode class
(majority

....…
vote)
....…

m selected
features
References
Hands-On Machine Learning with Scikit-Learn, Keras, and
TensorFlow: Concepts, Tools, and Techniques to Build Intelligent
Systems’ 2nd Edition

2023AIB1008 Lab08
No ratings yet
2023AIB1008 Lab08
8 pages
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material II 19-May-2021 Random Forest
No ratings yet
WINSEM2020-21 CSE4020 ETH VL2020210504996 Reference Material II 19-May-2021 Random Forest
22 pages
03 - Random Forest
No ratings yet
03 - Random Forest
24 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
Decision Trees
No ratings yet
Decision Trees
15 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
MLS 1 - Decision Trees and Random Forests
No ratings yet
MLS 1 - Decision Trees and Random Forests
16 pages
Decision Trees and Random Forest
No ratings yet
Decision Trees and Random Forest
79 pages
Present
No ratings yet
Present
20 pages
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
No ratings yet
16-Decision Tree Classification Algorithm Advantages With Examples (Iterative Dichotomiser 3-ID3) - 22-03-2024
83 pages
Decision Tree Random Forest Theory
No ratings yet
Decision Tree Random Forest Theory
13 pages
4.decision Tree
No ratings yet
4.decision Tree
39 pages
What Is Decision Tree
No ratings yet
What Is Decision Tree
35 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Dokumen - Tips Decision Tree and Random Forest 58f9e8a0cce07
No ratings yet
Dokumen - Tips Decision Tree and Random Forest 58f9e8a0cce07
17 pages
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
No ratings yet
14MachineLearningDecisionTreeRandomForest - Ipynb - Colaboratory
29 pages
7 DecisionTree
No ratings yet
7 DecisionTree
58 pages
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
No ratings yet
Decision Tree Algorithm - A Complete Guide: Data Science Blogathon
13 pages
LVC 1 Post-Session Summary
No ratings yet
LVC 1 Post-Session Summary
9 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
41 pages
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
No ratings yet
CS446: Machine Learning: Lecture 21 (ML Models - Decision Trees - ID3)
54 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Week 11 - Decision Tree Learning
No ratings yet
Week 11 - Decision Tree Learning
43 pages
21 Decision Trees
No ratings yet
21 Decision Trees
62 pages
Decision Trees-Lecture 9&10
No ratings yet
Decision Trees-Lecture 9&10
60 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
4 pages
Lecture+Notes+-+Random Forests
No ratings yet
Lecture+Notes+-+Random Forests
10 pages
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
No ratings yet
Learning by Asking Questions: Decision Trees: Piyush Rai Machine Learning (CS771A)
22 pages
Unit Ii
No ratings yet
Unit Ii
22 pages
Lecture-4 Unit 2
No ratings yet
Lecture-4 Unit 2
73 pages
Lecture-7 Machine Learning With Python
No ratings yet
Lecture-7 Machine Learning With Python
42 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
DataMining-Handouts1 5
No ratings yet
DataMining-Handouts1 5
8 pages
Decisiontrees
No ratings yet
Decisiontrees
28 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Chap5 - Machine Learning Part II - Decision Tree
No ratings yet
Chap5 - Machine Learning Part II - Decision Tree
68 pages
Decision Tree
No ratings yet
Decision Tree
18 pages
Decision Trees Cheat Sheet PDF
No ratings yet
Decision Trees Cheat Sheet PDF
2 pages
19 - Decision Tree - ID3
No ratings yet
19 - Decision Tree - ID3
87 pages
Lecture 17 18
No ratings yet
Lecture 17 18
52 pages
CHTKT - DataScience - Chapter03 - Machine Learning With Python - 02
No ratings yet
CHTKT - DataScience - Chapter03 - Machine Learning With Python - 02
34 pages
Decision Trees
No ratings yet
Decision Trees
27 pages
Decision Tree
No ratings yet
Decision Tree
7 pages
Lecture 12 - Decision and Regression Trees
No ratings yet
Lecture 12 - Decision and Regression Trees
35 pages
Dtree&rf
No ratings yet
Dtree&rf
26 pages
Decision Tree
0% (1)
Decision Tree
24 pages
Random Forest Summary
No ratings yet
Random Forest Summary
6 pages
Random Forest (RF) : Decision Trees
No ratings yet
Random Forest (RF) : Decision Trees
3 pages
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
25 pages
Aditri Chaudhuri - DM
No ratings yet
Aditri Chaudhuri - DM
10 pages
Machine Learning Approaches: Decision Trees
No ratings yet
Machine Learning Approaches: Decision Trees
44 pages
AIML Lec-11
No ratings yet
AIML Lec-11
18 pages
Trees and Random Forest
No ratings yet
Trees and Random Forest
34 pages
Unit Iir20
No ratings yet
Unit Iir20
22 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
Lecture 3 - Decision Trees and Random Forest
No ratings yet
Lecture 3 - Decision Trees and Random Forest
20 pages
Decision Tree
No ratings yet
Decision Tree
20 pages
Chapter 3
No ratings yet
Chapter 3
88 pages
Decisiontree 2
No ratings yet
Decisiontree 2
16 pages
Experiment No 4 Vanraj
No ratings yet
Experiment No 4 Vanraj
2 pages
Jntuk Machine Learning 3-2 Unit-2
No ratings yet
Jntuk Machine Learning 3-2 Unit-2
47 pages
Lesson 5
No ratings yet
Lesson 5
26 pages
Final Report
No ratings yet
Final Report
26 pages
Ai Unit 3
No ratings yet
Ai Unit 3
23 pages
Assignment 3
No ratings yet
Assignment 3
8 pages
Decision Tree Class 1
No ratings yet
Decision Tree Class 1
34 pages
C2 W4 Decision Tree With Markdown
No ratings yet
C2 W4 Decision Tree With Markdown
17 pages
Crop Recommendation System KEC Conference
No ratings yet
Crop Recommendation System KEC Conference
16 pages
Machine Learning
No ratings yet
Machine Learning
27 pages
Data Classification and Prediction : Lecture-11
No ratings yet
Data Classification and Prediction : Lecture-11
36 pages
Introduction To Machine Learning IIT KGP Week 2
100% (1)
Introduction To Machine Learning IIT KGP Week 2
14 pages
Rafsanzani Pane
No ratings yet
Rafsanzani Pane
11 pages
Pilot Study Using Decision Trees To Diagnose The Efficacy of Virtual Offshore Egress Training
No ratings yet
Pilot Study Using Decision Trees To Diagnose The Efficacy of Virtual Offshore Egress Training
15 pages
Decision Tree Class 2
No ratings yet
Decision Tree Class 2
40 pages
Unit-II - Tree Based Methods
No ratings yet
Unit-II - Tree Based Methods
158 pages
BCS602 Model Set 1 Paper
No ratings yet
BCS602 Model Set 1 Paper
2 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Shivaji University, Kolhapur
No ratings yet
Shivaji University, Kolhapur
12 pages
Mod06 Decisions Trees
No ratings yet
Mod06 Decisions Trees
49 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
BookSlides 4B Information Based Learning Edited
No ratings yet
BookSlides 4B Information Based Learning Edited
64 pages
ML Unit-2
No ratings yet
ML Unit-2
51 pages
21ai66 ML Lab Manual
No ratings yet
21ai66 ML Lab Manual
41 pages
Machine Learning
No ratings yet
Machine Learning
133 pages
Lecture 07 On Decision Trees
No ratings yet
Lecture 07 On Decision Trees
36 pages
Decisiontree QB
No ratings yet
Decisiontree QB
9 pages
Giuaki
No ratings yet
Giuaki
7 pages
Decision Trees
No ratings yet
Decision Trees
14 pages

Random Forests

Uploaded by

Random Forests

Uploaded by

Week #8

Artificial Intelligence &

 To decide which feature to start with in the decision tree,

Where is the probability of

 Information Gain, IG, decides which feature to split at

𝐼𝐺=𝐻 𝑃𝑎𝑟𝑒𝑛𝑡 𝑛𝑜𝑑𝑒 − 𝐻 𝑐h𝑖𝑙𝑑𝑛𝑜𝑑𝑒𝑠

H(Child#2)= -(1) (1) = 0

IG = H(Parent) - (3/4) H(Child#1) – (1/4) H(Child#2) =

H(Child#2)= -(2/2) (2/2) = 0

IG = H(Parent) - (2/4) H(Child#1) – (2/4) H(Child#2) =

H(Child#2)= -(1/2) (1/2) -(1/2) (1/2) = 1

IG = H(Parent) - (2/4) H(Child#1) – (2/4) H(Child#2) =

 The higher IG value means a higher quality of split for this

Dataset m selected Take the

You might also like