0% found this document useful (0 votes)

2 views24 pages

Lecture2 Decision Tree and Random Forest

The document discusses Decision Trees and Random Forests in machine learning, highlighting their structure, capabilities, and applications. It emphasizes the importance of tree depth and size to balance overfitting and underfitting, and outlines how Random Forests mitigate overfitting through randomness in data and feature selection. Applications include motion capture systems and medical image analysis.

Uploaded by

leodang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views24 pages

Lecture2 Decision Tree and Random Forest

Uploaded by

leodang

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

DTS304TC: Machine Learning

Lecture 2: Decision Tree and Random Forest

Dr Kang Dang
D-5032, Taicang Campus
[email protected]
Tel: 88973341
1
Decision Trees
A hierarchical data
structure that implements
a divide and conquer
strategy is known as a
tree.
• Internal nodes test
attributes
• Branching is determined
by attribute value
• Leaf nodes are outputs
(class assignments)
Image source: https://www.linkedin.com/pulse/using-classification-tree-detect-spam-mails-stefano-stompanato/

2
Question – XOR problem

What types of machine

learning algorithms can be
used to classify this?
Examples include: Linear
SVM, Decision Tree,
Nearest Neighbor, and
Neural Network.

3
Decision Tree for XOR problem

• XOR Classification Capability:

• Decision trees can model complex feature interactions.
• Unlike linear classifiers, decision trees can successfully classify XOR patterns.
• Axes-Aligned Decision Boundaries:
• Decision trees create decision boundaries that align with the feature axes.
• This contrasts with the more flexible boundaries found in neural networks.

Image source: https://datascience.stackexchange.com/questions/27960/what-are-examples-for-xor-parity-and-multiplexer-problems-in-decision-tree-lear

https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/ml-decision-tree/tutorial/

4
Decision Tree Representation Power
Discrete Inputs/Outputs:
• Can represent any discrete function based on input attributes.
(Decision Tree)
Continuous Inputs/Outputs:
• Capable of approximating any continuous function closely.
(Regression Tree)

Overfitting Concern:
• Possible to construct a tree that fits training data perfectly.
• Such trees may fail to generalize to new data due to overfitting.

5
Learning Decision Tree
• Decision Tree learning is a challenging problem.
• The number of possible trees grows exponentially with the
number of features
• To address the complexity, we employ a practical
approach: a greedy heuristic.
• Start with an empty decision tree.
• Identify the "best" attribute for splitting.
• Recurse the process.
• The selection of the "best" attribute is crucial.
• Gini Impurity, Information Gain (based on Entropy), Chi-Square

6
Learning Decision Tree – Youtube Video

• (21) Decision Tree: Important things to know – YouTube

7
What Makes A Good Tree

Too Small Right Balance is the key! Too Big

• Not too small: need to handle important but possibly subtle distinctions in data
• Not too big:
• Computational efficiency (avoid redundant, spurious attributes)
• Avoid over-fitting training examples
• Setting a Maximum Depth for the tree, Minimum Samples for a Node Split, Minimum
Samples for a Leaf Node, Ensemble Methods (Random Forest), Pruning, etc.
• Find the simplest hypothesis (smallest tree) that fits the observations
8
Tree Overfitting and Underfitting Issue (See
lab1 this week)

When the tree becomes too deep,

even slight changes can cause the
decision tree's feature space to
change dramatically, as it starts
capturing noise in the data

9
How to overcome Tree Overfitting Issues?
• Necessary to ensure the tree is compact.
• Setting a Maximum Depth for the tree, Minimum Samples
for a Node Split, Minimum Samples for a Leaf Node
• What are the issues for limiting tree depth?

10
How to overcome Tree Overfitting Issues?
• What are the issues for limiting tree depth?
• A shallow tree fails to capture the complexity of data distribution.

• A better approach: Random Forest

11
Random Forest

• Tree Construction: Trees are built from random subsets of the

training data. At each node, a subset of features is randomly
chosen to make a decision split.
• Reduction of Overfitting: By averaging multiple decision trees,
random forests reduce the risk of overfitting common to individual
trees.
12
Randomness is Key to Random Forests
Analogy:
• Building a Random Forest is like constructing a diversified investment
portfolio.
• Why? Instead of relying on similar "stocks" (decisions), the goal is to
diversify by creating a mix of "assets" (trees).

Why Randomness?
• Randomness helps reduce overfitting by ensuring that each tree in the
forest is different.
• This diversity among trees improves the overall performance of the
model.

13
How Randomness is Introduced in Random
Forests
• Randomness 1: Random Subsets of the Training Data
• Each tree is built using a random subset of the data.
• Why? This ensures that no single tree relies on the entire dataset, promoting
variety across the trees.
• Randomness 2: Random Subset of Features at Each Decision Node
• At every decision node, only a random subset of features is considered for
splitting the data.
• Why? This reduces correlation between trees, making them more independent
and improving the model’s robustness.

14
Bias and Variance in Random Forest
Bias:
• Definition: Error due to overly simplistic models that don’t capture the true data
pattern.
• In Random Forest: Shallow trees (e.g., low depth) might have high bias. They can't
capture complex patterns in data, leading to underfitting.
Variance:
• Definition: Error due to model sensitivity to small fluctuations in the training data.
• In Random Forest: Deep trees (e.g., high depth) may have high variance. They
memorize the training data too well, leading to overfitting.
Random Forest Balances Bias and Variance: By using random subsets of data and
features and averaging, Random Forest reduces variance (overfitting).

15
Application1 : Random Decision Tree in
Microsoft Kinetics

Technology behind Kinetics - Random Decision Trees by MSRC, 2011

16
Decision trees are in Xbox: Classifying body parts
1.Inference: Body part distribution
per pixel is derived from the depth
image.
2.Estimation: Local signal modes are
calculated to generate proposals.
3.Result: Precise 3D joint location
proposals for body parts.

Handling Complexity:
•Multiple Users: Capable of
distinguishing between different
individuals in the same image.
•Diverse Poses: Effective even with
various body poses.

17
Lesson1: Use Computer Graphics to Generate
A lot of Data

3D Model: Existing computer graphics with a pre-defined 3D model. Body parts are
explicitly labeled corresponding to the 3D model.
Deformation: Manipulate standard 3D body models to introduce variations. Generate
extensive synthetic data through controlled model alterations.
Benefits: Enhances the robustness of machine learning models through diverse training
data.
18
Lesson 2: Depth Features
• Use Simple Depth Features with Random Decision Trees Algorithms

• For each pixel, computes the feature.

is the depth feature, and parameter describes the offsets u and v

• The normalization of the offsets ensure the features are depth invariant. At any given
point in image, a fixed world space offset will result whether the pixel is close or far
away from the camera 19
Lesson 3: Random Forest

Random Forest: widely considered as a top-tier machine learning

algorithm. A consistent favorite in Kaggle competitions.

20
Application 2: Decision Forests with Long-Range Spatial
Context for Organ Localization in CT Volumes

Simple Spatial Features Random Decision Tree (Random

Forest) Algorithms

MSRC builds upon Kinect's approach for medical image

analysis. Check the Criminisi paper for details
21
Decision Tree Pro and Cons
• Pros:
• Interpretability: Decision trees are highly interpretable. They mimic
human decision-making more closely than other classifiers and provide
a clear decision path.
• Handles Mixed Data: Can handle both numerical and categorical data.
• No Scaling Required: Unlike algorithms like SVM or neural networks,
decision trees do not require feature scaling.
• Cons:
• Overfitting: Prone to overfit, especially with lots of features.
• Instability: Small changes in the data can lead to a different tree.
• Random Forest can are much stronger, but will lose the
interpretability advantage.

22
Summary
Motivation for Decision Trees: Hierarchical partitioning of
feature space using a divide-and-conquer approach.

Tree Reconstruction:
• Utilizes a recursive algorithm.
• Employs information gain to determine optimal nodes.
Parameter Tuning:
• Tree depth and size are pivotal for balance.
• Aims to mitigate overfitting and underfitting.

Applications: Kinetic motion capture systems and many other

applications.
23
24

Lecture-4 Unit 2
No ratings yet
Lecture-4 Unit 2
73 pages
Unit 3,4,5 ML (CS - AI)
No ratings yet
Unit 3,4,5 ML (CS - AI)
37 pages
What Is Decision Tree
No ratings yet
What Is Decision Tree
35 pages
14 - Ensemble Methods
No ratings yet
14 - Ensemble Methods
38 pages
Types of AI Agents
No ratings yet
Types of AI Agents
6 pages
DecisionTreeClassifiers Avi PurdueUni 2017
No ratings yet
DecisionTreeClassifiers Avi PurdueUni 2017
127 pages
AI for Everyone: An Intermediate Guide to Artificial Intelligence
From Everand
AI for Everyone: An Intermediate Guide to Artificial Intelligence
Nova Clarke
No ratings yet
Ilovepdf Merged-3
No ratings yet
Ilovepdf Merged-3
70 pages
Decision Tree
No ratings yet
Decision Tree
45 pages
Randon Forest
No ratings yet
Randon Forest
34 pages
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
From Everand
Efficient Algorithms and Structures with Heaps: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Lecture 3 - Decision Trees and Random Forest
No ratings yet
Lecture 3 - Decision Trees and Random Forest
20 pages
Phys361 S24 Lecture 17 Random Forests
No ratings yet
Phys361 S24 Lecture 17 Random Forests
24 pages
MLS 1 - Decision Trees and Random Forests
No ratings yet
MLS 1 - Decision Trees and Random Forests
16 pages
Present
No ratings yet
Present
20 pages
20ee38011 Exp4
No ratings yet
20ee38011 Exp4
24 pages
DS Unit - 4
No ratings yet
DS Unit - 4
76 pages
Decision Tree Algorithm
No ratings yet
Decision Tree Algorithm
14 pages
Random Forest Algorithm
No ratings yet
Random Forest Algorithm
39 pages
Decision Trees
No ratings yet
Decision Trees
8 pages
8.program Decisiontree
No ratings yet
8.program Decisiontree
15 pages
Decision Tree and Random Forest
No ratings yet
Decision Tree and Random Forest
41 pages
Decision Trees Presentation
No ratings yet
Decision Trees Presentation
10 pages
ML CLASS 6 Decision Tree Algorithm
No ratings yet
ML CLASS 6 Decision Tree Algorithm
21 pages
Predict 422 - Module 8
100% (1)
Predict 422 - Module 8
138 pages
Assignment of Decision Tree
No ratings yet
Assignment of Decision Tree
15 pages
Random Forest
No ratings yet
Random Forest
21 pages
Decision Trees and Random Forests
No ratings yet
Decision Trees and Random Forests
36 pages
Assignment of Decision Tree in Machine Learning
No ratings yet
Assignment of Decision Tree in Machine Learning
15 pages
Deep Learning and Neural Networks
No ratings yet
Deep Learning and Neural Networks
21 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
08 Decision - Tree
No ratings yet
08 Decision - Tree
9 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
2023AIB1008 Lab08
No ratings yet
2023AIB1008 Lab08
8 pages
Da MS
No ratings yet
Da MS
24 pages
Decision Trees
No ratings yet
Decision Trees
18 pages
Aditri Chaudhuri - DM
No ratings yet
Aditri Chaudhuri - DM
10 pages
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
No ratings yet
Decision Trees - A Complete Introduction With Examples - by Shubham Koli - Medium
22 pages
Random Forest Algorithm 1
No ratings yet
Random Forest Algorithm 1
14 pages
Presentation On Decision Trees
No ratings yet
Presentation On Decision Trees
12 pages
Introduction To Decision Trees
No ratings yet
Introduction To Decision Trees
10 pages
Lecture Note #5 - PEC-CS701E
No ratings yet
Lecture Note #5 - PEC-CS701E
16 pages
Artificial Intelligence Algorithms
From Everand
Artificial Intelligence Algorithms
akosnemeth
No ratings yet
ML Unit3
No ratings yet
ML Unit3
8 pages
Discrete Mathematics - MA3354 - Hand Written Notes - Unit 5 - Lattices and Boolean Algebra
No ratings yet
Discrete Mathematics - MA3354 - Hand Written Notes - Unit 5 - Lattices and Boolean Algebra
41 pages
Unit 4
No ratings yet
Unit 4
33 pages
Assignment Decision Tree
No ratings yet
Assignment Decision Tree
15 pages
Practical No4 - 5 ML
No ratings yet
Practical No4 - 5 ML
11 pages
Random Forest
No ratings yet
Random Forest
25 pages
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
No ratings yet
Machine Learning With Random Forests - by Knoldus Inc. - Knoldus - Technical Insights - Medium
12 pages
Decision Tree
No ratings yet
Decision Tree
16 pages
Ch02 DecisionTree
No ratings yet
Ch02 DecisionTree
41 pages
B Tech Courses Ai and Ds
No ratings yet
B Tech Courses Ai and Ds
168 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
Decision Tree Comprehesive
No ratings yet
Decision Tree Comprehesive
7 pages
Machine Learning Note 2
No ratings yet
Machine Learning Note 2
2 pages
Machine Learning For Decision Makers: Cognitive Computing Fundamentals For Better Decision Making 2nd Edition Patanjali Kashyap
No ratings yet
Machine Learning For Decision Makers: Cognitive Computing Fundamentals For Better Decision Making 2nd Edition Patanjali Kashyap
36 pages
Unit 1 Part 1
No ratings yet
Unit 1 Part 1
61 pages
Introduction To Programming With Python: Libfexdlbdsipwp01
No ratings yet
Introduction To Programming With Python: Libfexdlbdsipwp01
188 pages
Decision Tree Classification Algorithm
No ratings yet
Decision Tree Classification Algorithm
4 pages
Random Forest - Basics
No ratings yet
Random Forest - Basics
9 pages
An Intelligent Web-Based Voice Chat Bot
No ratings yet
An Intelligent Web-Based Voice Chat Bot
33 pages
Corporate Presentation
No ratings yet
Corporate Presentation
112 pages
Ransomware Readiness Guide
100% (1)
Ransomware Readiness Guide
15 pages
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
No ratings yet
Machine Learning: Practical Tutorial On Random Forest and Parameter Tuning in R
11 pages
Graduate Environmental Consultant Cover Letter
100% (1)
Graduate Environmental Consultant Cover Letter
8 pages
The Future of Accounting: How AI and Automation Are Changing The Profession
No ratings yet
The Future of Accounting: How AI and Automation Are Changing The Profession
16 pages
Microsoft Learn For Orgs Playbook
No ratings yet
Microsoft Learn For Orgs Playbook
19 pages
A Comprehensive Review and Evaluation On Text Predictive and Entertainment Systems
No ratings yet
A Comprehensive Review and Evaluation On Text Predictive and Entertainment Systems
42 pages
A Survey of Large Language Models
No ratings yet
A Survey of Large Language Models
58 pages
Haghighat Et Al. - 2020 - Applications of Deep Learning in Intelligent Transportation Systems-Annotated
No ratings yet
Haghighat Et Al. - 2020 - Applications of Deep Learning in Intelligent Transportation Systems-Annotated
31 pages
2020 AI Healthcare Report
No ratings yet
2020 AI Healthcare Report
27 pages
Image Segmentation: Unlocking Insights through Pixel Precision
From Everand
Image Segmentation: Unlocking Insights through Pixel Precision
Fouad Sabry
No ratings yet
1Z0 1122 24 Demo
No ratings yet
1Z0 1122 24 Demo
6 pages
Decision Tree Pruning: Fundamentals and Applications
From Everand
Decision Tree Pruning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Cofintelligence Fintech Business Plan
No ratings yet
Cofintelligence Fintech Business Plan
37 pages
Submission Template - Imagine Cup Junior UAE 2025
No ratings yet
Submission Template - Imagine Cup Junior UAE 2025
16 pages
Chapter 1
No ratings yet
Chapter 1
10 pages
AIML Architect Ascendion MOHIT SHARMA Bengaluru
No ratings yet
AIML Architect Ascendion MOHIT SHARMA Bengaluru
3 pages
Artificial Intelligence in Education RRL
No ratings yet
Artificial Intelligence in Education RRL
3 pages
Role of Artificial Intelligence in Legal Education and Profession
No ratings yet
Role of Artificial Intelligence in Legal Education and Profession
3 pages
15th ICCCNT 2024 Paper 3723
No ratings yet
15th ICCCNT 2024 Paper 3723
8 pages
Plastics Potential
No ratings yet
Plastics Potential
11 pages
Aml Question Bank
No ratings yet
Aml Question Bank
5 pages
AI-driven Operations Forecasting in Data-Light Environments
No ratings yet
AI-driven Operations Forecasting in Data-Light Environments
8 pages
ChatGPT Vs Google Bard RR
No ratings yet
ChatGPT Vs Google Bard RR
4 pages
Business Plan For Merrill Lynch Interview
No ratings yet
Business Plan For Merrill Lynch Interview
15 pages
Harshal PAL: Education
No ratings yet
Harshal PAL: Education
1 page