0% found this document useful (0 votes)
2 views

Lecture2 Decision Tree and Random Forest

The document discusses Decision Trees and Random Forests in machine learning, highlighting their structure, capabilities, and applications. It emphasizes the importance of tree depth and size to balance overfitting and underfitting, and outlines how Random Forests mitigate overfitting through randomness in data and feature selection. Applications include motion capture systems and medical image analysis.

Uploaded by

leodang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture2 Decision Tree and Random Forest

The document discusses Decision Trees and Random Forests in machine learning, highlighting their structure, capabilities, and applications. It emphasizes the importance of tree depth and size to balance overfitting and underfitting, and outlines how Random Forests mitigate overfitting through randomness in data and feature selection. Applications include motion capture systems and medical image analysis.

Uploaded by

leodang
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 24

DTS304TC: Machine Learning

Lecture 2: Decision Tree and Random Forest

Dr Kang Dang
D-5032, Taicang Campus
[email protected]
Tel: 88973341
1
Decision Trees
A hierarchical data
structure that implements
a divide and conquer
strategy is known as a
tree.
• Internal nodes test
attributes
• Branching is determined
by attribute value
• Leaf nodes are outputs
(class assignments)
Image source: https://www.linkedin.com/pulse/using-classification-tree-detect-spam-mails-stefano-stompanato/

2
Question – XOR problem

What types of machine


learning algorithms can be
used to classify this?
Examples include: Linear
SVM, Decision Tree,
Nearest Neighbor, and
Neural Network.

3
Decision Tree for XOR problem

• XOR Classification Capability:


• Decision trees can model complex feature interactions.
• Unlike linear classifiers, decision trees can successfully classify XOR patterns.
• Axes-Aligned Decision Boundaries:
• Decision trees create decision boundaries that align with the feature axes.
• This contrasts with the more flexible boundaries found in neural networks.

Image source: https://datascience.stackexchange.com/questions/27960/what-are-examples-for-xor-parity-and-multiplexer-problems-in-decision-tree-lear


https://www.hackerearth.com/practice/machine-learning/machine-learning-algorithms/ml-decision-tree/tutorial/

4
Decision Tree Representation Power
Discrete Inputs/Outputs:
• Can represent any discrete function based on input attributes.
(Decision Tree)
Continuous Inputs/Outputs:
• Capable of approximating any continuous function closely.
(Regression Tree)

Overfitting Concern:
• Possible to construct a tree that fits training data perfectly.
• Such trees may fail to generalize to new data due to overfitting.

5
Learning Decision Tree
• Decision Tree learning is a challenging problem.
• The number of possible trees grows exponentially with the
number of features
• To address the complexity, we employ a practical
approach: a greedy heuristic.
• Start with an empty decision tree.
• Identify the "best" attribute for splitting.
• Recurse the process.
• The selection of the "best" attribute is crucial.
• Gini Impurity, Information Gain (based on Entropy), Chi-Square

6
Learning Decision Tree – Youtube Video

• (21) Decision Tree: Important things to know – YouTube

7
What Makes A Good Tree

Too Small Right Balance is the key! Too Big

• Not too small: need to handle important but possibly subtle distinctions in data
• Not too big:
• Computational efficiency (avoid redundant, spurious attributes)
• Avoid over-fitting training examples
• Setting a Maximum Depth for the tree, Minimum Samples for a Node Split, Minimum
Samples for a Leaf Node, Ensemble Methods (Random Forest), Pruning, etc.
• Find the simplest hypothesis (smallest tree) that fits the observations
8
Tree Overfitting and Underfitting Issue (See
lab1 this week)

When the tree becomes too deep,


even slight changes can cause the
decision tree's feature space to
change dramatically, as it starts
capturing noise in the data

9
How to overcome Tree Overfitting Issues?
• Necessary to ensure the tree is compact.
• Setting a Maximum Depth for the tree, Minimum Samples
for a Node Split, Minimum Samples for a Leaf Node
• What are the issues for limiting tree depth?

10
How to overcome Tree Overfitting Issues?
• What are the issues for limiting tree depth?
• A shallow tree fails to capture the complexity of data distribution.

• A better approach: Random Forest

11
Random Forest

• Tree Construction: Trees are built from random subsets of the


training data. At each node, a subset of features is randomly
chosen to make a decision split.
• Reduction of Overfitting: By averaging multiple decision trees,
random forests reduce the risk of overfitting common to individual
trees.
12
Randomness is Key to Random Forests
Analogy:
• Building a Random Forest is like constructing a diversified investment
portfolio.
• Why? Instead of relying on similar "stocks" (decisions), the goal is to
diversify by creating a mix of "assets" (trees).

Why Randomness?
• Randomness helps reduce overfitting by ensuring that each tree in the
forest is different.
• This diversity among trees improves the overall performance of the
model.

13
How Randomness is Introduced in Random
Forests
• Randomness 1: Random Subsets of the Training Data
• Each tree is built using a random subset of the data.
• Why? This ensures that no single tree relies on the entire dataset, promoting
variety across the trees.
• Randomness 2: Random Subset of Features at Each Decision Node
• At every decision node, only a random subset of features is considered for
splitting the data.
• Why? This reduces correlation between trees, making them more independent
and improving the model’s robustness.

14
Bias and Variance in Random Forest
Bias:
• Definition: Error due to overly simplistic models that don’t capture the true data
pattern.
• In Random Forest: Shallow trees (e.g., low depth) might have high bias. They can't
capture complex patterns in data, leading to underfitting.
Variance:
• Definition: Error due to model sensitivity to small fluctuations in the training data.
• In Random Forest: Deep trees (e.g., high depth) may have high variance. They
memorize the training data too well, leading to overfitting.
Random Forest Balances Bias and Variance: By using random subsets of data and
features and averaging, Random Forest reduces variance (overfitting).

15
Application1 : Random Decision Tree in
Microsoft Kinetics

Technology behind Kinetics - Random Decision Trees by MSRC, 2011


16
Decision trees are in Xbox: Classifying body parts
1.Inference: Body part distribution
per pixel is derived from the depth
image.
2.Estimation: Local signal modes are
calculated to generate proposals.
3.Result: Precise 3D joint location
proposals for body parts.

Handling Complexity:
•Multiple Users: Capable of
distinguishing between different
individuals in the same image.
•Diverse Poses: Effective even with
various body poses.

17
Lesson1: Use Computer Graphics to Generate
A lot of Data

3D Model: Existing computer graphics with a pre-defined 3D model. Body parts are
explicitly labeled corresponding to the 3D model.
Deformation: Manipulate standard 3D body models to introduce variations. Generate
extensive synthetic data through controlled model alterations.
Benefits: Enhances the robustness of machine learning models through diverse training
data.
18
Lesson 2: Depth Features
• Use Simple Depth Features with Random Decision Trees Algorithms

• For each pixel, computes the feature.

is the depth feature, and parameter describes the offsets u and v

• The normalization of the offsets ensure the features are depth invariant. At any given
point in image, a fixed world space offset will result whether the pixel is close or far
away from the camera 19
Lesson 3: Random Forest

Random Forest: widely considered as a top-tier machine learning


algorithm. A consistent favorite in Kaggle competitions.

20
Application 2: Decision Forests with Long-Range Spatial
Context for Organ Localization in CT Volumes

Simple Spatial Features Random Decision Tree (Random


Forest) Algorithms

MSRC builds upon Kinect's approach for medical image


analysis. Check the Criminisi paper for details
21
Decision Tree Pro and Cons
• Pros:
• Interpretability: Decision trees are highly interpretable. They mimic
human decision-making more closely than other classifiers and provide
a clear decision path.
• Handles Mixed Data: Can handle both numerical and categorical data.
• No Scaling Required: Unlike algorithms like SVM or neural networks,
decision trees do not require feature scaling.
• Cons:
• Overfitting: Prone to overfit, especially with lots of features.
• Instability: Small changes in the data can lead to a different tree.
• Random Forest can are much stronger, but will lose the
interpretability advantage.

22
Summary
Motivation for Decision Trees: Hierarchical partitioning of
feature space using a divide-and-conquer approach.

Tree Reconstruction:
• Utilizes a recursive algorithm.
• Employs information gain to determine optimal nodes.
Parameter Tuning:
• Tree depth and size are pivotal for balance.
• Aims to mitigate overfitting and underfitting.

Applications: Kinetic motion capture systems and many other


applications.
23
24

You might also like