0% found this document useful (0 votes)
1 views

5-Intro-to-Tree-Methods-LT

Decision trees are a supervised learning method used for classification and regression, utilizing a tree-like model to make decisions based on various attributes. The creation of sub-nodes enhances the homogeneity of the resultant nodes, and algorithms like ID3, C4.5, and CART are commonly used for building decision trees. Random forests improve performance by using multiple trees with random samples of features, which helps reduce variance by decorrelating the trees.

Uploaded by

sarabenamar27
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
1 views

5-Intro-to-Tree-Methods-LT

Decision trees are a supervised learning method used for classification and regression, utilizing a tree-like model to make decisions based on various attributes. The creation of sub-nodes enhances the homogeneity of the resultant nodes, and algorithms like ID3, C4.5, and CART are commonly used for building decision trees. Random forests improve performance by using multiple trees with random samples of features, which helps reduce variance by decorrelating the trees.

Uploaded by

sarabenamar27
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Introduction to

Decision Trees

Dr. BENDIABDALLAH
How do Decision Trees work?
• A decision tree is a simple but powerful
supervised learning method that uses tree-like
model of decisions and their possible
consequences.
• They are used in both classification and
regression problems.
• The creation of sub-nodes increases the
homogeneity of resultant sub-nodes.
• Decision trees classify the examples by sorting
them down the tree from the root to some
leaf/terminal node, with the leaf/terminal node
providing the classification of the example.
Tree Methods

Imagine that I play Tennis every Saturday and I always invite a


friend to come with me.
Sometimes my friend shows up, sometimes not.
For him it depends on a variety of factors, such as: weather,
temperature, humidity, wind etc..
I start keeping track of these features and whether or not he
showed up to play with me.
Tree Methods
Tree Methods

I want to use this data to


predict whether or not he will
show up to play.
An intuitive way to do this is
through a Decision Tree
Tree Methods

In this tree we have:


● Nodes
○ Split for the value of a
certain attribute
● Edges
○ Outcome of a split to
next node
Root
Tree Methods Leaves

In this tree we have:


● Root
○ The node that performs
the first split
● Leaves (colored nodes)
○ Terminal nodes that
predict the outcome
Algorithms used in Decision Trees:

• ID3 → (extension of D3)


• C4.5 → (successor of ID3)
• CART → (Classification And Regression Tree)
• CHAID → (Chi-square automatic interaction
detection Performs multi-level splits when
computing classification trees)
• MARS → (multivariate adaptive regression splines)
The primary challenge in the decision tree
implementation is to identify which
attributes do we need to consider as the
root node and each level?
Creating Decision Tree
• In the beginning, the whole training set is
considered as the root.
• Feature values are preferred to be categorical. If
the values are continuous then they are
discretized prior to building the model.
• Order to placing attributes as root or internal
node of the tree is done by using some statistical
approach.
We have different attributes selection measures to identify the
attribute which can be considered as the root note at each level.
Attributes selection measures
• For solving this attribute selection problem, researchers
worked and devised some solutions. They suggested using
some criteria like :
• Entropy,
• Gini index,
• ...
Gini Index
• Gini index: The gini index is a number describing the quality of the split
of a node on a variable (feature).
Random Forests

To improve performance, we can use many trees with a


random sample of features chosen as the split.
● A new random sample of features is chosen for
every single tree at every single split.
● For classification, m is typically chosen to be
the square root of p.
Random Forests

What's the point?

● Suppose there is one very strong feature in the data set.


When using “bagged” trees, most of the trees will use that
feature as the top split, resulting in an ensemble of similar
trees that are highly correlated.
Random Forests

What's the point?

● Averaging highly correlated quantities does not significantly


reduce variance.
● By randomly leaving out candidate features from each split,
Random Forests "decorrelates" the trees, such that the
averaging process can reduce the variance of the resulting
model.

You might also like