Machine Learning II - Decision Trees
Machine Learning II - Decision Trees
Decision Trees
Prepared By
Dr Augustine S. Nsang
Introduction
The decision tree learning is an extraordinarily important
algorithm for AI not only because it is very powerful, but
also because it is simple and efficient for extracting
knowledge from data.
Compared to other learning algorithms, it has important
advantages. The extracted knowledge can be easily
understood, interpreted, and controlled by humans in the
form of a readable decision tree.
We shall show in a simple example how a decision tree
can be constructed from training data.
A Simple Example
A devoted skier who lives near the high sierra, a beautiful
mountain range in California, wants a decision tree to help
him decide whether it is worthwhile to drive his car to a ski
resort in the mountains. We thus have a two-class problem
ski yes/no based on the variables listed in Table 1.
Figure 1 (Slide 5) shows a decision tree for this problem. A
decision tree is a tree whose inner nodes represent features
(attributes). Each edge stands for an attribute value. At each
leaf node a class value is given.
The data used for the construction of the decision tree is
shown in Table 1 (next slide). Each row in the table contains
the data for one day and as such represents a sample.
A Simple Example – Cont’d
D = (yes, yes, yes, yes, yes, yes, no, no, no, no, no)
In this case, the first one of the n events will certainly occur and
all others will not. The uncertainty about the outcome of the events
is thus minimal.
In contrast, for the uniform distribution
1 1 1
, ,...,
n n n
p=( )
Fig 2: The entropy function for the case of two classes. We see the maximum at p
= 1/2 and the symmetry with respect to swapping p and 1 −p
Information Content
The information content of a dataset is defined as:
I(D) := 1 − H(D)
Step 1:
For i := 1 to p:
Select k data samples (k < n) at random from the n data samples in
D, and construct a decision tree using these k data samples
Step 2:
Given any unknown data sample, x, classify x using each of the p
decision trees constructed in Step 1. The class obtained by the random
forest classification is given by a majority vote of the p decision trees.