100% found this document useful (1 vote)
364 views

Decision Tree Learning

The document summarizes a lecture on decision tree learning. It discusses what a decision tree is, how it can be used for classification, how the ID3 algorithm constructs decision trees from training data in a top-down greedy manner by choosing attributes that maximize information gain at each step, and where decision trees can appropriately be applied, such as when the target function has discrete outputs.

Uploaded by

Saurabh Saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
364 views

Decision Tree Learning

The document summarizes a lecture on decision tree learning. It discusses what a decision tree is, how it can be used for classification, how the ID3 algorithm constructs decision trees from training data in a top-down greedy manner by choosing attributes that maximize information gain at each step, and where decision trees can appropriately be applied, such as when the target function has discrete outputs.

Uploaded by

Saurabh Saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 65

Machine Learning

(IS ZC464)
BITS Pilani
Pilani Campus

SK Hafizul Islam, Ph.D


Phone: +91-1596-51-5846

BITS Pilani
Pilani Campus

Lecture No. 9
Date 20/02/2016
Time 2:00 PM 4:00 PM

Decision Tree Learning

Todays agenda

What is Decision Tree ?


How Decision Tree is used as a classifier?
How to construct Decision Tree ?
Where we can apply Decision Tree Learning?
What is Entropy?
What is Information Gain?
How the ID3 Algorithm works?

4
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree
A decision tree is a structure that includes
A root node,
Set of branches, and
Set of leaf nodes
Set of Each internal node.

5
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree
A Decision tree is used for classification.
Each internal node denotes a test on an attribute
Each branch denotes the outcome of a test, and
Each leaf node holds a class label.

6
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree
Decision tree is for the
concept buy_compute
that indicates whether a
customer at a company
is likely to buy a
computer or not.

Each internal node represents a test on an attribute.


Each leaf node represents a class.
IS ZC464, Machine Learning

20 February 2016

7
BITS Pilani, Pilani Campus

How to use for classification?


Begin at root node.
Follow
the
correct
branch.
Reach to an appropriate
leaf node
Get the predicted class
value.

X = (age = senior, Student = no, Credit_rating = excellent), buy_compute = ?


Y = (age = Young, Student = Yes, Credit_rating = fair), buy_compute = ?
Z = (age = middle-age, Student = No, Credit_rating = fair), buy_compute = ?
8
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Benefits
It does not require any domain knowledge.
It is easy to understood.
The learning and classification steps of a decision tree are simple
and fast.

9
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation


Training examples for the target concept PlayTennis.
Day

Outlook

Temperature

Humidity

Wind

PlayTennis

D1

Sunny

Hot

High

Weak

No

D2

Sunny

Hot

High

Strong

No

D3

Overcast

Hot

High

Weak

Yes

D4

Rain

Mild

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

No

10
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation


In general, decision trees represent a disjunction ( ) of
conjunctions ( ) of constraints on the attribute values of
instances.
Each path from the tree root to a leaf corresponds to a
conjunction of attribute tests, and the tree itself to a disjunction
of these conjunctions.

11
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation

Outlook = Sunny Humidity = High


12
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation

Outlook = Sunny Humidity = Normal


13
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation

Outlook = Overcast
14
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation

Outlook = Rain Wind = Strong


15
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation

Outlook = Rain Wind = weak


16
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Representation

(Outlook = Sunny Humidity = High) (Outlook = Sunny Humidity = Normal)


(Outlook = Overcast) (Outlook = Rain Wind = Strong) (Outlook = Rain Wind = weak)
17
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Where to apply Decision Tree?


Instances are represented by attribute-value pairs
Instances are described by a fixed set of attributes (e.g.,
Temperature) and their values (e.g., Hot).
The easiest situation for decision tree learning is when each
attribute takes on a small number of disjoint possible values (e.g.,
Hot, Mild, Cold).
However, extensions to the basic algorithm allow handling realvalued attributes as well (e.g., representing Temperature
numerically).
(Outlook = Sunny , Temperature = Hot, Humidity = High, Wind = Strong)

18
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Where to apply Decision Tree


The target function has discrete output values.
The decision tree assigns a Boolean classification to each example.
Playtennis = Yes or Playtennis = No

19
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Where to apply Decision Tree


Disjunctive descriptions may be required
Decision trees naturally represent disjunctive expressions.

(Outlook = Sunny Humidity = High) (Outlook = Sunny Humidity = Normal)


(Outlook = Overcast) (Outlook = Rain Wind = Strong) (Outlook = Rain Wind = weak)
20
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Where to apply Decision Tree


The training data may contain errors
Decision tree learning methods are robust to errors, both errors in
classifications of the training examples and errors in the attribute
values that describe these examples.

Z = (age = middle-age, Student = No, Credit_rating = fair), buy_compute = ?


IS ZC464, Machine Learning

20 February 2016

21

BITS Pilani, Pilani Campus

Where to apply Decision Tree


The training data may contain missing attribute values
Decision tree methods can be used even when some training
examples have unknown values
if the Humidity of the day is known for only some of the
training examples.

22
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


ID3 is developed by R. Quinlan in 1973.
ID3 is a top-down Greedy algorithm.
ID3 begins with the question "which attribute should be tested at
the root of the tree?
If root node is decided, each instance attribute is evaluated using
a statistical test to determine how well it alone classifies the
training examples.
Then the best attribute is selected and used as the test at the root
node of the tree.

23
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


A descendant of the root node is then created for each possible
value of this attribute, and the training examples are sorted to the
appropriate descendant node.
The entire process is then repeated using the training examples
associated with each descendant node to select the best attribute
to test at that point in the tree.

24
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree

A the best attribute for next node.


Assign A as decision attribute for node.
For each value of A create new decedent.
Sort the tanning examples to leaf node according to the attribute
value of the branch.
If all tanning examples are perfectly classified, stop. Otherwise,
iterate over new leaf nodes.

25
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


Input
Attr Set of non-target attributes
Q Target attribute
S Training Set
Output
Returns a Decision Tree

26
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree

ID3(Attr, Q, S) {
If S is empty, returns a single node with value failure.
If S consists of examples of same class, return a single leaf node with that values.
If Attr is empty, then return a single node with the value of the most frequent value of Q
in S.
{
A ChooseBestAttribute (S, Attr)
Tree A new decision tree rooted at A
For each value vj of A do
Sj Subset of S with A = vj
Subt ID3(Attr A, Q, Sj)
Add a branch to tree with
label vj and sub-tree subt
Return tree
}

IS ZC464, Machine Learning

27
20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


Training examples for the target concept PlayTennis.
Day

Outlook

Temperature

Humidity

Wind

PlayTennis

D1

Sunny

Hot

High

Weak

No

D2

Sunny

Hot

High

Strong

No

D3

Overcast

Hot

High

Weak

Yes

D4

Rain

Mild

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

No

IS ZC464, Machine Learning

20 February 2016

28

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


Day

Outlook

Temperature

Humidity

Wind

PlayTennis

D15

Rain

Hot

High

Weak

???

29
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


Day

Outlook

Temperature

Humidity

Wind

PlayTennis

D1

Sunny

Hot

High

Weak

No

D2

Sunny

Hot

High

Strong

No

D3

Overcast

Hot

High

Weak

Yes

D4

Rain

Mild

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

No

IS ZC464, Machine Learning

20 February 2016

30

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Overcast
(4 Yes/0 No)

Sunny
(2 Yes/3 No)

Day

Temperature

Humidity

Wind

PlayTennis

D3

Hot

High

Weak

Yes

D7

Cool

Normal

Strong

Yes

D12

Mild

High

Strong

Yes

D13

Hot

Normal

Weak

Yes

Rain

(3 Yes/2 No)

Day

Temperature

Humidity

Wind

PlayTennis

Day

Temperature

Humidity

Wind

PlayTennis

D1

Hot

High

Weak

No

D4

Mild

High

Weak

Yes

D2

Hot

High

Strong

No

D5

Cool

Normal

Weak

Yes

D8

Mild

High

Weak

No

D6

Cool

Normal

Strong

No

D9

Cool

Normal

Weak

Yes

D10

Mild

Normal

Weak

Yes

D11

Mild

Normal

Strong

Yes

D14

Mild

High

Strong

No

IS ZC464, Machine Learning

20 February 2016

31

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Overcast

Sunny

Rain
Yes
(4 Yes/0 No)

(2 Yes/3 No)

(3 Yes/2 No)
Day

Temperature

Humidity

Wind

PlayTennis

D4

Mild

High

Weak

Yes

No

D5

Cool

Normal

Weak

Yes

Weak

No

D6

Cool

Normal

Strong

No

Normal

Weak

Yes

D10

Mild

Normal

Weak

Yes

Normal

Strong

Yes

D14

Mild

High

Strong

No

Day

Temperature

Humidity

Wind

PlayTenni
s

D1

Hot

High

Weak

No

D2

Hot

High

Strong

D8

Mild

High

D9

Cool

D11

Mild

IS ZC464, Machine Learning

20 February 2016

32

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Overcast

Sunny

Yes

Rain

(4 Yes/0 No)

(2 Yes/3 No)

(3 Yes/2 No)
Day

Temperature

Humidity

Wind

PlayTennis

D4

Mild

High

Weak

Yes

No

D5

Cool

Normal

Weak

Yes

Weak

No

D6

Cool

Normal

Strong

No

Normal

Weak

Yes

D10

Mild

Normal

Weak

Yes

Normal

Strong

Yes

D14

Mild

High

Strong

No

Day

Temperature

Humidity

Wind

PlayTenni
s

D1

Hot

High

Weak

No

D2

Hot

High

Strong

D8

Mild

High

D9

Cool

D11

Mild

IS ZC464, Machine Learning

20 February 2016

33

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


Outlook

Sunny

Overcast

(2 Yes/3 No)

Yes

Rain

(4 Yes/0 No)

Humidity
Normal

High

(3 Yes/2 No)

(0 Yes/3 No)
Day

(9 Yes/5 No)

(2 Yes/0 No)

Tempera
ture

Wind

PlayTennis

D1

Hot

Weak

No

D2

Hot

Strong

No

D8

Mild

Weak

No

IS ZC464, Machine Learning

Day

Temper
ature

Wind

Play
Ten
nis

D9

Cool

Weak

Yes

D11

Mild

Stron
g

Yes

20 February 2016

Day

Temperature

Humidity

Wind

PlayTennis

D4

Mild

High

Weak

Yes

D5

Cool

Normal

Weak

Yes

D6

Cool

Normal

Strong

No

D10

Mild

Normal

Weak

Yes

D14

Mild

High

Strong

No

34

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Sunny

Overcast

(2 Yes/3 No)

Yes
(4 Yes/0 No)

Humidity
High

No
(0 Yes/3 No)

IS ZC464, Machine Learning

Rain

Normal

(3 Yes/2 No)

Yes
(2 Yes/0 No)

20 February 2016

Day

Temperature

Humidity

Wind

PlayTennis

D4

Mild

High

Weak

Yes

D5

Cool

Normal

Weak

Yes

D6

Cool

Normal

Strong

No

D10

Mild

Normal

Weak

Yes

D14

Mild

High

Strong

No

35

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Sunny
Overcast
(2 Yes/3 No)

Yes
(4 Yes/0 No)

Humidity
High

No
(0 Yes/3 No)

IS ZC464, Machine Learning

Rain

Normal

(3 Yes/2 No)

Yes
(2 Yes/0 No)

20 February 2016

Day

Temperature

Humidity

Wind

PlayTennis

D4

Mild

High

Weak

Yes

D5

Cool

Normal

Weak

Yes

D6

Cool

Normal

Strong

No

D10

Mild

Normal

Weak

Yes

D14

Mild

High

Strong

No

36

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Rain

Sunny
Overcast

Wind
(2 Yes/3 No)

Yes
(4 Yes/0 No)

Humidity
High

No
(0 Yes/3 No)

Weak
(3 Yes/0 No)

Normal

Yes
(2 Yes/0 No)

Day

Temperature

Humidity

PlayTennis

D4

Mild

High

Yes

D5

Cool

Normal

Yes

D10

Mild

Normal

Yes

Strong

(0 Yes/2 No)
Day

Temperature

Humidity

PlayTennis

D6

Cool

Normal

No

D14

Mild

High

No

37
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


Outlook

(9 Yes/5 No)

Rain

Sunny
Overcast

Wind
(2 Yes/3 No)

Yes

Humidity

Weak

(4 Yes/0 No)

High

No
(0 Yes/3 No)

Strong

Normal

Yes
(2 Yes/0 No)

Yes
(3 Yes/0 No)

No
(0 Yes/2 No)

38
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Rain

Sunny
Overcast

Wind
(2 Yes/3 No)

Yes

Humidity

Weak

(4 Yes/0 No)

Strong

Normal

High

No

Yes

(0 Yes/3 No)

Yes

(2 Yes/0 No)
Day

D15
IS ZC464, Machine Learning

Outlook
Rain

Temperature
Hot
20 February 2016

No

(3 Yes/0 No)

(0 Yes/2 No)

Humidity

Wind

PlayTennis

High

Weak

???

39

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Rain

Sunny
Overcast

Wind
(2 Yes/3 No)

Yes

Humidity

Weak

(4 Yes/0 No)

Strong

Normal

High

No

Yes

(0 Yes/3 No)

Yes

(2 Yes/0 No)
Day

D15
IS ZC464, Machine Learning

Outlook
Rain

Temperature
Hot
20 February 2016

No

(3 Yes/0 No)

(0 Yes/2 No)

Humidity

Wind

PlayTennis

High

Weak

???

40

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Rain

Sunny
Overcast

Wind
(2 Yes/3 No)

Yes

Humidity

Weak

(4 Yes/0 No)

Strong

Normal

High

No

Yes

(0 Yes/3 No)

Yes

(2 Yes/0 No)
Day

D15
IS ZC464, Machine Learning

Outlook
Rain

Temperature
Hot
20 February 2016

No

(3 Yes/0 No)

(0 Yes/2 No)

Humidity

Wind

PlayTennis

High

Weak

???

41

BITS Pilani, Pilani Campus

ID3: How to construct a Decision Tree


(9 Yes/5 No)

Outlook

Rain

Sunny
Overcast

Wind
(2 Yes/3 No)

Yes

Humidity

Weak

(4 Yes/0 No)

Strong

Normal

High

No

Yes

(0 Yes/3 No)

Yes

(2 Yes/0 No)
Day

D15
IS ZC464, Machine Learning

Outlook
Rain

Temperature
Hot
20 February 2016

No

(3 Yes/0 No)

(0 Yes/2 No)

Humidity

Wind

PlayTennis

High

Weak

Yes

42

BITS Pilani, Pilani Campus

Decision Tree Learning


Which Attribute ?
You can choose
the attribute completely randomly
the attribute with the smallest number of possible values
the attribute with the largest number of possible values
the attribute with the largest expected information gain.
But how they are effective in Decision Tree learning?

43
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Which Attribute Is Best?
The statistical property, information gain, that measures how well
a given attribute separates the training examples according to
their target classification.
It is the expected reduction in entropy caused by partitioning the
examples according to this attribute.
Choose the simplest hypothesis over more complex hypothesis
if they have the same performance over the training examples
(Occams razor).

44
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Entropy(S)
Expected/average number of bits required to encode class
(yes/no) of randomly drawn members of S. (-plog2p).
Optimal code length for a message having probability p is -log2 p
Given a collection S, containing positive and negative examples of
some target concept, the entropy of S relative to this boolean
classification is

45
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Entropy
Suppose S is a collection of 14 examples of some boolean concept,
including 9 positive and 5 negative examples. Then the entropy of
S is

46
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Entropy

47
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Information gain
The information gain, Gain(S, A) of an attribute A, relative to a
collection of training examples S, is defined as

Values(A): Set of all possible values for attribute A.


Sv: Subset of S for which attribute A has value v
(i.e., Sv = {sS|A(s) = v)).

48
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Information gain

The first term is the entropy of the original collection S.


The second term is the expected value of the entropy after S is
partitioned using attribute A.
The expected entropy described by this second term is simply the sum
of the entropies of each subset S, weighted by the fraction of examples
that belong to S.
Gain(S, A) is the expected reduction in entropy caused by knowing the
value of attribute A, i.e., Gain(S, A) is the information provided about
the target function value, given the value of some other attribute A.
49
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Training examples for the target concept PlayTennis.
Day

Outlook

Temperature

Humidity

Wind

PlayTennis

D1

Sunny

Hot

High

Weak

No

D2

Sunny

Hot

High

Strong

No

D3

Overcast

Hot

High

Weak

Yes

D4

Rain

Mild

High

Weak

Yes

D5

Rain

Cool

Normal

Weak

Yes

D6

Rain

Cool

Normal

Strong

No

D7

Overcast

Cool

Normal

Strong

Yes

D8

Sunny

Mild

High

Weak

No

D9

Sunny

Cool

Normal

Weak

Yes

D10

Rain

Mild

Normal

Weak

Yes

D11

Sunny

Mild

Normal

Strong

Yes

D12

Overcast

Mild

High

Strong

Yes

D13

Overcast

Hot

Normal

Weak

Yes

D14

Rain

Mild

High

Strong

No

Wind = Weak
[6 yes, 2 No]
Wind = Strong
[3 yes, 3 No]

50
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Information gain

Wind = Weak
[6 yes, 2 No]
Wind = Strong
[3 yes, 3 No]

51
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Information gain

Wind = Weak
[6 yes, 2 No]
Wind = Strong
[3 yes, 3 No]

Entropy(Sweak) = - 6/8 log2 (6/8) - 2/8 log2 (2/8) = 0.811


52
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning


Information gain

Wind = Weak
[6 yes, 2 No]
Wind = Strong
[3 yes, 3 No]

Entropy(Sstrong) = - 3/6 log2 (3/6) - 3/6 log2 (3/6) = 1.00


53
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Decision Tree Learning

Humidity = High [3 Yes, 4 No]

Humidity = Normal [6 Yes, 1 No]

Entropy(SHigh) = - 3/7 log2 (3/7) - 4/7 log2 (4/7) = 0.985


IS ZC464, Machine Learning

20 February 2016

54

BITS Pilani, Pilani Campus

Decision Tree Learning

Humidity = High [3 Yes, 4 No]

Humidity = Normal [6 Yes, 1 No]

Entropy(SNormal) = - 6/7 log2 (6/7) - 1/7 log2 (1/7) = 0.592


IS ZC464, Machine Learning

20 February 2016

55

BITS Pilani, Pilani Campus

Decision Tree Learning

The information gained by this partitioning is 0.151, compared to a


gain of only 0.048 for the attribute Wind.
56
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Algorithm ID3: Example


Which attribute should be tested first in the tree? Outlook ! Why ?

Outlook attribute provides the best prediction of the target attribute

57
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Algorithm ID3: Example


every example for which Outlook
= Overcast is also a positive
example
of
PlayTennis.
Therefore, this node of the tree
becomes a leaf node with the
classification PlayTennis = Yes.

58
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Algorithm ID3: Example


The descendants corresponding
to Outlook = Sunny and Outlook
= Rain still have nonzero entropy,
and the decision tree will be
further elaborated below these
nodes.

59
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Algorithm ID3: Example


Day

Temperature

Humidity

Wind

PlayTennis

D1

Hot

High

Weak

No

D2

Hot

High

Strong

No

D8

Mild

High

Weak

No

D9

Cool

Normal

Weak

Yes

D11

Mild

Normal

Strong

Yes

60
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Algorithm ID3: Example


Outlook

(9 Yes/5 No)

Rain

Sunny
Overcast

Wind
(2 Yes/3 No)

Yes

Humidity

Weak

(4 Yes/0 No)

High

No
(0 Yes/3 No)

Strong

Normal

Yes
(2 Yes/0 No)

Yes
(3 Yes/0 No)

No
(0 Yes/2 No)

61
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3:Decision Tree
ID3 performs no backtracking in its search (greedy algorithm).
Once an attribute has been chosen as the node for a particular
level of the tree, ID3 does not reconsider this choice.

62
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3:Decision Tree
As ID3 searches through the space of decision trees, it maintains
only a single current hypothesis.
By learning only a single hypothesis, ID3 loses benefits
associated with explicitly representing all consistent
hypotheses.
For instance, it does not have the ability to
determine how many decision trees that are
consistent with the data could exist, or select the
best hypothesis among these.

63
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

ID3:Decision Tree

64
IS ZC464, Machine Learning

20 February 2016

BITS Pilani, Pilani Campus

Thank You

BITS Pilani, Pilani Campus

You might also like