0% found this document useful (0 votes)
37 views

Decision Tree

The document discusses decision trees and the ID3 algorithm for building classification models. It explains that a decision tree consists of nodes that represent attributes, branches that represent decisions, and leaves that represent outcomes. It then outlines the steps of the ID3 algorithm, which chooses attributes to split the data on by calculating the information gain from splitting on each attribute and selecting the attribute with the highest gain. This recursively builds the tree until reaching leaf nodes that classify the data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views

Decision Tree

The document discusses decision trees and the ID3 algorithm for building classification models. It explains that a decision tree consists of nodes that represent attributes, branches that represent decisions, and leaves that represent outcomes. It then outlines the steps of the ID3 algorithm, which chooses attributes to split the data on by calculating the information gain from splitting on each attribute and selecting the attribute with the highest gain. This recursively builds the tree until reaching leaf nodes that classify the data.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Decision Tree

(ID3 Algorithm)

(Numerical)
DECISION TREE
and ID3
Algorithm
Make a decision tree that predicts whether
tennis will be played on the day?
WHAT IS
DECISON
TREE?
A Decision Tree is a
tree where each
node represents a
Feature (Attribute),
each link (branch)
represents a
decision (rule) and
each leaf represents
an outcome.
Algorithms

Cart ID3
Gini Index Entropy Function
Information Gain
Make a decision tree that predicts whether
tennis will be played on the day?
Step 1: Create a root node

How to choose the root node?

The attribute that best classifies the training data, use this
attribute at the root of the tree.
Step 1: Create a root node
How to choose the root node?

The attribute that best classifies the training data, use this
attribute at the root of the tree.

How to choose the best attribute?

So from here, ID3 algorithm begins


Calculate Entropy (Amount of uncertainity in dataset):

Calculate Average Information:

Calculate Information Gain: (Difference in Entropy before and


after splitting dataset on attribute A)
1.compute the entropy for data-set Entropy(s)

2.for every attribute/feature:

1.calculate entropy for all other values Entropy(a)

2.take average information entropy for the current attribute

3.calculate gain for the current attribute

3. pick the highest gain attribute.

4. Repeat until we get the tree we desired.


1.
P=9
N=5

Total = 14
Calculate Entropy(S):

= 0.940
For each Attribute: (let say Outlook)
Calculate Entropy for each Values, i.e for 'Sunny', 'Rainy','Overcast'
Calculate Entropy(Outlook='Value'):
Calculate Average Information Entropy:

= 0.693
Calculate Gain: attribute is Outlook
For each Attribute: (let say Temperature)
Calculate Entropy for each Temp, i.e for 'Hot', 'Mild' and 'Cool'
Calculate Average Information Entropy:
Calculate Gain: attribute is Temperature
For each Attribute: (let say Humidity)
Calculate Entropy for each Humidity, i.e for 'High', 'Normal'
Calculate Average Information Entropy:
Calculate Gain: attribute is Humidity
For each Attribute: (let say Windy)
Calculate Entropy for each Windy, i.e for 'Strong' and 'Weak'
Calculate Average Information Entropy:
Calculate Gain: attribute is Windy
Pick the highest gain attribute.

Root Node:
OUTLOOK
Repeat the same thing for sub-trees till we get
the tree.

Outlook = "Sunny"

Outlook = "Rainy"
P= N=
2 3
Total=
5
Entropy:
For each Attribute: (let say Humidity):
Calculate Entropy for each Humidity, i.e for 'High' and 'Normal'

Calculate Average Information Entropy: I(Humidity) = 0


Calculate Gain: Gain = 0.971
For each Attribute: (let say Windy):
Calculate Entropy for each Windy, i.e for 'Strong' and 'Weak'

Calculate Average Information Entropy: I(Windy) = 0.951


Calculate Gain: Gain = 0.020
For each Attribute: (let say Temperature):
Calculate Entropy for each Windy, i.e for 'Cool', 'Hot' and 'Mild'

Calculate Average Information Entropy: I(Temp) = 0.4


Calculate Gain: Gain = 0.571
Pick the highest gain attribute.

Next Node in sunny:


Humidity
P= N=
3 2
Total = 5

Entropy:
For each Attribute: (let say Humidity):
Calculate Entropy for each Humidity, i.e for 'High' and 'Normal'

Calculate Average Information Entropy: I(Humidity) = 0.951


Calculate Gain: Gain = 0.020
For each Attribute: (let say Windy):
Calculate Entropy for each Windy, i.e for 'Strong' and 'Weak'

Calculate Average Information Entropy: I(Windy) = 0


Calculate Gain: Gain = 0.971
For each Attribute: (let say Temperature):
Calculate Entropy for each Windy, i.e for 'Cool', 'Hot' and 'Mild'

Calculate Average Information Entropy: I(Temp) = 0.951


Calculate Gain: Gain = 0.020
Pick the highest gain attribute.

Next Node in
Rainy:
Windy
Weak Strong
nk
ha
T
ou
Y

You might also like