Chapter#03 Supervised Learning and Its Algorithms - III
Chapter#03 Supervised Learning and Its Algorithms - III
COURSE INSTRUCTORS:
DR. MUHAMMAD NASEEM
ENGR. FARHEEN QAZI
MS. RUQQIYA AZIZ
At each node, we need to find the attribute that best divides the data
into Yes and No.
Take all unused attributes and calculates their entropies.
Chooses attribute that has the lowest entropy is minimum or when
information gain is maximum
The attribute with the highest information gain is selected at each node.
ENTROPY & INFORMATION GAIN
It is observed that splitting on any attribute has the property that average entropy of the resulting
training subsets will be less than or equal to that of the previous training set.
ID3 algorithm defines a measurement of a splitting called Information Gain to determine the goodness
of a split.
The attribute with the largest value of information gain is chosen as the splitting attribute and
it partitions into a number of smaller training sets based on the distinct values of attribute under split.
Example-1
Question#01: Using Decision Tree algorithm and given table, classify the given tuple (T), also
draw the tree displaying the parent node and children nodes.
−𝑃 𝑃 𝑁 𝑁
𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐶𝑙𝑎𝑠𝑠) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔 −− − (1)
𝑁+𝑃 𝑁+𝑃 𝑁+𝑃 𝑁+𝑃
−5 5 5 5
𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐶𝑙𝑎𝑠𝑠) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
5+5 5+5 5+5 5+5
𝐸𝑛𝑡𝑟𝑜𝑝𝑦(𝐶𝑙𝑎𝑠𝑠) = 1
CONTD…..
Step-2: Determine information gain 𝐼(𝑃 , 𝑁 ) for each feature attributes like Age, Competition and Type.
−𝑃 𝑃 𝑁 𝑁
𝐼(𝑃 , 𝑁 ) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔 −− − (2)
𝑁+𝑃 𝑁+𝑃 𝑁+𝑃 𝑁+𝑃
Table-1: (Age)
𝑷𝒊 𝑵𝒊 𝑰(𝑷𝒊 , 𝑵𝒊 )
Old 0 3 0
Mid 2 2 1
New 3 0 0
0 0 3 3
𝐼(𝑃 , 𝑁 ) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
3+0 3+0 3+0 3+0
𝐼(𝑃 , 𝑁 ) =0
−2 2 2 2
𝐼(𝑃 , 𝑁 ) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
2+2 2+2 2+2 2+2
𝐼(𝑃 , 𝑁 ) =1
−3 3 3 3
𝐼(𝑃 , 𝑁 ) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
0+3 0+3 3+0 3+0
𝐼(𝑃 , 𝑁 ) =0
CONTD…..
𝑃 +𝑁
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐴𝑔𝑒 = ∗ 𝐼 𝑃 ,𝑁 −− −(3)
𝑃+𝑁
From table-1,
𝐺𝑎𝑖𝑛 = 1 − 0.4
𝐺𝑎𝑖𝑛 = 0.6
CONTD….
Table-2: (Competition)
𝑷𝒊 𝑵𝒊 𝑰(𝑷𝒊 , 𝑵𝒊 )
Y 1 3 0.81127
N 4 2 0.918295
−1 1 3 3
𝐼(𝑃 , 𝑁 ) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
1+3 1+3 1+3 1+3
𝐼(𝑃 , 𝑁 ) = 0.81127
−4 4 2 2
𝐼(𝑃 , 𝑁 ) = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
2+4 2+4 2+4 2+4
𝐼(𝑃 , 𝑁 ) = 0.918295
CONTD….
From table-2,
1+3 4+2
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐶𝑜𝑚𝑝𝑖𝑡𝑖𝑜𝑛 = 0.81127 + 0.918295
5+5 5+5
Table-3: (Type)
𝑷𝒊 𝑵𝒊 𝑰(𝑷𝒊 , 𝑵𝒊 )
Software 3 3 1
Hardware 2 2 1
−3 3 3 3
𝐼(𝑃 , 𝑁 )𝑺𝒐𝒇𝒕𝒘𝒂𝒓𝒆 = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
3+3 3+3 3+3 1+3
𝐼(𝑃 , 𝑁 )𝑺𝒐𝒇𝒕𝒘𝒂𝒓𝒆 = 1
−2 2 2 2
𝐼(𝑃 , 𝑁 )𝑯𝒂𝒓𝒅𝒘𝒂𝒓𝒆 = 𝑙𝑜𝑔 − 𝑙𝑜𝑔
2+2 2+2 2+2 2+2
𝐼(𝑃 , 𝑁 )𝑯𝒂𝒓𝒅𝒘𝒂𝒓𝒆 = 1
CONTD….
From table-3,
3+3 2+2
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑇𝑦𝑝𝑒 = 1 + 1
5+5 5+5
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑇𝑦𝑝𝑒 = 1
Now determine the Gain,
So we can see that Information Gain of Age is greater than both, which means that it will become a root node or parent
node.
Age
Old New
Mid
Down Up
Note
If you see in the table that all features of Old are Down, so we place down as one of its root and on the other hand
all features of New is Up, so we placed it as its other branch.
CONTD….
Now, we need to determine the feature (Age) at mid-range, so we build another table for that.
Table-5:
ID AGE COMPETITION TYPE PROFIT
1 Mid Y Software Down
2 Mid Y Hardware Down
3 Mid N Hardware Up
4 Mid N Software Up
Entropy(Class) = 1
Table-6: Competition reduced
𝑷𝒊 𝑵𝒊 𝑰(𝑷𝒊 , 𝑵𝒊 )
Y 0 2 0
N 2 0 0
CONTD….
From table-6,
0+2 0+2
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝐶𝑜𝑚𝑝𝑖𝑡𝑖𝑜𝑛 𝑅𝑒𝑑𝑢𝑐𝑒𝑑 = 0 + 0
2+2 2+2
SOFTWARE 1 1 1
HARDWARE 1 1 1
From table-7,
1+1 1+1
𝐸𝑛𝑡𝑟𝑜𝑝𝑦 𝑇𝑦𝑝𝑒 𝑅𝑒𝑑𝑢𝑐𝑒𝑑 = 1 + 1
2+2 2+2
Down Competition Up
Predict Profit Class using Decision tree:
Down Up
EXAMPLE-II
Question#02: Using Decision Tree algorithm and given table, classify the given tuple
(T), also draw the tree displaying the parent node and children nodes.
ADVANTAGES OF ID3