By Eesha Tur Razia Babar: 2/1/2021 Introduction To Data Mining, 2 Edition 1
By Eesha Tur Razia Babar: 2/1/2021 Introduction To Data Mining, 2 Edition 1
● Task:
– Learn a model that maps each attribute set x into
one of the predefined class labels y
● Base Classifiers
– Decision Tree based Methods
– Rule-based Methods
– Nearest-neighbor
– Naïve Bayes and Bayesian Belief Networks
– Support Vector Machines
– Neural Networks, Deep Neural Nets
● Ensemble Classifiers
– Boosting, Bagging, Random Forests
Home
Owner
Yes No
NO MarSt
Single, Divorced Married
Income NO
< 80K > 80K
NO YES
Test Data
Start from the root of tree.
Home
Yes Owner No
NO MarSt
Single, Divorced Married
Income NO
< 80K > 80K
NO YES
Test Data
Home
Yes Owner No
NO MarSt
Single, Divorced Married
Income NO
< 80K > 80K
NO YES
Test Data
Home
Yes Owner No
NO MarSt
Single, Divorced Married
Income NO
< 80K > 80K
NO YES
Test Data
Home
Yes Owner No
NO MarSt
Single, Divorced Married
Income NO
< 80K > 80K
NO YES
Test Data
Home
Yes Owner No
NO MarSt
Single, Divorced Married
Income NO
< 80K > 80K
NO YES
Test Data
Home
Yes Owner No
NO MarSt
Single, Divorced Married Assign Defaulted to
“No”
Income NO
< 80K > 80K
NO YES
cal al
us
o ri ric u o
g go tin ss
te t e n la
ca ca co c MarSt Single,
Married Divorced
NO Home
Yes Owner No
NO Income
< 80K > 80K
NO YES
Decision
Tree
● General Procedure:
– If Dt contains records that
belong the same class yt,
then t is a leaf node labeled
as yt
– If Dt contains records that
belong to more than one
Dt
class, use an attribute test
to split the data into smaller
subsets. Recursively apply ?
the procedure to each
subset.
(7,3)
(3,0) (4,3)
(3,0)
(3,0)
(3,0)
(1,3) (3,0)
(1,0) (0,3)
(7,3)
(3,0) (4,3)
(3,0)
(3,0)
(3,0)
(1,3) (3,0)
(1,0) (0,3)
(7,3)
(3,0) (4,3)
(3,0)
(3,0)
(3,0)
(1,3) (3,0)
(1,0) (0,3)
(7,3)
(3,0) (4,3)
(3,0)
(3,0)
(3,0)
(1,3) (3,0)
(1,0) (0,3)
● Multi-way split:
– Use as many partitions as
distinct values.
● Binary split:
– Divides values into two subsets
● Multi-way split:
– Use as many partitions
as distinct values
● Binary split:
– Divides values into two
subsets
– Preserve order
property among
attribute values This grouping
violates order
property
● Greedy approach:
– Nodes with purer class distribution are
preferred
● Gini Index
● Entropy
● Misclassification error
Gain = P - M
A? B?
Yes No Yes No
Node Node Node Node
N1 N2 N3 N4
M1 M2
Gain = P – M1 vs P – M2
2/1/2021 Introduction to Data Mining, 2nd Edition 39
Compute Impurity Measures
B?
Yes No
Node Node
Gini(N1) N1 N2
= 1 – (5/6)2 – (1/6)2
Weighted Gini of N1 N2
= 0.278
= 6/12 * 0.278 +
Gini(N2) 6/12 * 0.444
= 1 – (2/6)2 – (4/6)2 = 0.361
= 0.444 Gain = 0.486 – 0.361 = 0.125
A?
Yes No
Node Node
N1 N2
Gini(N1)
= 1 – (3/3)2 – (0/3)2 Gini(Children)
=0 = 3/10 * 0
+ 7/10 * 0.489
Gini(N2) = 0.342
= 1 – (4/7)2 – (3/7)2
= 0.489 Gini improves but
error remains the
same!!
A?
Yes No
Node Node
N1 N2