04 Classification
04 Classification
6 No Medium 60K No
Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class
Model
11 No Small 55K ?
15 No Large 67K ?
10
Test Set
▪ Top-down tree construction
▪ At start, all training examples are at the root.
▪ Partition the examples recursively by choosing one
attribute each time.
▪ Bottom-up tree pruning
▪ Remove sub-trees or branches, in a bottom-up
manner, to improve the estimated accuracy on new
cases.
▪ At each node, available attributes are evaluated on
the basis of separating the classes of the training
examples. A goodness function is used for this
purpose.
▪ Typical goodness functions:
▪ information gain (ID3/C4.5)
▪ information gain ratio
▪ gini index
Outlook Temperature Humidity Windy Play?
sunny hot high false No
sunny hot high true No
overcast hot high false Yes
rain mild high false Yes
rain cool normal false Yes
rain cool normal true No
overcast cool normal true Yes
▪ https://www.youtube.com/watch?v=_L39rN6gz7Y&t=722s
Outlook Temperature Humidity Windy Play?
sunny hot high false No
▪ Create the decision tree of the sunny hot high true No
following data: overcast hot high false Yes
rain mild high false Yes
rain cool normal false Yes
rain cool normal true No
overcast cool normal true Yes