0% found this document useful (0 votes)
112 views19 pages

Chapter 6. Decision Tree Classification

Uploaded by

muhamad saepul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views19 pages

Chapter 6. Decision Tree Classification

Uploaded by

muhamad saepul
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 19

Chapter 6.

Decision Tree

Classification
Definition
Given a collection of records (training set )
– Each record contains a set of attributes, one of the attributes
is the class.
Find a model for class attribute as a function of the values of other
attributes.
Goal: previously unseen records should be assigned a class as accurately
as possible.
– A test set is used to determine the accuracy of the model.
Usually, the given data set is divided into training and test
sets, with training set used to build the model and test set
used to validate it.

Classification—A Two-Step Process


 Model construction: describing a set of predetermined classes
 Each tuple/sample is assumed to belong to a predefined
class, as determined by the class label attribute
 The set of tuples used for model construction is training set
 The model is represented as classification rules, decision
trees, or mathematical formulae
 Model usage: for classifying future or unknown objects
 Estimate accuracy of the model
 The known label of test sample is compared with the
classified result from the model
 Accuracy rate is the percentage of test set samples
that are correctly classified by the model
 If the accuracy is acceptable, use the model to classify data
tuples whose class labels are not known

Process (1): Model Construction

Classification
Algorithms
Training
Data

NAME RANK YEARS TENURED Classifier


Mike Assistant Prof 3 no (Model)
Mary Assistant Prof 7 yes
Bill Professor 2 yes
Jim Associate Prof 7 yes IF rank = ‘professor’
Dave Assistant Prof 6 no OR years > 6
Anne Associate Prof 3 no
THEN tenured = ‘yes’
March 5, 2008 Data Mining: Concepts and Techniques 6

Process (2): Using the Model in Prediction

Classifier

Testing
Data Unseen Data

(Jeff, Professor, 4)
NAME RANK YEARS TENURED
Tom Assistant Prof 2 no Tenured?
Merlisa Associate Prof 7 no
George Professor 5 yes
Joseph Assistant Prof 7 yes
March 5, 2008 Data Mining: Concepts and Techniques 7
Illustrating Classification Task

Tid Attrib1 Attrib2 Attrib3 Class


Learning
1 Yes Large 125K No
algorithm
2 No Medium 100K No
3 No Small 70K No
4 Yes Medium 120K No
Induction
5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No Learn


8 No Small 85K Yes Model
9 No Medium 75K No
10 No Small 90K Yes
Model
10

Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?

12 Yes Medium 80K ?


13 Yes Large 110K ? Deduction
14 No Small 95K ?

15 No Large 67K ?
10

Test Set

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 3

Supervised vs. Unsupervised Learning


 Supervised learning (classification)
 Supervision: The training data (observations, measurements,
etc.) are accompanied by labels indicating the class of the
observations
 New data is classified based on the training set
 Unsupervised learning (clustering)
 The class labels of training data is unknown
 Given a set of measurements, observations, etc. with the aim
of establishing the existence of classes or clusters in the data

Issues: Data Preparation


 Data cleaning
 Preprocess data in order to reduce noise and handle missing
values
 Relevance analysis (feature selection)
 Remove the irrelevant or redundant attributes
 Data transformation
 Generalize and/or normalize data

Examples of Classification Task


- Predicting tumor cells as benign or malignant
- Classifying credit card transactions as legitimate or fraudulent
- Classifying secondary structures of protein as alpha-helix, beta-sheet,
or random coil
- Categorizing news stories as finance, weather, entertainment, sports,
etc

Classification Techniques
- Decision Tree based Methods
- Rule-based Methods
- Instance-based Methods
- Neural Networks
- Naïve Bayes and Bayesian Belief Networks
- Support Vector Machines

Decision Tree
Decision Tree Representation:
- Each internal node tests an attribute
- Each branch corresponds to attribute value
- Each leaf node assigns a classification

Top-Down Induction of Decision Trees


Main Loop:
1. Choose A, that is the best decision attribute for the next node
2. Assign A as decision attribute for node
3. For each value of A, create new descendant of node
4. Sort training examples to leaf nodes
5. If training examples perfectly classified, Then STOP, Else iterate
over new leaf nodes

Example of a Decision Tree


al al us
o r ic or ic uo
g g in s
te te nt as
ca ca co cl
Splitting Attributes
Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No


2 No Married 100K No Refund
Yes No
3 No Single 70K No
4 Yes Married 120K No NO MarSt
5 No Divorced 95K Yes Single, Divorced Married
6 No Married 60K No
7 Yes Divorced 220K No TaxInc NO
8 No Single 85K Yes < 80K > 80K
9 No Married 75K No
NO YES
10 No Single 90K Yes
10

Training Data Model: Decision Tree

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 6


Another Example of Decision Tree

al al us
ic ic uo
g or g or in s
te te nt as Single,
ca ca co cl MarSt
Married Divorced
Tid Refund Marital Taxable
Status Income Cheat
NO Refund
1 Yes Single 125K No
Yes No
2 No Married 100K No
3 No Single 70K No NO TaxInc
4 Yes Married 120K No < 80K > 80K
5 No Divorced 95K Yes
NO YES
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No There could be more than one tree that
10 No Single 90K Yes fits the same data!
10

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 7

Contoh lain Decision Tree

Decision Tree Induction: Training Dataset

age income student credit_rating buys_computer


<=30 high no fair no
<=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
>40 low yes excellent no
31…40 low yes excellent yes
<=30 medium no fair no
This <=30 low yes fair yes
follows an >40 medium yes fair yes
examle of <=30 medium yes excellent yes
31…40 medium no excellent yes
Quinlan’s 31…40 high yes fair yes
ID3 >40 medium no excellent no
(Playing March 5, 2008 Data Mining: Concepts and Techniques 13

Tennis)
Output: A Decision Tree for “buys_computer”

age?

<=30 overcast
31..40 >40

student? yes credit rating?

no yes excellent fair

no yes no yes

March 5, 2008 Data Mining: Concepts and Techniques 14

Decision Tree Classification Task

Tid Attrib1 Attrib2 Attrib3 Class


Tree
1 Yes Large 125K No Induction
2 No Medium 100K No algorithm
3 No Small 70K No
4 Yes Medium 120K No
Induction
5 No Large 95K Yes

6 No Medium 60K No

7 Yes Large 220K No Learn


8 No Small 85K Yes Model
9 No Medium 75K No
10 No Small 90K Yes
Model
10

Training Set
Apply Decision
Model Tree
Tid Attrib1 Attrib2 Attrib3 Class
11 No Small 55K ?
12 Yes Medium 80K ?
13 Yes Large 110K ?
Deduction
14 No Small 95K ?

15 No Large 67K ?
10

Test Set

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 8


Apply Model to Test Data

Test Data
Start from the root of tree. Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 9

Apply Model to Test Data

Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 10


Apply Model to Test Data

Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 11

Apply Model to Test Data

Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 12


Apply Model to Test Data

Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married

TaxInc NO
< 80K > 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 13

Apply Model to Test Data

Test Data
Refund Marital Taxable
Status Income Cheat

No Married 80K ?
Refund 10

Yes No

NO MarSt
Single, Divorced Married Assign Cheat to “No”

TaxInc NO
< 80K > 80K

NO YES

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 14


Latihan:
1. Buatlah decision tree berdasarkan data berikut ini:

Rule Based Classifier


Use IF-THEN rule for classification
 Represent the knowledge in the form of IF-THEN rules
 Rule: (Condition) ® y
 where
 Condition is a conjunctions of attributes
 y is the class label
 LHS: rule antecedent or condition
 RHS: rule consequent
 Examples of classification rules:
 (Blood Type=Warm) Ù (Lay Eggs=Yes) ® Birds
 (Taxable Income < 50K) Ù (Refund=Yes) ® Evade=No
 IF age = youth AND student = yes THEN
buys_computer = yes

Example of Rule Based Classifier


Name Blood Type Give Birth Can Fly Live in Water Class
human warm yes no no mammals
python cold no no no reptiles
salmon cold no no yes fishes
whale warm yes no yes mammals
frog cold no no sometimes amphibians
komodo cold no no no reptiles
bat warm yes yes no mammals
pigeon warm no yes no birds
cat warm yes no no mammals
leopard shark cold yes no yes fishes
turtle cold no no sometimes reptiles
penguin warm no no sometimes birds
porcupine warm yes no no mammals
eel cold no no yes fishes
salamander cold no no sometimes amphibians
gila monster cold no no no reptiles
platypus warm no no no mammals
owl warm no yes no birds
dolphin warm yes no yes mammals
eagle warm no yes no birds

Rules:
R1: (Give Birth = no) Ù (Can Fly = yes) ® Birds
R2: (Give Birth = no) Ù (Live in Water = yes) ® Fishes
R3: (Give Birth = yes) Ù (Blood Type = warm) ® Mammals
R4: (Give Birth = no) Ù (Can Fly = no) ® Reptiles
R5: (Live in Water = sometimes) ® Amphibians

Application of Rule-Based Classifier


A rule r covers an instance x if the attributes of the instance satisfy the
condition of the rule

Rules:
R1: (Give Birth = no) Ù (Can Fly = yes) ® Birds
R2: (Give Birth = no) Ù (Live in Water = yes) ® Fishes
R3: (Give Birth = yes) Ù (Blood Type = warm) ® Mammals
R4: (Give Birth = no) Ù (Can Fly = no) ® Reptiles
R5: (Live in Water = sometimes) ® Amphibians

Name Blood Type Give Birth Can Fly Live in Water Class
hawk warm no yes no ?
grizzly bear warm yes no no ?

The rule R1 covers a hawk => Bird


The rule R3 covers the grizzly bear => Mammal

Rule Coverage and Accuracy


- Coverage of a rule:
– Fraction of records that satisfy the antecedent of a rule
- Accuracy of a rule:
– Fraction of records that satisfy both the antecedent and
consequent of a rule

(Status=Single) ® No
Coverage = 40%, Accuracy = 50%
How does Rule-based Classifier Work?
Rules:
R1: (Give Birth = no) Ù (Can Fly = yes) ® Birds
R2: (Give Birth = no) Ù (Live in Water = yes) ® Fishes
R3: (Give Birth = yes) Ù (Blood Type = warm) ® Mammals
R4: (Give Birth = no) Ù (Can Fly = no) ® Reptiles
R5: (Live in Water = sometimes) ® Amphibians

Name Blood Type Give Birth Can Fly Live in Water Class
lemur warm yes no no ?
turtle cold no no sometimes ?
dogfish shark cold yes no yes ?

A lemur triggers rule R3, so it is classified as a mammal


A turtle triggers both R4 and R5
A dogfish shark triggers none of the rules

Characteristics of Rule-Based Classifier


- Mutually exclusive rules
– Classifier contains mutually exclusive rules if the rules are
independent of each other
– Every record is covered by at most one rule

- Exhaustive rules
– Classifier has exhaustive coverage if it accounts for every
possible combination of attribute values
– Each record is covered by at least one rule

From Decision Trees To Rules

Classification Rules
(Refund=Yes) ==> No
Refund
Yes No (Refund=No, Marital Status={Single,Divorced},
Taxable Income<80K) ==> No
NO Marital
Status (Refund=No, Marital Status={Single,Divorced},
{Single,
{Married} Taxable Income>80K) ==> Yes
Divorced}
(Refund=No, Marital Status={Married}) ==> No
Taxable NO
Income
< 80K > 80K

NO YES
Rules are mutually exclusive and exhaustive
Rule set contains as much information as the
tree

© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 8


Rules Can Be Simplified
Tid Refund Marital Taxable
Status Income Cheat
Refund
Yes No 1 Yes Single 125K No
2 No Married 100K No
NO Marital
3 No Single 70K No
{Single, Status
{Married} 4 Yes Married 120K No
Divorced}
5 No Divorced 95K Yes
Taxable NO
Income 6 No Married 60K No

< 80K > 80K 7 Yes Divorced 220K No


8 No Single 85K Yes
NO YES
9 No Married 75K No
10 No Single 90K Yes
10

Initial Rule: (Refund=No)  (Status=Married)  No


Simplified Rule: (Status=Married)  No
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 9

Effect of Rule Simplification


- Rules are no longer mutually exclusive
– A record may trigger more than one rule
– Solution?
 Ordered rule set
 Unordered rule set – use voting schemes

- Rules are no longer exhaustive


– A record may not trigger any rules
– Solution?
 Use a default class

Ordered Rule Set


- Rules are rank ordered according to their priority
– An ordered rule set is known as a decision list
- When a test record is presented to the classifier
– It is assigned to the class label of the highest ranked rule it
has triggered
– If none of the rules fired, it is assigned to the default class

R1: (Give Birth = no)  (Can Fly = yes)  Birds


R2: (Give Birth = no)  (Live in Water = yes)  Fishes
R3: (Give Birth = yes)  (Blood Type = warm)  Mammals
R4: (Give Birth = no)  (Can Fly = no)  Reptiles
R5: (Live in Water = sometimes)  Amphibians
Name Blood Type Give Birth Can Fly Live in Water Class
turtle cold no no sometimes ?

Building Classification Rules


- Direct Method:
 Extract rules directly from data
 e.g.: RIPPER, CN2, Holte’s 1R

- Indirect Method:
 Extract rules from other classification models (e.g.
decision trees, neural networks, etc).
 e.g: C4.5rules

Rule Extraction from a Decision Tree


 One rule is created for each path from the root to a leaf
 Each attribute-value pair along a path forms a conjunction: the
leaf holds the class prediction

Example 1:
age?

<=30 31..4
0 >40

student? credit rating?


yes

no ye excellent fair
s
n yes n yes
o o

 Example: Rule extraction from buys_computer decision-tree


IF age = young AND student = no THEN buys_computer = no
IF age = young AND student = yes THEN buys_computer = yes
IF age = mid-age THEN buys_computer = yes
IF age = old AND credit_rating = excellent THEN buys_computer = yes
IF age = young AND credit_rating = fair THEN buys_computer = no

Example 2:

Advantages of Rule-Based Classifiers


- As highly expressive as decision trees
- Easy to interpret
- Easy to generate
- Can classify new instances rapidly
- Performance comparable to decision trees
- Rules are easier to understand than large trees

Latihan :
1. Tentukan rule-rule yang didapat dari data berikut ini:

You might also like