Rule Based Classifier
Rule Based Classifier
CLASSIFIER
L19-03-09-09 1
Rule-Based Classifier
• Classify records by using a collection of
“if…then…” rules
• Rule: (Condition) → y
– where
• Condition is a conjunctions of attributes
• y is the class label
– LHS: rule antecedent or condition
– RHS: rule consequent
– Examples of classification rules:
• (Blood Type=Warm) ∧(Lay Eggs=Yes) → Birds
• (Taxable Income < 50K) ∧(Refund=Yes) → Evade=No
L19-03-09-09 2
Rule-based Classifier (Example)
Name Blood Type Give Birth Can Fly Live in Water Class
human warm yes no no mammals
python cold no no no reptiles
salmon cold no no yes fishes
whale warm yes no yes mammals
frog cold no no sometimes amphibians
komodo cold no no no reptiles
bat warm yes yes no mammals
pigeon warm no yes no birds
cat warm yes no no mammals
leopard shark cold yes no yes fishes
turtle cold no no sometimes reptiles
penguin warm no no sometimes birds
porcupine warm yes no no mammals
eel cold no no yes fishes
salamander cold no no sometimes amphibians
gila monster cold no no no reptiles
platypus warm no no no mammals
owl warm no yes no birds
dolphin warm yes no yes mammals
eagle warm no yes no birds
Name Blood Type Give Birth Can Fly Live in Water Class
hawk warm no yes no ?
grizzly bear warm yes no no ?
The rule R1 covers a hawk => Bird
The rule R3 covers the grizzly bear => Mammal
L19-03-09-09 4
Rule Coverage and Accuracy
• Coverage of a rule:
Tid Re
– Fraction of records that satisfy
the antecedent of a rule.
– Consider a dataset D, and a
rule as r:A y, then,
Coverage (r)=|A|/|D|
• Accuracy of a rule:
– Fraction of records that satisfy
both the antecedent and
consequent of a rule.
Accuracy(r)= |A y|/|A|
1 Ye
(Status=Single) → No
Coverage = 40%, Accuracy =
50%
L19-03-09-09 5
Name Blood Type Give Birth Can Fly Live in Water Class
human warm yes no no mammals
python cold no no no reptiles
salmon cold no no yes fishes
whale warm yes no yes mammals
frog cold no no sometimes amphibians
komodo cold no no no reptiles
bat warm yes yes no mammals
pigeon warm no yes no birds
cat warm yes no no mammals
leopard shark cold yes no yes fishes
turtle cold no no sometimes reptiles
penguin warm no no sometimes birds
porcupine warm yes no no mammals
eel cold no no yes fishes
salamander cold no no sometimes amphibians
gila monster cold no no no reptiles
platypus warm no no no mammals
owl warm no yes no birds
dolphin warm yes no yes mammals
eagle warm no yes no birds
The rule:
(gives birth=yes) and (body temp= warm_blooded)
Mammals
Coverage=(6/20)*100=30% and
6
Accuracy = (6/6)*100= 100%
How does Rule-based Classifier Work?
R1: (Give Birth = no) ∧(Can Fly = yes) → Birds
R2: (Give Birth = no) ∧(Live in Water = yes) → Fishes
R3: (Give Birth = yes) ∧(Blood Type = warm) → Mammals
R4: (Give Birth = no) ∧(Can Fly = no) → Reptiles
R5: (Live in Water = sometimes) → Amphibians
Name Blood Type Give Birth Can Fly Live in Water Class
lemur warm yes no no ?
turtle cold no no sometimes ?
dogfish shark cold yes no yes ?
L19-03-09-09 7
Characteristics of Rule-Based Classifier
• Mutually exclusive rules
– Classifier contains mutually exclusive rules if the rules are
independent of each other
– The rules in a rule set R are ME if no two rules in R are
triggered by the same record.
– The above property ensures that every record is covered by at
most one rule.
• Exhaustive rules
– Classifier has exhaustive coverage if it accounts for every
possible combination of attribute values
– This property ensures that each record is covered by at least
one rule
L19-03-09-09 8
R1: {body_temp=cold_blooded} non-mammals
R2: {body_temp=warm-blooded} ∧ {gives birth= yes)
Mammals
R3: {body_temp=warm-blooded} ∧ {gives birth=No}
non-mammlas
L19-03-09-09 10
Rules
• Non mutually exclusive rules
– A record may trigger more than one rule
– Solution?
• Ordered rule set
L19-03-09-09 11
Two ways to overcome the problem when
rules are not ME
• Ordered rule set
– Rules are ordered in decreasing order of their
priority
– Priority can be defined in many ways based on
accuracy, coverage or the order in which the
rules were generated.
• Unordered rule set
– Allows a test record to trigger multiple
classification rules and consider consequent of
each rule as a vote for a particular class.
– use voting schemes
L19-03-09-09 12
Ordered Rule Set
• Rules are rank ordered according to their priority
– An ordered rule set is known as a decision list.
• When a test record is presented to the classifier
– It is assigned to the class label of the highest ranked rule it has
triggered
– If none of the rules fired, it is assigned to the default class
Name Blood Type Give Birth Can Fly Live in Water Class
turtle cold no no sometimes ?
L19-03-09-09 13
Unordered rules
– Allows a test record to trigger multiple
classification rules and consider consequent of
each rule as a vote for a particular class.
– The votes are then compared to determine the
class label of the test record.
– The record is usually assigned to the class that
has highest no. of votes.
L19-03-09-09 14
Rule
Ordering
Schemes
L19-03-09-09 15
• Rule-based ordering
– Individual rules are ranked based on their
quality
– This scheme ensures that every test record is
classified by the “best” rule covering it.
– If the no. of rules is large, interpreting lower
ranked rules become cumbersome.
L19-03-09-09 16
• Class-based ordering
L19-03-09-09 17
R u le -b a s e d O rd e rin g C la s s -b a s
( R e fu n d = Y e s ) = = > N o ( R e fu n d = Y e s ) = = >
( R e fu n d = N o , M a r ita l S ta tu s = { S in g le ,D (ivRoer fu
c endd} =, N o , M a r ita
T a x a b le In c o m e < 8 0 K ) = = > N o T a x a b le In c o m e < 8
NO YES
Rules are mutually exclusive and exhaustive
Rule set contains as much information as the
tree
L19-03-09-09 19
Rules Can Be Simplified
Tid Refund Marital Taxable
Status Income Cheat
Refund
Yes No 1 Yes Single 125K No
2 No Married 100K No
NO Marita l
3 No Single 70K No
{Single, Status
{Married} 4 Yes Married 120K No
Divorced}
5 No Divorced 95K Yes
Taxable NO
Income 6 No Married 60K No