DM 05 04 Rule-Based Classification
DM 05 04 Rule-Based Classification
Part 5. Prediction
Spring 2010
Rule-Based Classification
Outline
Rule-Based Classification
Using IF-THEN Rules for
Classification
Rule-Based Classification
Using IF-THEN Rules for Classification
– where
Condition (or LHS) is rule antecedent/precondition
Conclusion (or RHS) is rule consequent
Rule-Based Classification
Using IF-THEN rules for classification
Rule-Based Classification
Assessment of a Rule
Assessment of a rule:
– Coverage of a rule:
The percentage of instances that satisfy the antecedent of a
rule (i.e., whose attribute values hold true for the rule’s
antecedent).
– Accuracy of a rule:
The percentage of instances that satisfy both the antecedent
and consequent of a rule
Rule-Based Classification
Rule Coverage and Accuracy
where
– D: class labeled data set
– |D|: number of instances in D
– ncovers : number of instances covered by R
– ncorrect : number of instances correctly classified by R
Rule-Based Classification
Example: AllElectronics
Rule-Based Classification
Coverage and Accuracy
Rule-Based Classification
Executing a rule set
Rule-Based Classification
How We Can Use Rule-based Classification
Rule-Based Classification
Conflict Resolution
Rule-Based Classification
Conflict Resolution
Class-based ordering:
– Decreasing order of most frequent
That is, all of the rules for the most frequent class come first,
the rules for the next most frequent class come next, and so on.
– Decreasing order of misclassification cost per
class
– Most popular strategy
Rule-Based Classification
Conflict Resolution
Rule-Based Classification
Default Rule
If no rule is satisfied by X :
– A default rule can be set up to specify a default class,
based on a training set.
– This may be the class in majority or the majority class
of the instances that were not covered by any rule.
– The default rule is evaluated at the end, if and only if
no other rule covers X.
– The condition in the default rule is empty.
– In this way, the rule fires when no other rule is
satisfied.
Rule-Based Classification
Rule Extraction from a
Decision Tree
Rule-Based Classification
Building Classification Rules
Rule-Based Classification
Rule Extraction from a Decision Tree
Rule-Based Classification
Example: AllElectronics
Rule-Based Classification
Pruning the Rule Set
Rule-Based Classification
1R Algorithm
Rule-Based Classification
1R algorithm
Rule-Based Classification
Pseudocode or 1R Algorithm
Rule-Based Classification
Example: The weather problem
Rule-Based Classification
Evaluating the weather attributes
Rule-Based Classification
The attribute with the smallest number of errors
Rule-Based Classification
Dealing with numeric attributes
Rule-Based Classification
Weather data with some numeric attributes
Rule-Based Classification
Example: temperature from weather data
Rule-Based Classification
The problem of overfitting
Rule-Based Classification
Minimum is set at 3 for temperature attribute
Rule-Based Classification
Resulting rule set with overfitting avoidance
Rule-Based Classification
Sequential Covering Algorithms
Rule-Based Classification
Sequential Covering Algorithms
Rule-Based Classification
Sequential Covering Algorithms
Instances covered
Instances covered by Rule 2
by Rule 1 Instances covered
by Rule 3
Instances
Rule-Based Classification
Sequential Covering Algorithms
Rule-Based Classification
Basic Sequential Covering Algorithm
Rule-Based Classification
Basic Sequential Covering Algorithm
Steps:
– Rules are learned one at a time
– Each time a rule is learned, the instances covered by
the rules are removed
– The process repeats on the remaining instances
unless termination condition
e.g., when no more training examples or when the quality of a
rule returned is below a user-specified level
Rule-Based Classification
Generating A Rule
Rule-Based Classification
Example: Generating A Rule
Example:
– Suppose our training set, D, consists of loan application
data.
– Attributes regarding each applicant include their:
age
income
education level
residence
credit rating
the term of the loan.
– The classifying attribute is loan_decision, which indicates
whether a loan is accepted (considered safe) or rejected
(considered risky).
Rule-Based Classification
Example: Generating A Rule
Rule-Based Classification
Example: Generating A Rule
Rule-Based Classification
Example: Generating A Rule
Rule-Based Classification
Example: Generating A Rule
Rule-Based Classification
Example: Generating A Rule
Rule-Based Classification
Example: Generating A Rule
Rule-Based Classification
Decision tree for the same problem
Rule-Based Classification
Rules vs. trees
Rule-Based Classification
PRISM Algorithm
Rule-Based Classification
PRISM Algorithm
Rule-Based Classification
Selecting a test
Rule-Based Classification
Example: contact lens data
Rule-Based Classification
Example: contact lens data
Possible tests:
Rule-Based Classification
Create the rule
Rule-Based Classification
Further refinement
Current state:
Possible tests:
Rule-Based Classification
Modified rule and resulting data
Rule-Based Classification
Further refinement
Current state:
Possible tests:
Rule-Based Classification
The result
Final rule:
Rule-Based Classification
Pseudo-code for PRISM
Rule-Based Classification
Rules vs. decision lists
Rule-Based Classification
Separate and conquer
Rule-Based Classification
FOIL Algorithm
(First Order Inductive Learner Algorithm)
Rule-Based Classification
Coverage or Accuracy?
Rule-Based Classification
Coverage or Accuracy?
Rule-Based Classification
Consider Both Coverage and Accuracy
Rule-Based Classification
FOIL Information Gain
To generate a rule
while(true)
find the best predicate p
if FOIL_GAIN(p) > threshold then add p to current rule
else break
A3=1&&A1=2
A3=1&&A1=2
&&A8=5A3=1
Positive Negative
examples examples
Rule-Based Classification
Rule Pruning: FOIL method
Rule-Based Classification
References
Rule-Based Classification
References
Rule-Based Classification
The end
Rule-Based Classification