IME672 - Lecture 48
IME672 - Lecture 48
IME672 - Lecture 48
• If the rule antecedent holds true for a given tuple, we say that
the rule is satisfied; rule covers the tuple; rule is fired or
triggered
IF-THEN Rules for Classification
• Vertebrate Classification Problem
IF-THEN Rules for Classification
• Assessment of a rule R: coverage and accuracy
– ncovers = # of tuples covered by R
– ncorrect = # of tuples correctly classified by R
– Coverage: Fraction of records that satisfy the antecedent of a rule
– Accuracy: Fraction of records that satisfy both the antecedent and
consequent of a rule
• Since one rule extracted per leaf, the set of rules is not much
simpler than the corresponding decision tree
• Tuples of the class for which rules are learned are called positive tuples, while
the remaining tuples are negative
Basic Sequential Covering Algorithm
How are Rules Learned?
• Start with the most general rule possible:
– IF THEN loan_decision = accept
• Add new attributes by adopting a greedy depth-first
strategy
– Pick the one that improves the rule quality most
– E.g. maximize rule’s accuracy
• Similar to situation in decision trees: problem of
selecting an attribute to split on
• The resulting rule should cover relatively more of the
“accept” tuples
Rule Learning
Rule Learning
IF {} THEN class = a IF x > 1.2 THEN class = a IF x > 1.2 and y > 2.6 THEN class = a
IF {} THEN class = a IF x > 1.2 THEN class = a IF x > 1.2 and y > 2.6 THEN class = a
Rule Pruning
• Pruning = remove conjunct (attribute test) from the rule
• Prune a rule, R, if the pruned version of R has greater quality, as
assessed on an independent set of tuples
• FOIL_Prune
Rule-Quality Measures
• Likelihood Ratio
– Rule R1 :
• Expected number of +ve examples = 500/1000*(350+150) = 250
• Expected number of -ve examples = 500/1000*(350+150) = 250
– Rule R2 :
• Expected number of +ve examples = 500/1000*(300+50) = 175
• Expected number of -ve examples = 500/1000*(300+50) = 175
Rule-Based Classifiers
• Advantages:
– As highly expressive as decision trees
– Easy to interpret
– Easy to generate
– Can classify new instances rapidly
– Performance comparable to decision trees
– Can easily handle missing values and numeric attributes