ML Unit V
ML Unit V
Explanation based learning has ability to learn from a single training instance.
Instead of taking more examples the explanation based learning is emphasized
to learn a single, specific example. For example, consider the Ludoo game. In a
Ludoo game, there are generally four colors of buttons. For a single color there
are four different squares. Suppose the colors are red, green, blue and yellow.
So maximum four members are possible for this game. Two members are
considered for one side (suppose green and red) and other two are considered
for another side (suppose blue and yellow). So for any one opponent the other
will play his game. A square sized small box marked by symbols one to six is
circulated among the four members. The number one is the lowest number and
the number six is the highest for which all the operations are done. Always any
one from the 1st side will try to attack any one member in the 2 nd side and vice
versa. At any instance of play the players of one side can attack towards the
players of another side. Likewise, all the buttons may be attacked and rejected
one by one and finally one side will win the game. Here at a time the players of
one side can attack towards the players of another side. So for a specific player,
the whole game may be affected. Hence we can say that always explanation
based learning is concentrated on the inputs like a simple learning program, the
idea about the goal state, the idea about the usable concepts and a set of rules
that describes relationships between the objects and the actions.
Explanation based generalization (EBG) is an algorithm for explanation based
learning, described in Mitchell at al. (1986). It has two steps first, explain
method and secondly, generalize method. During the first step, the domain
theory is used to prune away all the unimportant aspects of training examples
with respect to the goal concept. The second step is to generalize the
explanation as far as possible while still describing the goal concept. Consider
the problem of learning the concept bucket. We want to generalize from a single
example of a bucket. At first collect the following informations.
3. Goal: Bucket
B is a bucket if B is liftable, stable and open-vessel.
KBANN Algorithm
KBANN(domainTheory, trainingExamples)
KBANN Example
Domain theory
Training Examples
Cup Non-Cups
BottomIsFlat X X X X X X X X
ConcavityPointsUp X X X X X X X
Expensive X X X X
Fragile X X X X X X
HandleOnTop X X
HandleOnSide X X X
HasConcavity X X X X X X X X X
HasHandle X X X X X
Light X X X X X X X X
MadeOfCeramic X X X X
MadeOfPaper X X
MadeOfStyrofoam X X X X
KBANN Example Network
After Training
KBANN Results
In classifying promoter regions in DNA: Backpropagation got 8/106 error
rate, KBANN got 4/106.
It does typically generalize more accurately than backpropagation.
TangentProp
The TangentProp algorithm incorporates the prior knowledge into the error
criterion minimized by gradient descent.
Specifically, the prior knowledge is in the form of known derivatives of the
target function.
TangentProp Example
TangentProp Search
EBNN
The Explanation-Based Neural Network (EBNN ) algorithm extends
TangentProp.
It computes the derivatives itself.
The value of $\mu$ is chosen independently for each example.
It represents the domain theory with a collection of neural networks.
Then, learns the target function as another network.
EBNN Example
There is one network for each of the Horn clauses in the domain theory.
EBNN uses the top network to calculate the partial derivative of the
prediction with respect to each feature of the instance. (i.e., how much does
the output change as I tweak BottomIsFlat?).
These derivatives are given to the bottom network which is trained with a
variation of TangentProp.
EBNN Summary
The goal of FOCL, like FOIL, is to create a rule in terms of the extensionally
defined predicates, that covers all the positive examples and none of the
negative examples. Unlike FOIL, FOCL integrates background knowledge
and EBL methods into it which leads to a much more efficient search of
hypothesis space that fits the training data. (As shown in Fig 3)
FOCL: Intuition
Like FOIL, FOCL also tends to perform an iterative process of learning a set
of best-rules to cover the training examples and then remove all the training
examples covered by that best rule. (using a sequential covering algorithm)
However, what makes the FOCL algorithm more powerful is the approach
that it adapts while searching for that best-rule.
A literal is called operational if it can describe the training example properly
leading to the output hypothesis. In contrast, literals that occur only as
intermediate features in the domain theory, but not as primitive attributes of
the instances, are considered non-operational. Non-operational predicates
are evaluated in the same manner as operational predicates in FOCL.
Algorithm Involved:
Step 1 – Use the same method as done in FOIL and add a single feature for
each operational literal that is not part of the hypothesis space so as to
create candidates for the best rule.
(solid arrows in Fig.4 denote specializations of bottle)
Step 2 – Create an operational literal that is logically efficient to explain the
goal concept according to the Domain Theory.
(dashed arrows in Fig.4 denote domain theory based specializations of
bottle)
Step 3 – Add this set of literals to the current preconditions of hypothesis.
Step 4 – Remove all those preconditions of hypothesis space that are
unnecessary according to the training data.
Let us consider the example shown in Fig 4.
First, FOCL creates all the candidate literals that have the possibility of
becoming the best-rule (all denoted by solid arrows). Something we have
already seen in the FOIL algorithm. In addition, it creates several
logically relevant candidate literals of its own. (the domain theory)
Then, it selects one of the literals from the domain theory whose
precondition matches with the goal concept. If there are several such
literals present, then it just selects one which gives the most
information related to the goal concept.
For example,
If the bottle (goal concept) is made of steel (while satisfying
the other domain theory preconditions),
then the algorithm will select that as it the most relevant
information related to the goal concept.
i.e. the bottle.
Now, all those literals that removed unless the affect the classification
accuracy over the training examples. This is done so that the domain
theory doesn’t over specialize the result by addition irrelevant literals. This
set of literals is now added to the preconditions of the current hypothesis.
Finally, one candidate literal which provides the maximum information
gain is selected out two specialization methods. (FOIL and domain
theory)
FOCL is a powerful machine learning algorithm that uses EBL and domain
theory techniques, reaching the hypothesis space quickly and efficiently. It
has shown more improved and accurate results than the Inductive FOIL
Algorithm. A study on “Legal Chessboard Positions” showed that on 60
training examples describing 30 legal and 30 illegal endgame board
positions, FOIL accuracy was about 86% while that of FOCL was about
94%.