0% found this document useful (0 votes)
4 views50 pages

ML Unit 5

The document discusses analytical learning in machine learning, focusing on inductive and deductive learning methods, particularly emphasizing explanation-based learning (EBL) and the PROLOG-EBG algorithm. It outlines the importance of domain theory in improving learning performance and details various algorithms like KBANN, TANGENTPROP, and EBNN that integrate prior knowledge with training data. The document also compares inductive and analytical approaches, highlighting their goals, justifications, and the challenges faced in combining both methods for effective learning.

Uploaded by

Kasarap Soumya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views50 pages

ML Unit 5

The document discusses analytical learning in machine learning, focusing on inductive and deductive learning methods, particularly emphasizing explanation-based learning (EBL) and the PROLOG-EBG algorithm. It outlines the importance of domain theory in improving learning performance and details various algorithms like KBANN, TANGENTPROP, and EBNN that integrate prior knowledge with training data. The document also compares inductive and analytical approaches, highlighting their goals, justifications, and the challenges faced in combining both methods for effective learning.

Uploaded by

Kasarap Soumya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 50

MACHINE LEARNING

01/22/2025 Dr S Pratap Singh 1


UNIT -V

01/22/2025 Dr S Pratap Singh 2


ANALYTICAL LEARNING
• Inductive Learning:
• Based on pattern[ Any missing data it assumes
and does classification]
• Ex: Decision tree learning, Neural networks,..
• Only training example is given for learning.
• Deductive Learning: Does not based on
pattern.

01/22/2025 Dr S Pratap Singh 4


ANALYTICAL LEARNING
• Use prior knowledge and deductive reasoning
• Past data /past experience Reasoning based
on facts.
• Also called as explanation based learning
( along with tr.ex. , we also give explanation)
• Also used when there is missing data( more
effective than Inductive learning)(because of
explanation)
• Ex: Chess game
01/22/2025 Dr S Pratap Singh 5
• Target Concept: "chessboard positions in which
black will lose its queen within two moves.”

• Prior Knowledge: Legal Moves of chess


• In analytical learning, the input to the learner
includes the same hypothesis space H and
training examples D as for inductive learning.
• A domain theory B consisting of background
knowledge that can be used to explain observed
training examples.
• The desired output of the learner is a hypothesis
h from H that is consistent with both the training
examples D and the domain theory B.
Learning with Perfect Domain Theory
PROLOG-EBG
• Analytical learning is Explanation based learning-
>related to domain theory(DT)[knowledge in specific
field]
• Ex:Mathematics: domain
• Science: domain
• Domain theory is always said to be correct and
complete
• --correct: If each assertion made by DT is always true
• --complete:If it covers each and every positive
example
01/22/2025 Dr S Pratap Singh 8
• Need of Domain Theory:
• 1.Improved performance
• 2. Difficult to achieve a perfect domain
Ex: PROLOG-EBG[programming with logic-explanation based
learning]
Mainly based on sequential learning and horn clauses
3 steps : Explaining(+ve tr.exs), gives explanation
-Analysis(whether the explanation give is correct / not , suitable to
our condition/not)
-Refining(Adding horn clauses/generalizations in order to get a pure
hyp)
Illustrative example: safetostack
01/22/2025 Dr S Pratap Singh 9
SafeToStack
LEARNING WITH PERFECT DOMAIN
THEORIES: PROLOG-EBG
• A domain theory is said to be correct if each of
its assertions is a truthful statement about the
world.
• A domain theory is said to be complete with
respect to a given target concept and instance
space, if the domain theory covers every
positive example in the instance space.
PROLOG-EBG
• PROLOG-EBG algorithm is a sequential covering algorithm
that considers the training data incrementally.
• For each new positive training example that is not yet
covered by a learned Horn clause, it forms a new Horn
clause by:
(1) Explaining the training example.
(2) Analyzing this explanation to determine an appropriate
generalization.
(3) Refining the current hypothesis by adding a new Horn
clause rule to cover this positive example, as well as other
similar instances.
• PROLOG-EBG computes the most general rule by
computing the weakest preimage of the explanation.
• The weakest preimage of a conclusion C with respect to a
proof P is the most general set of initial assertions A, such
that A entails C according to P.

• PROLOG-EBG computes the weakest preimage of the target


concept with respect to the explanation, using a general
procedure called regression.
• The regression procedure operates on a domain theory
represented by an arbitrary set of Horn clauses.
• Regression works iteratively backward through the explanation,
first computing the weakest preimage of the target concept
with respect to the final proof step in the explanation, then
computing the weakest preimage of the resulting expressions
with respect to the preceding step, and so on.
• The procedure terminates when it has iterated over all steps in
the explanation, yielding the weakest precondition of the target
concept with respect to the literals at the leaf nodes of the
explanation.
• The heart of the regression procedure is the algorithm that at
each step regresses the current frontier of expressions through
a single Horn clause from the domain theory.
PROLOG-EBG
Remarks on EB Learning
• EBL as theory-guided generalization of examples.
(Explanations are used to distinguish relevant from
irrevalent features)
• EBL as example-guided reformulation of
theories(Examples are used to focus on which
reformulations to make in order to produce
operational concepts)
• EBL as knowledge compilation(Explanations that are
particularly useful for explaning the training
examples are compiled out to improve efficiency)
01/22/2025 Dr S Pratap Singh 19
EBL of Search Control Knowledge
• To find some move towards the goal state, the definitions
of legal search operators provide correct anc complete DT
for learning search control knowledge.
• Need to choose perfect Target concept that depends on the
intenal structure of problem solver.
• Use PRODIGY ( a domain-independent planning system)
that accepts the def. of a problem domain in terms of state
space S and operators O.
• If One subgoal to be solved is On(x,y) and
• One subgoal to be solved is On(y,z)
• Then Solve the subgoal On(y,z) before On(x,y)
01/22/2025 Dr S Pratap Singh 20
COMBINING INDUCTIVE AND
ANALYTICAL LEARNING
• Inductive methods, such as decision tree
induction and neural network
BACKPROPAGATION, seek general hypotheses
that fit the observed training data.
• Analytical methods, such as PROLOG-EBG, seek
general hypotheses that fit prior knowledge
while covering the observed data.
• Inductive methods give statistical justification.
• Analytical methods give logistic justification.
Inductive Learning Analytical Learning

Goal Hypothesis fits data Hypothesis fits domain theory

Justification Statistical inference Logical inference

Advantages Requires little prior Learns from scarce data


knowledge
Pitfalls Scarce data, incorrect bias Imperfect domain theory
INDUCTIVE-ANALYTICAL APPROACHES TO
LEARNING
• The learning problem
Given:
A set of training examples D, possibly containing
errors
A domain theory B, possibly containing errors
A space of candidate hypotheses H
Determine:
A hypothesis that best fits the training examples
and domain theory
• There are 2 approaches:
• 1. To find best fit find errorD(h) , errorB(h)
• errorD(h)- defined to be the proportion of
examples from D that are misclassified by h.
• errorB(h)- of h with respect to a domain theory B
to be the probabitlity that h will disagree with B on
the classification of a randomly drawn instance.
• We could require the hypothesis that minimizes
some combined measure of these errors.
01/22/2025 Dr S Pratap Singh 25
• It is not clear what values to assign to KB and KD to
specify the relative importance of fitting the data versus
fitting the theory.
• If we have a poor theory and great deal of reliable data,
it will be best to weight errorD(h) more heavily.
• Given a strong theory and a small sample of very noisy
data, the best results would be obtained by weighting
errorB(h) more heavily.
• Note: The learner doesnot know in advance the quality
of the domain theory or training data, it will be unclear
how it should weight these two error components.
01/22/2025 Dr S Pratap Singh 26
Bayes theorem perspective
• II nd approach to find the best fit hypothesis for this
approach :
• Bayes theorem computes this posterior probability
based on observed data D, together with prior
knowledge in the form of P(h), P(D) and P(D/h).
• The Bayesian vies is that one should simply choose
the hypothesis whose posterior probability is
greatest and that BT provides the proper method for
weighting the contribution of this prior knowledge
and observed data.
01/22/2025 Dr S Pratap Singh 27
• When the quantities are imperfectly known ,
Bayes theorem alone does not prescribe how
to combine them with the observed data.
• The learning problem is to minimize some
combined measure of the error of the
hypothesis over the data and the domain
theory.

01/22/2025 Dr S Pratap Singh 28


• Hypothesis Space Search
i) Use prior knowledge to derive an initial hypothesis from which
to begin the search:
In this approach domain theory B is used to construct an initial
hypothesis h0 that is consistent with B.
Ex: KBANN
It uses prior knowledge to design the interconnections and weights
for an initial network, so that this initial network is perfectly
consistent with the given DT. This initial network hypothesis is then
refined inductively using BackPropagationAlgorithm and available
data.
Consistent hyp with DT makes final o/p hyp will better fit this theory.
• Other types of Hypothesis space search:

i) Use prior knowledge to alter the


objective of the hypothesis space
search.(TangentProp , EBNN)
ii) Use prior knowledge to alter the
available search steps.(FOCL)

01/22/2025 Dr S Pratap Singh 30


KBANN Algorithm
• Learning Task : Identify Cup
Limitations of KBANN
• Accommodate only propositional domain
theories.[ Collection of variable free horn
clauses]
• Possibility of misled by highly inaccurate
domain theories.[Generalization accuracy can
detoriate below the level of Back propagation
alg]
The TangentProp Algorithm
• TANGENTPROP accommodates domain knowledge
expressed as derivatives of the target function with
respect to transformations of its inputs.
• Prior knowledge is to incorporate it into the error
criterion minimized by gradient descent, so that the
network must fit a combined function of the training
data and domain theory.
Ex: Handwritten characters recognition
• TANGENTPROP algorithm trains a neural network to
fit both training values and training derivatives.
• Each training example consists of a pair
(xi, f (xi)) [instance, training value]
• The TANGENTPROP algorithm assumes various
training derivatives of the target function are
also provided.[derivative]
• If each instance xi is described by a single real
value, then each training example may be of
the form
1.Representation of x and its corresponding training data.
2.Back propagation algorithm , smooth interpolation is observed.
3.Tangent algorithm graph(considers 3 tuple training data)
Slope gives more accurate results than BPA.

01/22/2025 Dr S Pratap Singh 38


• BACKPROPAGATION algorithm performs gradient descent to attempt to minimize
the sum of squared errors

Where f(xi) – true target function value, 2nd function is learned NN.
• The modified error function in TangentProp is

• To assert both rotational invariance and translational invariance of the character


identity.
1st-training derivative ,2nd-actual derivative
Alpha is continuous derivative
• TANGENTPROP uses prior knowledge in the form of
desired derivatives of the target function with respect to
transformations of its inputs.
• Combines this prior knowledge with observed training
data, by minimizing an objective function that measures
both the network's error with respect to the training
example values (fitting the data) and its error with respect
to the desired derivatives (fitting the prior knowledge).
• The value of μ determines the degree to which the
network will fit one or the other of these two components
in the total error.
EBNN (Explanation-Based Neural Network
learning) algorithm
• It builds on the TANGENTPROP algorithm in two significant
ways.
• First, instead of relying on the user to provide training
derivatives, EBNN computes training derivatives itself for each
observed training example.
• Second, EBNN addresses the issue of how to weight the
relative importance of the inductive and analytical
components of learning.
• The value of μ is chosen independently for each training
example, based on a heuristic that considers how accurately
the domain theory predicts the training value for this
particular example.
• The top portion of this figure depicts an EBNN domain theory for the
target function Cup, with each rectangular block representing a
distinct neural network in the domain theory.
• Some networks take the outputs of other networks as their inputs
(e.g., the rightmost network labeled Cup takes its inputs from the
outputs of the Stable, Liftable and OpenVessel networks).
• Thus, the networks that make up the domain theory can be chained
together to infer the target function value for the input instance, just
as Horn clauses might be chained together for this purpose.
• In general, these domain theory networks may be provided to the
learner by some external source, or they may be the result of
previous learning by the same system.
• EBNN makes use of these domain theory networks to learn the
new , target function. It does not alter the domain theory networks
during this process.
• EBNN calculates the partial derivative of
prediction(to minimize the error functions) with
respect to each instance feature, yielding the set of
derivatives

• This set of derivatives is the gradient of the domain


theory prediction function with respect to the input
instance.
• EBNN uses a minor variant of the TANGENTPROP
algorithm to train the target network to fit the
following error function

• Here A(xi) is used-domain theory prediction of xi.


• Xj- jth component of vector x.
• We are calculating domain theory prediction.
FOCL Algorithm
FOCL is an extension of the purely inductive FOIL
algorithm.
FOIL and FOCL learn a set of first-order Horn
clauses to cover the observed training examples.
Both employ a sequential covering algorithm
that learns a single Horn clause, removes the
positive examples covered by this new Horn
clause, and then iterates this procedure over the
remaining training examples.

01/22/2025 Dr S Pratap Singh 46


• In FOCL and FOIL new Horn clause is created
by performing a general-to-specific search,
beginning with the most general possible Horn
clause.(TF)
• Several candidate specializations of the
current clause are then generated , and the
specialization with greatest information gain
relative to the training examples is chosen.

01/22/2025 Dr S Pratap Singh 47


• FOIL generates each candidate specialization
by adding a single new literal to the clause
preconditions.
• FOCL uses this same method for producing
candidate specializations, but also generates
additional specializations based on the
domain theory.

01/22/2025 Dr S Pratap Singh 48


01/22/2025 Dr S Pratap Singh 49
• FOIL- considers only training data.
• FOCL-consider TD and DT.
• Most general is cup (constructing from top to
bottom)
• 2+,3 Has handle 2 positive , 3 negative.
• Using DT, adding leaf to the theory
• Cup is stable , liftable , open vessel ----DT(non
operational ones). General thing is replaced
with operational ones.(Bottom is flat)
01/22/2025 Dr S Pratap Singh 50

You might also like