0% found this document useful (0 votes)
361 views

8 - Knowledge in Learning

The document discusses various techniques for artificial intelligence, including: 1. Explanation-based learning which constructs general rules from individual examples by creating proofs. 2. Inductive logic programming which induces general first-order theories from examples by representing hypotheses as logic programs. 3. Reinforcement learning where an agent learns optimal actions through trial-and-error interactions in an environment to achieve its goals, either passively by observing or actively by acting.

Uploaded by

Elsa Mutiara
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
361 views

8 - Knowledge in Learning

The document discusses various techniques for artificial intelligence, including: 1. Explanation-based learning which constructs general rules from individual examples by creating proofs. 2. Inductive logic programming which induces general first-order theories from examples by representing hypotheses as logic programs. 3. Reinforcement learning where an agent learns optimal actions through trial-and-error interactions in an environment to achieve its goals, either passively by observing or actively by acting.

Uploaded by

Elsa Mutiara
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

Artificial Intelligence

Week 8
Knowledge in Learning
LEARNING OUTCOMES

At the end of this session, students will be able to:


LO2 Explain how to use knowledge representation in reasoning purpose
LO3 Apply various techniques to an agent when acting under certainty
OUTLINE

1. A Logical Formulation of Learning


2. Knowledge in Learning
3. Explanation Based Learning
4. Inductive Logic Programming
5. Passive and Active Reinforcement Learning
6. Generalization in Reinforcement Learning
7. Application of Reinforcement Learning
8. Summary
A LOGICAL FORMULATION OF LEARNING
o Study learning methods that can take advantage of prior knowledge
about the world. In most cases, the prior knowledge is represented
as general first-order logical theories; thus, for the first time we
bring together the work on knowledge representation and learning.
o The logical formulation of learning may seem like a lot of extra work
at first, but it turns out to clarify many of the issues in learning.
A LOGICAL FORMULATION OF LEARNING
• The hypothesis is represented by a set of logical sentences
• Example descriptions and classifications will also be logical
sentences.
• A new example can be classified by inferring a classification sentence
from the hypothesis and the example description
A LOGICAL FORMULATION OF LEARNING
o Goal and Hypotheses:
Goal predicate Q: WillWait
o Learning: to find an equivalent logical expression we can classify
examples
o Each hypotheses proposes such an expression
a candidate definition of Q:
A LOGICAL FORMULATION OF LEARNING
 An example: an object of some logical description to which the goal
concept may or may not apply

 The classification of the examples

 Each hypothesis hj have the form

where Cj (x) is a candidate definition


A LOGICAL FORMULATION OF LEARNING
 The relation between f and h are: ++, --, +- (false negative), -+(false
positive)
 An example can be a false negative for the hypothesis, if the hypothesis
says it should be negative but in fact it is positive.

would be a false negative for the hypothesis hr


A LOGICAL FORMULATION OF LEARNING
Current-best-hypothesis search
(extensions of predictor Hr)

Initial False False


hypothesis negative a generalization positive a specialization

Generalization e.g. via dropping conditions


Alternate(x)Patrons(x, Some)  Patrons(x, Some)
Specialization e.g. via adding conditions or via removing disjuncts
Alternate(x)Patrons(x, Some)  Patrons(x, Some)
A LOGICAL FORMULATION OF LEARNING
Current-best-hypothesis search

But
1. Checking all previous instances over again is expensive.
2. Difficult to find good heuristics, and backtracking is slow in the
hypothesis space (which is doubly exponential)
A LOGICAL FORMULATION OF LEARNING
Current-best-hypothesis search
Least commitment:
Instead of keeping around one hypothesis and using backtracking, keep
all consistent hypotheses (and only those).

Incremental: old instances do not have to be rechecked


KNOWLEDGE IN LEARNING

o The preceding section described the simplest setting for inductive


learning. To understand the role of prior knowledge, we need to
talk about the logical relationships among hypotheses, example
descriptions, and classifications.
o Let Descriptions denote the conjunction of all the example
descriptions in the training set, and let Classifications denote the
conjunction of all the example Classifications. Then a Hypothesis
that "explains the observations" must satisfy the following property
(recall that |= means "logically entails"):

Hypothesis ۸ Descriptions |= Classifications


EXPLANATION BASED LEARNING
o Explanation-based learning is a method for extracting general rules from
individual observations.
o The technique of memoization has long been used in computer science
to speed up programs by saving the results of computation. The basic
idea of memo functions is to accumulate a database of input—output
pairs; when the function is called, it first checks the database to see
whether it can avoid solving the problem from scratch.
o Explanation –based learning takes this a good deal further, by creating
general rules that cover an entire class of cases.
EXPLANATION BASED LEARNING

Basic EBL process works as follows


o Given an example, construct a proof that the goal predicate applies to
the example using the available background knowledge.
o In parallel, construct a generalized proof tree for the variabilized goal
using the same inference steps as in the original proof.
o Construct a new rule whose left-hand side consists of the leaves of the
proof tree and whose right-hand side is the variabilized goal (after
applying the necessary bindings from the generalized proof).
o Drop any conditions from the left-hand side that are true regardless of
the values of the variables in the goal.
LEARNING AND USING RELEVANCE
INFORMATION
o The learning algorithm we now present is based on a straightforward
attempt to find the simplest determination consistent with the
observations.
o A determination is therefore consistent with a set of examples if every
pair that matches on the predicates on the left-hand side also matches
on the goal predicate.
LEARNING AND USING RELEVANCE
INFORMATION
INDUCTIVE LOGIC PROGRAMMING
o Inductive logic programming (ILP) combines inductive methods with
the power of first-order representations, concentrating in particular on
the representation of hypotheses as logic programs.

o It has gained popularity for three reasons :


1. ILP offers a rigorous approach to the general knowledge-based
inductive learning problem.
2. ILP offers complete algorithms for inducing general, first-order
theories from examples, which can therefore learn successfully in
domains where attribute-based algorithms are hard to apply.
3. Inductive logic programming produces hypotheses that are
(relatively) easy for humans to read.
INDUCTIVE LOGIC PROGRAMMING
o the general knowledge-based
induction problem is to “solve”
the entailment constraint for
the unknown Hypothesis, given
the Background knowledge and
examples described by
Descriptions and
Classifications .
o The descriptions will consist of
an extended family tree,
described in terms of Mother ,
Father , and Married relations
and Male and Female
properties.
INDUCTIVE LOGIC PROGRAMMING
o The sentences in Classifications depend on the target concept being
learned.
o For example: Grandparent, BrotherInLaw, or Ancestor
o The complete set of Grandparent classifications contains 20 × 20 = 400
conjuncts of the form
INDUCTIVE LOGIC PROGRAMMING
Hypothesis
INDUCTIVE LOGIC PROGRAMMING

Decision-Tree-Learning
o Grandparent (⟨Mum , Charles ⟩) . . .
o FirstElementIsMotherOfElizabeth(⟨Mum,Charles⟩) .

The reader will certainly have noticed that a little bit of background
knowledge would help in the representation of the Grandparent
definition. For example, if Background included the sentence
Parent(x,y) ⇔ [Mother(x,y)∨Father(x,y)],
then the definition of Grandparent would be reduced to
Grandparent(x,y) ⇔ [∃z Parent(x,z)∧Parent(z,y)]
INDUCTIVE LOGIC PROGRAMMING
INDUCTIVE LOGIC PROGRAMMING

Two principal approaches to ILP:


o Top-down inductive learning method: using a generalization of decision
tree methods
o Inductive learning with inverse deduction: using techniques based on
inverting a resolution proof
INDUCTIVE LOGIC PROGRAMMING

Top-down inductive learning method


Suppose we are trying to learn a definition of the
Grandfather (x, y) predicate
Here are three potential additions:
INDUCTIVE LOGIC PROGRAMMING

Inductive learning with inverse deduction


Inverse resolution is based on the observation that if the example Classifications
follow from Background ∧ Hypothesis ∧ Descriptions, then one must be able to
prove this fact by resolution (because resolution is complete). A family tree
example
PASSIVE REINFORCEMENT LEARNING

 An autonomous agent should learn to choose optimal actions in each


state to achieve its goals
 The agent learns how to achieve that goal by trial-and-error
interactions with its environment
 Passive learning the agent imply watches the world going by and tries
to learn the utilities of being in various states
 Active learning the agent not simply watches, but also acts.
PASSIVE REINFORCEMENT LEARNING

The agent’s policy π is fixed: in state s, it always executes the action π(s).
Its goal is simply to learn how good the policy is—to learn the utility
function Uπ(s).
PASSIVE REINFORCEMENT LEARNING

transition model P’(s|s, a), which specifies the probability of reaching


state s from state s after doing action a;
R(s) it the reward function,
The agent executes a set of trials in the environment using its policy π. In
each trial, the agent starts in state (1,1) and experiences a sequence of
state transitions until it reaches one of the terminal states, (4,2) or (4,3).
Its percepts supply both the current state and the reward received in that
state. Typical trials might look like this:
ACTIVE REINFORCEMENT LEARNING

An active agent must consider


 What action to take?
 What their outcomes maybe?

Update utility equation


APPLICATION OF REINFORCEMENT
LEARNING
Game Playing

1. Checkers program written by Arthur Samuel (1959, 1967)


Samuel first used a weighted linear function for the evaluation of
positions, using up to 16 terms at any one time
2. Backgammon program TD-GAMMON (1992)
The TD-GAMMON project was an attempt to learn from self-play
alone. The only reward signal was given at the end of each game.
TD-GAMMON learned to play considerably better than
NEUROGAMMON, even though the input representation contained
just the raw board position with no computed features. This took
about 200,000 training games and two weeks of computer time.
APPLICATION OF REINFORCEMENT
LEARNING
Robot Control

1. BOXES algorithm (Michie and Chambers 1968)


BOXES was implemented with real cart and pole. The algorithm first
discretized the four-dimensional state space into boxes. Negative
reinforcement was associated with the final action in the final box
and then propagated back through the sequence.
2. PEGASUS algorithm (Bagnell and Schneider, 2001)
Application of reinforcement learning to helicopter flight
SUMMARY
o The use of prior knowledge in learning leads to a picture of
cumulative learning, in which learning agents improve their
learning ability as they acquire more knowledge.
o Explanation-based learning (EBL) extracts general rules from single
examples by explaining the examples and generalizing the
explanation. It provides a deductive method for knowledge into
useful, efficient, special -purpose expertise.
o Relevance-based learning (RBL) uses prior knowledge in the form of
determinations to identify the relevant attributes, thereby
generating a reduced hypothesis space and speeding up learning.
RBL also allows deductive generalizations from single examples.
SUMMARY
o Inductive logic programming (ILP) techniques perform on
knowledge that is expressed in first-order logic. ILP methods can
learn relational knowledge that is not expressible in attribute-based
systems
o The overall agent design dictates the kind of information that must
be learned. The three main designs we covered were the model-
based design, using a model P and a utility function U ; the model-
free design, using an action-utility function Q; and the reflex
design, using a policy π.
o When the learning agent is responsible for selecting actions while it
learns, it must trade off the estimated value of those actions
against the potential for learning useful new information. An exact
solution of the exploration problem is infeasible, but some simple
heuristics do a reasonable job
REFERENCES

Stuart Russell, Peter Norvig,. 2010. Artificial intelligence : a modern


approach. PE. New Jersey. ISBN:9780132071482, Chapter 19
Knowledge in Learning and Human Learning:
http://l3d.cs.colorado.edu/courses/AI-96/learning-2.pdf
Scaling Learning Algorithms towards AI:
http://yann.lecun.com/exdb/publis/pdf/bengio-lecun-07.pdf
https://slideplayer.com/slide/15478257/
https://www.slideshare.net/ersaranya/reinforcement-learning-7313
ThankYOU...

You might also like