0% found this document useful (0 votes)
13 views21 pages

JU Ch9

The document provides an introduction to learning agents and inductive learning. It discusses how learning agents are composed of a performance element and a learning element. The learning element receives feedback from a critic to improve the performance element. Inductive learning involves finding a hypothesis function that approximates a target function based on examples from a training set. Decision tree learning is one successful inductive learning method that represents hypotheses as tree structures built from attribute tests. Performance is evaluated on a held-out test set.

Uploaded by

abebemako302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views21 pages

JU Ch9

The document provides an introduction to learning agents and inductive learning. It discusses how learning agents are composed of a performance element and a learning element. The learning element receives feedback from a critic to improve the performance element. Inductive learning involves finding a hypothesis function that approximates a target function based on examples from a training set. Decision tree learning is one successful inductive learning method that represents hypotheses as tree structures built from attribute tests. Performance is evaluated on a held-out test set.

Uploaded by

abebemako302
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 21

Introduction to Artificial Intelligence

(Comp551)
1

JIMMA UNIVERSITY
JIMMA INSTITUTE OF TECHNOLOGY
DEPARTMENT OF COMPUTING

CHAPTER NIGN
LEARNING FROM OBSERVATIONS
Topics we will cover
2

Learning agents
Inductive learning
Decision tree learning
Learning
3

Learning is essential for unknown environments,


 i.e., when designer lacks omniscience.
 Omniscience (all-knowing with infinite knowledge).
 An agent is autonomous if its behavior is determined by its own
experience.
Learning is useful as a system construction
method,
 i.e., expose the agent to reality rather than trying to
write it down.
Learning modifies the agent's decision
mechanisms to improve performance.
Learning agents
4

Learning agents
A learning agent can be divided into
four conceptual components:
 Performance Element can be
replaced with any of the 4
agent types .
 The Learning Element is
responsible for suggesting
improvements to any part of the
performance element.
 The input to the learning element
comes from the Critic.
 The Problem Generator is
responsible for suggesting
Figure - The basic structure of a learning
actions that will result in new agent
knowledge about the world being
acquired.
Learning element
5

 Design of a learning element is affected by:


 Which components of the performance element should be modified?
 What feedback is available to guide the learning process?
 How is the performance element represented?
 Type of feedback:
 Depending on the answer to this question, we can define three
types of learning process:
 Supervised learning: correct answers for each example.
• The agent learns a function from examples of its inputs and
outputs.
 Unsupervised learning: correct answers not given.
• The agent learns patterns in the inputs, without actually
knowing what the correct output should be.
 Reinforcement learning: occasional rewards.
• The outputs of the function are not explicitly specified, but the
agent may get an occasional reward for doing good things.
Inductive learning
6

 The simplest type of learning: inductive learning.


 Simplest form: learn a function from examples.

 f is the target function.

 An example is a pair (x, f(x))

Problem: find a hypothesis h


such that h ≈ f (h is an approximation of f)
given a training set of examples

 This is a highly simplified model of real learning:


 Ignores prior knowledge.
 Assumes examples are given.
Inductive learning method (1)
7

 Construct/adjust h to agree with f on training set.


 (h is consistent if it agrees with f on all examples)
 E.g., curve fitting:
Inductive learning method (2)
8

 Construct/adjust h to agree with f on training set.


 (h is consistent if it agrees with f on all examples)
 E.g., curve fitting:
Inductive learning method (3)
9

 Construct/adjust h to agree with f on training set.


 (h is consistent if it agrees with f on all examples)
 E.g., curve fitting:
Inductive learning method (4)
10

 Construct/adjust h to agree with f on training set.


 (h is consistent if it agrees with f on all examples)
 E.g., curve fitting:
Inductive learning method (5)
11

 Construct/adjust h to agree with f on training set.


 (h is consistent if it agrees with f on all examples)
 E.g., curve fitting:
Inductive learning method (6)
12
 Construct/adjust h to agree with f on training set.
 (h is consistent if it agrees with f on all examples)
 E.g., curve fitting:

 Ockham’s razor: prefer the simplest hypothesis


consistent with data.
Learning decision trees
13

 This is one of the simplest and most successful of learning algorithms:


decision trees.
 Problem: Suppose we arrive at a restaurant intending to have dinner. If
there is no space, depending on certain circumstances we may decide
whether to wait for a table at a restaurant, based on the following
attributes:
1. Alternate: is there an alternative restaurant nearby?
2. Bar: is there a comfortable bar area to wait in?
3. Fri/Sat: is today Friday or Saturday?
4. Hungry: are we hungry?
5. Patrons: number of people in the restaurant (None, Some, Full)
6. Price: price range ($, $$, $$$)
7. Raining: is it raining outside?
8. Reservation: have we made a reservation?
9. Type: kind of restaurant (French, Italian, Thai, Burger)
10. WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)
Attribute-based representations
14

 Examples described by attribute values.


 E.g., situations where I will/won't wait for a table:
o (Decision to wait or not to wait a table at a restaurant)

Table– Training set for the restaurant learning problem


 Classification of example is either positive (T) or negative (F)
Decision trees
15
 One possible representation for hypotheses.
 E.g., here is the “true” tree for deciding whether to wait:
 Decision to wait or not to wait a table at a restaurant.

Figure – A decision tree for the restaurant problem


Decision Trees: Expressiveness
16

 Decision trees can express any function of the input attributes.


 E.g., for Boolean functions, truth table row → path to leaf:

 Trivially, there is a consistent decision tree for any training set with one path to
leaf for each example but it probably won't generalise to new examples.

 Prefer to find more compact decision trees.


Hypothesis Space of Decision Trees
17

How many distinct decision trees with n Boolean attributes?


 With decision trees, if we have n attributes, there are 2 to the power 2n
different possible decision trees.
= number of Boolean functions
n
= number of distinct truth tables with 2n rows = 22

 E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616


trees
Performance measurement
18
 How do we know that h ≈ f (h is an approximation of f) ?
1. Use theorems of computational/statistical learning theory.
2. Try h on a new test set of examples.
 Learning curve = % correct on test set as a function of training set size

Figure – A sample learning curve


Summary
19

Learning needed for unknown environments, lazy


designers.
Learning agent = performance element + learning
element.
For supervised learning, the aim is to find a
simple hypothesis approximately consistent with
training examples.
Decision tree learning using information gain .
Learning performance = prediction accuracy
measured on test set.
Exercise:
20

 Below is a training set for the problem of determining whether England


will win the football world cup. There are three input attributes (one
Boolean and one with three discrete values, and a Boolean outcome)
Using the information theory approach to selecting attributes,
construct a decision tree to represent this function.

INPUT ATTRIBUTES OUTPUT


RooneyPlays? Temperature? EnglandWin?
Yes 20-30oC Yes
Yes <20oC Yes
No <20oC No
No 20-30oC No
Yes >30oC No
No >30oC No
Exercise Solution
21
 Therefore the maximum information gain comes from choosing the RooneyPlays
attribute. We make this the root of the tree. Now we have the following tree:

 We can see that the negative branch requires no further classification, as all 3 outcomes
are Lose. The positive branch requires further classification, so we use the Temperature
attribute, resulting in the following final decision tree:

You might also like