0% found this document useful (0 votes)
12 views25 pages

1 Concept-Learning

Uploaded by

Lùn Bi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views25 pages

1 Concept-Learning

Uploaded by

Lùn Bi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

Concept Learning

Tian-Li Yu

Taiwan Evolutionary Intelligence Laboratory (TEIL)


Department of Electrical Engineering
National Taiwan University
[email protected]

Readings: ML Chapter 2 (AIMA 19.1 & 19.2 cover a little)

This work is licensed by the entiy(ies) for the use of NTU MOOCs ONLY.
The copyright belongs to Yu, Tian-Li.

Tian-Li Yu (NTUEE) Concept Learning 1 / 25


Outline

1 Learning From Examples

2 Hypothesis

3 Find-S

4 Version space

5 Candidate Elimination

6 Inductive bias

Tian-Li Yu (NTUEE) Concept Learning 2 / 25


Learning From Examples

Learning From Examples

Training Examples for EnjoySport.

Sky Temp Humid Wind Water Forecast EnjoySpt


Sunny Warm Normal Strong Warm Same Yes
Sunny Warm High Strong Warm Same Yes
Sunny Cold High Strong Warm Change No
Sunny Warm High Strong Cool Change Yes

What is the general concept?

Tian-Li Yu (NTUEE) Concept Learning 3 / 25


Learning From Examples

Prototypical Concept Learning Task

Given:
Instances X : Possible days, each described by the attributes Sky ,
AirTemp, Humidity , Wind, Water , Forecast.
Target function c: EnjoySport : X → {0, 1}
Hypotheses H: Conjunctions of literals. E.g. h?, Cold, High, ?, ?, ?i.
Training examples D: Positive and negative examples of the target
function hx1 , c(x1 )i, . . . , hxm , c(xm )i
Determine:
A hypothesis h in H such that h(x) = c(x) for all x in D?
A hypothesis h in H such that h(x) = c(x) for all x in X ?

Tian-Li Yu (NTUEE) Concept Learning 4 / 25


Hypothesis

Hypothesis

Many possible representations.


Here, h is conjunction of constraints on attributes.
Each constraint can be
A specific value (Water = Warm).
Don’t care (Water =?).
May be empty (Water = φ).
For example, hSky , AirTemp, Humid, Wind, Water , Forecasti =
hSunny , ?, ?, Strong , ?, Samei.

The Inductive Learning Hypothesis


Any hypothesis found to approximate the target function well over a
sufficiently large set of training examples will also approximate the target
function well over other unobserved examples.

Tian-Li Yu (NTUEE) Concept Learning 5 / 25


Hypothesis

Instance and Hypotheses

x1 = hSunny , Warm, High, Strong , Cool, Samei


x2 = hSunny , Warm, High, Light, Warm, Samei

Instances 𝑋 Hypotheses 𝐻
Specific

𝑥1 ℎ1 ℎ3

ℎ2
𝑥2

General

𝑥1 = < Sunny, Warm, Normal, Strong, Cool, Same > ℎ1 = < Sunny, ?, ?, Strong, ?, ? >

h1 = hSunny , ?, ℎℎ?,
𝑥2 = < Sunny, Warm, High, Light, Warm, Same > 2 = < Sunny, ?, ?, ?, ?, ? >
Strong , ?, ?i
3 = < Sunny, ?, ?, ?, Cool, ? >

h2 = hSunny , ?, ?, ?, ?, ?i
h3 = hSunny , ?, ?, ?, Cool, ?i
Tian-Li Yu (NTUEE) Concept Learning 6 / 25
Hypothesis

Models and “More General Than”

Definition: Model m(h)


Model m(h) is a set of instances x where h(x) is true:
m(h) = {x | h(x) = true}

Definitions
h1 <g h2 iff m(h1 ) ⊂ m(h2 ).
h1 ≤g h2 iff m(h1 ) ⊆ m(h2 ).

Tian-Li Yu (NTUEE) Concept Learning 7 / 25


Find-S

Find-S Algorithm

Find-S
1 Initialize h to the most specific hypothesis in H.
2 for each positive training instance x
3 for each attribute constraint ai in h
4 If ai in h is NOT satisfied by x, replace ai in h by the next
5 more general constraint that is satisfied by x.
6 Output hypothesis h.

Tian-Li Yu (NTUEE) Concept Learning 8 / 25


Find-S

Hypothesis Space Search by Find-S

x1 = hSunny , Warm, Normal, Strong , Warm, Samei : +


x2 = hSunny , Warm, High, Strong , Warm, Samei : +
x3 = hRainy , Cold, High, Strong , Warm, Changei : −
x4 = hSunny , Warm, High, Strong , Cool, Changei : +

Instances 𝑋 Hypotheses 𝐻
ℎ0
𝑥3 Specific
- ℎ1

ℎ2,3
𝑥1 +
+𝑥
2

+ ℎ4
𝑥4
General

h0 = hφ, φ, φ, φ, φ, φi
h1 = hSunny , Warm, Normal, Strong , Warm, Samei
h2 = hSunny , Warm, ?, Strong , Warm, Samei
h3 = hSunny , Warm, ?, Strong , Warm, Samei
h4 = hSunny , Warm, ?, Strong , ?, ?i

Tian-Li Yu (NTUEE) Concept Learning 9 / 25


Find-S

Properties of Find-S

For hypothesis spaces that are described by conjunctions of attributes


constraints, Find-S is guaranteed to output the most specific
hypothesis within the hypothesis space that is consistent with positive
training examples.
The output is also consistent with negative training examples
provided the correct concept is in the hypothesis space.
Proof
c ∈ H ⇒ h ≤g c ⇒ m(h) ⊆ m(c)
c(x) : − ⇒ x 6∈ m(c) ⇒ x 6∈ m(h) ⇒ h(x) : −

Has the learner converged to the correct target concept?


Why prefer the most specific hypothesis?
Are the training examples consistent?
What if there are several maximally specific consistent hypotheses?
Tian-Li Yu (NTUEE) Concept Learning 10 / 25
Version space

Version Space (VS)

Definition: Consistency
A hypothesis h is consistent with a set of training examples D of
target concept c if and only if h(x) = c(x) for all hx, c(x)i in D.

Consistent(h, D) ≡ (∀hx, c(x)i ∈ D) h(x) = c(x)

Definition: Version Space


The version space, VSH,D , with respect to hypothesis space H and
training examples D, is the subset of hypotheses from H consistent
with all training examples in D.

VSH,D ≡ {h ∈ H | Consistent(h, D)}

Tian-Li Yu (NTUEE) Concept Learning 11 / 25


Version space

List-Then-Eliminate Algorithm

List-Then-Eliminate
1 VS = a set containing every hypothesis in H.
2 for each training example hx, c(x)i ∈ D
3 Remove from VS any hypothesis h for which h(x) 6= c(x).
4 Output the list of hypotheses in VS.

List-Then-Eliminate outputs all hypotheses that are consistent


with examples.
Theoretically, it works for any finite version spaces.
However, the requirement of memory is impractical.
We need more compact representation of VS.

Tian-Li Yu (NTUEE) Concept Learning 12 / 25


Version space

Representing Version Spaces

Definition: General Boundary


The general boundary, G , of version space VSH,D is the set of its
maximally general members.

G ≡ {g ∈ H|Consistent(g , D)∧(¬∃g 0 ∈ H (g <g g 0 )∧Consistent(g 0 , D))}

Definition: Specific Boundary


The specific boundary, S, of version space VSH,D is the set of its
maximally specific members.

S ≡ {s ∈ H|Consistent(s, D) ∧ (¬∃s 0 ∈ H (s 0 <g s) ∧ Consistent(s 0 , D))}

Tian-Li Yu (NTUEE) Concept Learning 13 / 25


Version space

Representing Version Spaces

𝑺: { < Sunny, Warm, ?, Strong, ?, ? > }

< Sunny, ?, ?, Strong, ?, ? > < ?, Warm, ?, Strong, ?, ? >


< Sunny, Warm, ?, ?, ?, ? >

𝑮: { < Sunny, ?, ?, ?, ?, ? >, < ?, Warm, ?, ?, ?, ? > }

Version Space Representation Theorem


Every member of the version space lies between these two boundaries.
VSH,D = {h ∈ H | (∃s ∈ S)(∃g ∈ G ) s ≤g h ≤g g }.

Tian-Li Yu (NTUEE) Concept Learning 14 / 25


Candidate Elimination

Candidate-Elimination Algorithm
1 G = set of maximally general hypotheses in H.
2 S = set of maximally specific hypotheses in H.
3 For each training example d, do
If d is positive,
Remove from G any hypothesis inconsistent with d.
For each s ∈ S inconsistent with d
Remove s from S.
Add to S all minimal generalization h of s s.t. h is consistent with
d and some member of G is more general than h.
Remove from S any hypothesis that is more general than another
hypothesis in S.
If d is negative,
Remove from S any hypothesis inconsistent with d.
For each g ∈ G inconsistent with d
Remove g from G .
Add to G all minimal specification h of g s.t. h is consistent with
d and some member of S is more specific than h.
Remove from G any hypothesis that is more specific than another
hypothesis in G .
Tian-Li Yu (NTUEE) Concept Learning 15 / 25
Candidate Elimination

Candidate Elimination Example

Training examples
hSunny , Warm, Normal, Strong , Warm, Samei : +
hSunny , Warm, High, Strong , Warm, Samei : +
hRainy , Cold, High, Strong , Warm, Changei : −
hSunny , Warm, High, Strong , Cool, Changei : +

Initially,
S0 = hφ, φ, φ, φ, φ, φ, φi
G0 = h?, ?, ?, ?, ?, ?, ?i

Tian-Li Yu (NTUEE) Concept Learning 16 / 25


Candidate Elimination

Candidate Elimination Example

𝑺𝟎 : { < ∅, ∅, ∅, ∅, ∅, ∅ > }

𝑺𝟏 : { < Sunny, Warm, Normal, Strong, Warm, Same > }

𝑺𝟐 : { < Sunny, Warm, ?, Strong, Warm, Same > }

𝑮𝟎 , 𝑮𝟏 , 𝑮𝟐 : { < ?, ?, ?, ?, ?, ? > }

Training Examples:
1. < Sunny, Warm, Normal, Strong, Warm, Same > , EnjoySport = Yes
2. < Sunny, Warm, High, Strong, Warm, Same > , EnjoySport = Yes

Tian-Li Yu (NTUEE) Concept Learning 17 / 25


Candidate Elimination

Candidate Elimination Example

𝑺𝟐 , 𝑺𝟑 : { < Sunny, Warm, ?, Strong, Warm, Same > }

𝑮𝟑 : { < Sunny, ?, ?, ?, ?, ? > < ?, Warm, ?, ?, ?, ? > < ?, ?, ?, ?, ?, Same > }

𝑮𝟐 : { < ?, ?, ?, ?, ?, ? > }

Training Example:
3. < Rainy, Cold, High, Strong, Warm, Change > , EnjoySport = No

Tian-Li Yu (NTUEE) Concept Learning 18 / 25


Candidate Elimination

Candidate Elimination Example

𝑺𝟑 : { < Sunny, Warm, ?, Strong, Warm, Same > }

𝑺𝟒 : { < Sunny, Warm, ?, Strong, ?, ? > }

𝑮𝟒 : { < Sunny, ?, ?, ?, ?, ? > < ?, Warm, ?, ?, ?, ? >}

𝑮𝟑 : { < Sunny, ?, ?, ?, ?, ? > < ?, Warm, ?, ?, ?, ? > < ?, ?, ?, ?, ?, Same > }

Training Example:
4. < Sunny, Warm, High, Strong, Cool, Change > , EnjoySport = Yes

Tian-Li Yu (NTUEE) Concept Learning 19 / 25


Candidate Elimination

Properties of Candidate Elimination

Does it converge to the correct concept?


Yes if (1) there is no error in the training examples and (2) some
hypothesis in H correctly describes the target concept.
When one of these (or both) doesn’t hold, given enough data,
eventually S and G crosses, and yields an empty VS.
What training example should the learner request next?
A good query should be classified as positive by some h ∈ H and
negative by others.
An optimal query should be half-half, then only dlg |VS|e such queries
are needed to learn the exact concept if any.
How can partially learned concepts be used?
An instance is positive for the target concept if it satisfies every
member in S.
An instance is negative for the target concept if it satisfies none in G .
We may calculate confidence for other instances given some prior.

Tian-Li Yu (NTUEE) Concept Learning 20 / 25


Inductive bias

Inductive Bias

We talked about the difficulty where the target concept is not in the
hypothesis space.
Why not using a hypothesis space which includes every possible
hypothesis?
How does |H| affect the generalization of the learner?
How does |H| affect required number of training examples?

Tian-Li Yu (NTUEE) Concept Learning 21 / 25


Inductive bias

Biased and Unbiased Hypothesis Space

Inductive bias of Candidate-Elimination:


The target concept is contained in the hypothesis space H.

Our previous conjunctive hypothesis space:


Contains only 4 × 3 × 3 × 3 × 3 × 3 + 1 = 973 concepts, very biased.
hSunny , Warm, Normal, Strong , Cool, Changei : +
hCloudy , Warm, Normal, Strong , Cool, Changei : +
hRainy , Warm, Normal, Strong , Cool, Changei : −
Our algorithm will find zero hypothesis since it can’t learn disjunctions
such as Sky = Sunny ∨ Sky = Cloudy .

Consider an unbiased hypothesis space.


3 × 2 × 2 × 2 × 2 × 2 = 96 instances in total ⇒ 296 ' 1028 distinct
target concepts.
We never have to worry whether the target concept is in H.

Tian-Li Yu (NTUEE) Concept Learning 22 / 25


Inductive bias

Futility of Bias-Free Learning

No generalization in such hypothesis space!


Positive examples: x1 , x2 , x3 ; negative: x4 , x5 .
S = {x1 ∨ x2 ∨ x3 }.
G = {¬x4 ∧ ¬x5 }.

A learner that makes no prior assumptions regarding the identity of


the target concept has no rational basis for classifying any unseen
instances.
No-free-lunch theorem.
No bias, no learning.

Tian-Li Yu (NTUEE) Concept Learning 23 / 25


Summary

Summary

Concept learning can be cast as a problem of searching through a


large predefined space of potential hypotheses.
Search with the general-to-specific partial ordering.
Find-S search from specific to general and outputs the most specific
consistent hypothesis.
Candidate-Elimination keeps the most general (G ) and specific
(S) hypothesis. It shrinks VS during the search by relaxing S with
positive examples and restricting G with negative ones.
Inductive learning algorithms are able to classify unseen data because
of implicit inductive bias.

Tian-Li Yu (NTUEE) Concept Learning 24 / 25


References

References

Page Copy-
Image Source/Author
number right
Instances 𝑋 Hypotheses 𝐻
Specific

This work is licensed by the entiy(ies) for the use of “NTU MOOCs” ONLY.
6 𝑥1 ℎ1 ℎ3

𝑥2
ℎ2 The copyright belongs to Shao-Heng Ko.
General

ℎ1 = < Sunny, ?, ?, Strong, ?, ? >


Instances 𝑋
𝑥1 = < Sunny, Warm, Normal, Strong, Cool, Same >
𝑥2 = < Sunny, Warm, High, Light, Warm, Same >
Hypotheses 𝐻
ℎ2 = < Sunny, ?, ?, ?, ?, ? >
ℎ3 = < Sunny, ?, ?, ?, Cool, ? >
ℎ0
𝑥3 Specific
- ℎ1
This work is licensed by the entiy(ies) for the use of “NTU MOOCs” ONLY.
9 ℎ2,3
𝑥1 +
+𝑥
2
The copyright belongs to Shao-Heng Ko.
+ ℎ4
𝑥4
General

𝑺: { < Sunny, Warm, ?, Strong, ?, ? > }

This work is licensed by the entiy(ies) for the use of “NTU MOOCs” ONLY.
14 < Sunny, ?, ?, Strong, ?, ? >
< Sunny, Warm, ?, ?, ?, ? >
< ?, Warm, ?, Strong, ?, ? >

The copyright belongs to Shao-Heng Ko.


𝑮: { < Sunny, ?, ?, ?, ?, ? >, < ?, Warm, ?, ?, ?, ? > }

𝑺𝟎 : { < ∅, ∅, ∅, ∅, ∅, ∅ > }

𝑺𝟏 : { < Sunny, Warm, Normal, Strong, Warm, Same > }

This work is licensed by the entiy(ies) for the use of “NTU MOOCs” ONLY.
17 𝑺𝟐 : { < Sunny, Warm, ?, Strong, Warm, Same > }

𝑮𝟎 , 𝑮𝟏 , 𝑮𝟐 : { < ?, ?, ?, ?, ?, ? > }
The copyright belongs to Shao-Heng Ko.
Training Examples:
1. < Sunny, Warm, Normal, Strong, Warm, Same > , EnjoySport = Yes
2. < Sunny, Warm, High, Strong, Warm, Same > , EnjoySport = Yes

𝑺𝟐 , 𝑺𝟑 : { < Sunny, Warm, ?, Strong, Warm, Same > }

𝑮𝟑 : { < Sunny, ?, ?, ?, ?, ? > < ?, Warm, ?, ?, ?, ? > < ?, ?, ?, ?, ?, Same > }


This work is licensed by the entiy(ies) for the use of “NTU MOOCs” ONLY.
18
𝑮𝟐 : { < ?, ?, ?, ?, ?, ? > }
The copyright belongs to Shao-Heng Ko.
Training Example:
3. < Rainy, Cold, High, Strong, Warm, Change > , EnjoySport = No

𝑺𝟑 : { < Sunny, Warm, ?, Strong, Warm, Same > }

𝑺𝟒 : { < Sunny, Warm, ?, Strong, ?, ? > }


This work is licensed by the entiy(ies) for the use of “NTU MOOCs” ONLY.
19 𝑮𝟒 : { < Sunny, ?, ?, ?, ?, ? > < ?, Warm, ?, ?, ?, ? >}

𝑮𝟑 : { < Sunny, ?, ?, ?, ?, ? > < ?, Warm, ?, ?, ?, ? > < ?, ?, ?, ?, ?, Same > }


The copyright belongs to Shao-Heng Ko.
Training Example:
4. < Sunny, Warm, High, Strong, Cool, Change > , EnjoySport = Yes

Tian-Li Yu (NTUEE) Concept Learning 25 / 25

You might also like