Lecture3 Concept Learning
Lecture3 Concept Learning
attributes
Sky
Water Forecast
Enjoy
Sport
Sunny
Sunny
Rainy
Sunny
Warm
Warm
Cold
Warm
Warm
Warm
Warm
Cool
Yes
Yes
No
Yes
Normal Strong
example
High
Strong
High
Strong
High
Strong
Same
Same
Change
Change
Representing Hypothesis
n
Example: hypothesis h
Sky
Temp Humid Wind Water Forecast
< Sunny
?
?
Strong ?
Same >
n
Number of Instances,
Concepts, Hypotheses
Sky: Sunny, Cloudy, Rainy
n AirTemp: Warm, Cold
n Humidity: Normal, High
n Wind: Strong, Weak
n Water: Warm, Cold
n Forecast: Same, Change
#distinct instances : 3*2*2*2*2*2 = 96
#distinct concepts : 296
#syntactically distinct hypotheses : 5*4*4*4*4*4=5120
#semantically distinct hypotheses : 1+4*3*3*3*3*3=973
n
Hypotheses
specific
x1
h3
x2
h2 h1
h2 h3
h1
h2
general
x1=< Sunny,Warm,High,Strong,Cool,Same>
h1=<
Sunny,?,?,Strong,?,?>
h1 is a minimal
specialization
of h 2
x2=< Sunny,Warm,High,Light,Warm,Same>
h2=<
Sunny,?,?,?,?,?>
h2 is a minimal
generalization
of h 1
h3=< Sunny,?,?,?,Cool,?>
Find-S Algorithm
1.
2.
3.
Constraint Generalization
Arrtibute: Sky
(no value)
Sunny
Cloudy
Rainy
?(any value)
Illustration of Find-S
Instances
x3
Hypotheses
h0
x1
x2
x4
specific
h1
h2,3
h4
general
Properties of Find-S
n
+
+
+
-
h is consistent
with D, then
h>s;
Version Spaces
n
{<Sunny,Warm,?,Strong,?,?>}
<Sunny,?,?,Strong,?,?>
G:
x1
x2
x3
x4
<Sunny,Warm,?,?,?,?>
<?,Warm,?,Strong,?,?>
{<Sunny,?,?,?,?,?>, <?,Warm,?,?,?>, }
=
=
=
=
<Sunny
<Sunny
<Rainy
<Sunny
VSH,D = {h H| ( s S) ( g G) (g h s)
where x y means x is more general or equal than y
g
h
s
+
s
+
+
h is consistent
with D
Consistent(s,D)
= FALSE
+
-
Consistent(g,D)
=FALSE
Candidate Elimination
Algorithm
G maximally general hypotheses in H
S maximally specific hypotheses in H
For each training example d=<x,c(x)>
modify G and S so that G and S are consistent
with d
Positive Example:
g(d)=s(d)=0
s
+
+
+
-
remove g
remove s
+
-
Possitive Example:
g(d)=1 and s(d)=0
-
s
+
+
+
-
generalize s
+
-
Possitive Example:
g(d)=s(d)=1
s
+ + +
+
+ +
Negative Example:
g(d-)=s (d-)=1
s
+ - +
+
+ +
remove s
remove g
-
Negative Example:
g(d-)=1 and s(d -)=0
s
+
+
+
-
specialize g
+
-
Negative Example:
g(d-)=s(d-)=0
s
+
+
+
-
+
-
Candidate Elimination
Algorithm
G maximally general hypotheses in H
S maximally specific hypotheses in H
For each training example d=<x,c(x)>
If d is a positive example
Remove from G any hypothesis that is inconsistent with d
For each hypothesis s in S that is not consistent with d
n remove s from S.
n Add to S all minimal generalizations h of s such that
n
n
h consistent with d
Some member of G is more general than h
Candidate Elimination
Algorithm
If d is a negative example
Remove from S any hypothesis that is inconsistent with d
For each hypothesis g in G that is not consistent with d
n remove g from G.
n Add to G all minimal specializations h of g such that
n
n
h consistent with d
Some member of S is more specific than h
{<, , , , , >}
{<?, ?, ?, ?, ?, ?>}
G:
{<?, ?, ?, ?, ?, ?>}
G:
{<?, ?, ?, ?, ?, ?>}
{<Sunny,?,?,?,?,?>, <?,Warm,?,?,?> }
n
n
{<Sunny,Warm,?,Strong,?,?>}
<Sunny,?,?,Strong,?,?>
G:
x5
x6
x7
x8
<Sunny,Warm,?,?,?,?>
<?,Warm,?,Strong,?,?>
{<Sunny,?,?,?,?,?>, <?,Warm,?,?,?>, }
=
=
=
=
<Sunny
<Rainy
<Sunny
<Sunny
Inductive Leap
+ <Sunny Warm Normal Strong Cool Change>
+ <Sunny Warm Normal Light Warm Same>
problem of
expressibility
x1 = <Sunny Warm Normal Strong Cool Change> +
x2 = <Cloudy Warm Normal Strong Cool Change> +
S : { <?, Warm, Normal, Strong, Cool, Change> }
x3 = <Rainy Warm Normal Light Warm Same> S : {}
Unbiased Learner
n
n
n
Unbiased Learner
What are S and G in this case?
Assume positive examples (x 1, x2, x3) and
negative examples (x4, x5)
G : { (x4 v x5) }
S : { (x1 v x2 v x3) }
The only examples that are classified are the training
examples themselves. In other words in order to learn
the target concept one would have to present every single
instance in X as a training example.
Each unobserved instance will be classified positive by
precisely half the hypothesis in VS and negative by the
other half.
problem of generalizability
No Free Lunch!
Inductive Bias
Consider:
n Concept learning algorithm L
n Instances X, target concept c
n Training examples D c={<x,c(x)>}
n Let L(xi,Dc ) denote the classification assigned to
instance xi by L after training on D c.
Definition:
The inductive bias of L is any minimal set of assertions
B such that for any target concept c and
corresponding training data D c
(xi X)[B Dc xi ] |-- L(xi, D c)
Where A |-- B means that A logically entails B.
training
examples
new instance
assertion H
contains target
concept
candidate elimination
algorithm
classification of
new instance or
dont know
classification of
new instance or
dont know
Summary
n
n
n
n
n
n