Concept
Concept
• each attribute can take value as either '?’ or ‘ᶲ’ or can hold a single
value.
• "?" denotes that the attribute can take any value [e.g., Color= ?]
• ‘ᶲ’ denotes that the attribute cannot take any value, i.e., it represents a null value [e.g., Horns= ‘ᶲ’
• Single value denotes a specific single value from acceptable values of the attribute, i.e., the attribute
'Tail' can take a value as 'short' [e.g., Tail= Short]
• The different hypotheses that can be predicted for the target concept are
The most general hypothesis can allow any value for each of the attribute.
It is represented as : <?, ?, ?, ?, ?, ?, ?, ?>.
This hypothesis indicates that any animal can be an elephant.
The most specific hypothesis will not allow any value for each of the
attribute < ᶲ ,ᶲ ,ᶲ ,ᶲ ,ᶲ ,ᶲ ,ᶲ ,ᶲ >
hypothesis indicates that no animal can be an elephant.
Hypothesis space
• Hypothesis space is the set of all possible hypotheses that
approximates the target function f.
• Version space represents the only hypotheses that are used for the
classification.
• For example, each of the attribute given in the Table 3.1 has the following
possible set of values.
• Considering these values for each of the attribute, there are (2 x 2 x 2 x 2
x 2 x 3 x 2 x 2) =384 distinct instances covering all the 5 instances in the
training dataset.
So, we can generate (4 x 4 x 4 x 4 x 4 x 5 x 4 x 4) =81,920 distinct hypotheses when including two more
values[?, ᶲ] for each of the attribute
Heuristic search space
• Heuristic search is a search strategy that finds an optimized
hypothesis/solution to a problem
Example : Consider the training instances shown in Table 3.1 and illustrate
Specific to General Learning.
Solution: We will start from all false or the most specific hypothesis to
determine the most restrictive specialization. Consider only the positive
instances and generalize the most specific hypothesis. Ignore the negative
instances.
• The most specific hypothesis is taken now, which will not classify any instance
to true.
•h=<ᶲ ᶲ ᶲ ᶲ ᶲ ᶲ ᶲ ᶲ>
• Read the first instance I1, to generalize the hypothesis h so that this positive
instance can be classified by the hypothesis hl.
• I1: No Short Yes No No Black No Big Yes (Positive instance)
I5: No Short Yes Yes Yes Black No Big Yes (Positive instance)
• h=<? ? ? ? ? ? ? ?>
hl =<? ? ? ? ? ? ? ?>
I2: Yes Short No
No No No Brown Yes Medium
hl =<? ? ? ? ? ? ? ?>
h2=<No ? ? ? ? ? ?
?>
<? ? ? ? ? ? No ?>
<? ? ? Big>
? ?
? ?
• h3=h2
h2=<No ? ? ? ? ? ?
?>
<? ? ? ? ? ? No ?>
<? ? ? Big>
? ?
? ?
• I4
No(Negative instance)
I4: No Long No Yes Yes White No Medium
h4=<? ? Yes ? ? ? ?
?>
<? ? ? Big>
? ?
? ?
h5=h4
h5=<? ? Yes ? ? ? ?
?>
<? ? ? Big>
? ?
? ?
Thus, h5 is the hypothesis space generated which will classify the positive instances
to true and negative instances to false.
Example 2:
Consider sample training instances shown in Table 1, which describes the symptoms
of the persons and their Covid-19 test result. Apply general to specific learning to
search for an approximate hypothesis in the hypothesis space.
Hypothesis Space Search by Find-S Algorithm
• Ignore It
h3 | >=9 Yes ? Good Fast Yes>
• Now scan I4. Since it is a positive instance, check for mismatch in the
hypothesis 'h' with I4.
• The 5th and 6th attribute value are mismatching, so add'?’ to those
attributes in 'h'.
Yes >
h3 | >=9 Yes ? Good Fast
• Step2:
• h1 = <Sunny, Warm, Normal, Strong, Warm, Same>
• I2 = <Sunny, Warm, High, Strong, Warm, Same> Yes(+ve)
• h3=h2