Module 1 (3)
Module 1 (3)
Course Materials
for
by
Mr. Sunil Kumar S
HoD & Senior Assistant Professor,
Dept of AI and ML
MITE,Moodabidri
Contents
• Syllabus
• Course Outcomes
• Vision & Mission of the Department
• Program Outcomes
• Program Specific Outcomes & Program Education Objectives
• CO-PO Mapping
• Lesson Plan
• Assignment Questions
• IA Marks Distribution & Portion for IA
• Laboratory Mark Rubrics (If IPCC Course)
• Minimum Attendance Requirements
• Module 1 Materials
• Module 1 Question Bank
Syllabus
Syllabus
Syllabus
Syllabus
Course Outcomes
Vision & Mission of the Department
Vision
• To create well groomed, technically competent and skilled AIML
professionals who can become part of industry and undertake
quality research at global level to meet societal needs.
Mission
• Provide state of art infrastructure, tools and facilities to make
students competent and achieve excellence in education and
research.
• Provide a strong theoretical and practical knowledge across the
AIML discipline with an emphasis on AI based research and
software development.
• Inculcate strong ethical values, professional behaviour and
leadership abilities through various curricular, co-curricular training
and development activities.
Program Outcomes
PO1: Engineering Knowledge
PO2: Problem Analysis
PO3: Design/Development of Solutions
PO4: Conduct Investigations of Complex Problems
PO5: Engineering Tool Usage
PO6: The Engineer and The World
PO7: Ethics
PO8: Individual and Collaborative Team work
PO9: Communication
PO10: Project Management and Finance
PO11: Life-Long Learning:
Assignments
IA Marks Distribution
IA Portion
1st IA – 1st & 2nd Modules
2nd IA – 3rd & 4th Modules
Attendance Requirements
Module 1 – Course Contents
The Machine Learning Landscape
Learning Problem
• Any computer program improve that improve
its performance in some task through
experience
For example, a computer program that learns to play checkers might improve its
performance as measured by its abiliry to win at the class of tasks involving
playing checkers games
•Now consider the sets of instances that are classified positive by hl and
by h2.
•Because h2 imposes fewer constraints on the instance, it classifies more
instances as positive.
•In fact, any instance classified positive by hl will also be classified
positive by h2. Therefore, we say that h2 is more general than hl.
Instances, hypotheses, and the m o r e - g e n e r a l - t h a n relation.
FIND-S Algorithm
FIND-S
• To illustrate this algorithm, assume the learner
is given the sequence of training examples
from Table 2.1 for the EnjoySport task.
• The first step of FINDS is to initialize h to the
most specific hypothesis in H
This h is still very specific; it asserts that all instances are negative except for
the single positive training example we have observed.
FIND-S
• Next, the second training example (also
positive in this case) forces the algorithm to
further generalize h,
• this time substituting a "?' in place of any
attribute value in h that is not satisfied by the
new example.
• The refined hypothesis in this case is
FIND-S
• Upon encountering the third training example-in
this case a negative example-the algorithm makes
no change to h.
• In fact, the FIND-S algorithm simply ignores every
negative example!
• While this may at first seem strange, notice that
in the current case our hypothesis h is already
consistent with the new negative example
• (i-e., h correctly classifies this example as
negative), and hence no revision is needed
FIND-S
• In the general case, as long as we assume that the hypothesis space
H contains a hypothesis that describes the true target concept c
and that the training data contains no errors, then the current
hypothesis h can never require a revision in response to a negative
example.
• To see why, recall that the current hypothesis h is the most specific
hypothesis in H consistent with the observed positive examples.
• Because the target concept c is also assumed to be in H and to be
consistent with the positive training examples, c must be
more.general_than-or-equal .
• But the target concept c will never cover a negative example, thus
neither will h (by the definition of more-general~han).
• Therefore, no revision to h will be required in response to any
negative example.
FIND-S
• To complete our trace of FIND-S, the fourth
(positive) example leads to a further
generalization of h
FIND-S
• The FIND-S algorithm illustrates one way in which the
more-general than partial ordering can be used to organize
the search for an acceptable hypothesis.
• The search moves from hypothesis to hypothesis, searching
from the most specific to progressively more general
hypotheses along one chain of the partial ordering.
• Figure 2.2 illustrates this search in terms of the instance
and hypothesis spaces.
• At each step, the hypothesis is generalized only as far as
necessary to cover the new positive example.
• Therefore, at each stage the hypothesis is the most specific
hypothesis consistent with the training examples observed
up to this point (hence the name FIND-S)
FIND-S
FIND-S
• The key property of the FIND-S algorithm is that
for hypothesis spaces described by conjunctions
of attribute constraints (such as H for the
EnjoySport task),
• FIND-S is guaranteed to output the most specific
hypothesis within H that is consistent with the
positive training examples.
• Its final hypothesis will also be consistent with
the negative examples provided the correct
target concept is contained in H, and provided
the training examples are correct.
Problems in FIND-S
• However, there are several questions still left unanswered
by this learning algorithm, such as:
• Has the learner converged to the correct target concept?
• Although FIND-S will find a hypothesis consistent with the
training data, it has no way to determine whether it has
found the only hypothesis in H consistent with the data
(i.e., the correct target concept), or
• whether there are many other consistent hypotheses as
well.
• We would prefer a learning algorithm that could determine
whether it had converged and, if not, at least characterize
its uncertainty regarding the true identity of the target
concept.
Problems in FIND-S
• Why prefer the most specific hypothesis?
• In case there are multiple hypotheses
consistent with the training examples, FIND-S
will find the most specific.
• It is unclear whether we should prefer this
hypothesis over, say, the most general, or
some other hypothesis of intermediate
generality.
Problems in FIND-S
• Are the training examples consistent?
• In most practical learning problems there is some
chance that the training examples will contain at
least some errors or noise.
• Such inconsistent sets of training examples can
severely mislead FIND-S, given the fact that it
ignores negative examples.
• We would prefer an algorithm that could at least
detect when the training data is inconsistent and,
preferably, accommodate such errors.
Problems in FIND-S
• What if there are several maximally specific consistent hypotheses?
• In the hypothesis language H for the EnjoySport task, there is
always a unique, most specific hypothesis consistent with any set
of positive examples.
• However, for other hypothesis spaces (discussed later) there can be
several maximally specific hypotheses consistent with the data.
• In this case, FIND-S must be extended to allow it to backtrack on its
choices of how to generalize the hypothesis,
– to accommodate the possibility that the target concept lies along a
different branch of the partial ordering than the branch it has
selected.
• Furthermore,
• we can define hypothesis spaces for which there is no maximally
specific consistent hypothesis, although this is more of a theoretical
issue than a practical one (see Exercise 2.7).
VERSION SPACES AND THE CANDIDATE-ELIMINATION
ALGORITHM
• This section describes a second approach to concept learning, the CANDIDATE
ELIMINATIO Algorithm,
• that addresses several of the limitations of FIND-S.
• Notice that although FIND-S outputs a hypothesis from H, that is consistent with
the training examples,
– this is just one of many hypotheses from H that might fit the training data equally well.
• The key idea in the CANDIDATE-ELIMINATION Algorithm is to output a description
of the set of all hypotheses consistent with the training examples.
• Surprisingly, the CANDIDATE-ELIMINATION Alg computes the description of this
set without explicitly enumerating all of its members.
• This is accomplished by again using the more-general-than partial ordering,
• this time to maintain a compact representation of the set of consistent hypotheses
and to incrementally refine this representation as each new training example is
encountered.
Representation
• The CANDIDATE-ELIMINATION Alg finds all describable
hypotheses that are consistent with the observed training
examples.
• In order to define this algorithm precisely, we begin with a
few basic definitions.
• First, let us say that a hypothesis is
• consistent with the training examples if it correctly classifies
these examples.
Satisfy vs Consistent
• An example x is said to satisfy hypothesis h
when h(x) = 1, regardless of whether x is a
positive or negative example of the target
concept.
• However, whether such an example is
consistent with h depends on the target
concept, and in particular, whether h(x) = c(x)
VS
• The CANDIDATE-ELIMINATION Alg represents the set of all
hypotheses consistent with the observed training examples.
• This subset of all hypotheses is called the version space with
respect to the hypothesis space H and the training examples
D, because it contains all plausible versions of the target
concept.
The LIST-THEN-ELIMINATlON Alg
A More Compact Representation for
Version Spaces
• The CANDIDATE-ELIMINATION Algoritm works on
the same principle as the above
• LIST-THEN-ELIMINATION Alg.
• However, it employs a much more compact
representation of the version space.
• In particular, the version space is represented by
its most general and least general members.
• These members form general and specific
boundary sets that delimit the version space
within the partially ordered hypothesis space.
A More Compact Representation for
Version Spaces
A More Compact Representation for
Version Spaces
• To illustrate this representation for version spaces,
• consider again the Enjoysport concept learning problem described in Table 2.2.
• Recall that given the four training examples from Table 2.1, FIND-S outputs the
hypothesis
• In fact, this is just one of six different hypotheses from H that are consistent with
these training examples. All six hypotheses are shown in Figure 2.3.
• They constitute the version space relative to this set of data and this hypothesis
representation.
A More Compact Representation for
Version Spaces
• The CANDIDATE-ELIMINATION Alg represents
• the version space by storing only its most general
members (labeled G in Figure 2.3) and its most
specific (labeled S in the figure).
• Given only these two sets S and G, it is possible to
enumerate all members of the version space as
needed by generating the hypotheses that lie
between these two sets in the general-to-specific
partial ordering over hypotheses.
A More Compact Representation for
Version Spaces
As long as the sets G and S are well defined (see Exercise 2.7), they
completely specify the version space.
In particular, we can show that the version space is precisely the set of
hypotheses contained in G, plus those contained in S, plus
those that lie between G and S in the partially ordered hypothesis space
CANDIDATE-ELIMINATION Algorithm
CANDIDATE-ELIMINATION Algorithm
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
•CANDIDATE-ELIMINATTraIcOe N1. S0
and G0 are the initial boundary sets
corresponding to the most
specific and most general hypotheses.
•Training examples 1 and 2 force the S
boundary to become
more general, as in the FIND-S
algorithm. They have no effect on the
G boundary.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• As illustrated by these first two steps, positive training examples
may force the S boundary of the version space to become
increasingly general.
• Negative training examples play the complimentary role of forcing
the G boundary to become increasingly specific.
• Consider the third training example, shown in Figure 2.5.
• This negative example reveals that the G boundary of the version
space is overly general; that is, the hypothesis in G incorrectly
predicts that this new example is a positive example.
• The hypothesis in the G boundary must therefore be specialized
until it correctly classifies this new negative example.
• As shown in Figure 2.5, there are several alternative minimally more
specific hypotheses.
• All of these become members of the new G3 boundary set.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• Given that there are six attributes that could be
specified to specialize G2,
• why are there only three new hypotheses in G3?
• For example, the hypothesis
• h = (?, ?, Normal, ?, ?, ?) is a minimal
specialization of G2 that correctly labels the new
example as a negative example, but it is not
included in G3.
• The reason this hypothesis is excluded is that it is
inconsistent with the previously encountered
positive examples.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• The fourth training example, as shown in
Figure 2.6, further generalizes the
• S boundary of the version space.
• It also results in removing one member of the
G boundary, because this member fails to
cover the new positive example.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• After processing these four examples, the
boundary sets S4 and G4 delimit the version
space of all hypotheses consistent with the set
of incrementally observed training examples.
The entire version space, including those
hypotheses bounded by S4 and G4, is shown
in Figure 2.7.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
A Biased Hypothesis Space
Permutation Pblm – Syntactically
Distinct Hypothesis
The Futility of Bias-Free Learning
• The above discussion illustrates a fundamental property of inductive
inference:
• a learner that makes no a priori assumptions regarding the identity of the
target concept has no rational basis for classifying any unseen instances.
• In fact,
• the only reason that the CANDIDATE-ELIMINATION Alg was able to
generalize beyond the observed training examples in our original
formulation of the EnjoySport task is that it was biased by the
• implicit assumption that the target concept could be represented by a
conjunction of attribute values.
• In cases where this assumption is correct (and the training examples are
error-free), its classification of new instances will also be correct.
• If this assumption is incorrect, however, it is certain that the CANDIDATE-
ELIMINATIO Alg will misclassify at least some instances from X.
Machine Learning
Machine learning algorithms are
classified into four categories as
•
defined
Supervised Learning Algorithms:
below:
– Labeled training data
• Unsupervised Learning Algorithms
– No labeled training data,
– clustering based on charters tics
• Reinforcement Learning Algorithms
– Here the prediction is not one single value, but a set of values.
– Another definition is: Reinforcement learning algorithms are
algorithms that have to take sequential actions (decisions) to
maximize a cumulative reward
• Evolutionary Learning Algorithms
– Evolutional algorithms are algorithms that imitate natural evolution to
solve a problem. Techniques such as genetic algorithm and ant colony
optimization fall under the category of evolutionary learning
algorithms.
Why Machine Learning
• Machine learning algorithms can be used for identifying the factors that
influence the key performance indicators, which can be further used for
decision making and value creation.
• Organizations such as Amazon, Apple, Capital One, General Electric,
Google, IBM, Facebook, Procter and Gamble and so on use ML algorithms
to create new products and solutions.
FRAMEWORK FOR DEVELOPING
MACHINE LEARNING MODELS
• The framework for ML algorithm development
can be divided into five integrated stages:
• problem and opportunity identification,
• collection of relevant data,
• data pre-processing,
• ML model building, and
• Model deployment.
• The various activities carried out during these
different stages are described in Figure 1.2
The success of ML projects will depend
on the following activities:
• Feature Extraction
• Feature Engineering
• Model Building and Feature Selection
• Model Deployment
Issues in Machine Learning
Module 1 Questions
1. Discuss the design choice of choosing training experience for checkers learning
problem
2. Discuss the process of choosing the target function & its representation for
checkers learning problem
3. Discuss the process of choosing a function approximation algorithm for checkers
learning problem
4. With a neat diagram explain the stages of checkers learning program design
5. Explain the following
a)Concept Learning
b)Inductive learning hypothesis
c) Consistent Hypothesis
d)Version Space
6. By considering the dataset given in figure 1, Illustrate the working of FIND-S
Algorithm to find the maximally specific hypothesis
7. By considering the dataset given in figure 1, illustrate the working of Candidate
Elimination Algorithm
Figure 1 : Dataset
Module 1 Questions
7. Explain inductive bias in candidate elimination algorithm
8. List & explain the issues in Machine Learning
9. Explain the categories of Machine Learning Algorithms
10. Why to use Machine Learning and explain the typical steps used by the
Machine Learning Algorithms
11. Explain the five integrated stages of ML Algorithm Development Frameworks
12. Explain the following activities of Machine Learning & discuss the significance
of these activities in success of machine learning application
a)Feature Extraction
b)Feature Engineering
c)Model Building and Feature Selection
d)Model Deployment