0% found this document useful (0 votes)
2 views

Module 1 (3)

Uploaded by

nikhitha C
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 1 (3)

Uploaded by

nikhitha C
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 97

MANGALORE INSTITUTE OF TECHNOLOGY & ENGINEERING

(A Unit of Rajalaxmi Education Trust®, Mangalore)


Autonomous Institute affiliated to VTU, Belagavi, Approved by AICTE, New Delhi
Accredited by NAAC with A+ Grade & ISO 9001:2015 Certified Institution

DEPARTMENT OF ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING

Course Materials
for

Subject Title : Introduction to Machine Learning


Subject Code : 23AIPC208
Semester : IV
Scheme : 2023 Autonomous

by
Mr. Sunil Kumar S
HoD & Senior Assistant Professor,
Dept of AI and ML
MITE,Moodabidri
Contents
• Syllabus
• Course Outcomes
• Vision & Mission of the Department
• Program Outcomes
• Program Specific Outcomes & Program Education Objectives
• CO-PO Mapping
• Lesson Plan
• Assignment Questions
• IA Marks Distribution & Portion for IA
• Laboratory Mark Rubrics (If IPCC Course)
• Minimum Attendance Requirements
• Module 1 Materials
• Module 1 Question Bank
Syllabus
Syllabus
Syllabus
Syllabus
Course Outcomes
Vision & Mission of the Department
Vision
• To create well groomed, technically competent and skilled AIML
professionals who can become part of industry and undertake
quality research at global level to meet societal needs.
Mission
• Provide state of art infrastructure, tools and facilities to make
students competent and achieve excellence in education and
research.
• Provide a strong theoretical and practical knowledge across the
AIML discipline with an emphasis on AI based research and
software development.
• Inculcate strong ethical values, professional behaviour and
leadership abilities through various curricular, co-curricular training
and development activities.
Program Outcomes
 PO1: Engineering Knowledge
 PO2: Problem Analysis
 PO3: Design/Development of Solutions
 PO4: Conduct Investigations of Complex Problems
 PO5: Engineering Tool Usage
 PO6: The Engineer and The World
 PO7: Ethics
 PO8: Individual and Collaborative Team work
 PO9: Communication
 PO10: Project Management and Finance
 PO11: Life-Long Learning:
Assignments
IA Marks Distribution

IA Portion
1st IA – 1st & 2nd Modules
2nd IA – 3rd & 4th Modules
Attendance Requirements
Module 1 – Course Contents
The Machine Learning Landscape
Learning Problem
• Any computer program improve that improve
its performance in some task through
experience
For example, a computer program that learns to play checkers might improve its
performance as measured by its abiliry to win at the class of tasks involving
playing checkers games

Learning to recognize spoken words. Learning to drive an autonomous vehicle.


Learning Problem
Learning Problem
Designing a Learning Systems –
Checkers Program Design
• First Design Choice
• Choosing the Training Experience (E) / Examples
– Direct –
• self playing and learning the moves under teachers instructions
– Indirect
• Dataset
– Containing the sequence of moves played by various individual checker
– No teacher involvement
• Alternatively, it might have available only indirect information consisting of the move
sequences and final outcomes of various games played.
• In this later case, information about the correctness of specific moves early in the game must
be inferred indirectly from the fact that the game was eventually won or lost.
• Here the learner faces an additional problem of credit assignment, or determining the
degree to which each move in the sequence deserves credit or blame for the final outcome.
• Credit assignment can be a particularly difficult problem because the game can be lost even
when early moves are optimal, if these are followed later by poor moves.
• Hence, learning from direct training feedback is typically easier than learning from indirect
feedback.
Choosing the Training Experience
• A second important attribute of the training experience is
the degree to which the learner controls the sequence of
training examples
• For example, the learner might rely on the teacher to select
informative board states and to provide the correct move
for each.
• Alternatively, the learner might itself propose board states
that it finds particularly confusing and ask the teacher for
the correct move.
• Or the learner may have complete control over both the
board states and (indirect) training classifications, as it does
when it learns by playing against itself with no teacher
Choosing the Training Experience
• Distribution of Training experiance/examples
– Should contain all sort of gaming strategies by
various players
– Otherwise learner ends up in overfitting to one
kind of learning and fails when exposed to testing
with other players moves
A checkers learning problem
• To proceed with our design, let us decide that our system will train
by playing games against itself.
• This has the advantage that no external trainer need be present,
and it therefore allows the system to generate as much training
data as time permits. We now have a fully specified learning task.
• Task T: playing checkers
• Performance measure P: percent of games won in the world
tournament
• Training experience E: games played against itself
• In order to complete the design of the learning system, we must
now choose
1. the exact type of knowledge to be, learned
2. a representation for this target knowledge
3. a learning mechanism
A checkers learning problem
• Second Design Choice
• Choosing the Target Function
– Is to determine exactly what type of knowledge will be learned
and
– how this will be used by the performance program
• Let us begin with a checkers-playing program that can
generate the legal moves from any board state.
• The program needs only to learn how to choose the best
move from among these legal moves.
• This learning task is representative of a large class of tasks
for which the legal moves that define some large search
space are known a priori, but for which the best search
strategy is not known
A checkers learning problem -
Choosing the Target Function
• Let us call this function ChooseMove and use the notation
ChooseMove : B --> M
• to indicate that this function accepts as input any board
from the set of legal board states B
• and produces as output some move from the set of legal
moves M.
• Throughout our discussion of machine learning we will find
it useful to reduce the problem of improving performance P
at task T to the problem of learning some particular
• Target function such as ChooseMove.
• The choice of the target function will therefore be a key
design choice.
Choosing the Target Function –
Evaluation Function
• Although ChooseMove is an obvious choice for the target function in our
example,
– this function will turn out to be very difficult to learn given the kind of indirect
training experience available to our system.
• An alternative target function and one that will turn out to be easier to learn in
this setting-is an evaluation function that assigns a numerical score to any given
board state.
• Let us call this target function V and again use the notation V : B  R, to denote
that V maps any legal board state from the set B to some real value (we use R to
denote the set of real numbers).
• We intend for this target function V to assign higher scores to better board states.
• If the system can successfully learn such a target function V, then it can easily use
it to select the best move from any current board position.
• This can be accomplished by generating the successor board state produced by
every legal move, then using V to choose the best successor state and therefore
the best legal move.
Choosing the Target Function –
Evaluation Function
• What exactly should be the value of the target function V for any given board
state?
– Of course any evaluation function that assigns higher scores to better board states will do.
• Nevertheless, we will find it useful to define one particular target function V
among the many that produce optimal play.
• As we shall see, this will make it easier to design a training algorithm. Let us
therefore define the
• target value V(b) for an arbitrary board state b in B, as follows:
1. if b is a final board state that is won, then V(b) = 100
2. if b is a final board state that is lost, then V(b) = -100
3. if b is a final board state that is drawn, then V(b) = 0
4. if b is a not a final state in the game, then V(b) = V(b'), where b' is the best
final board state that can be achieved starting from b and playing optimally
until the end of the game (assuming the opponent plays optimally, as well).
• Ideal Target Function V may not possible
• System learns for function approximation know as V (cap)
Choosing the Representation for
Target Function
• Now its time to choose a representation that the
learning program will use to describe the function
^V that it will learn.
• The representation of ^V can be as follows.
1. A table specifying values for each possible board
state?
2. collection of rules?
3. neural network?
4. a polynomial function of board features?
5. …
Choosing the Representation for
Target Function
• To keep the discussion simple, let us choose a simple representation for
any given board state, the function ^V will be calculated as a linear
combination of the following board features:
• x1(b) — number of black pieces on board b
• x2(b) — number of red pieces on b
• x3(b) — number of black kings on b
• x4(b) — number of red kings on b
• x5(b) — number of red pieces threatened by black (i.e., which can be
taken on black’s next turn)
• x6(b) — number of black pieces threatened by red

• Where w0 through w6 are numerical coefficients or weights to be


obtained by a learning algorithm. Weights w1 to w6 will determine the
relative importance of different board features.
To summarize our design
• we have elaborated the original formulation of the learning problem by
choosing a type of training experience, a target function to be learned,
and a representation for this target function.
• Our elaborated learning task is now
Choosing a Function Approximation
Algorithm
• Generating training data —
• To train our learning program, we need a set of training data, each
describing a specific board state b and the training value V_train (b) for b.
• Each training example is an ordered pair <b,V_train(b)>
• For example, a training example may be <(x1 = 3, x2 = 0, x3 = 1, x4 = 0, x5 =
0, x6 = 0), +100">.
• This is an example where black has won the game since x2 = 0 or red has
no remaining pieces.
• However, such clean values of V_train (b) can be obtained only for board
value b that are clear win, loss or draw.
• In above case, assigning a training value V_train(b) for the specific boards
b that are clean win, loss or draw is direct as they are direct training
experience.
• But in the case of indirect training experience, assigning a training value
V_train(b) for the intermediate boards is difficult.
ESTIMATING TRAINING VALUES
• In such case, the training values are updated using temporal
difference learning. Temporal difference (TD) learning is a concept
central to reinforcement learning, in which learning happens
through the iterative correction of your estimated returns towards
a more accurate target return.
• Let Successor(b) denotes the next board state following b for which
it is again the program’s turn to move. ^V is the learner’s current
approximation to V.
• Using these information, assign the training value of V_train(b) for
any intermediate board state b as below :
V_train(b) ← ^V(Successor(b))
• In the belowe figure, V_train(b1) ← ^V(b3), where b3 is the
successor of b1. Once the game is played, the training data is
generated. For each training example, the V_train(b) is computed.
Adjusting the weights
• Now its time to define the learning algorithm
for choosing the weights and best fit the set of
training examples.
• One common approach is to define the best
hypothesis as that which minimizes the
squared error E between the training values
and the values predicted by the hypothesis ^V.
Adjusting the weights
• The learning algorithm should incrementally refine weights as more
training examples become available and it needs to be robust to errors in
training data
• Least Mean Square (LMS) training rule is the one training algorithm that
will adjust weights a small amount in the direction that reduces the error.
• The LMS algorithm is defined as follows:
Final Design for Checkers Learning
system The performance System — Takes a
new board as input and outputs a
trace of the game it played against
itself.
The Critic — Takes the trace of a
game as an input and outputs a set of
training examples of the target
function.
The Generalizer — Takes training
examples as input and outputs a
hypothesis that estimates the target
function. Good generalization to new
cases is crucial.
The Experiment Generator — Takes
the current hypothesis (currently
learned function) as input and
outputs a new problem (an initial
board state) for the performance
system to explore.
Summary of choices in designing the
checkers learning program
Concept Learning
• Much of learning involves acquiring general concepts
from specific training examples.
• People, for example, continually learn general concepts
or categories such as "bird," "car,“ situations in which I
should study more in order to pass the exam," etc.
• Each such concept can be viewed as describing some
subset of objects or events defined over a larger set
• (e.g., the subset of animals that constitute birds).
• Alternatively, each concept can be thought of as a
boolean-valued function defined over this larger set
• (e.g., a function defined over all animals, whose value
is true for birds and false for other animals).
Concept Learning
• In this chapter we consider the problem of
automatically inferring the general definition of
some concept, given examples labeled as
members or nonmembers of the concept.
• This task is commonly referred to as concept
learning, or approximating a boolean-valued
function from examples.
• Concept learning.
– Inferring a boolean-valued function from training
examples of its input and output.
A CONCEPT LEARNING TASK
A CONCEPT LEARNING TASK
• To ground our discussion of concept learning, consider the example task of
learning the target concept "days on which my friend Aldo enjoys his
favorite water sport." Table 2.1 describes a set of example days, each
represented by a set of attributes.
• The attribute EnjoySport indicates whether or not Aldo enjoys his favorite
water sport on this day.
• The task is to learn to predict the value of EnjoySport for an arbitrary day,
based on the values of its other attributes.
• What hypothesis representation shall we provide to the learner in this
case?
• Let us begin by considering a simple representation in which each
hypothesis consists of a conjunction of constraints on the instance
attributes.
• In particular, let each hypothesis be a vector of six constraints, specifying
the values of the six attributes Sky, AirTemp, Humidity, Wind, Water, and
Forecast. For each attribute, the hypothesis will either
A CONCEPT LEARNING TASK
• If some instance x satisfies all the constraints
of hypothesis h, then h classifies x as a
positive example (h(x) = 1).
• To illustrate, the hypothesis that Aldo enjoys
his favorite sport only on cold days with high
humidity (independent of the values of the
other attributes) is represented by the
expression
A CONCEPT LEARNING TASK
A CONCEPT LEARNING TASK
A CONCEPT LEARNING TASK
• When learning the target concept, the learner is presented a set of training
• examples, each consisting of an instance x from X, along with its target concept
• value c(x) (e.g., the training examples in Table 2.1).
• Instances for which c(x) = 1 are called positive examples, or members of the target concept.
• Instances for which C(X) = 0 are called negative examples, or nonmembers of the target
concept.
• We will often write the ordered pair (x, c(x)) to describe the training example consisting of
the instance x and its target concept value c(x).
• We use the symbol D to denote the set of available training examples.
• Given a set of training examples of the target concept c, the problem faced by the learner is
to hypothesize, or estimate, c.
• We use the symbol H to denote the set of all possible hypotheses that the learner may
consider regarding the identity of the target concept.
• Usually H is determined by the human designer's choice of hypothesis representation.
• In general, each hypothesis h in H represents a boolean-valued function defined over X;
• that is, h : X {O, 1).
• The goal of the learner is to find a hypothesis h such that h(x) = c(x) for all x in X.
The Inductive Learning Hypothesis
• Any hypothesis found to approximate the
target function well over a sufficiently large
set of training examples will also approximate
the target function well over other
unobserved examples.
CONCEPT LEARNING AS SEARCH
• General-to-Specific Ordering of Hypotheses

•Now consider the sets of instances that are classified positive by hl and
by h2.
•Because h2 imposes fewer constraints on the instance, it classifies more
instances as positive.
•In fact, any instance classified positive by hl will also be classified
positive by h2. Therefore, we say that h2 is more general than hl.
Instances, hypotheses, and the m o r e - g e n e r a l - t h a n relation.
FIND-S Algorithm
FIND-S
• To illustrate this algorithm, assume the learner
is given the sequence of training examples
from Table 2.1 for the EnjoySport task.
• The first step of FINDS is to initialize h to the
most specific hypothesis in H

This h is still very specific; it asserts that all instances are negative except for
the single positive training example we have observed.
FIND-S
• Next, the second training example (also
positive in this case) forces the algorithm to
further generalize h,
• this time substituting a "?' in place of any
attribute value in h that is not satisfied by the
new example.
• The refined hypothesis in this case is
FIND-S
• Upon encountering the third training example-in
this case a negative example-the algorithm makes
no change to h.
• In fact, the FIND-S algorithm simply ignores every
negative example!
• While this may at first seem strange, notice that
in the current case our hypothesis h is already
consistent with the new negative example
• (i-e., h correctly classifies this example as
negative), and hence no revision is needed
FIND-S
• In the general case, as long as we assume that the hypothesis space
H contains a hypothesis that describes the true target concept c
and that the training data contains no errors, then the current
hypothesis h can never require a revision in response to a negative
example.
• To see why, recall that the current hypothesis h is the most specific
hypothesis in H consistent with the observed positive examples.
• Because the target concept c is also assumed to be in H and to be
consistent with the positive training examples, c must be
more.general_than-or-equal .
• But the target concept c will never cover a negative example, thus
neither will h (by the definition of more-general~han).
• Therefore, no revision to h will be required in response to any
negative example.
FIND-S
• To complete our trace of FIND-S, the fourth
(positive) example leads to a further
generalization of h
FIND-S
• The FIND-S algorithm illustrates one way in which the
more-general than partial ordering can be used to organize
the search for an acceptable hypothesis.
• The search moves from hypothesis to hypothesis, searching
from the most specific to progressively more general
hypotheses along one chain of the partial ordering.
• Figure 2.2 illustrates this search in terms of the instance
and hypothesis spaces.
• At each step, the hypothesis is generalized only as far as
necessary to cover the new positive example.
• Therefore, at each stage the hypothesis is the most specific
hypothesis consistent with the training examples observed
up to this point (hence the name FIND-S)
FIND-S
FIND-S
• The key property of the FIND-S algorithm is that
for hypothesis spaces described by conjunctions
of attribute constraints (such as H for the
EnjoySport task),
• FIND-S is guaranteed to output the most specific
hypothesis within H that is consistent with the
positive training examples.
• Its final hypothesis will also be consistent with
the negative examples provided the correct
target concept is contained in H, and provided
the training examples are correct.
Problems in FIND-S
• However, there are several questions still left unanswered
by this learning algorithm, such as:
• Has the learner converged to the correct target concept?
• Although FIND-S will find a hypothesis consistent with the
training data, it has no way to determine whether it has
found the only hypothesis in H consistent with the data
(i.e., the correct target concept), or
• whether there are many other consistent hypotheses as
well.
• We would prefer a learning algorithm that could determine
whether it had converged and, if not, at least characterize
its uncertainty regarding the true identity of the target
concept.
Problems in FIND-S
• Why prefer the most specific hypothesis?
• In case there are multiple hypotheses
consistent with the training examples, FIND-S
will find the most specific.
• It is unclear whether we should prefer this
hypothesis over, say, the most general, or
some other hypothesis of intermediate
generality.
Problems in FIND-S
• Are the training examples consistent?
• In most practical learning problems there is some
chance that the training examples will contain at
least some errors or noise.
• Such inconsistent sets of training examples can
severely mislead FIND-S, given the fact that it
ignores negative examples.
• We would prefer an algorithm that could at least
detect when the training data is inconsistent and,
preferably, accommodate such errors.
Problems in FIND-S
• What if there are several maximally specific consistent hypotheses?
• In the hypothesis language H for the EnjoySport task, there is
always a unique, most specific hypothesis consistent with any set
of positive examples.
• However, for other hypothesis spaces (discussed later) there can be
several maximally specific hypotheses consistent with the data.
• In this case, FIND-S must be extended to allow it to backtrack on its
choices of how to generalize the hypothesis,
– to accommodate the possibility that the target concept lies along a
different branch of the partial ordering than the branch it has
selected.
• Furthermore,
• we can define hypothesis spaces for which there is no maximally
specific consistent hypothesis, although this is more of a theoretical
issue than a practical one (see Exercise 2.7).
VERSION SPACES AND THE CANDIDATE-ELIMINATION
ALGORITHM
• This section describes a second approach to concept learning, the CANDIDATE
ELIMINATIO Algorithm,
• that addresses several of the limitations of FIND-S.
• Notice that although FIND-S outputs a hypothesis from H, that is consistent with
the training examples,
– this is just one of many hypotheses from H that might fit the training data equally well.
• The key idea in the CANDIDATE-ELIMINATION Algorithm is to output a description
of the set of all hypotheses consistent with the training examples.
• Surprisingly, the CANDIDATE-ELIMINATION Alg computes the description of this
set without explicitly enumerating all of its members.
• This is accomplished by again using the more-general-than partial ordering,
• this time to maintain a compact representation of the set of consistent hypotheses
and to incrementally refine this representation as each new training example is
encountered.
Representation
• The CANDIDATE-ELIMINATION Alg finds all describable
hypotheses that are consistent with the observed training
examples.
• In order to define this algorithm precisely, we begin with a
few basic definitions.
• First, let us say that a hypothesis is
• consistent with the training examples if it correctly classifies
these examples.
Satisfy vs Consistent
• An example x is said to satisfy hypothesis h
when h(x) = 1, regardless of whether x is a
positive or negative example of the target
concept.
• However, whether such an example is
consistent with h depends on the target
concept, and in particular, whether h(x) = c(x)
VS
• The CANDIDATE-ELIMINATION Alg represents the set of all
hypotheses consistent with the observed training examples.
• This subset of all hypotheses is called the version space with
respect to the hypothesis space H and the training examples
D, because it contains all plausible versions of the target
concept.
The LIST-THEN-ELIMINATlON Alg
A More Compact Representation for
Version Spaces
• The CANDIDATE-ELIMINATION Algoritm works on
the same principle as the above
• LIST-THEN-ELIMINATION Alg.
• However, it employs a much more compact
representation of the version space.
• In particular, the version space is represented by
its most general and least general members.
• These members form general and specific
boundary sets that delimit the version space
within the partially ordered hypothesis space.
A More Compact Representation for
Version Spaces
A More Compact Representation for
Version Spaces
• To illustrate this representation for version spaces,
• consider again the Enjoysport concept learning problem described in Table 2.2.
• Recall that given the four training examples from Table 2.1, FIND-S outputs the
hypothesis
• In fact, this is just one of six different hypotheses from H that are consistent with
these training examples. All six hypotheses are shown in Figure 2.3.
• They constitute the version space relative to this set of data and this hypothesis
representation.
A More Compact Representation for
Version Spaces
• The CANDIDATE-ELIMINATION Alg represents
• the version space by storing only its most general
members (labeled G in Figure 2.3) and its most
specific (labeled S in the figure).
• Given only these two sets S and G, it is possible to
enumerate all members of the version space as
needed by generating the hypotheses that lie
between these two sets in the general-to-specific
partial ordering over hypotheses.
A More Compact Representation for
Version Spaces

As long as the sets G and S are well defined (see Exercise 2.7), they
completely specify the version space.
In particular, we can show that the version space is precisely the set of
hypotheses contained in G, plus those contained in S, plus
those that lie between G and S in the partially ordered hypothesis space
CANDIDATE-ELIMINATION Algorithm
CANDIDATE-ELIMINATION Algorithm
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
•CANDIDATE-ELIMINATTraIcOe N1. S0
and G0 are the initial boundary sets
corresponding to the most
specific and most general hypotheses.
•Training examples 1 and 2 force the S
boundary to become
more general, as in the FIND-S
algorithm. They have no effect on the
G boundary.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• As illustrated by these first two steps, positive training examples
may force the S boundary of the version space to become
increasingly general.
• Negative training examples play the complimentary role of forcing
the G boundary to become increasingly specific.
• Consider the third training example, shown in Figure 2.5.
• This negative example reveals that the G boundary of the version
space is overly general; that is, the hypothesis in G incorrectly
predicts that this new example is a positive example.
• The hypothesis in the G boundary must therefore be specialized
until it correctly classifies this new negative example.
• As shown in Figure 2.5, there are several alternative minimally more
specific hypotheses.
• All of these become members of the new G3 boundary set.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• Given that there are six attributes that could be
specified to specialize G2,
• why are there only three new hypotheses in G3?
• For example, the hypothesis
• h = (?, ?, Normal, ?, ?, ?) is a minimal
specialization of G2 that correctly labels the new
example as a negative example, but it is not
included in G3.
• The reason this hypothesis is excluded is that it is
inconsistent with the previously encountered
positive examples.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• The fourth training example, as shown in
Figure 2.6, further generalizes the
• S boundary of the version space.
• It also results in removing one member of the
G boundary, because this member fails to
cover the new positive example.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
• After processing these four examples, the
boundary sets S4 and G4 delimit the version
space of all hypotheses consistent with the set
of incrementally observed training examples.
The entire version space, including those
hypotheses bounded by S4 and G4, is shown
in Figure 2.7.
CANDIDATE-ELIMINATION Algorithm -
An Illustrative Example
A Biased Hypothesis Space
Permutation Pblm – Syntactically
Distinct Hypothesis
The Futility of Bias-Free Learning
• The above discussion illustrates a fundamental property of inductive
inference:
• a learner that makes no a priori assumptions regarding the identity of the
target concept has no rational basis for classifying any unseen instances.
• In fact,
• the only reason that the CANDIDATE-ELIMINATION Alg was able to
generalize beyond the observed training examples in our original
formulation of the EnjoySport task is that it was biased by the
• implicit assumption that the target concept could be represented by a
conjunction of attribute values.
• In cases where this assumption is correct (and the training examples are
error-free), its classification of new instances will also be correct.
• If this assumption is incorrect, however, it is certain that the CANDIDATE-
ELIMINATIO Alg will misclassify at least some instances from X.
Machine Learning
Machine learning algorithms are
classified into four categories as

defined
Supervised Learning Algorithms:
below:
– Labeled training data
• Unsupervised Learning Algorithms
– No labeled training data,
– clustering based on charters tics
• Reinforcement Learning Algorithms
– Here the prediction is not one single value, but a set of values.
– Another definition is: Reinforcement learning algorithms are
algorithms that have to take sequential actions (decisions) to
maximize a cumulative reward
• Evolutionary Learning Algorithms
– Evolutional algorithms are algorithms that imitate natural evolution to
solve a problem. Techniques such as genetic algorithm and ant colony
optimization fall under the category of evolutionary learning
algorithms.
Why Machine Learning
• Machine learning algorithms can be used for identifying the factors that
influence the key performance indicators, which can be further used for
decision making and value creation.
• Organizations such as Amazon, Apple, Capital One, General Electric,
Google, IBM, Facebook, Procter and Gamble and so on use ML algorithms
to create new products and solutions.
FRAMEWORK FOR DEVELOPING
MACHINE LEARNING MODELS
• The framework for ML algorithm development
can be divided into five integrated stages:
• problem and opportunity identification,
• collection of relevant data,
• data pre-processing,
• ML model building, and
• Model deployment.
• The various activities carried out during these
different stages are described in Figure 1.2
The success of ML projects will depend
on the following activities:
• Feature Extraction
• Feature Engineering
• Model Building and Feature Selection
• Model Deployment
Issues in Machine Learning
Module 1 Questions
1. Discuss the design choice of choosing training experience for checkers learning
problem
2. Discuss the process of choosing the target function & its representation for
checkers learning problem
3. Discuss the process of choosing a function approximation algorithm for checkers
learning problem
4. With a neat diagram explain the stages of checkers learning program design
5. Explain the following
a)Concept Learning
b)Inductive learning hypothesis
c) Consistent Hypothesis
d)Version Space
6. By considering the dataset given in figure 1, Illustrate the working of FIND-S
Algorithm to find the maximally specific hypothesis
7. By considering the dataset given in figure 1, illustrate the working of Candidate
Elimination Algorithm

Figure 1 : Dataset
Module 1 Questions
7. Explain inductive bias in candidate elimination algorithm
8. List & explain the issues in Machine Learning
9. Explain the categories of Machine Learning Algorithms
10. Why to use Machine Learning and explain the typical steps used by the
Machine Learning Algorithms
11. Explain the five integrated stages of ML Algorithm Development Frameworks
12. Explain the following activities of Machine Learning & discuss the significance
of these activities in success of machine learning application
a)Feature Extraction
b)Feature Engineering
c)Model Building and Feature Selection
d)Model Deployment

You might also like