Unit 1
Unit 1
1. Supervised learning
2. Unsupervised learning
3. Reinforcement learning
Supervised Learning
Now machine learning has got a great advancement as self-driving cars, Amazon
Alexa, Catboats, recommender system, weather prediction, disease prediction, stock market
analysis, etc.
• Concept: <x1,x2,x3,x4>
Tablet: <Large, Black, Flat, Square>
Smart Phone: <Small, Blue, Folded, Rectangle>
Number of possible instances= 2^d where, d= no. of features
All Hypothesis
Version Space
h1, h2, h3, h4,
h3, h4, h5
h5
Candidate Elimination Algorithm
• The candidate elimination algorithm incrementally builds the version
space given a hypothesis space H and a set E of examples.
• The examples are added one by one; each example possibly shrinks
the version space by removing the hypotheses that are inconsistent
with the example.
• The candidate elimination algorithm does this by updating the general
and specific boundary for each new example.
• This can be consider this as an
extended form of the Find-S
algorithm.
• Consider both positive and
negative examples.
• Actually, positive examples are
used here as the Find-S
algorithm (Basically they are
generalizing from the
specification).
• While the negative example is
specified in the generalizing
form.
• Terms Used:
• Concept learning: Concept learning is basically the learning task of
the machine (Learn by Train data)
• General Hypothesis: Not Specifying features to learn the machine.
• G = {‘?’, ‘?’,’?’,’?’…}: Number of attributes
• Specific Hypothesis: Specifying features to learn machine (Specific
feature)
• S= {ɸ,ɸ, ɸ,…ɸ}: The number of pi depends on a number of
attributes.
• Version Space: It is an intermediate of general hypothesis and
Specific hypothesis. It not only just writes one hypothesis but a set of
all possible hypotheses based on training data-set.
• Example:
Inductive Bias
• The phrase “inductive bias” refers to a collection of (explicit or implicit)
assumptions made by a learning algorithm in order to conduct induction, or
generalize a limited set of observations (training data) into a general model of the
domain.
• From Candidate-Elimination Algorithm, we get two hypotheses, one specific and
one general at the end as a final solution.
• Now, we need to check if the hypothesis we got from the algorithm is actually
correct or not, also make decisions like what training examples should the
machine learn next.
The fundamental questions for inductive reference:
• What happens if the target concept isn’t in the hypothesis space?
• Is it possible to avoid this problem by adopting a hypothesis space that
contains all potential hypotheses?
• What effect does the size of the hypothesis space have on the
algorithm’s capacity to generalize to unseen instances?
• What effect does the size of the hypothesis space have on the number
of training instances required?
• Inductive Learning:
• This basically means learning from examples.
• We are given input samples (x) and output samples (f(x)) in the
context of inductive learning, and the objective is to estimate the
function (f). i.e from examples rules are derived.
• The goal is to generalize from the samples and map such that the
output may be estimated for fresh samples in the future.
• Examples:
Assessment of credit risk:
• The x represents the customer’s properties.
• Whether or whether the f(x) has been accepted for credit.
• For example,
• x > y means y is inductively deduced from x.
• Types of Inductive Bias:
• Maximum conditional independence: It aims to maximize conditional
independence if the hypothesis can be put in a Bayesian framework. The Naive
Bayes classifier employs this bias.
• Minimum cross-validation error: Select the hypothesis with the lowest cross-
validation error when deciding between hypotheses.
• Maximum margin: While creating a border between two classes, try to make the
boundary as wide as possible. In support vector machines, this is the bias. The
idea is that distinct classes are usually separated by large gaps.
• Minimum hypothesis description length: When constructing a hypothesis, try to
keep the description as short as possible.
• Minimum features: features should be removed unless there is strong
evidence that they are helpful. Feature selection methods are based on
this premise.
• Nearest neighbors: Assume that the majority of the examples in a
local neighborhood in feature space are from the same class.
• If the class of a case is unknown, assume that it belongs to the same
class as the majority of the people in its near vicinity.
• The k-nearest neighbor’s algorithm employs this bias. Cases that are
close to each other are assumed to belong to the same class.