0% found this document useful (0 votes)
7 views8 pages

ML Unit Iv

The document discusses instance-based learning methods, particularly focusing on K-Nearest Neighbor (KNN) and locally weighted regression. It outlines the advantages and disadvantages of instance-based learning, the algorithm for KNN, and the concept of distance-weighted nearest neighbor learning. Additionally, it covers case-based learning and radial basis functions, providing insights into their applications and methodologies.

Uploaded by

mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views8 pages

ML Unit Iv

The document discusses instance-based learning methods, particularly focusing on K-Nearest Neighbor (KNN) and locally weighted regression. It outlines the advantages and disadvantages of instance-based learning, the algorithm for KNN, and the concept of distance-weighted nearest neighbor learning. Additionally, it covers case-based learning and radial basis functions, providing insights into their applications and methodologies.

Uploaded by

mohan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 8

UNIT IV

INSTANCE BASED LEARNING


1. WRITE SHORT NOTES ON INSTANCE BASED LEARNING. (PART – B)
Introduction
 Instance-based learning methods such as nearest neighbor and locally weighted regression are
conceptually straight forward approaches to approximating real-valued or discrete-valued target
functions.
 Learning in these algorithms consists of simply storing the presented training data.
 Instance-based approaches can construct a different approximation to the target function for each
distinct query instance that must be classified
Advantages:
1. Training is very fast
2. Learn complex target function
3. Don’t lose information
Disadvantages:
 The cost of classifying new instances can be high.
 In many instance-based approaches, especially nearest-neighbor approaches, is that they typically
consider all attributes of the instances when attempting to retrieve similar training examples from
memory.
K-NEAREST NEIGHBOR LEARNING
2. WRITE DOWN THE ALGORITHM FOR K-NEAREST NEIGHBOR LEARNING. (PART – C)
3. WRITE SHORT NOTES ON DISTANCE WEIGHTED NEAREST NEIGHBOR LEARNING.
(PART – B)
 The most basic instance-based method is the K-Nearest Neighbor Learning. This algorithm assumes
all instances correspond to points in the n-dimensional space Rn.
 The nearest neighbors of an instance are defined in terms of the standard Euclidean distance.
 Let an arbitrary instance x be described by the feature vector:

Where, ar(x) denotes the value of the rth attribute of instance x.


 Then the distance between two instances xi and xj is defined to be d(xi, xj) Where,

 In nearest-neighbor learning the target function may be either discrete-valued or real- valued.
 Let us first consider learning discrete-valued target functions of the form

Where, V is the finite set {v1, . . . vs}


 The k-Nearest Neighbor algorithm for approximation a discrete-valued target function is given
below:
 The value f(xq)returned by this algorithm as its estimate of f(xq) is just the most common value of f
among the k training examples nearest to xq.
 If k=1, then the1-Nearest Neighbor algorithm assigns to f(xq) the value f(xi). Where xi is the
training instance nearest to xq.
 For larger values of k, the algorithm assigns the most common value among the k-nearest training
examples.
 Below figure illustrates the operation of the k-Nearest Neighbor algorithm for the case where the
instances are points in a two-dimensional space and where the target function is Boolean valued.

 The positive and negative training examples are shown by “+” and “-” respectively. A query point
xqis shown as well.
 The 1-Nearest Neighbor algorithm classifies xq as a positive example in this figure, whereas the 5-
Nearest Neighbor algorithm classifies it as a negative example.
 Below figure shows the shape of this decision surface induced by1-NearestNeighborover the entire
instance space.
 The decision surface is a combination of convex polyhedra surrounding each of the training
examples.

 For every training example, the polyhedron indicates the set of query points whose classification will
be completely determined by that training example.
 Query points outside the polyhedron are closer to some other training example. This kind of diagram
is often called the Voronoi diagram of the set of training example.
 The K-Nearest Neighbor algorithm for approximation a real-valued target function is given below:

Distance-Weighted Nearest Neighbor Algorithm


 The refinement to the k-NEAREST NEIGHBOR Algorithm is to weight the contribution of each of
the k-neighbors according to their distance to the query point xq, giving greater weight to closer
neighbors.
 For example, in the k-Nearest Neighbor algorithm, which approximates discrete-valued target
functions, we might weight the vote of each neighbor according to the inverse square of its distance
from xq
 Distance-Weighted Nearest Neighbor Algorithm for approximation a discrete-valued target
functions:

 Distance-Weighted Nearest Neighbor Algorithm for approximation a Real-valued target functions:

Terminology
 Regression means approximating a real-valued target function.
 Residual is the error𝑓̂(x)-f(x) in approximating the target function.
 Kernel function is the function of distance that is used to determine the weight of each training
example.
 In other words, the kernel function is the function K such that
wi=K(d(xi, xq))

LOCALLY WEIGHTED REGRESSION


4. WRITE DOWN THE STEPS INVOLVED IN LOCALLY WEIGHTED REGRESSION.
(PART – B)
 The phrase "locally weighted regression" is called local because the function is approximated based
only on data near the query point, weighted because the contribution of each training example is
weighted by its distance from the query point, and regression because this is the term used widely in
the statistical learning community for the problem of approximating real-valued functions.
 Given a new query instance xq, the general approach in locally weighted regression is to construct
an approximation 𝑓̂ that fits the training examples in the neighborhood surrounding xq.
 This approximation is then used to calculate the value 𝑓̂(xq), which is output as the estimated target
value for the query instance.
Locally Weighted Linear Regression
 Consider locally weighted regression in which the target function f is approximated near xq using a
linear function of the form

Where, ai(x) denotes the value of the ith attribute of the instance x
 Derived methods are used to choose weights that minimize the squared error summed over the set D
of training examples using gradient descent

Which led us to the gradient descent training rule

Where, η is a constant learning rate.

 Need to modify this procedure to derive a local approximation rather than a global one.
 The simple way is to redefine the error criterion E to emphasize fitting the local training examples.
Three possible criteria are given below.

1. Minimize the squared error over just the k-nearest neighbors:

2. Minimize the squared error over the entire set D of training examples, while weighting the error of
each training example by some decreasing function K of its distance from xq:

3. Combine1and 2:
 If we choose criterion three and re-derive the gradient descent rule, we obtain the following training rule

 The differences between this new rule and the rule given by Equation(3) are that the contribution of instance
x to the weight update is now multiplied by the distance penalty K(d(xq, x)), and that the error is summed
over only the k nearest training examples.

RADIAL BASIS FUNCTIONS


5. GIVE AN EXAMPLE OF RADIAL BASIS FUNCTION. (PART – B)
 One approach to function approximation that is closely related to distance-weighted regression and also to
artificial neural networks is learning with radial basis functions
 In this approach, the learned hypothesis is a function of the form

Where, each xu is an instance from X and where the kernel function K u(d(xu,x))is defined so that it
decreases as the distance d(xu, x) increases.

𝑓̂ is a global approximation to f(x), the contribution from each of the K u(d(xu,x)) terms is localized to a
 Here k is a user provided constant that specifies the number of kernel functions to be included.

region nearby the point xu.


 Choose each function Ku(d(xu, x)) to be a Gaussian function centered at the point xuwith some variance 𝜎u2

sufficiently large number k of such Gaussian kernels and provided the width 𝜎2 of each kernel can be
 The functional form of equ(1) can approximate any function with arbitrarily small error, provided a

separately specified
 The function given by equ(1) can be viewed as describing a two layer network where the first layer of units
computes the values of the various Ku(d(xu, x)) and where the second layer computes a linear combination of
these first-layer unit values
Example: Radial basis function (RBF) network:
 Given a set of training examples of the target function, RBF networks are typically trained in a two-stage
process.
1. First, the number k of hidden units is determined and each hidden unit u is defined by choosing the values of
xuand 𝜎u2that define its kernel function Ku(d(xu, x))
2. Second, the weights w, are trained to maximize the fit of the network to the training data, using the global
error criterion given by

 Because the kernel functions are held fixed during this second stage, the linear weight values w, can be
trained very efficiently
 Several alternative methods have been proposed for choosing an appropriate number of hidden units or,
equivalently, kernel functions.
 One approach is to allocate a Gaussian kernel function for each training example (xi,f (xi)), centering this

Each of these kernels may be assigned the same width 𝜎2.


Gaussian at the point xi.

 Given this approach, the RBF network learns a global approximation to the target function in which each
training example (xi, f (xi)) can influence the value of f only in the neighbourhood of xi.
 A second approach is to choose a set of kernel functions that is smaller than the number of training examples.
 This approach can be much more efficient than the first approach, especially when the number of training
examples is large.

CASE-BASED LEARNING
6. WRITE ABOUT CASE BASED LEARNING. (PART – C)
 In case-based learning, which is a concept often applied in machine learning, the system learns from
individual instances or cases rather than relying solely on general rules or models.
 It’s a type of lazy learning approach where the system stores and uses specific instances to make predictions
or decisions.
 In the context of machine learning, case-based learning involves:
1. Case Representation: Each instance or case is represented by a set of attributes or features. These
attributes describe the characteristics of the case.
2. Case Retrieval: When a new query or instance is presented to the system, it searches through its stored
cases to find the most similar cases to the query.
3. Case Adaptation: The system adapts the information from retrieved cases to generate a prediction or
solution for the new query.
4. Case Base Maintenance: The system might update its case base over time by adding new cases or
removing outdated ones to improve its performance.
 Case-based learning is particularly useful when there is a lack of clear rules or patterns that can be learned
from data directly.
 It’s often employed in areas where domain knowledge and context play a significant role in making
decisions.
 The success of case-based learning heavily depends on the quality of case representation, similarity
measures, and the adaptability of retrieved cases to the new situations.
A prototypical example of a case-based reasoning:
 The CADET system employs case-based reasoning to assist in the conceptual design of simple mechanical
devices such as water faucets.
 It uses a library containing approximately 75 previous designs and design fragments to suggest conceptual
designs to meet the specifications of new design problems.
 Each instance stored in memory (e.g.,a water pipe) is represented by describing both its structure and its
qualitative function.
 New design problems are then presented by specifying the desired function and requesting the corresponding
structure.
 The problem setting is illustrated in below figure:
 The function is represented in terms of the qualitative relationships among the water- flow levels and
temperatures at its inputs and outputs.
 In the functional description, an arrow with a "+" label indicates that the variable at the arrow head increases
with the variable at its tail.
 A "-" label indicates that the variable at the head decreases with the variable at the tail.
 Here Qc refers to the flow of cold water into the faucet, Q h to the input flow of hot water, and Q m to the
single mixed flow out of the faucet.
 Tc, Th, and Tmrefer to the temperatures of the cold water, hot water, and mixed water respectively.
 The variable Ct denotes the control signal for temperature that is input to the faucet, and C f denotes the
control signal for water flow.
 The controls Ctand Cfare to influence the water flows Qcand Qh, thereby indirectly influencing the faucet
output flow Qmand temperature Tm.

 CADET searches its library for stored cases whose functional descriptions match the design problem.
 If an exact match is found, indicating that some stored case implements exactly the desired function, then this
case can be returned as a suggested solution to the design problem.
 If no exact match occurs, CADET may find cases that match various subgraphs of the desired functional
specification.

Reference:
1. Tom M. Mitchell,―Machine Learning, McGraw-Hill Education (India) Private Limited, 2013.

PART – A (1 MARK)
1. ------------------- is an instance-based learner.
a) Eager Learner b) Lazy Learner c) Both (A) and (B) d) None of the Above

2. Machine Learning has various function representation, which of the following is not numerical functions?
a) Case-based b) Neural Network c) Linear Regression d) Support Vector Machines

3. Which of the following will be true about k in K-Nearest Neighbor in terms of Bias?
a) When you decrease the k the bias will be increases
b) When you increase the k the bias will be increases
c) Both (A) and (B)
d) None of the Above
4. Which of the following statements is false about k-Nearest Neighbor algorithm?
a) It stores all available cases and classifies new cases based on a similarity measure
b) It has been used in statistical estimation and pattern recognition
c) It cannot be used for regression
d) The input consists of the k closest training examples in the feature space
5. What are the advantages of Nearest neighbour algorithm?
a) Training is very fast b) Can learn complex target functions
c) Don’t lose information d) All of these
6. What if the target function is real valued in kNN algorithm?
a) Calculate the mean of the k nearest neighbours
b) Calculate the SD of the k nearest neighbor
c) None of these
d) All of the above
7. What is/are advantage(s) of Locally Weighted Regression?
a) Pointwise approximation of complex target function
b) Earlier data has no influence on the new ones
c) Both A & B d) None of these
8. Which network is more accurate when the size of training set between small to medium?
a) PNN/GRNN b) RBF c)K-means clustering d) None of these
9. What is/are true about RBF network?
a) A kind of supervised learning
b) Design of NN as curve fitting problem
c) Use of multidimensional surface to interpolate the test data
d) All of these
10. In k-NN algorithm, given a set of training examples and the value of k < size of training set (n), the algorithm
predicts the class of a test example to be the. What is/are advantages of CBR?
a) Least frequent class among the classes of k closest training examples.
b) Most frequent class among the classes of k closest training examples.
c) Class of the closest point.
d) Most frequent class among the classes of the k farthest training examples.

PART – B (5 MARKS)
1. Write short notes on instance based learning. (Refer P. No.1, Q. No.1)
2. Write short notes on distance weighted nearest neighbor learning. (Refer P. No.1, Q. No.3)
3. Write down the steps involved in locally weighted regression. (Refer P. No.4, Q. No.4)
4. Give an example of radial basis function. (Refer P. No.5, Q. No.5)

PART – C (10 MARKS)


1. Write down the algorithm for k-nearest neighbor learning. (Refer P. No.1, Q. No.2)
2. Write about case based learning. (Refer P. No.6, Q. No.6)

You might also like