0% found this document useful (0 votes)
49 views30 pages

ML 4

The document outlines an upcoming machine learning course including: - Course objectives like exploring ML applications and algorithms like decision trees and neural networks. - Learning outcomes like identifying ML applications and choosing an ML algorithm for a project. - The syllabus includes reference books and modules covering k-nearest neighbor learning and other instance-based methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
49 views30 pages

ML 4

The document outlines an upcoming machine learning course including: - Course objectives like exploring ML applications and algorithms like decision trees and neural networks. - Learning outcomes like identifying ML applications and choosing an ML algorithm for a project. - The syllabus includes reference books and modules covering k-nearest neighbor learning and other instance-based methods.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

Day & Time: Monday (10am-11am & 3pm-4pm)

Tuesday (10am-11am)
Wednesday (10am-11am & 3pm-4pm)
Friday (9am-10am, 11am-12am, 2pm-3pm)
Dr. Srinivasa L. Chakravarthy
&
Smt. Jyotsna Rani Thota
Department of CSE
GITAM Institute of Technology (GIT)
Visakhapatnam – 530045
Email: [email protected] & [email protected]
Department of CSE, GIT 1
2 Novt 2020
EID 403 and machine learning
Course objectives

● Explore about various disciplines connected with ML.


● Explore about efficiency of learning with inductive bias.
● Explore about identification of Ml algorithms like decision
tree learning.
● Explore about algorithms like Artificial Neural networks,
genetic programming, Bayesian algorithm, Nearest neighbor
algorithm, Hidden Markov chain model.

Department of CSE, GIT EID 403 and machine learning


Learning Outcomes

● Identify the various applications connected with ML.


● Classify efficiency of ML algorithms with Inductive bias
technique.
● Discriminate the purpose of all ML algorithms.
● Analyze any application and Correlate available ML
algorithms.
● Choose an ML algorithm to develop their project.

Department of CSE, GIT EID 403 and machine learning


Syllabus

20 August 2020 4

Department of CSE, GIT EID 403 and machine learning


Reference book 1. Title -Machine Learning
Author- Tom M Mitchell

Department of CSE, GIT EID 403 and machine learning


Reference book 2. Title –Introduction to Machine Learning
Author- Ethem Alpaydin

Department of CSE, GIT EID 403 and machine learning


Module -4(Chapter-8)
It includes-

Chapter -8

k-Nearest Neighbor Learning

Locally weighted Regression

Radial Basis Functions

Case-Based Reasoning

7
Introduction
Instance based learning methods different from the learning methods which classify
target function when training examples are provided.

Where as Instance based learning methods store the training data and when a query
instance is encountered, then a set of similar related instances is retrieved from memory
which can be used to classify new query instance.

Instances in instance based learning can be represented as points in a Euclidean


space.
Introduction
Instance based learning methods are-

nearest neighbor,

locally weighted regression,

case based reasoning methods.

Instance based methods are sometimes referred as Lazy learning methods


because they delay processing training examples until a new instance must be
classified.
Introduction

Instance based learning methods are approaches to approximating-

real-valued or discrete-valued target functions.

● In fact, many techniques construct a local approximation to the target


function that applies in the neighborhood of a new query instance.

● Those techniques never construct an approximation designed to perform


well over entire instance space.
Introduction

● Where as Instance based learning methods can construct a different


approximation to the target function for each distinct query instance that
must be classified and

● An also When the target function is very complex, but can still described by
collection of less complex local approximations.

● Instance based methods can also use more complex and symbolic
representations for instances.
Introduction

Disadvantages of instance based approaches-

1. Cost of classifying new instances can be high,due to the fact that all
computation takes place at classification time only.
2. They typically consider all attributes of the instances when attempting to
retrieve similar training examples from memory,

Here, if the target concept depends on only a few of many


available attributes then the instances which are similar may be
distance apart.

(this is a disadvantage-especially to nearest neighbor approach)


K-Nearest neighbor Learning
It assumes all instances correspond to points in the n-dimensional space Rn

● The nearest neighbors of an instance are defined in terms of the


standard Euclidean distance.
K-Nearest neighbor Learning(cont.)
An arbitrary instance x described by feature vector <a1(x),a2(x),...an(x)>

Where ar(x) denotes the value of rth attribute of instance x. Then the distance
between two instances xi and xj is defined as d(xi,xj,)

Where
K-Nearest neighbor Learning(cont.)

K-Nearest neighbor Algorithm for approximating discrete-valued function


and real-valued function-

Training algorithm-For each training example <x,f(x)> add example to the list
training examples.

Classification algorithm-
K-Nearest neighbor Learning(cont.)

Classification algorithm(cont.)

Where 𝛅(a,b)=1 if a=b otherwise 0.


K-Nearest neighbor Learning(cont.)
Voronoi diagram-

Left picture-Represents set of positive and negative examples with query


instance xq to be classified. Here, 1-nearest neighbor classifies xq as
positive example and 5-nearest neighbor classifies as negative examples.

Right picture-Represents decision surface combination of convex


polyhedra induced by 1-nearest neighbor algorithm for a typical set of
training examples.
K-Nearest neighbor Learning(cont.)

Voronoi diagram-(cont.)

Note-

1. If k=1 then 1-Nearest neighbor algorithm assigns f(xi) where xi is the


training instance nearest to xq.
2. For every training example polyhedron indicates set of query
points whose classification is determined by training example.
3. Query points outside the polyhedron are closer to some other
training example.
Distance weighted kNN
One refinement to the KNN is to weight each of the k neighbors according to
their distance to the query point xq, by giving greater weight to close neighbor.

We might weight according to inverse square of its distance from xq.

Updated Classification algorithm with weights-For approximating,

Discrete valued target function

Real valued target function

Where,
K-Nearest neighbor Learning(cont.)
A note on terminology-

Regression- means approximating a real valued target function

Residual- is the error f̂ (x) f(x) in approximating the target function.

Kernel function- is the function of distance used to determine weight


of training example. In other words, it is the function K such that

Wi = K(d(xi,xq)).
Locally weighted Regression

In KNN, it is approximating the target function f(x) at single query point x=xq.

Where as, Locally weighted regression constructs an explicit approximation


to f over a local region surrounding xq.

Locally weighted regression uses nearby or distance-weighted training


examples to form this local approximation to f.
Locally weighted Regression(cont.)

The phrase “locally weighted regression” is called

● Local because the function is approximated based on data near query point,

● Weighted because the contribution of each training example is weighted by its


distance from query point,

● Regression because this is the term used widely in statistical learning


community for the problem of approximating real-valued functions.
Locally weighted Regression(cont.)
● Let us consider the case of locally weighted regression in which the target
function f is approximated near xq using a linear function.
● We derive methods to choose weights that minimize the squared error
summed over the set D of training examples.

We write error E(xq) as a function of the query point xq.


Locally weighted Regression(cont.)
The Locally weighted regression contains-

A broad range of alternative methods for distance weighting the training


examples.

A range of methods for locally approximating target functions. In most cases


the target function is constant, linear or quadratic functions but more complex
functions are not found because-

1. The cost of fitting more complex functions for each query instance is high.
2. These approximations model the target function quite well over a sufficiently
small subregion of instance space.
Radial Basis Functions
One approach to function approximation that is closely related to
distance-weighted regression but “eager” instead of “lazy” and also to artificial
neural networks learning with radial basis functions.

The learned hypothesis is a function of the form-

It is used for Image classification.


Radial Basis Functions(cont.)
Where,
Xu is an instance from X
k is a constant that specifies the number of kernel functions to be included.
Ku(d(xu,x)) is a Gaussian function centered at point xu with some variance 𝝈u2.
f̂ (x) is a global approximation to f(x).

Figure- A radial basis function network,where ai(x) are attributes of instance x.


● Each hidden unit produces an activation
determined by a Gaussian function centered at
some instance xu.
● Therefore, its activation will be close to zero
Unless the input x is near xu.
● The output unit produces a linear combination
of hidden unit activations.
CASE-BASED REASONING(CBR)

Instance based methods such as k-nearest neighbor and locally weighted


regression share three key properties.

1. They are lazy learning methods in that they defer the decision of how to
generalize beyond the training data until a new query instance is observed.
2. They classify new query instances by analyzing similar instances while
ignoring instances which are different from the query.
3. They represent instances as real-valued points in an n-dimensional
euclidean space.

CBR is based on first 2 principles.


CASE-BASED REASONING(cont.)

CBR is applied to instances with symbolic logic descriptions.

CBR is applied to problems such as-

Conceptual design of mechanical devices based on a stored library of


previous designs.

Reasoning about new legal cases based on previous rulings.

Solving planning & scheduling problems by reusing and combining portions


of previous solutions to similar problems.
END OF CHAPTER 8(MODULE-4)

You might also like