0% found this document useful (0 votes)
50 views35 pages

Module 3 KNN and SVM

Uploaded by

amvarshney123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views35 pages

Module 3 KNN and SVM

Uploaded by

amvarshney123
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

ASET

K-NEAREST
NEIGHBOR
Reference link:

https://www.youtube.com/watch?v=HVXime0nQeI
Simple ASET

• Analogy..
Tell me about your friends(who your neighbors
are) and I will tell you who you are.
Instance-based ASET

Learning

Its very similar to


a Desktop!!
KNN – Different names
4
ASET

• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Lazy Learning
What is KNN? ASET

• A powerful classification algorithm used in


pattern recognition.

• K nearest neighbors stores all available cases and


classifies new cases based on a similarity
measure (e.g distance function)

• One of the top data mining algorithms used


today.

• A non-parametric lazy learning algorithm (An


Instance-based Learning method).
KNN: Classification Approach
ASET

• An object (a new instance) is classified by a


majority votes for its neighbor classes.
• The object is assigned to the most common class
amongst its K nearest neighbors (measured by a
distant function ).
ASET
Distance Measure

Compute
Distance
Test
Record

Training
Records Choose k of the
“nearest” records
Distance Between Neighbors ASET

• Calculate the distance between new example


(E) and all examples in the training set.

• Euclidean distance between two examples.


– X = [x1,x2,x3,..,xn]
– Y = [y1,y2,y3,...,yn]

– The Euclidean distance between X and Y is defined as

n
D( X ,Y )  (x2 
i
i1
yi )
K-Nearest Neighbor Algorithm ASET

All the instances correspond to points in an n-


dimensional feature space.

• Each instance is represented with a set of numerical


attributes.

• Each of the training data consists of a set of vectors and


a class label associated with each vector.

• Classification is done by comparing feature vectors of


different K nearest points.

• Select the K-nearest examples to E in the training set.

• Assign E to the most common class among its K-nearest


neighbors.
ASET
3-KNN: Example(1)
Example-2 ASET

11
Compute Euclidean Distance ASET

12
Use k=5 ASET

13
How to Select K? ASET

• If K is too small it is sensitive to noise points.

• Larger K works well. But too large K may include


majority points from other classes.

• Rule of thumb is K < sqrt(n), n is number of examples.


X
15
ASET
16
ASET

X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points that


have the k smallest distance to x.
Strengths of KNN ASET

• Very simple and intuitive.


• Can be applied to the data from any distribution.
• Good classification if the number of samples is large
enough.

Weaknesses of KNN

• Takes more time to classify a new example.


 need to calculate and compare distance from new
example to all other examples.
• Choosing k may be tricky.
• Need large number of samples for accuracy.
ASET

Support Vector Machine

https://www.youtube.com/watch?v=efR1C6CvhmE

18
Support Vector Machine ASET
How can you separate the given data points
in the search space?

How can you get a decision boundary to


separate these points?

If you add a new


data point, will
that decision line
work equally
well?
19
Support Vector Machine ASET

• Supervised Learning algorithms.

• The goal of the SVM algorithm is to create the best


line or decision boundary that can segregate n-
dimensional space into classes so that we can
easily put the new data point in the correct
category in the future.

20
Support Vector Machine ASET

• The best decision boundary is called a hyperplane.

• SVM chooses the extreme points/vectors that help in


creating the hyperplane. These extreme cases are called as
support vectors, and hence algorithm is termed as Support
Vector Machine.

21
Support Vector Machine ASET

22
Support Vector Machine ASET

Hyperplane(Decision surface ):
•The hyperplane is a function which is used to differentiate between features.
In 2-D, the function used to classify between features is a line whereas, the
function used to classify the features in a 3-D is called as a plane similarly the
function which classifies the point in higher dimension is called as a
hyperplane.
•Let’s say there are “m” dimensions:
•Thus the equation of the hyperplane in the ‘M’ dimension can be given as-

where,
Wi = vectors(W0,W1,W2,W3……Wm)
b = biased term (W0)
X = variables.

23
Support Vector Machine ASET

Hard margin SVM:


•Assume 3 hyperplanes namely (π, π+, π−) such that ‘π+’ is parallel to ‘π’
passing through the support vectors on the positive side and ‘π−’ is parallel to
‘π’ passing through the support vectors on the negative side

The equations of each hyperplane


can be considered as:

24
Support Vector Machine ASET
• For the point X1 :

• Explanation: when the point X1 we can say that point lies on the hyperplane and the
equation determines that the product of our actual output and the hyperplane
equation is 1 which means the point is correctly classified in the positive domain.

• For the point X3

• Explanation: when the point X3 we can say that point lies away from the hyperplane
and the equation determines that the product of our actual output and the hyperplane
equation is greater 1 which means the point is correctly classified in the positive
domain.

25
Support Vector Machine ASET
• for the point X4:

• Explanation: when the point X4 we can say that point lies on the hyperplane in the
negative region and the equation determines that the product of our actual output and the
hyperplane equation is equal to 1 which means the point is correctly classified in the
negative domain.
• for the point X6 :

• Explanation: when the point X6 we can say that point lies away from the hyperplane in
the negative region and the equation determines that the product of our actual output and
the hyperplane equation is greater 1 which means the point is correctly classified in the
negative domain.

26
Support Vector Machine ASET

• The constraints which are not classified


for point X7:

Explanation: When Xi = 7 the point is classified incorrectly because for point 7 the
wT + b will be smaller than one and this violates the constraints. So we found the
misclassification because of constraint violation. Similarly, we can also say for points
Xi = 8.
27
Support Vector Machine ASET

we can conclude that for any point Xi,


if Yi(WT*Xi +b) ≥ 1:
then Xi is correctly classified
else:
Xi is incorrectly classified.

So we can see that if the points are linearly separable then only our hyperplane is able to
distinguish between them and if any outlier is introduced then it is not able to separate them.
So these type of SVM is called as hard margin SVM (since we have very strict constraints to
correctly classify each and every datapoint).

28
Support Vector Machine ASET

• Soft margin SVM:


• We basically consider that the data is linearly separable
and this might not be the case in real life scenario. We
need an update so that our function may skip few
outliers and be able to classify almost linearly separable
points. For this reason, we introduce a new Slack
variable ( ξ ) which is called Xi.
• if we introduce ξ it into our previous equation we can
rewrite it as

29
Support Vector Machine ASET

if ξi= 0,
the points can be considered as correctly classified.
else:
ξi> 0 , Incorrectly classified points.

so if ξi> 0 it means that Xi(variables)lies in incorrect dimension, thus we can think of ξi as an


error term associated with Xi(variable). The average error can be given as:

thus our objective, mathematically can be described as;

This formulation is called the Soft margin technique.


where ξi = ςi
30
30
Support Vector Machine ASET

• What is Kernel trick?


• The kernel is a way of computing the dot product of two
vectors x and y in some (very high dimensional) feature space,
which is why kernel functions are sometimes called
“generalized dot product.
• Applying kernel trick means just to the replace dot product of
two vectors by the kernel function.

31
Support Vector Machine ASET

• Types of kernels:
1. linear kernel
2. polynomial kernel
3. Radial basis function kernel (RBF)/ Gaussian Kernel

32
Support Vector Machine ASET

• Pros:
1. It is really effective in the higher dimension.
2. Effective when the number of features are more
than training examples.
3. Best algorithm when classes are separable
4. The hyperplane is affected by only the support
vectors thus outliers have less impact.
5. SVM is suited for extreme case binary
classification.
33
Support Vector Machine ASET

• cons:
1. For larger dataset, it requires a large amount of time
to process.
2. Does not perform well in case of overlapped classes.
3. Selecting, appropriately hyperparameters of the
SVM that will allow for sufficient generalization
performance.
4. Selecting the appropriate kernel function can be
tricky.
34
ASET

Thank You

35

You might also like