0% found this document useful (0 votes)
37 views7 pages

Machine Learning Technique K-Nearest Neighbors (K-NN) : Dr. Gaurav Dixit

The document provides an overview of the k-nearest neighbors (k-NN) machine learning technique. It explains that k-NN is a non-parametric method that makes no assumptions about the relationship between predictors and outcomes. It uses distance-based similarity measures like Euclidean distance to find the k closest records to a new observation and predicts the new observation's class based on the predominant class among its neighbors.

Uploaded by

Aniket Sujay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views7 pages

Machine Learning Technique K-Nearest Neighbors (K-NN) : Dr. Gaurav Dixit

The document provides an overview of the k-nearest neighbors (k-NN) machine learning technique. It explains that k-NN is a non-parametric method that makes no assumptions about the relationship between predictors and outcomes. It uses distance-based similarity measures like Euclidean distance to find the k closest records to a new observation and predicts the new observation's class based on the predominant class among its neighbors.

Uploaded by

Aniket Sujay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

MACHINE LEARNING TECHNIQUE

k-NEAREST NEIGHBORS (k-NN)


LECTURE 28

DR. GAURAV DIXIT


DEPARTMENT OF MANAGEMENT STUDIES

1
k-NEAREST NEIGHBORS (k-NN)

• k-NN
– No assumptions about the form of relationship between outcome
variable and the set of predictors
– Non-parametric method
• No parameters from the assumed functional form to estimate
– Useful information for modeling is extracted using the similarities
between the records based on predictors' values
• Typically, distance based similarity measures are used

2
k-NEAREST NEIGHBORS (k-NN)

• k-NN: distance metrics


– Most popular metric is Euclidean distance
For two records having values of the predictors denoted by (x1, x2, …, xp)
and (w1, w2, …, wp)
DEu = (x1 − w1)2 + (x2 − w2)2 + … + (xp − wp)2
• Low computation costs
– Other distance metrics: statistical distance or Mahalanobis distance
and Manhattan distance
– Euclidean distance is preferred in k-NN due to many distance
computations

3
k-NEAREST NEIGHBORS (k-NN)

• k-NN
– Scaling of predictors: standardized values of predictors
• k-NN for Classification task
– Main idea is to find k records in the training partition which are
neighboring the new observation to be classified
– These k neighbors are used to classify the new observation into a class
• Predominant class among the neighbors

4
k-NEAREST NEIGHBORS (k-NN)

• k-NN: Finding neighbors and Classification


– Compute the distance between the new observation and training partition records
– Determine k nearest or closest records to the new observation
– Find most prevalent class among k neighbors and it would be the predicted class of new
observation

• Open RStudio

5
Key References

• Data Science and Big Data Analytics: Discovering, Analyzing,


Visualizing and Presenting Data by EMC Education Services
(2015)
• Data Mining for Business Intelligence: Concepts, Techniques,
and Applications in Microsoft Office Excel with XLMiner by
Shmueli, G., Patel, N. R., & Bruce, P. C. (2010)

6
Thanks…

You might also like