0% found this document useful (0 votes)
41 views21 pages

KNN With Example

The K-Nearest Neighbors (KNN) algorithm is a simple supervised machine learning method that classifies new data points based on their proximity to existing data points in a reference dataset. KNN is considered a 'lazy learner' because it does not train on the data but instead memorizes the entire dataset, leading to time-consuming predictions when new data points are introduced. The choice of K is crucial for accuracy, with common practices suggesting odd values and experimentation to minimize classification error.

Uploaded by

mahjabeenbehram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views21 pages

KNN With Example

The K-Nearest Neighbors (KNN) algorithm is a simple supervised machine learning method that classifies new data points based on their proximity to existing data points in a reference dataset. KNN is considered a 'lazy learner' because it does not train on the data but instead memorizes the entire dataset, leading to time-consuming predictions when new data points are introduced. The choice of K is crucial for accuracy, with common practices suggesting odd values and experimentation to minimize classification error.

Uploaded by

mahjabeenbehram
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 21

LECTURE – 03

K-NEAREST-NEIGHBOR
ALGORITHM
K Nearest Neighbor
Classification
K-Nearest Neighbors or KNN is one of the simplest machine
learning algorithms.
◦ easy to implement

◦ easy to understand.

It is a supervised machine learning algorithm. This means we


need a reference dataset to predict the category/group of the
future data point.
K Nearest Neighbor
Classification
Once the KNN model is trained on the reference dataset, it
classifies the new data points based on the points (neighbors)
that are most similar to it. It uses the test data to make an
“educated guess” as to which category should the new data
points belong.

KNN algorithm simply means that “I am as good as my


nearest K neighbors.”
Why Is The KNN Called A “Lazy
Learner” Or A “Lazy Algorithm”?
KNN is called a lazy learner because when we supply training

data to this algorithm, the algorithm does not train itself at all.

KNN does not learn any discriminative function from the training

data. But it memorizes the entire training dataset instead.

There is no training time in KNN.


K Nearest Neighbor
Classification(Lazy Learner)
Each time a new data point comes in and we want to make a

prediction, the KNN algorithm will search for the nearest


neighbors in the entire training set.

Hence the prediction step becomes more time-consuming

and computationally expensive.


KNN CLASSIFICATION
APPROACH
KNN CLASSIFICATION
APPROACH
KNN CLASSIFICATION ALGORITHM
The table above represents our data set. We have
two columns — Brightness and Saturation.
Each row in the table has a class of
either Red or Blue.
Before we introduce a new data entry, let's assume
the value of K is 5.
The table above represents our data set. We have
two columns
How to Calculate Euclidean Distance in the K-Nearest
Neighbors Algorithm
Here's the new data entry:

To know its class, we must calculate the distance from the


new entry to other entries in the data set using the Euclidean
distance formula.
Here's what the table will look like after
all the distances have been calculated:
Since we chose 5 as the value of K, we'll
only consider the closest five records.
That is:
As you can see above, the majority class
within the 5 nearest neighbors to the new
entry is Red. Therefore, we'll classify the
new entry as Red.
KNN - EXAMPLE
HOW TO SELECT THE VALUE
OF K
There is no particular way of choosing the value K, but here are
some common conventions to keep in mind:

Choosing a very low value will most likely lead to inaccurate


predictions.

The commonly used value of K is 5.

Always use an odd number as the value of K.


HOW TO SELECT THE VALUE OF
K
There is no particular way to determine the best value for "K",
so we need to try some values to find the best out of them.
For each of N training example
Find its K nearest neighbours

Make a classification based on these K neighbours

Calculate classification error

Output average error over all examples

Use the K that gives lowest average error over the N training examples
PROS AND CONS OF KNN
Pros

Cons
Class Task KNN Solved Example Diabetic Patient

Apply K nearest neighbor


classifier to predict the
diabetic patient with the
given features BMI, Age. If
the training examples are
shown in table.
Assume K = 3,
Manhattan Distance:

Euclidean Distance(L2-
Norm):

Test Example BMI=43.6,


Age=40, Sugar=?
Now, we need to apply the
majority voting technique to
decide the resulting label from
the new example. Here the 1st
and 2nd nearest neighbors
have target label 1 and the
3rd nearest neighbor has
target label 0. Target label 1
has the majority. Hence the
new example is classified as 1,
That is the diabetic patient
has Sugar.
Test Example BMI=43.6,
Age=40, Sugar=1
THANK YOU!

You might also like