0% found this document useful (0 votes)
11 views7 pages

Instance Based Learning

The K-Nearest Neighbors (KNN) algorithm classifies data points based on the proximity of nearby points, using a specified number 'K' to determine how many neighbors to consider. It employs various distance metrics, such as Euclidean, Manhattan, and Minkowski distances, to identify these neighbors for classification and regression tasks. The document also includes a Python implementation of the KNN algorithm, demonstrating how to predict the category of a new data point based on its closest neighbors.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

Instance Based Learning

The K-Nearest Neighbors (KNN) algorithm classifies data points based on the proximity of nearby points, using a specified number 'K' to determine how many neighbors to consider. It employs various distance metrics, such as Euclidean, Manhattan, and Minkowski distances, to identify these neighbors for classification and regression tasks. The document also includes a Python implementation of the KNN algorithm, demonstrating how to predict the category of a new data point based on its closest neighbors.

Uploaded by

gokulk200507
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 7

INSTANCE BASED LEARNING

K-Nearest Neighbor(KNN) Algorithm

• K-Nearest Neighbors (KNN) is a simple way to classify things by looking at


what’s nearby. Imagine a streaming service wants to predict if a new user is
likely to cancel their subscription (churn) based on their age. They checks
the ages of its existing users and whether they churned or stayed. If most of
the “K” closest users in age of new user canceled their subscription KNN
will predict the new user might churn too. The key idea is that users with
similar ages tend to have similar behaviors and KNN uses this closeness to
make decisions.

Getting Started with K-Nearest Neighbors


• K-Nearest Neighbors is also called as a lazy learner algorithm because it
does not learn from the training set immediately instead it stores the dataset
and at the time of classification it performs an action on the dataset.
As an example, consider the following table of data points containing two
features:

The image shows how KNN predicts the category of a new data point based on its closest
neighbours.
The red diamonds represent Category 1 and the blue squares represent Category 2.
The new data point checks its closest neighbours (circled points).
Since the majority of its closest neighbours are blue squares (Category 2) KNN predicts the
new data point belongs to Category 2.
What is ‘K’ in K Nearest Neighbour ?

• In the k-Nearest Neighbours (k-NN) algorithm k is just a number


that tells the algorithm how many nearby points (neighbours) to
look at when it makes a decision.
• Example:
• Imagine you’re deciding which fruit it is based on its shape and size.
You compare it to fruits you already know.
• If k = 3, the algorithm looks at the 3 closest fruits to the new one.
• If 2 of those 3 fruits are apples and 1 is a banana, the algorithm says
the new fruit is an apple because most of its neighbours are apples.

Distance Metrics Used in KNN Algorithm


• KNN uses distance metrics to identify nearest neighbour, these
neighbours are used for classification and regression task. To
identify nearest neighbour we use below distance metrics:
1. Euclidean Distance
• Euclidean distance is defined as the straight-line distance between two
points in a plane or space.

2. Manhattan Distance
• This is the total distance you would travel if you could only move along
horizontal and vertical lines (like a grid or city streets). It’s also called
“taxicab distance” because a taxi can only drive along the grid-like streets
of a city.

• 3. Minkowski Distance
Minkowski distance is like a family of distances, which includes
both Euclidean and Manhattan distances as special cases.
Working of KNN algorithm

Python Implementation of KNN Algorithm:


1. Importing Libraries:
import numpy as np
from collections import Counter
2. Defining the Euclidean Distance Function:
def euclidean_distance(point1, point2):
return np.sqrt(np.sum((np.array(point1) - np.array(point2))**2))
3. KNN Prediction Function:
• def knn_predict(training_data, training_labels, test_point, k):
• distances = []
• for i in range(len(training_data)):
• dist = euclidean_distance(test_point, training_data[i])
• distances.append((dist, training_labels[i]))
• distances.sort(key=lambda x: x[0])
• k_nearest_labels = [label for _, label in distances[:k]]
• return Counter(k_nearest_labels).most_common(1)[0][0]
4. Training Data, Labels and Test Point:
training_data = [[1, 2], [2, 3], [3, 4], [6, 7], [7, 8]]
training_labels = ['A', 'A', 'A', 'B', 'B']
test_point = [4, 5]
k=3
5. Prediction and Output:
prediction = knn_predict(training_data, training_labels, test_point, k)
print(prediction)

Output:
A

You might also like