K-Nearest Neighbors (KNN)
K-Nearest Neighbors (KNN)
• K-Nearest Neighbors (KNN) is one of the simplest and most effective machine
learning algorithms used for classification and regression tasks.
• It is a non-parametric, lazy learning algorithm, meaning it makes no
assumptions about the underlying data distribution and does not learn a model
during training.
How KNN Works?
• KNN works by finding the K nearest data points in
the training set for a given query point and making
predictions based on their majority class (for
classification) or average value (for regression).
Steps of the KNN Algorithm:
•Choose the value of K (number of neighbors).
•Calculate the distance between the query point and all data points in the training set.
•Sort the distances and select the K nearest neighbors.
•For classification, assign the most common class among the K neighbors.
•For regression, take the average (or weighted average) of the K nearest neighbors.
Distance Metrics in KNN
Problem Statement:
We have a dataset of students' heights and weights. We want to classify a
new student into one of two categories: "Underweight" or "Healthy."
Choosing the optimal value of K in
K-Nearest Neighbors (KNN)
• 1. Rule of Thumb for Choosing K
• This gives a good starting point but is not always the best
choice.
Choosing the optimal value of K in
K-Nearest Neighbors (KNN)
•2. Odd vs. Even K
•Use odd values of K to avoid ties in classification problems.
•Even values can cause ties, requiring tie-breaking strategies.