0% found this document useful (0 votes)
9 views3 pages

K-Nearest Neighbors (KNN) Algorithm in Machine Learning

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm used for classification and regression by predicting outputs based on the 'k' closest training examples. It is simple to understand and does not require a training phase, but it can be slow with large datasets and sensitive to irrelevant features. KNN differs from K-Means, which is an unsupervised algorithm used for clustering, as KNN requires labeled data and predicts labels for new data.

Uploaded by

ROHAN PAL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views3 pages

K-Nearest Neighbors (KNN) Algorithm in Machine Learning

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm used for classification and regression by predicting outputs based on the 'k' closest training examples. It is simple to understand and does not require a training phase, but it can be slow with large datasets and sensitive to irrelevant features. KNN differs from K-Means, which is an unsupervised algorithm used for clustering, as KNN requires labeled data and predicts labels for new data.

Uploaded by

ROHAN PAL
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

K-Nearest Neighbors (KNN) Algorithm

in Machine Learning
What is KNN?
K-Nearest Neighbors (KNN) is a supervised machine learning algorithm used for
classification and regression tasks. It assumes that similar data points are close to each
other. It works by finding the 'k' closest training examples to a given test point and then
predicts the output based on majority voting (for classification) or averaging (for
regression).

Basic Idea (Simple Explanation)


Imagine you move to a new neighborhood and want to guess the most common profession
of your neighbors. You ask your k nearest neighbors what they do for a living and predict
your own profession based on their answers. That’s exactly what KNN does with data.

Important Terms in KNN

Steps Involved in KNN


1. Choose the number of neighbors (k).
2. Calculate the distance between the test point and all training points.
3. Select the k nearest data points (neighbors).
4. For classification: count the majority class among the neighbors.
For regression: calculate the average of the neighbors’ values.
5. Assign the predicted label or value to the test point.

Advantages of KNN
1. Simple and easy to understand.
2. No training phase — it's a lazy learner.
3. Naturally handles multi-class problems.
4. Works well with small datasets.

Disadvantages of KNN
1. Slow with large datasets — needs to compute distance for all points.
2. Sensitive to irrelevant or redundant features.
3. Affected by the scale of the data (feature scaling is important).
4. Choice of 'k' is crucial — too small or too large can lead to poor results.

Distance Metrics Used


1. Euclidean Distance — Most common, straight-line distance.
2. Manhattan Distance — Sum of absolute differences.
3. Minkowski Distance — Generalized form of both Euclidean and Manhattan.

Important Terms in KNN


Term Definition
K The number of neighbors to consider when
making a prediction.
Distance Metric A method to measure similarity (or
dissimilarity) between points.
Training Data The dataset used to compare and classify
the test point.
Lazy Learner KNN does not learn during training; all
computations are done during prediction.
Majority Voting In classification, the label most common
among the k neighbors is chosen.
Difference Between KNN and K-Means Algorithm
KNN (K-Nearest Neighbors) and K-Means are two very different algorithms in machine
learning:

🔹 KNN is a **supervised** algorithm used for **classification and regression**.


🔹 K-Means is an **unsupervised** algorithm used for **clustering**.

🔹 Key Differences:

Aspect KNN K-Means


Type Supervised Learning Unsupervised Learning
Purpose Classification or Regression Clustering
Input Labeled data required No labels needed
Output Predicts label for new data Groups data into clusters
Working Finds nearest neighbors to Groups similar data points
classify
Training Lazy learner (no training) Involves training (cluster
formation)

🔹 Visual Comparison:

You might also like