0% found this document useful (0 votes)

9 views8 pages

K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia

The k-nearest neighbors (kNN) algorithm classifies new data points based on their similarity to labeled data points in the training set, applicable for both classification and regression tasks. Key considerations include selecting the optimal number of neighbors (k), preparing data through feature scaling and encoding, and its various business applications such as customer segmentation and fraud detection. Despite its simplicity and effectiveness, kNN has drawbacks like high memory requirements and slow classification phases.

Uploaded by

gauri10in

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views8 pages

K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia

Uploaded by

gauri10in

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

k-Nearest Neighbors

Marcel van Velzen

Junior Marte Garcia
k-Nearest Neighbors
The k-nearest neighbors algorithm is based on
the principle of similarity, where it classifies new
data points by comparing them to the labeled data
points in the training set and is used for
classification and regression tasks.

For classification tasks, the algorithm assigns the

class label of the majority of the k nearest
neighbors to the new data point. In regression
tasks, it predicts the value of the new data point
based on the average or weighted average of the
values of its k nearest neighbors.

Benefits Drawbacks
Simple and effective Does not produce a model
Makes no assumptions about the underlying Slow classification phase
data distribution
Fast training phase Requires a large amount of memory
Nominal features and missing data require
additional processing
Calculating distance
In order to determine which data points are closest to a given query point, the distance between the query point
and the other data points will need to be calculated.
Choosing an appropriate k
Deciding how many neighbors to use for the kNN determines how well the model generalize future data. The
balance between overfitting and underfitting the training data is a problem known as the bias-variance tradeoff.

Smaller values of k in kNN have low bias and high variance and larger values of k in kNN have high bias and low
variance.

A common practise is to start with k equal to the square root of the number of training examples.

To determine the optimal value of k in k-nearest neighbors (kNN) for both regression and classification problems,
you can use the following techniques:

Cross-validation: Evaluate the performance of kNN using techniques like k-fold cross-validation, stratified k-
fold cross-validation, or leave-one-out cross-validation, and select the value of k that yields the best
performance metric.

Grid search: Define a range of possible values for k, evaluate the performance of kNN for each value using an
appropriate performance metric, and select the value of k that gives the highest performance metric.

Domain knowledge and experimentation: Leverage insights or prior knowledge about the data, experiment
with different values of k, and observe the model's performance to determine the optimal value of k based on
the specific problem and dataset.
Preparing data

Feature scaling: When features have different scales, as it avoids dominance of a single feature in distance
calculation, mitigating bias and improving overall performance.

The traditional method of rescaling features for kNN is min-max normalization. Another common
transformation is called z-score normalization.

Feature engineering: Transforming features to reveal patterns, enhance class separability, or strengthen the
relationship with the target variable using techniques like feature extraction, dimensionality reduction, or creating
new derived features based on domain knowledge.

Handling categorical features: They should be encoded into numerical representations through techniques like
one-hot encoding or label encoding before utilizing the algorithm.

Common approaches for categorical variables in KNN include label encoding for ordinal variables, one-hot
encoding for non-ordinal variables, and feature hashing for high-cardinality variables.
Example
The table represents our data set and has two columns — Brightness and Saturation. Each row in the table
has a class of either Red or Blue and the Euclidean distance is used as the distance measure. We have a new
entry but it doesn't have a class yet. Suppose k=3.

Calculated distances in ascending order

BRIGHTNESS SATURATION CLASS FORMULA DISTANCE
10 25 Red SQRT((20-10)^2 + (35-25)^2) 14,1
40 20 Red SQRT((20-40)^2 + (35-20)^2) 25,0 k=3
50 50 Blue SQRT((20-50)^2 + (35-50)^2) 33,5
25 80 Blue SQRT((20-25)^2 + (35-80)^2) 45,3
60 10 Red SQRT((20-60)^2 + (35-10)^2) 47,2
70 70 Blue SQRT((20-70)^2 + (35-70)^2) 61,0
60 90 Blue SQRT((20-60)^2 + (35-90)^2) 68,0

The majority class within the 3 nearest neighbors to the new

entry is Red. Therefore, we'll classify the new entry as Red.
Business applications
Customer Segmentation: kNN can group customers based on similarities in purchasing behavior or
demographics, allowing businesses to tailor marketing strategies and offerings to specific customer segments.

Recommender Systems: kNN can power personalized recommendations by finding similarities between users
and suggesting items that similar users have shown interest in, improving customer engagement and sales.

Fraud Detection: kNN can identify suspicious transactions or activities by comparing them to known fraudulent
patterns or similar historical instances, helping businesses prevent financial losses due to fraudulent behavior.

Credit Scoring: kNN can assess creditworthiness by comparing the financial profiles of loan applicants to those
of existing customers, assisting in the decision-making process for loan approvals and managing credit risk.

Image Recognition: kNN can classify images into categories or identify similar images by comparing their pixel
values or extracted features, enabling applications in e-commerce, healthcare, security, and more.

Anomaly Detection: kNN can identify anomalies in data, such as detecting faulty equipment or network
intrusions, aiding in proactive maintenance and ensuring system security.

Predictive Maintenance: kNN can predict equipment failures or maintenance needs by analyzing historical
patterns and similarities with current operating conditions, reducing downtime and optimizing maintenance
schedules.
References
Abba, I. V. (2023). KNN Algorithm – k-Nearest Neighbors Classifiers and Model Example. freeCodeCamp.org.
https://www.freecodecamp.org/news/k-nearest-neighbors-algorithm-classifiers-and-model-example/

K-Nearest Neighbor(KNN) Algorithm for Machine Learning - Javatpoint. (n.d.-b). www.javatpoint.com.

https://www.javatpoint.com/k-nearest-neighbor-algorithm-for-machine-learning

Lantz, B. (2013). Machine learning with R: Learn how to use R to apply powerful machine learning methods and
gain an insight into real-world applications. Packt Publishing.

12 ML KNN
No ratings yet
12 ML KNN
28 pages
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
100% (1)
Introduction To K-Nearest Neighbors: Simplified (With Implementation in Python)
125 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
Worth1000 Photoshop Tutorials
100% (3)
Worth1000 Photoshop Tutorials
315 pages
KNN Report
No ratings yet
KNN Report
28 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
Bcis 1305 Business Computer Applications Homework 2 True/False
No ratings yet
Bcis 1305 Business Computer Applications Homework 2 True/False
6 pages
Sample KNN
No ratings yet
Sample KNN
7 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
33 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Clustering - KNN
No ratings yet
Clustering - KNN
10 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
Updated K-Nearest Neighbors in Machine Learning
No ratings yet
Updated K-Nearest Neighbors in Machine Learning
11 pages
Research Paper
No ratings yet
Research Paper
6 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
32 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
Shubh
No ratings yet
Shubh
10 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
Adobe Scan 16 May 2023
No ratings yet
Adobe Scan 16 May 2023
9 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
06 KNN
No ratings yet
06 KNN
41 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
'Machine Learning (Nagarjun)
No ratings yet
'Machine Learning (Nagarjun)
10 pages
Algorithms - K Nearest Neighbors
No ratings yet
Algorithms - K Nearest Neighbors
23 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
Lecture 38 KNN
No ratings yet
Lecture 38 KNN
4 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
2 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
K-Nearest Neighbors (K-NN) Algorithm
No ratings yet
K-Nearest Neighbors (K-NN) Algorithm
10 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
1694600817-Unit2.3 KNN CU 2.0
No ratings yet
1694600817-Unit2.3 KNN CU 2.0
25 pages
K-Nearest Neighbours (KNN)
No ratings yet
K-Nearest Neighbours (KNN)
10 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
Week 5 - Instance-Based Learning & PCA
No ratings yet
Week 5 - Instance-Based Learning & PCA
69 pages
Enhancing K-Nearest Neighbor Algorithm: A Comprehensive Review and Performance Analysis of Modifications
No ratings yet
Enhancing K-Nearest Neighbor Algorithm: A Comprehensive Review and Performance Analysis of Modifications
55 pages
S3 K Nearest Neighbor LKW 15jan2025
No ratings yet
S3 K Nearest Neighbor LKW 15jan2025
16 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
KNN
No ratings yet
KNN
53 pages
Week 07
No ratings yet
Week 07
24 pages
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
No ratings yet
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
14 pages
Audio Amplifier
No ratings yet
Audio Amplifier
20 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
AIML
No ratings yet
AIML
13 pages
Intro To KNN
No ratings yet
Intro To KNN
8 pages
What Is KNN
No ratings yet
What Is KNN
9 pages
KNN Algorithm
No ratings yet
KNN Algorithm
11 pages
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
No ratings yet
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
11 pages
An ATM With An Eye
No ratings yet
An ATM With An Eye
43 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
Amrendra
No ratings yet
Amrendra
9 pages
ML 2
No ratings yet
ML 2
6 pages
Illustrated Dictionary of Cyborg Anthropology Web
100% (3)
Illustrated Dictionary of Cyborg Anthropology Web
101 pages
Regular Falsi Method: B.S. (SE) Semester Project Report
No ratings yet
Regular Falsi Method: B.S. (SE) Semester Project Report
12 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
Panasonic SA-PMX70 & SA-PMX90 User Manual
No ratings yet
Panasonic SA-PMX70 & SA-PMX90 User Manual
20 pages
U1
No ratings yet
U1
2 pages
Srs - Lms (Final)
No ratings yet
Srs - Lms (Final)
15 pages
Chapter 1 - MIS
No ratings yet
Chapter 1 - MIS
62 pages
3D-Modeling For Crane Selection and Logistics For Modular Construction On-Site Assembly
No ratings yet
3D-Modeling For Crane Selection and Logistics For Modular Construction On-Site Assembly
8 pages
CHEMISTRY - 3.1 Accuracy Precision Practice Sig Figs and Sci Notation
100% (1)
CHEMISTRY - 3.1 Accuracy Precision Practice Sig Figs and Sci Notation
20 pages
Brochure X13 Servers
No ratings yet
Brochure X13 Servers
48 pages
Firewall Ufw
No ratings yet
Firewall Ufw
10 pages
Alternate Autonomous AP Upgrade Procedure
No ratings yet
Alternate Autonomous AP Upgrade Procedure
14 pages
Reduced Row Echelon Form
No ratings yet
Reduced Row Echelon Form
4 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
Sustainable IT Services: Assessing The Impact of Green Computing Practices
No ratings yet
Sustainable IT Services: Assessing The Impact of Green Computing Practices
11 pages
ICTBroadcast, A Unified Autodialer Software, Enterprise Edition User Guide
No ratings yet
ICTBroadcast, A Unified Autodialer Software, Enterprise Edition User Guide
37 pages
Teknik Lipatan Minggu 14
No ratings yet
Teknik Lipatan Minggu 14
42 pages
Call Fail Cause
100% (1)
Call Fail Cause
3 pages
Propagation Delay and Short-Circuit Power Dissipation Modeling of The CMOS Inverter
No ratings yet
Propagation Delay and Short-Circuit Power Dissipation Modeling of The CMOS Inverter
12 pages
C# Concepts
No ratings yet
C# Concepts
2 pages
CV: Aditya Baliga Via
No ratings yet
CV: Aditya Baliga Via
2 pages
CC0002 Notes
No ratings yet
CC0002 Notes
10 pages
Healthcare ERP Project Success: It's All About Avoiding Missteps
No ratings yet
Healthcare ERP Project Success: It's All About Avoiding Missteps
5 pages
Assignment - 4 - Risk Response, Contingency and Control
No ratings yet
Assignment - 4 - Risk Response, Contingency and Control
4 pages
Gujarat Technological University
No ratings yet
Gujarat Technological University
2 pages
Collaborative Learning For Cyberattack Detection in Blockchain Networks
No ratings yet
Collaborative Learning For Cyberattack Detection in Blockchain Networks
12 pages
Digital System Design Q1 Q2
No ratings yet
Digital System Design Q1 Q2
3 pages
Battery Capacity and Battery Backup Time Calculation
No ratings yet
Battery Capacity and Battery Backup Time Calculation
3 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet

K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia

Uploaded by

K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia

Uploaded by

k-Nearest Neighbors

Marcel van Velzen

For classification tasks, the algorithm assigns the

Calculated distances in ascending order

The majority class within the 3 nearest neighbors to the new

K-Nearest Neighbor(KNN) Algorithm for Machine Learning - Javatpoint. (n.d.-b). www.javatpoint.com.

You might also like