Week 07

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views24 pages

Week 07

Uploaded by

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 24

CSC – 368

WEEK 07
K NEAREST NEIGHBOUR

Dr. Sadaf Hussain

Asst. Professor CS
Faculty of Computer Science
Lahore Garrison University
WHAT IS KNN

 A K Nearest Neighbor classifier is a machine learning model that makes predictions based
on the majority class of the K nearest data points in the feature space.
 The KNN algorithm assumes that similar things exist in close proximity, making it intuitive
and easy to understand.
KK
Job
CGPA Age
Employment

3.5 22 1
3.2 23 0
WORKING 3.8 21 1
3.0 24 0
1. Calculate the
distance from 3.7 22 1
each point in the 3.3 25 1
space (e.g.
Euclidian 2.9 23 0
distance) 3.6 21 1
2. Sort all distances
3. Majority count 3.1 24 0
3.4 22 1
HOW IS THE K-DISTANCE CALCULATED?

 Euclidean distance
 The Euclidean distance
between two points is
the length of the
straight line segment
connecting them. This
most common
distance metric is
applied to real-valued
vectors.
HOW IS THE K-DISTANCE CALCULATED?

 Manhattan distance
 The Manhattan distance
between two points is the
sum of the absolute
differences between the x
and y coordinates of each
point.
 Used to measure the
minimum distance by
summing the length of all
the intervals needed to
get from one location to
another in a city, it’s also
known as the taxicab
distance.
HOW IS THE K-DISTANCE CALCULATED?

 Minkowski distance
 Minkowski distance
generalizes the Euclidean
and Manhattan distances.
 It adds a parameter
called “order” that allows
different distance
measures to be
calculated.
 Minkowski distance
indicates a distance
between two points in a
normed vector space
HOW IS THE K-DISTANCE CALCULATED?

 Hamming distance
 Hamming distance is used
to compare two binary
vectors (also called data
strings or bitstrings).
 To calculate it, data first has
to be translated into a
binary system.
 REFER TO CODING EXAMPLE
DATASET DISTRIBUTION
HOW TO DETERMINE THE K VALUE IN THE K-NEIGHBORS
CLASSIFIER?

The optimal k value will help you to achieve the maximum accuracy of the model.
This process, however, is always challenging.

The simplest solution is to try out k values and find the one that brings the best results on the
testing set. For this, we follow these steps:
1. Select a random k value. In practice, k is usually chosen at random between 3 and 10, but
there are no strict rules.
a) A small value of k results in unstable decision boundaries.
b) A large value of k often leads to the smoothening of decision boundaries but not always to better
metrics.
c) So it’s always about trial and error.

2. Try out different k values and note their accuracy on the testing set.
3. Choose k with the lowest error rate and implement the model.
Cross validation

HOW TO SELECT THE VALUE OF “K”

 Depends on the problem

 Heirstics approach
 : n is training example
 Experimental approach
 Cross validation (Elbow method))
PICKING K IN MORE COMPLICATED CASES
UNDER FITTING AND OVERFITTING IN KNN

 K is very low  overfitting

 K is very high  underfitting
DECISION BOUNDARIES

 Ploting numpy meshgrid

 Use library mlxtend
WHEN USE AND NOT TO USE
 When not to use:
1. Very large size of datasets in dimensions since it is a lazy learning state
(whole working is done during validation time for query point)
2. High dimensions of data (number of features are high) leads to curse of
dimensionality since distance measurement is not reliable
3. Doesn’t work well in case of outliers
4. Non homogeneous scales leads poor performance since high scale dominates
the decision and unreliable distance measurements
5. Imbalanced datasets leads to biasness
6. When used for inference rather than prediction (since the contributions of
features are less important than the distances measured)
EXAMPLE 2

 Dataset Used

https://
towardsdatascience.com/k-nearest-neighbor-classif
ier-explained-a-visual-guide-with-code-examples-fo
r-beginners-a3d85cad00e1
PROS & CONS
 Pros:
 Simplicity: Easy to understand and implement.
 No Assumptions: Doesn’t assume anything about the data distribution.
 Versatility: Can be used for both classification and regression tasks.
 No Training Phase: Can quickly incorporate new data without retraining.

 Cons:
 Computationally Expensive: Needs to compute distances to all training samples for each
prediction.
 Memory Intensive: Requires storing all training data.
 Sensitive to Irrelevant Features: Can be thrown off by features that aren’t important to the
classification.
 Curse of Dimensionality: Performance degrades in high-dimensional spaces.
FINAL REMARKS

 Introduction to KNN
 Simple and Intuitive: A straightforward algorithm for classification.
 Proximity-Based: Makes predictions based on the similarity of data points.
 No Explicit Training: Leverages the entire dataset for predictions.

 How KNN Works

 Distance Calculation: Measures the distance between a new data point and existing ones.
 K-Nearest Neighbors: Identifies the K closest data points to the new point.
 Majority Vote: Assigns the new point to the class that is most frequent among its K neighbors.
ADVANTAGES & DISADVANTAGES OF KNN

 Advantages of KNN
 Easy to Understand: Simple concept, easy to implement.
 Versatile: Applicable to various classification problems.
 No Model Training: Quick to deploy.

 Disadvantages of KNN
 Computational Cost: Can be slow for large datasets.
 Sensitive to Noise: Noisy data can impact predictions.
 Curse of Dimensionality: Performance degrades in high-dimensional spaces.
CHOOSING THE RIGHT K VALUE

 Impact of K: The choice of K influences the model's flexibility and robustness.

 Cross-Validation: A technique to find the optimal K value.
 Trade-off: A larger K reduces noise sensitivity but can oversmooth the decision boundary.
APPLICATIONS OF KNN

 Recommendation Systems: Recommending products or services.

 Image Recognition: Classifying images based on visual features.
 Text Classification: Categorizing text documents.
 Anomaly Detection: Identifying outliers in data.

6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
USA Kamma Industrialist &doctors
100% (2)
USA Kamma Industrialist &doctors
72 pages
KNN Algorithm
No ratings yet
KNN Algorithm
11 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
Chapter 4. K Nearest Neighbors
No ratings yet
Chapter 4. K Nearest Neighbors
55 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
No ratings yet
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
8 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
K-Nearest Neighbour Classifier: Prerequisite
No ratings yet
K-Nearest Neighbour Classifier: Prerequisite
6 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
Algorithms - K Nearest Neighbors
No ratings yet
Algorithms - K Nearest Neighbors
23 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
ML 2
No ratings yet
ML 2
6 pages
06 KNN
No ratings yet
06 KNN
41 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
CSE445 NSU Week - 5
No ratings yet
CSE445 NSU Week - 5
26 pages
What Is KNN
No ratings yet
What Is KNN
9 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
KNN Algorithm
No ratings yet
KNN Algorithm
2 pages
Lecture 4 KNN
No ratings yet
Lecture 4 KNN
17 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
ML DSBA Lab4
No ratings yet
ML DSBA Lab4
5 pages
4 KNN Classifier
No ratings yet
4 KNN Classifier
6 pages
Research Paper
No ratings yet
Research Paper
6 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
4 KNN Classifier
No ratings yet
4 KNN Classifier
6 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
05 KNN
No ratings yet
05 KNN
49 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
04 KNN
No ratings yet
04 KNN
60 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
Updated K-Nearest Neighbors in Machine Learning
No ratings yet
Updated K-Nearest Neighbors in Machine Learning
11 pages
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
No ratings yet
A Complete Guide To K Nearest Neighbors Algorithm 1598272616
13 pages
Unit II 2 Mark Answers ML
No ratings yet
Unit II 2 Mark Answers ML
3 pages
Unit V Non Parametric Machine Learning
No ratings yet
Unit V Non Parametric Machine Learning
47 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
No ratings yet
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
14 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Shubh
No ratings yet
Shubh
10 pages
ML 5
No ratings yet
ML 5
35 pages
Ch2 - Lec2 - K Nearest Neighbour (KNN)
No ratings yet
Ch2 - Lec2 - K Nearest Neighbour (KNN)
18 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
Posture Detection System
No ratings yet
Posture Detection System
31 pages
Quote - Dispatch Msgs
No ratings yet
Quote - Dispatch Msgs
5 pages
PDC Lecture 7-8 GPU Architectures
No ratings yet
PDC Lecture 7-8 GPU Architectures
25 pages
Quote Dispatch Msgs
No ratings yet
Quote Dispatch Msgs
5 pages
Dispatch Sheet
No ratings yet
Dispatch Sheet
1 page
Bone Fracture Detection
No ratings yet
Bone Fracture Detection
15 pages
Human Resource Management Presentation
No ratings yet
Human Resource Management Presentation
12 pages
TIC TAC TOE Game Code in C++
No ratings yet
TIC TAC TOE Game Code in C++
4 pages
What Is Performance Appraisal
No ratings yet
What Is Performance Appraisal
2 pages
Assembly Language Project Os Shape Making Program
No ratings yet
Assembly Language Project Os Shape Making Program
6 pages
Balance of Payment - FOREX
No ratings yet
Balance of Payment - FOREX
18 pages
Power Point MCQ
No ratings yet
Power Point MCQ
25 pages
Dkrci - Pd.if0.a7.02 Sfa
No ratings yet
Dkrci - Pd.if0.a7.02 Sfa
10 pages
Car Pro Final Document
No ratings yet
Car Pro Final Document
33 pages
Shorten That Long URL Into A Tiny URL
No ratings yet
Shorten That Long URL Into A Tiny URL
2 pages
Mobilization of Cabin
No ratings yet
Mobilization of Cabin
1 page
Flumes For Accurate Flow Measurement
No ratings yet
Flumes For Accurate Flow Measurement
10 pages
ANRITSU MG3692C Datasheet
No ratings yet
ANRITSU MG3692C Datasheet
17 pages
Readynas Os 6 SM en
No ratings yet
Readynas Os 6 SM en
303 pages
Prueba
No ratings yet
Prueba
22 pages
CBMS Brochure
No ratings yet
CBMS Brochure
2 pages
Oscommerce Online Merchant V3.0.2
No ratings yet
Oscommerce Online Merchant V3.0.2
7 pages
Keybank Hassle Free Fee Transparency
No ratings yet
Keybank Hassle Free Fee Transparency
2 pages
Sign Language Recognition Using Machine Learning
No ratings yet
Sign Language Recognition Using Machine Learning
7 pages
Mbasic English For Academic Purposes
No ratings yet
Mbasic English For Academic Purposes
4 pages
This Study Resource Was: Chapter 5: Performance Pay Choices
No ratings yet
This Study Resource Was: Chapter 5: Performance Pay Choices
9 pages
MasterTop 314ULv3
No ratings yet
MasterTop 314ULv3
2 pages
Ownership of Equipment - Claimant
No ratings yet
Ownership of Equipment - Claimant
2 pages
Chapter 1 Introduction To Accounting
No ratings yet
Chapter 1 Introduction To Accounting
37 pages
Veritas Netbackup 8.0 Blueprint Accelerator
No ratings yet
Veritas Netbackup 8.0 Blueprint Accelerator
29 pages
5 Steps Developing Sales Plan p52 55
No ratings yet
5 Steps Developing Sales Plan p52 55
4 pages
Learning Nugget - RAMS
No ratings yet
Learning Nugget - RAMS
14 pages
G20 - India
No ratings yet
G20 - India
23 pages
Click Here To Download The Asnt Book List
No ratings yet
Click Here To Download The Asnt Book List
1 page
Module II
No ratings yet
Module II
13 pages
Commodification of Crime
No ratings yet
Commodification of Crime
71 pages
Service Center Repairs We Buy Used Equipment: Instra
No ratings yet
Service Center Repairs We Buy Used Equipment: Instra
5 pages
Exxsol D80
No ratings yet
Exxsol D80
2 pages
Dealer Management System 2023 AI Withlogo
No ratings yet
Dealer Management System 2023 AI Withlogo
1 page