0% found this document useful (0 votes)

10 views18 pages

Lecture 17 - KNN

KNN

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views18 pages

Lecture 17 - KNN

KNN

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Transfer Functions

Supervised Learning – Classification

K-Nearest Neighbor Algorithm
Definition of Nearest Neighbor

X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points that

have the k smallest distance to x
Basic Idea

 k-NN classification rule is to assign to a test sample the

majority category label of its k nearest training samples
 In practice, k is usually chosen to be odd, so as to avoid
ties
 The k = 1 rule is generally called the nearest-neighbor
classification rule
Nearest-Neighbor Classifiers: Issues

– The value of k, the number of nearest

neighbors to retrieve
– Choice of Distance Metric to compute
distance between records
– Computational complexity
– Size of training set
– Dimension of data
Value of K
 Choosing the value of k:
 If k is too small, sensitive to noise points
 If k is too large, neighborhood may include points from
other classes

Rule of thumb:
K = sqrt(N)
N: number of training points X
Distance Metrics
Distance Measure: Scale Effects

 Different features may have different measurement scales

 E.g., patient weight in kg (range [50,200]) vs. blood protein
values in ng/dL (range [-3,3])
 Consequences
 Patient weight will have a much greater influence on the
distance between samples
 May bias the performance of the classifier
Standardization

 Transform raw feature values into z-scores

x ij - m j
zij =
sj

 x ijis the value for the ith sample and jth feature
 m j is the average of all x ij for feature j
 s is the standard deviation of all x over all input samples
j ij
 Range and scale of z-scores should be similar (providing
distributions of raw feature values are alike)
Additional Material
Voronoi Diagram

Properties:
1) All possible points
within a sample's
Voronoi cell are the
nearest neighboring
points for that sample
2) For any sample, the
nearest sample is
determined by the
closest Voronoi cell
edge
Distance-weighted k-NN

k
Replace
fˆ (q) = arg max å d (v, f ( xi ))


vÎV i =1

k
fˆ (q) = argmax å
1
d (v, f (x i ))
d( x i, x q )
2
v ÎV i=1

General Kernel functions like Parzen Windows may be considered

Instead of inverse distance.
Distance for Heterogeneous Data

Wilson, D. R. and Martinez, T. R., Improved Heterogeneous Distance Functions, Journal of

Artificial Intelligence Research, vol. 6, no. 1, pp. 1-34, 1997
Nearest Neighbour : Computational
Complexity
 Expensive
 To determine the nearest neighbour of a query point q, must
compute the distance to all N training examples
+ Pre-sort training examples into fast data structures (kd-trees)
+ Compute only an approximate distance (LSH)
+ Remove redundant data (condensing)
 Storage Requirements
 Must store all training data P
+ Remove redundant data (condensing)
- Pre-sorting often increases the storage requirements
 High Dimensional Data
 “Curse of Dimensionality”
 Required amount of training data increases exponentially with dimension
 Computational cost also increases dramatically
 Partitioning techniques degrade to linear search in high dimension
KNN: Alternate Terminologies

 Instance Based Learning

 Lazy Learning
 Case Based Reasoning
 Exemplar Based Learning
Discussions
 kNN can deal with complex and arbitrary decision
boundaries.
 Despite its simplicity, researchers have shown that the
classification accuracy of kNN can be quite strong and in
many cases as accurate as those elaborated methods.
 kNN is slow at the classification time
 kNN does not produce an understandable model
Summary
 Applications of supervised learning are in almost any field
or domain.
 We studied 4 classification techniques.
 There are still many other methods, e.g.,
 Bayesian networks
 Neural networks
 Genetic algorithms
 Fuzzy classification
This large number of methods also show the importance of
classification and its wide applicability.
 It remains to be an active research area.

Service Manual B4600 PDF
71% (7)
Service Manual B4600 PDF
171 pages
Government of India Act 1858
No ratings yet
Government of India Act 1858
20 pages
K Nearest Neighbor Classification
No ratings yet
K Nearest Neighbor Classification
30 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
K Nearest Neighbor Classification
0% (1)
K Nearest Neighbor Classification
32 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
Wikipedia K Nearest Neighbor Algorithm
No ratings yet
Wikipedia K Nearest Neighbor Algorithm
4 pages
Mlfa Autumn 22 Lec 03
No ratings yet
Mlfa Autumn 22 Lec 03
61 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
ML 2
No ratings yet
ML 2
6 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
11 pages
Unit Ii
No ratings yet
Unit Ii
102 pages
ML 5
No ratings yet
ML 5
35 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
CH 04 Classification Techniques
No ratings yet
CH 04 Classification Techniques
89 pages
08 Classification Using K NN
No ratings yet
08 Classification Using K NN
23 pages
AIML-Unit 4 Notes-Assignment 4
No ratings yet
AIML-Unit 4 Notes-Assignment 4
21 pages
Presentation UNIT-2
No ratings yet
Presentation UNIT-2
96 pages
Lecture 2 - Nearest-Neighbors Methods
No ratings yet
Lecture 2 - Nearest-Neighbors Methods
57 pages
K-Nearest Neighbour Classifiers
No ratings yet
K-Nearest Neighbour Classifiers
18 pages
ML Unit 2 (Ab22)
No ratings yet
ML Unit 2 (Ab22)
61 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
K Nearest Neighbour
No ratings yet
K Nearest Neighbour
2 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
K-Nearest Neighbors Algorithm - Wikipedia
No ratings yet
K-Nearest Neighbors Algorithm - Wikipedia
10 pages
K-Nearest Neighbor: Scholarpedia January 2009
No ratings yet
K-Nearest Neighbor: Scholarpedia January 2009
14 pages
Decision Tree KNN
No ratings yet
Decision Tree KNN
9 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
KNN Presentation
No ratings yet
KNN Presentation
16 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
ML04 KNN-SVM 2024-2025
No ratings yet
ML04 KNN-SVM 2024-2025
57 pages
Chapter 2
No ratings yet
Chapter 2
26 pages
KNN
No ratings yet
KNN
53 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
ML 03 Classification
No ratings yet
ML 03 Classification
15 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
KNN Algorithm
No ratings yet
KNN Algorithm
9 pages
Chapter 3
No ratings yet
Chapter 3
33 pages
Introduction To KNN
100% (1)
Introduction To KNN
8 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
Dynamic KNNF
No ratings yet
Dynamic KNNF
3 pages
Improving Time-Complexity of K Nearest Neighbors Classifier: A Systematic Review
No ratings yet
Improving Time-Complexity of K Nearest Neighbors Classifier: A Systematic Review
6 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
K-Nearest Neighbourhood
100% (1)
K-Nearest Neighbourhood
7 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
ML DSBA Lab4
No ratings yet
ML DSBA Lab4
5 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
CH 2
No ratings yet
CH 2
30 pages
2.unit 2 ML Q&A
No ratings yet
2.unit 2 ML Q&A
36 pages
Challenges in KNN Classification: Shichao Zhang
No ratings yet
Challenges in KNN Classification: Shichao Zhang
13 pages
MKNN Modified K Nearest Neighbor
No ratings yet
MKNN Modified K Nearest Neighbor
4 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
No ratings yet
Supervised Learning and K Nearest Neighbors: Business Intelligence For Managers
15 pages
ML KN
No ratings yet
ML KN
12 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
CV Syamsul Maarif
No ratings yet
CV Syamsul Maarif
4 pages
Revision Worksheet 1
No ratings yet
Revision Worksheet 1
3 pages
Chavez vs. CA
No ratings yet
Chavez vs. CA
1 page
Outline Field Development & Project Management (5th Apr 22) Rev.2
No ratings yet
Outline Field Development & Project Management (5th Apr 22) Rev.2
67 pages
Standard Costing and Variance Analysis 1: Solutions To Chapter 18 Questions
No ratings yet
Standard Costing and Variance Analysis 1: Solutions To Chapter 18 Questions
8 pages
Heather R. Flores: Creative Director
No ratings yet
Heather R. Flores: Creative Director
1 page
Service and Parts Frymaster Bigl30 Series Manual Lov™ Gas Fryer
No ratings yet
Service and Parts Frymaster Bigl30 Series Manual Lov™ Gas Fryer
75 pages
Subject: Insufficient Fuel Tank Wall Thickness/Fuel Leak: 1200 New Jersey Avenue SE Washington, DC 20590
No ratings yet
Subject: Insufficient Fuel Tank Wall Thickness/Fuel Leak: 1200 New Jersey Avenue SE Washington, DC 20590
2 pages
BAC GIANG - Đề thi chọn ĐT 2023 (chính thức)
No ratings yet
BAC GIANG - Đề thi chọn ĐT 2023 (chính thức)
19 pages
George Henri Hazard - GM
No ratings yet
George Henri Hazard - GM
9 pages
Latihan Bahasa Inggris
No ratings yet
Latihan Bahasa Inggris
13 pages
Development of Production and Product Safety EN PDF
No ratings yet
Development of Production and Product Safety EN PDF
19 pages
Gaming Industry - Group 1 - MM
No ratings yet
Gaming Industry - Group 1 - MM
20 pages
Dao 2015-09
No ratings yet
Dao 2015-09
14 pages
ELE 2 Module 1
No ratings yet
ELE 2 Module 1
4 pages
Basic Firefighting Course
No ratings yet
Basic Firefighting Course
15 pages
Nikhil Sharma Resume
No ratings yet
Nikhil Sharma Resume
6 pages
820P 203
No ratings yet
820P 203
10 pages
TQM 2-Customer Satisfaction
No ratings yet
TQM 2-Customer Satisfaction
10 pages
08Pr067C Electrical Safety: Safety Management System Procedure
No ratings yet
08Pr067C Electrical Safety: Safety Management System Procedure
8 pages
Assessment Task 2
No ratings yet
Assessment Task 2
16 pages
8 Gabriel
No ratings yet
8 Gabriel
22 pages
Land and Inland
No ratings yet
Land and Inland
28 pages
Urine Tests
100% (1)
Urine Tests
398 pages
GEA20002 Environment, Health, and Safety at GE 2020
No ratings yet
GEA20002 Environment, Health, and Safety at GE 2020
3 pages
History S Pettett Ver 4a DRW
No ratings yet
History S Pettett Ver 4a DRW
32 pages
ISO 14001 Environment Management Watermark
No ratings yet
ISO 14001 Environment Management Watermark
2 pages
MAN TGA ZF Transmission 16S151/16S181 (RL)
100% (4)
MAN TGA ZF Transmission 16S151/16S181 (RL)
4 pages

Lecture 17 - KNN

Uploaded by

Lecture 17 - KNN

Uploaded by

Transfer Functions

Supervised Learning – Classification

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points that

 k-NN classification rule is to assign to a test sample the

– The value of k, the number of nearest

 Different features may have different measurement scales

 Transform raw feature values into z-scores

General Kernel functions like Parzen Windows may be considered

Wilson, D. R. and Martinez, T. R., Improved Heterogeneous Distance Functions, Journal of

 Instance Based Learning

You might also like