0% found this document useful (0 votes)

224 views11 pages

K-Nearest Neighbors Algorithm

1. The k-nearest neighbors algorithm is a non-parametric supervised learning method used for classification and regression. 2. It works by finding the k closest training examples in the data and using a majority vote (classification) or average (regression) of the labels/properties of those neighbors. 3. The algorithm performance depends on choosing an appropriate distance metric and value of k, with larger k reducing noise but making class boundaries less distinct.

Uploaded by

watson191

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

224 views11 pages

K-Nearest Neighbors Algorithm

Uploaded by

watson191

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

k-nearest neighbors algorithm

In statistics, the k-nearest neighbors algorithm (k-NN) is a non-parametric supervised learning method
first developed by Evelyn Fix and Joseph Hodges in 1951,[1] and later expanded by Thomas Cover.[2] It is
used for classification and regression. In both cases, the input consists of the k closest training examples in a
data set. The output depends on whether k-NN is used for classification or regression:

In k-NN classification, the output is a class membership. An object is classified by a plurality

vote of its neighbors, with the object being assigned to the class most common among its k
nearest neighbors (k is a positive integer, typically small). If k = 1, then the object is simply
assigned to the class of that single nearest neighbor.
In k-NN regression, the output is the property value for the object. This value is the average
of the values of k nearest neighbors. If k = 1, then the output is simply assigned to the value
of that single nearest neighbor.

k-NN is a type of classification where the function is only approximated locally and all computation is
deferred until function evaluation. Since this algorithm relies on distance for classification, if the features
represent different physical units or come in vastly different scales then normalizing the training data can
improve its accuracy dramatically.[3]

Both for classification and regression, a useful technique can be to assign weights to the contributions of the
neighbors, so that the nearer neighbors contribute more to the average than the more distant ones. For
example, a common weighting scheme consists in giving each neighbor a weight of 1/d, where d is the
distance to the neighbor.[4]

The neighbors are taken from a set of objects for which the class (for k-NN classification) or the object
property value (for k-NN regression) is known. This can be thought of as the training set for the algorithm,
though no explicit training step is required.

A peculiarity of the k-NN algorithm is that it is sensitive to the local structure of the data.

Statistical setting
Suppose we have pairs taking values in , where Y is the
class label of X, so that for (and probability distributions ). Given some norm
on and a point , let be a reordering of the training data
such that .

Algorithm
The training examples are vectors in a multidimensional feature space, each with a class label. The training
phase of the algorithm consists only of storing the feature vectors and class labels of the training samples.

In the classification phase, k is a user-defined constant, and an unlabeled vector (a query or test point) is
classified by assigning the label which is most frequent among the k training samples nearest to that query
point.
A commonly used distance metric for continuous variables is
Euclidean distance. For discrete variables, such as for text
classification, another metric can be used, such as the overlap
metric (or Hamming distance). In the context of gene expression
microarray data, for example, k-NN has been employed with
correlation coefficients, such as Pearson and Spearman, as a
metric.[5] Often, the classification accuracy of k-NN can be
improved significantly if the distance metric is learned with
specialized algorithms such as Large Margin Nearest Neighbor or
Neighbourhood components analysis.

A drawback of the basic "majority voting" classification occurs Example of k-NN classification. The
when the class distribution is skewed. That is, examples of a more test sample (green dot) should be
frequent class tend to dominate the prediction of the new example, classified either to blue squares or to
because they tend to be common among the k nearest neighbors red triangles. If k = 3 (solid line
due to their large number.[6] One way to overcome this problem is circle) it is assigned to the red
to weight the classification, taking into account the distance from triangles because there are 2
the test point to each of its k nearest neighbors. The class (or value, triangles and only 1 square inside the
in regression problems) of each of the k nearest points is multiplied inner circle. If k = 5 (dashed line
by a weight proportional to the inverse of the distance from that circle) it is assigned to the blue
point to the test point. Another way to overcome skew is by squares (3 squares vs. 2 triangles
abstraction in data representation. For example, in a self-organizing inside the outer circle).
map (SOM), each node is a representative (a center) of a cluster of
similar points, regardless of their density in the original training
data. K-NN can then be applied to the SOM.

Parameter selection
The best choice of k depends upon the data; generally, larger values of k reduces effect of the noise on the
classification,[7] but make boundaries between classes less distinct. A good k can be selected by various
heuristic techniques (see hyperparameter optimization). The special case where the class is predicted to be
the class of the closest training sample (i.e. when k = 1) is called the nearest neighbor algorithm.

The accuracy of the k-NN algorithm can be severely degraded by the presence of noisy or irrelevant
features, or if the feature scales are not consistent with their importance. Much research effort has been put
into selecting or scaling features to improve classification. A particularly popular approach is the use of
evolutionary algorithms to optimize feature scaling.[8] Another popular approach is to scale features by the
mutual information of the training data with the training classes.

In binary (two class) classification problems, it is helpful to choose k to be an odd number as this avoids
tied votes. One popular way of choosing the empirically optimal k in this setting is via bootstrap method.[9]

The 1-nearest neighbor classifier

The most intuitive nearest neighbour type classifier is the one nearest neighbour classifier that assigns a
point x to the class of its closest neighbour in the feature space, that is .
As the size of training data set approaches infinity, the one nearest neighbour classifier guarantees an error
rate of no worse than twice the Bayes error rate (the minimum achievable error rate given the distribution of
the data).

The weighted nearest neighbour classifier

The k-nearest neighbour classifier can be viewed as assigning the k nearest neighbours a weight and
all others 0 weight. This can be generalised to weighted nearest neighbour classifiers. That is, where the ith
nearest neighbour is assigned a weight , with . An analogous result on the strong
consistency of weighted nearest neighbour classifiers also holds. [10]

Let denote the weighted nearest classifier with weights . Subject to regularity conditions,
which in asymptotic theory are conditional variables which require assumptions to differentiate among
parameters with some criteria. On the class distributions the excess risk has the following asymptotic
expansion[11]

for constants and where and .

The optimal weighting scheme , that balances the two terms in the display above, is given as
follows: set ,

for and

for .

With optimal weights the dominant term in the asymptotic expansion of the excess risk is .
Similar results are true when using a bagged nearest neighbour classifier.

Properties
k-NN is a special case of a variable-bandwidth, kernel density "balloon" estimator with a uniform
kernel.[12][13]

The naive version of the algorithm is easy to implement by computing the distances from the test example
to all stored examples, but it is computationally intensive for large training sets. Using an approximate
nearest neighbor search algorithm makes k-NN computationally tractable even for large data sets. Many
nearest neighbor search algorithms have been proposed over the years; these generally seek to reduce the
number of distance evaluations actually performed.
k-NN has some strong consistency results. As the amount of data approaches infinity, the two-class k-NN
algorithm is guaranteed to yield an error rate no worse than twice the Bayes error rate (the minimum
achievable error rate given the distribution of the data).[2] Various improvements to the k-NN speed are
possible by using proximity graphs.[14]

For multi-class k-NN classification, Cover and Hart (1967) prove an upper bound error rate of

where is the Bayes error rate (which is the minimal error rate possible), is the k-NN error rate,
and M is the number of classes in the problem. For and as the Bayesian error rate approaches
zero, this limit reduces to "not more than twice the Bayesian error rate".

Error rates
There are many results on the error rate of the k nearest neighbour classifiers.[15] The k-nearest neighbour
classifier is strongly (that is for any joint distribution on ) consistent provided diverges and
converges to zero as .

Let denote the k nearest neighbour classifier based on a training set of size n . Under certain
regularity conditions, the excess risk yields the following asymptotic expansion[11]

for some constants and .

The choice offers a trade off between the two terms in the above display, for which the
-nearest neighbour error converges to the Bayes error at the optimal (minimax) rate .

Metric learning
The K-nearest neighbor classification performance can often be significantly improved through
(supervised) metric learning. Popular algorithms are neighbourhood components analysis and large margin
nearest neighbor. Supervised metric learning algorithms use the label information to learn a new metric or
pseudo-metric.

Feature extraction
When the input data to an algorithm is too large to be processed and it is suspected to be redundant (e.g. the
same measurement in both feet and meters) then the input data will be transformed into a reduced
representation set of features (also named features vector). Transforming the input data into the set of
features is called feature extraction. If the features extracted are carefully chosen it is expected that the
features set will extract the relevant information from the input data in order to perform the desired task
using this reduced representation instead of the full size input. Feature extraction is performed on raw data
prior to applying k-NN algorithm on the transformed data in feature space.

An example of a typical computer vision computation pipeline for face recognition using k-NN including
feature extraction and dimension reduction pre-processing steps (usually implemented with OpenCV):

1. Haar face detection

2. Mean-shift tracking analysis
3. PCA or Fisher LDA projection into feature space, followed by k-NN classification

Dimension reduction
For high-dimensional data (e.g., with number of dimensions more than 10) dimension reduction is usually
performed prior to applying the k-NN algorithm in order to avoid the effects of the curse of
dimensionality.[16]

The curse of dimensionality in the k-NN context basically means that Euclidean distance is unhelpful in
high dimensions because all vectors are almost equidistant to the search query vector (imagine multiple
points lying more or less on a circle with the query point at the center; the distance from the query to all
data points in the search space is almost the same).

Feature extraction and dimension reduction can be combined in one step using principal component
analysis (PCA), linear discriminant analysis (LDA), or canonical correlation analysis (CCA) techniques as
a pre-processing step, followed by clustering by k-NN on feature vectors in reduced-dimension space. This
process is also called low-dimensional embedding.[17]

For very-high-dimensional datasets (e.g. when performing a similarity search on live video streams, DNA
data or high-dimensional time series) running a fast approximate k-NN search using locality sensitive
hashing, "random projections",[18] "sketches" [19] or other high-dimensional similarity search techniques
from the VLDB toolbox might be the only feasible option.

Decision boundary
Nearest neighbor rules in effect implicitly compute the decision boundary. It is also possible to compute the
decision boundary explicitly, and to do so efficiently, so that the computational complexity is a function of
the boundary complexity.[20]

Data reduction
Data reduction is one of the most important problems for work with huge data sets. Usually, only some of
the data points are needed for accurate classification. Those data are called the prototypes and can be found
as follows:

1. Select the class-outliers, that is, training data that are classified incorrectly by k-NN (for a
given k)
2. Separate the rest of the data into two sets: (i) the prototypes that are used for the
classification decisions and (ii) the absorbed points that can be correctly classified by k-NN
using prototypes. The absorbed points can then be removed from the training set.

Selection of class-outliers

A training example surrounded by examples of other classes is called a class outlier. Causes of class outliers
include:

random error
insufficient training examples of this class (an isolated example appears instead of a cluster)
missing important features (the classes are separated in other dimensions which we don't
know)
too many training examples of other classes (unbalanced classes) that create a "hostile"
background for the given small class

Class outliers with k-NN produce noise. They can be detected and separated for future analysis. Given two
natural numbers, k>r>0, a training example is called a (k,r)NN class-outlier if its k nearest neighbors
include more than r examples of other classes.

Condensed Nearest Neighbor for data reduction

Condensed nearest neighbor (CNN, the Hart algorithm) is an algorithm designed to reduce the data set for
k-NN classification.[21] It selects the set of prototypes U from the training data, such that 1NN with U can
classify the examples almost as accurately as 1NN does with the whole data set.

Given a training set X, CNN works iteratively:

1. Scan all elements of X, looking for an element x whose nearest

prototype from U has a different label than x.
2. Remove x from X and add it to U
3. Repeat the scan until no more prototypes are added to U.
Calculation of the
Use U instead of X for classification. The examples that are not prototypes are border ratio.
called "absorbed" points.

It is efficient to scan the training examples in order of decreasing border ratio.[22]

The border ratio of a training example x is defined as

a(x) = ‖x'-y‖
‖x-y‖ Three types of
points: prototypes,
where ‖ x-y‖ is the distance to the closest example y having a different color than class-outliers, and
x, and ‖ x'-y‖ is the distance from y to its closest example x' with the same label absorbed points.
as x.

The border ratio is in the interval [0,1] because ‖ x'-y‖ never exceeds ‖ x-y‖ . This ordering gives
preference to the borders of the classes for inclusion in the set of prototypes U. A point of a different label
than x is called external to x. The calculation of the border ratio is illustrated by the figure on the right. The
data points are labeled by colors: the initial point is x and its label is red. External points are blue and green.
The closest to x external point is y. The closest to y red point is x' . The border ratio
a(x) = ‖ x'-y‖ / ‖ x-y‖is the attribute of the initial point x.
Below is an illustration of CNN in a series of figures. There are three classes (red, green and blue). Fig. 1:
initially there are 60 points in each class. Fig. 2 shows the 1NN classification map: each pixel is classified
by 1NN using all the data. Fig. 3 shows the 5NN classification map. White areas correspond to the
unclassified regions, where 5NN voting is tied (for example, if there are two green, two red and one blue
points among 5 nearest neighbors). Fig. 4 shows the reduced data set. The crosses are the class-outliers
selected by the (3,2)NN rule (all the three nearest neighbors of these instances belong to other classes); the
squares are the prototypes, and the empty circles are the absorbed points. The left bottom corner shows the
numbers of the class-outliers, prototypes and absorbed points for all three classes. The number of
prototypes varies from 15% to 20% for different classes in this example. Fig. 5 shows that the 1NN
classification map with the prototypes is very similar to that with the initial data set. The figures were
produced using the Mirkes applet.[22]

CNN model reduction for k-NN classifiers

Fig. 1. The dataset. Fig. 2. The 1NN classification

map.

Fig. 3. The 5NN classification Fig. 4. The CNN reduced

map. dataset.
Fig. 5. The 1NN classification
map based on the CNN
extracted prototypes.

k-NN regression
In k-NN regression, the k-NN algorithm is used for estimating continuous variables. One such algorithm
uses a weighted average of the k nearest neighbors, weighted by the inverse of their distance. This
algorithm works as follows:

1. Compute the Euclidean or Mahalanobis distance from the query example to the labeled
examples.
2. Order the labeled examples by increasing distance.
3. Find a heuristically optimal number k of nearest neighbors, based on RMSE. This is done
using cross validation.
4. Calculate an inverse distance weighted average with the k-nearest multivariate neighbors.

k-NN outlier
The distance to the kth nearest neighbor can also be seen as a local density estimate and thus is also a
popular outlier score in anomaly detection. The larger the distance to the k-NN, the lower the local density,
the more likely the query point is an outlier.[23] Although quite simple, this outlier model, along with
another classic data mining method, local outlier factor, works quite well also in comparison to more recent
and more complex approaches, according to a large scale experimental analysis.[24]

Validation of results
A confusion matrix or "matching matrix" is often used as a tool to validate the accuracy of k-NN
classification. More robust statistical methods such as likelihood-ratio test can also be applied.

Nearest centroid classifier

Closest pair of points problem
References
1. Fix, Evelyn; Hodges, Joseph L. (1951). Discriminatory Analysis. Nonparametric
Discrimination: Consistency Properties (https://apps.dtic.mil/dtic/tr/fulltext/u2/a800276.pdf)
(PDF) (Report). USAF School of Aviation Medicine, Randolph Field, Texas. Archived (https://
web.archive.org/web/20200926212807/https://apps.dtic.mil/dtic/tr/fulltext/u2/a800276.pdf)
(PDF) from the original on September 26, 2020.
2. Cover, Thomas M.; Hart, Peter E. (1967). "Nearest neighbor pattern classification" (http://ssg.
mit.edu/cal/abs/2000_spring/np_dens/classification/cover67.pdf) (PDF). IEEE Transactions
on Information Theory. 13 (1): 21–27. CiteSeerX 10.1.1.68.2616 (https://citeseerx.ist.psu.ed
u/viewdoc/summary?doi=10.1.1.68.2616). doi:10.1109/TIT.1967.1053964 (https://doi.org/10.
1109%2FTIT.1967.1053964). S2CID 5246200 (https://api.semanticscholar.org/CorpusID:52
46200).
3. Hastie, Trevor. (2001). The elements of statistical learning : data mining, inference, and
prediction : with 200 full-color illustrations. Tibshirani, Robert., Friedman, J. H. (Jerome H.).
New York: Springer. ISBN 0-387-95284-5. OCLC 46809224 (https://www.worldcat.org/oclc/4
6809224).
4. This scheme is a generalization of linear interpolation.
5. Jaskowiak, Pablo A.; Campello, Ricardo J. G. B. (2011). "Comparing Correlation
Coefficients as Dissimilarity Measures for Cancer Classification in Gene Expression Data".
Brazilian Symposium on Bioinformatics (BSB 2011): 1–8. CiteSeerX 10.1.1.208.993 (https://
citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.208.993).
6. Coomans, Danny; Massart, Desire L. (1982). "Alternative k-nearest neighbour rules in
supervised pattern recognition : Part 1. k-Nearest neighbour classification by using
alternative voting rules". Analytica Chimica Acta. 136: 15–27. doi:10.1016/S0003-
2670(01)95359-0 (https://doi.org/10.1016%2FS0003-2670%2801%2995359-0).
7. Everitt, Brian S.; Landau, Sabine; Leese, Morven; and Stahl, Daniel (2011) "Miscellaneous
Clustering Methods", in Cluster Analysis, 5th Edition, John Wiley & Sons, Ltd., Chichester,
UK
8. Nigsch, Florian; Bender, Andreas; van Buuren, Bernd; Tissen, Jos; Nigsch, Eduard; Mitchell,
John B. O. (2006). "Melting point prediction employing k-nearest neighbor algorithms and
genetic parameter optimization". Journal of Chemical Information and Modeling. 46 (6):
2412–2422. doi:10.1021/ci060149f (https://doi.org/10.1021%2Fci060149f). PMID 17125183
(https://pubmed.ncbi.nlm.nih.gov/17125183).
9. Hall, Peter; Park, Byeong U.; Samworth, Richard J. (2008). "Choice of neighbor order in
nearest-neighbor classification". Annals of Statistics. 36 (5): 2135–2152. arXiv:0810.5276 (ht
tps://arxiv.org/abs/0810.5276). Bibcode:2008arXiv0810.5276H (https://ui.adsabs.harvard.ed
u/abs/2008arXiv0810.5276H). doi:10.1214/07-AOS537 (https://doi.org/10.1214%2F07-AOS
537). S2CID 14059866 (https://api.semanticscholar.org/CorpusID:14059866).
10. Stone, Charles J. (1977). "Consistent nonparametric regression" (https://doi.org/10.1214%2
Faos%2F1176343886). Annals of Statistics. 5 (4): 595–620. doi:10.1214/aos/1176343886
(https://doi.org/10.1214%2Faos%2F1176343886).
11. Samworth, Richard J. (2012). "Optimal weighted nearest neighbour classifiers". Annals of
Statistics. 40 (5): 2733–2763. arXiv:1101.5783 (https://arxiv.org/abs/1101.5783).
doi:10.1214/12-AOS1049 (https://doi.org/10.1214%2F12-AOS1049). S2CID 88511688 (http
s://api.semanticscholar.org/CorpusID:88511688).
12. Terrell, George R.; Scott, David W. (1992). "Variable kernel density estimation" (https://doi.or
g/10.1214%2Faos%2F1176348768). Annals of Statistics. 20 (3): 1236–1265.
doi:10.1214/aos/1176348768 (https://doi.org/10.1214%2Faos%2F1176348768).
13. Mills, Peter (2012-08-09). "Efficient statistical classification of satellite measurements" (http
s://archive.org/details/arxiv-1202.2194). International Journal of Remote Sensing.
14. Toussaint, Godfried T. (April 2005). "Geometric proximity graphs for improving nearest
neighbor methods in instance-based learning and data mining". International Journal of
Computational Geometry and Applications. 15 (2): 101–150.
doi:10.1142/S0218195905001622 (https://doi.org/10.1142%2FS0218195905001622).
15. Devroye, Luc; Gyorfi, Laszlo; Lugosi, Gabor (1996). A probabilistic theory of pattern
recognition. Springer. ISBN 978-0-3879-4618-4.
16. Beyer, Kevin; et al. "When is "nearest neighbor" meaningful?" (https://minds.wisconsin.edu/b
itstream/handle/1793/60174/TR1377.pdf?sequence=1) (PDF). Database Theory—ICDT'99.
1999: 217–235.
17. Shaw, Blake; Jebara, Tony (2009), "Structure preserving embedding" (http://www.cs.columbi
a.edu/~jebara/papers/spe-icml09.pdf) (PDF), Proceedings of the 26th Annual International
Conference on Machine Learning (published June 2009), pp. 1–8,
doi:10.1145/1553374.1553494 (https://doi.org/10.1145%2F1553374.1553494),
ISBN 9781605585161, S2CID 8522279 (https://api.semanticscholar.org/CorpusID:8522279)
18. Bingham, Ella; Mannila, Heikki (2001). "Random projection in dimensionality reduction".
Proceedings of the seventh ACM SIGKDD international conference on Knowledge
discovery and data mining - KDD '01. pp. 245–250. doi:10.1145/502512.502546 (https://doi.
org/10.1145%2F502512.502546). ISBN 158113391X. S2CID 1854295 (https://api.semantic
scholar.org/CorpusID:1854295).
19. Ryan, Donna (editor); High Performance Discovery in Time Series, Berlin: Springer, 2004,
ISBN 0-387-00857-8
20. Bremner, David; Demaine, Erik; Erickson, Jeff; Iacono, John; Langerman, Stefan; Morin, Pat;
Toussaint, Godfried T. (2005). "Output-sensitive algorithms for computing nearest-neighbor
decision boundaries" (https://doi.org/10.1007%2Fs00454-004-1152-0). Discrete and
Computational Geometry. 33 (4): 593–604. doi:10.1007/s00454-004-1152-0 (https://doi.org/1
0.1007%2Fs00454-004-1152-0).
21. Hart, Peter E. (1968). "The Condensed Nearest Neighbor Rule". IEEE Transactions on
Information Theory. 18: 515–516. doi:10.1109/TIT.1968.1054155 (https://doi.org/10.1109%2
FTIT.1968.1054155).
22. Mirkes, Evgeny M.; KNN and Potential Energy: applet (http://www.math.le.ac.uk/people/ag15
3/homepage/KNN/KNN3.html), University of Leicester, 2011
23. Ramaswamy, Sridhar; Rastogi, Rajeev; Shim, Kyuseok (2000). "Efficient algorithms for
mining outliers from large data sets". Proceedings of the 2000 ACM SIGMOD international
conference on Management of data - SIGMOD '00. Proceedings of the 2000 ACM SIGMOD
international conference on Management of data – SIGMOD '00. pp. 427–438.
doi:10.1145/342009.335437 (https://doi.org/10.1145%2F342009.335437). ISBN 1-58113-
217-4.
24. Campos, Guilherme O.; Zimek, Arthur; Sander, Jörg; Campello, Ricardo J. G. B.; Micenková,
Barbora; Schubert, Erich; Assent, Ira; Houle, Michael E. (2016). "On the evaluation of
unsupervised outlier detection: measures, datasets, and an empirical study". Data Mining
and Knowledge Discovery. 30 (4): 891–927. doi:10.1007/s10618-015-0444-8 (https://doi.org/
10.1007%2Fs10618-015-0444-8). ISSN 1384-5810 (https://www.worldcat.org/issn/1384-581
0). S2CID 1952214 (https://api.semanticscholar.org/CorpusID:1952214).

External links
K-Nearest Neighbor(KNN) Algorithm in Machine Learning (https://medium.com/@rizwanaya
smeen06/k-nearest-neighbor-knn-algorithm-in-machine-learning-d38d9638d7e0)

Further reading
Dasarathy, Belur V., ed. (1991). Nearest Neighbor (NN) Norms: NN Pattern Classification
Techniques. ISBN 978-0818689307.
Shakhnarovich, Gregory; Darrell, Trevor; Indyk, Piotr, eds. (2005). Nearest-Neighbor
Methods in Learning and Vision. MIT Press. ISBN 978-0262195478.

Retrieved from "https://en.wikipedia.org/w/index.php?title=K-nearest_neighbors_algorithm&oldid=1163707353"

k-nearest neighbors algorithm - Wikipedia
No ratings yet
k-nearest neighbors algorithm - Wikipedia
10 pages
Wikipedia K Nearest Neighbor Algorithm
No ratings yet
Wikipedia K Nearest Neighbor Algorithm
4 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
K Nearest Neighbour
No ratings yet
K Nearest Neighbour
2 pages
ML DSBA Lab4
No ratings yet
ML DSBA Lab4
5 pages
K-Nearest Neighbour Classifiers
No ratings yet
K-Nearest Neighbour Classifiers
18 pages
ml2
No ratings yet
ml2
6 pages
MKNN Modified K Nearest Neighbor
No ratings yet
MKNN Modified K Nearest Neighbor
4 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
ml5
No ratings yet
ml5
35 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
Decision Tree KNN
No ratings yet
Decision Tree KNN
9 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
A Dempster-Shafer Theory: K-Nearest Neighbor Classification Rule Based On
No ratings yet
A Dempster-Shafer Theory: K-Nearest Neighbor Classification Rule Based On
30 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
Chap7 KNN
No ratings yet
Chap7 KNN
15 pages
ML Lec07 KNN
100% (2)
ML Lec07 KNN
37 pages
Nonparametric Density Estimation Nearest Neighbors, KNN
No ratings yet
Nonparametric Density Estimation Nearest Neighbors, KNN
31 pages
Example 1: Riding Mowers
No ratings yet
Example 1: Riding Mowers
6 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
w5 Classification
No ratings yet
w5 Classification
34 pages
K Nearest Neighbors - Classification: Algorithm
No ratings yet
K Nearest Neighbors - Classification: Algorithm
4 pages
Road Traffic Algorithm
No ratings yet
Road Traffic Algorithm
5 pages
K-Nearest Neighbor: Scholarpedia January 2009
No ratings yet
K-Nearest Neighbor: Scholarpedia January 2009
14 pages
K NN Annotated Slides
No ratings yet
K NN Annotated Slides
9 pages
5. K-Nearest Neighbors
No ratings yet
5. K-Nearest Neighbors
35 pages
Lecture 17 - KNN
No ratings yet
Lecture 17 - KNN
18 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
S3-K-Nearest-Neighbor-LKW-15Jan2025
No ratings yet
S3-K-Nearest-Neighbor-LKW-15Jan2025
16 pages
What Is KNN
No ratings yet
What Is KNN
9 pages
14 K - Nearest Neighbours
No ratings yet
14 K - Nearest Neighbours
8 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
09314060
No ratings yet
09314060
13 pages
k-NN
No ratings yet
k-NN
17 pages
Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University
No ratings yet
Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University
24 pages
KNN For Classification
No ratings yet
KNN For Classification
1 page
KNN Presentation
No ratings yet
KNN Presentation
16 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
KNN HMM
No ratings yet
KNN HMM
51 pages
K- Nearest Neighbor
No ratings yet
K- Nearest Neighbor
13 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
K-Means Consistency in Clustering
No ratings yet
K-Means Consistency in Clustering
10 pages
Week 5 - Instance-Based Learning & PCA
No ratings yet
Week 5 - Instance-Based Learning & PCA
69 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
No ratings yet
Ml 7th Sem Aiml Ite Notes Complete Long[1]-63-155
93 pages
Lecture8 KNN1
No ratings yet
Lecture8 KNN1
16 pages
KNN Updated
No ratings yet
KNN Updated
30 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
ch2
No ratings yet
ch2
30 pages
WEEK 07
No ratings yet
WEEK 07
24 pages
Shubh
No ratings yet
Shubh
10 pages
Instance Based Learning
No ratings yet
Instance Based Learning
20 pages
KNN
100% (1)
KNN
7 pages
E Learning KNN
No ratings yet
E Learning KNN
31 pages
08 Classification Using K NN
No ratings yet
08 Classification Using K NN
23 pages
Chapter 7 - K-Nearest-Neighbor: Data Mining For Business Analytics in Python
No ratings yet
Chapter 7 - K-Nearest-Neighbor: Data Mining For Business Analytics in Python
21 pages
Linear Least Squares
No ratings yet
Linear Least Squares
10 pages
Instance-Based Learning
No ratings yet
Instance-Based Learning
1 page
Modelling Biological Systems
No ratings yet
Modelling Biological Systems
7 pages
Fourier Analysis
No ratings yet
Fourier Analysis
10 pages
Statistical Distance
No ratings yet
Statistical Distance
3 pages
Nearest-Neighbor Interpolation
No ratings yet
Nearest-Neighbor Interpolation
2 pages
Multidimensional Analysis
No ratings yet
Multidimensional Analysis
2 pages
Singular Value Decomposition
No ratings yet
Singular Value Decomposition
15 pages
Sparse Distributed Memory
No ratings yet
Sparse Distributed Memory
15 pages
List of Bioinformatics Companies
No ratings yet
List of Bioinformatics Companies
1 page
Computational Biology
No ratings yet
Computational Biology
11 pages
Biodiversity Informatics
No ratings yet
Biodiversity Informatics
9 pages
Metabolomics
No ratings yet
Metabolomics
12 pages
Cyberbiosecurity
No ratings yet
Cyberbiosecurity
3 pages
Nucleic Acid Sequence
No ratings yet
Nucleic Acid Sequence
8 pages
Jumping Library
No ratings yet
Jumping Library
8 pages
Proteomics
No ratings yet
Proteomics
24 pages
LIONsolver
No ratings yet
LIONsolver
2 pages
List of Bioinformatics Institutions
No ratings yet
List of Bioinformatics Institutions
1 page
MATLAB
No ratings yet
MATLAB
11 pages
Gene Disease Database
No ratings yet
Gene Disease Database
16 pages
Wolfram Mathematica
No ratings yet
Wolfram Mathematica
7 pages
IBM Watson Studio
No ratings yet
IBM Watson Studio
2 pages
Caffe (Software)
No ratings yet
Caffe (Software)
2 pages
Neural Designer
No ratings yet
Neural Designer
2 pages
Splunk
No ratings yet
Splunk
11 pages
RCASE
No ratings yet
RCASE
3 pages
Poly Analyst
No ratings yet
Poly Analyst
4 pages
Sciarretta Dialectometry Revised
No ratings yet
Sciarretta Dialectometry Revised
29 pages
Data Preprocessing
No ratings yet
Data Preprocessing
66 pages
Introduction To Data Science: Cpts 483-06 - Syllabus
No ratings yet
Introduction To Data Science: Cpts 483-06 - Syllabus
5 pages
ASSIGNMENT 2 (Business Analytics for Managers)
No ratings yet
ASSIGNMENT 2 (Business Analytics for Managers)
5 pages
Multivariate Statistical Approaches in Archeology: A Systematic Review
No ratings yet
Multivariate Statistical Approaches in Archeology: A Systematic Review
7 pages
Real-Time Detection of Knitting Fabric Defects Using Shearlet Transform
No ratings yet
Real-Time Detection of Knitting Fabric Defects Using Shearlet Transform
9 pages
Coventry (1,2,3,4)
No ratings yet
Coventry (1,2,3,4)
49 pages
Confusion Matrix
No ratings yet
Confusion Matrix
26 pages
Journal Tea Eng
No ratings yet
Journal Tea Eng
7 pages
AIML Course File
No ratings yet
AIML Course File
31 pages
Csit (r22) 3-2 Machine Learning Digital Notes
No ratings yet
Csit (r22) 3-2 Machine Learning Digital Notes
120 pages
Kannada Handwritten Digit Recognition. Version-1.0
0% (1)
Kannada Handwritten Digit Recognition. Version-1.0
9 pages
A Survey On Data Mining Techniques For COVID Prediction
100% (2)
A Survey On Data Mining Techniques For COVID Prediction
6 pages
Internship
No ratings yet
Internship
22 pages
Ldats2470 Project
No ratings yet
Ldats2470 Project
2 pages
(Big Data For Industry 4.0) K. Suganthi, R. Karthik, G. Rajesh, Peter Ho Chiung Ching - Machine Learning and Deep Learning Techniques in Wireless and Mobile Networking Systems-CRC Press (2021)
No ratings yet
(Big Data For Industry 4.0) K. Suganthi, R. Karthik, G. Rajesh, Peter Ho Chiung Ching - Machine Learning and Deep Learning Techniques in Wireless and Mobile Networking Systems-CRC Press (2021)
285 pages
Immediate download Machine Learning with R the tidyverse and mlr 1st Edition Hefin I Rhys ebooks 2024
100% (1)
Immediate download Machine Learning with R the tidyverse and mlr 1st Edition Hefin I Rhys ebooks 2024
62 pages
Cluster-N-Engage_A_New_Framework_for_Measuring_User_Engagement_of_Website_With_User_Navigational_Behavior
No ratings yet
Cluster-N-Engage_A_New_Framework_for_Measuring_User_Engagement_of_Website_With_User_Navigational_Behavior
17 pages
ML Unit 2
No ratings yet
ML Unit 2
25 pages
Image Processing Basics
No ratings yet
Image Processing Basics
17 pages
Ch.3 Data Preprocessing
No ratings yet
Ch.3 Data Preprocessing
16 pages
Icmc 2023 Template
No ratings yet
Icmc 2023 Template
9 pages
DS ML CompleteSlides PDF
No ratings yet
DS ML CompleteSlides PDF
211 pages
Face Recognition Attendance System Based On Real-Time Video Processing
No ratings yet
Face Recognition Attendance System Based On Real-Time Video Processing
8 pages
DMBAR Chapter 4 Dimension Reduction
No ratings yet
DMBAR Chapter 4 Dimension Reduction
25 pages
Unit 4 ISR
100% (1)
Unit 4 ISR
9 pages
CSCE 636: Deep Learning
No ratings yet
CSCE 636: Deep Learning
30 pages
Quiz 2 - Dimensionality reduction_ Machine Learning 3 - Ravi
No ratings yet
Quiz 2 - Dimensionality reduction_ Machine Learning 3 - Ravi
5 pages
ML Interview Questions
No ratings yet
ML Interview Questions
7 pages
Data Science - UNIT-3 - Notes
No ratings yet
Data Science - UNIT-3 - Notes
32 pages

K-Nearest Neighbors Algorithm

Uploaded by

K-Nearest Neighbors Algorithm

Uploaded by

k-nearest neighbors algorithm

In k-NN classification, the output is a class membership. An object is classified by a plurality

The 1-nearest neighbor classifier

The weighted nearest neighbour classifier

for constants and where and .

for some constants and .

1. Haar face detection

Condensed Nearest Neighbor for data reduction

Given a training set X, CNN works iteratively:

1. Scan all elements of X, looking for an element x whose nearest

It is efficient to scan the training examples in order of decreasing border ratio.[22]

CNN model reduction for k-NN classifiers

Fig. 1. The dataset. Fig. 2. The 1NN classification

Fig. 3. The 5NN classification Fig. 4. The CNN reduced

Nearest centroid classifier

Retrieved from "https://en.wikipedia.org/w/index.php?title=K-nearest_neighbors_algorithm&oldid=1163707353"

You might also like