0% found this document useful (0 votes)

13 views6 pages

ML 2

K-Nearest Neighbors (KNN) is a supervised machine learning algorithm used for classification and regression by predicting based on the 'k' closest data points. It operates as a lazy learner, storing the dataset and making predictions based on proximity and majority voting. The choice of 'k' is crucial for performance, and various distance metrics like Euclidean and Manhattan are employed to determine nearest neighbors.

Uploaded by

Bhavya V 8562

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

13 views6 pages

ML 2

Uploaded by

Bhavya V 8562

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Kkk-Nearest Neighbors (KNN) is a supervised machine learning algorithm generally used for

classification but can also be used for regression tasks. It works by finding the "k" closest data points
(neighbors) to a given input and makesa predictions based on the majority class (for classification) or
the average value (for regression). Since KNN makes no assumptions about the underlying data
distribution it makes it a non-parametric and instance-based learning method.

2/3
K-Nearest Neighbors is also called as a lazy learner algorithm because it does not learn from the
training set immediately instead it stores the dataset and at the time of classification it performs an
action on the dataset.

For example, consider the following table of data points containing two features:

KNN Algorithm working visualization

The new point is classified as Category 2 because most of its closest neighbors are blue squares. KNN
assigns the category based on the majority of nearby points. The image shows how KNN predicts the
category of a new data point based on its closest neighbours.

The red diamonds represent Category 1 and the blue squares represent Category 2.

The new data point checks its closest neighbors (circled points).

Since the majority of its closest neighbors are blue squares (Category 2) KNN predicts the new data
point belongs to Category 2.

KNN works by using proximity and majority voting to make predictions.

What is 'K' in K Nearest Neighbour?

In the k-Nearest Neighbours algorithm k is just a number that tells the algorithm how many nearby
points or neighbors to look at when it makes a decision.
Example: Imagine you're deciding which fruit it is based on its shape and size. You compare it to
fruits you already know.

If k = 3, the algorithm looks at the 3 closest fruits to the new one.

If 2 of those 3 fruits are apples and 1 is a banana, the algorithm says the new fruit is an apple
because most of its neighbors are apples.

How to choose the value of k for KNN Algorithm?

The value of k in KNN decides how many neighbors the algorithm looks at when making a prediction.

Choosing the right k is important for good results.

If the data has lots of noise or outliers, using a larger k can make the predictions more stable.

But if k is too large the model may become too simple and miss important patterns and this is called
underfitting.

So k should be picked carefully based on the data.

Statistical Methods for Selecting k

Cross-Validation: Cross-Validation is a good way to find the best value of k is by using k-fold cross-
validation. This means dividing the dataset into k parts. The model is trained on some of these parts
and tested on the remaining ones. This process is repeated for each part. The k value that gives the
highest average accuracy during these tests is usually the best one to use.

Elbow Method: In Elbow Method we draw a graph showing the error rate or accuracy for different
k values. As k increases the error usually drops at first. But after a certain point error stops
decreasing quickly. The point where the curve changes direction and looks like an "elbow" is usually
the best choice for k.

Odd Values for k: It’s a good idea to use an odd number for k especially in classification problems.
This helps avoid ties when deciding which class is the most common among the neighbors.

Distance Metrics Used in KNN Algorithm

KNN uses distance metrics to identify nearest neighbor, these neighbors are used for classification
and regression task. To identify nearest neighbor we use below distance metrics:

1. Euclidean Distance
Euclidean distance is defined as the straight-line distance between two points in a plane or space.
You can think of it like the shortest path you would walk if you were to go directly from one point to
another.

distance(x,Xi)=∑j=1d(xj−Xij)2] distance(x,Xi)=∑j=1d(xj−Xij)2]

2. Manhattan Distance
This is the total distance you would travel if you could only move along horizontal and vertical lines
like a grid or city streets. It’s also called "taxicab distance" because a taxi can only drive along the
grid-like streets of a city.
d(x,y)=∑i=1n∣xi−yi∣d(x,y)=∑i=1n∣xi−yi∣

3. Minkowski Distance
Minkowski distance is like a family of distances, which includes both Euclidean and Manhattan
distances as special cases.

d(x,y)=(∑i=1n(xi−yi)p)1pd(x,y)=(∑i=1n(xi−yi)p)p1
From the formula above, when p=2, it becomes the same as the Euclidean distance formula and
when p=1, it turns into the Manhattan distance formula. Minkowski distance is essentially a flexible
formula that can represent either Euclidean or Manhattan distance depending on the value of p.

Working of KNN algorithm

Thе K-Nearest Neighbors (KNN) algorithm operates on the principle of similarity where it predicts
the label or value of a new data point by considering the labels or values of its K nearest neighbors in
the training dataset.

Step 1: Selecting the optimal value of K

K represents the number of nearest neighbors that needs to be considered while making prediction.

Step 2: Calculating distance

To measure the similarity between target and training data points Euclidean distance is used.
Distance is calculated between data points in the dataset and target point.

Step 3: Finding Nearest Neighbors

The k data points with the smallest distances to the target point are nearest neighbors.

Step 4: Voting for Classification or Taking Average for Regression

When you want to classify a data point into a category like spam or not spam, the KNN algorithm
looks at the K closest points in the dataset. These closest points are called neighbors. The algorithm
then looks at which category the neighbors belong to and picks the one that appears the most. This
is called majority voting.

In regression, the algorithm still looks for the K closest points. But instead of voting for a class in
classification, it takes the average of the values of those K neighbors. This average is the predicted
value for the new point for the algorithm.

It shows how a test point is classified based on its nearest neighbors. As the test point moves the
algorithm identifies the closest 'k' data points i.e. 5 in this case and assigns test point the majority
class label that is grey label class here.

Perceptron is a type of neural network that performs binary classification that maps input features
to an output decision, usually classifying data into one of two categories, such as 0 or 1.

Perceptron consists of a single layer of input nodes that are fully connected to a layer of output
nodes. It is particularly good at learning linearly separable patterns. It utilizes a variation of artificial
neurons called Threshold Logic Units (TLU), which were first introduced by McCulloch and Walter
Pitts in the 1940s. This foundational model has played a crucial role in the development of more
advanced neural networks and machine learning algorithms.

Types of Perceptron
Single-Layer Perceptron is a type of perceptron is limited to learning linearly separable patterns. It is
effective for tasks where the data can be divided into distinct categories through a straight line.
While powerful in its simplicity, it struggles with more complex problems where the relationship
between inputs and outputs is non-linear.

Multi-Layer Perceptron possess enhanced processing capabilities as they consist of two or more
layers, adept at handling more complex patterns and relationships within the data.

Basic Components of Perceptron

A Perceptron is composed of key components that work together to process information and make
predictions.

Input Features: The perceptron takes multiple input features, each representing a characteristic of
the input data.

Weights: Each input feature is assigned a weight that determines its influence on the output. These
weights are adjusted during training to find the optimal values.

Summation Function: The perceptron calculates the weighted sum of its inputs, combining them
with their respective weights.

Activation Function: The weighted sum is passed through the Heaviside step function, comparing it
to a threshold to produce a binary output (0 or 1).

Output: The final output is determined by the activation function, often used for binary
classification tasks.
Bias: The bias term helps the perceptron make adjustments independent of the input, improving its
flexibility in learning.

Learning Algorithm: The perceptron adjusts its weights and bias using a learning algorithm, such as
the Perceptron Learning Rule, to minimize prediction errors.

These components enable the perceptron to learn from data and make predictions. While a single
perceptron can handle simple binary classification, complex tasks require multiple perceptrons
organized into layers, forming a neural network.

How does Perceptron work?

A weight is assigned to each input node of a perceptron, indicating the importance of that input in
determining the output. The Perceptron’s output is calculated as a weighted sum of the inputs,
which is then passed through an activation function to decide whether the Perceptron will fire.

The weighted sum is computed as:

z=w1x1+w2x2+…+wnxn=XTWz=w1x1+w2x2+…+wnxn=XTW
The step function compares this weighted sum to a threshold. If the input is larger than the
threshold value, the output is 1; otherwise, it's 0. This is the most common activation function used
in Perceptrons are represented by the Heaviside step function:

h(z)={0if z<Threshold1if z≥Thresholdh(z)={01if z<Thresholdif z≥Threshold

A perceptron consists of a single layer of Threshold Logic Units (TLU), with each TLU fully connected
to all input nodes.

In a fully connected layer, also known as a dense layer, all neurons in one layer are connected to
every neuron in the previous layer.

The output of the fully connected layer is computed as:

fW,b(X)=h(XW+b)fW,b(X)=h(XW+b)
where XX is the input WW is the weight for each inputs neurons and bb is the bias and hh is the
step function.

During training, the Perceptron's weights are adjusted to minimize the difference between the
predicted output and the actual output. This is achieved using supervised learning algorithms like
the delta rule or the Perceptron learning rule.

The weight update formula is:

wi,j=wi,j+η(yj−y^j)xiwi,j=wi,j+η(yj−y^j)xi
Where:

wi,jwi,j is the weight between the ithith input and jthjth output neuron,
xixi is the ithith input value,
yjyj is the actual value, and y^jy^j is the predicted value,
ηη is the learning rate, controlling how much the weights are adjusted.
This process enables the perceptron to learn from data and improve its prediction accuracy over
time.

A Multi-Layer Network Algorithm in machine learning typically refers to a Multi-Layer Perceptron

(MLP), which is a type of feedforward artificial neural network. It consists of multiple layers of
neurons and is used for tasks like classification, regression, and pattern recognition.

Here’s a simple breakdown:

1. Structure of Multi-Layer Network (MLP):

 Input Layer: Takes the input features.

 Hidden Layers: One or more layers where computation happens (non-linear
transformations).
 Output Layer: Produces the final prediction.

Each neuron in a layer is connected to every neuron in the next layer (fully connected).

2. Algorithm Working (Training a Multi-Layer Network):

The most common algorithm used is Backpropagation with Gradient Descent.

Steps:

1. Forward Pass:
o Inputs are passed through the network.
o Each neuron applies a weighted sum followed by an activation function (like ReLU,
sigmoid, tanh).
2. Loss Calculation:
o The output is compared with the actual target using a loss function (e.g., Mean
Squared Error, Cross-Entropy).
3. Backward Pass (Backpropagation):
o Calculates the gradient of the loss function with respect to each weight using the
chain rule.
o Updates the weights using Gradient Descent or its variants (SGD, Adam, etc.).
4. Repeat:
o This process is repeated for several epochs until the loss minimizes.

3. Activation Functions:

 Sigmoid: Good for binary classification.

 ReLU (Rectified Linear Unit): Popular in hidden layers.
 Softmax: For multi-class classification in the output layer.

Glass Fibers ASM PDF
No ratings yet
Glass Fibers ASM PDF
9 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Chemical Engineering in Practice Second Edition - Sampler
100% (1)
Chemical Engineering in Practice Second Edition - Sampler
99 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Casio AP500
0% (1)
Casio AP500
42 pages
Introduction To KNN
100% (1)
Introduction To KNN
8 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
U3 KNN
No ratings yet
U3 KNN
6 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
Sample KNN
No ratings yet
Sample KNN
7 pages
Unit II 2 Mark Answers ML
No ratings yet
Unit II 2 Mark Answers ML
3 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
33 pages
Clustering - KNN
No ratings yet
Clustering - KNN
10 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
No ratings yet
K-Nearest Neighbor (KNN) : Non-Parametric Algorithm
7 pages
2.unit 2 ML Q&A
No ratings yet
2.unit 2 ML Q&A
36 pages
14-15 ASAP Advanced Statistics Clasification Techniques KNN
No ratings yet
14-15 ASAP Advanced Statistics Clasification Techniques KNN
49 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
K-Nearest Neighbors (KNN) Algorithm in Machine Learning
No ratings yet
K-Nearest Neighbors (KNN) Algorithm in Machine Learning
3 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
Machine Learning-Lecture 03
No ratings yet
Machine Learning-Lecture 03
19 pages
KNN Algorithm
No ratings yet
KNN Algorithm
2 pages
ML CH 3
No ratings yet
ML CH 3
88 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
KNN
No ratings yet
KNN
9 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
Instance-Based Learning: K-Nearest Neighbour Learning
No ratings yet
Instance-Based Learning: K-Nearest Neighbour Learning
21 pages
21 KNN
No ratings yet
21 KNN
28 pages
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
No ratings yet
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
14 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
What Is KNN
No ratings yet
What Is KNN
9 pages
KNN With Example
No ratings yet
KNN With Example
21 pages
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
No ratings yet
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
8 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
K-Nearest Neighbors Algorithm - Wikipedia
No ratings yet
K-Nearest Neighbors Algorithm - Wikipedia
10 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
Week 07
No ratings yet
Week 07
24 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
Algorithms - K Nearest Neighbors
No ratings yet
Algorithms - K Nearest Neighbors
23 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
K-Nearest Neighbors (KNN)
No ratings yet
K-Nearest Neighbors (KNN)
3 pages
K-NN Algorithm and Clustering Analysis
No ratings yet
K-NN Algorithm and Clustering Analysis
93 pages
KNN
No ratings yet
KNN
53 pages
Amrendra
No ratings yet
Amrendra
9 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
Unit V Non Parametric Machine Learning
No ratings yet
Unit V Non Parametric Machine Learning
47 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
Wikipedia K Nearest Neighbor Algorithm
No ratings yet
Wikipedia K Nearest Neighbor Algorithm
4 pages
BPLCK105D - Module 2 - Functions in C++
No ratings yet
BPLCK105D - Module 2 - Functions in C++
10 pages
People v. Pagal
No ratings yet
People v. Pagal
3 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
Fig. Qty Description Code Fig. Qty Description Code: Carburettor 40 DCOE Part No. 19550.174 Parts
No ratings yet
Fig. Qty Description Code Fig. Qty Description Code: Carburettor 40 DCOE Part No. 19550.174 Parts
2 pages
Domino Squares
100% (2)
Domino Squares
1 page
Case
No ratings yet
Case
4 pages
2
No ratings yet
2
29 pages
The Act
No ratings yet
The Act
2 pages
Sweet Potatao As Superfood
No ratings yet
Sweet Potatao As Superfood
6 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
Sony Ericsson Product
No ratings yet
Sony Ericsson Product
34 pages
TOCFL 基礎級 A2
No ratings yet
TOCFL 基礎級 A2
11 pages
IELTS Writing Task 2
No ratings yet
IELTS Writing Task 2
34 pages
Grade 5 Write Expressions A
No ratings yet
Grade 5 Write Expressions A
2 pages
VK Liste 2017
No ratings yet
VK Liste 2017
29 pages
A.Datum Case Study
No ratings yet
A.Datum Case Study
23 pages
Important: Service Data Sheet
No ratings yet
Important: Service Data Sheet
4 pages
List of Banned Pesticides
No ratings yet
List of Banned Pesticides
3 pages
Resumen Productos Datalogic SENSORES
No ratings yet
Resumen Productos Datalogic SENSORES
219 pages
RevRes PDF
No ratings yet
RevRes PDF
1,134 pages
PLC Interview Questions
No ratings yet
PLC Interview Questions
3 pages
Chapter-4: Operations, Material and Maketing Management: Definition & Importance of Operational Management
No ratings yet
Chapter-4: Operations, Material and Maketing Management: Definition & Importance of Operational Management
47 pages
Technical Data Sheet & Processing Guide: ENMAT™ Thermoplastics Resin Y1000P
No ratings yet
Technical Data Sheet & Processing Guide: ENMAT™ Thermoplastics Resin Y1000P
6 pages
02 - FootPrinting
No ratings yet
02 - FootPrinting
91 pages
Table of Contents (The Summary) : Intro
No ratings yet
Table of Contents (The Summary) : Intro
14 pages
Failure Mode For Gas CHromatograph
No ratings yet
Failure Mode For Gas CHromatograph
2 pages
Judo Physiological Profile Sportsmedicine Franchini
No ratings yet
Judo Physiological Profile Sportsmedicine Franchini
21 pages
AC6-How To Setup Client+AP Mode
No ratings yet
AC6-How To Setup Client+AP Mode
10 pages
Review of Literature On Graduate Employability
No ratings yet
Review of Literature On Graduate Employability
15 pages
K Nearest Neighbor Algorithm: Fundamentals and Applications
From Everand
K Nearest Neighbor Algorithm: Fundamentals and Applications
Fouad Sabry
No ratings yet