0% found this document useful (0 votes)

35 views24 pages

Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University

Uploaded by

UIET Student

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views24 pages

Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University

Uploaded by

UIET Student

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

SEMINAR REPORT FILE ON KNN MODELS

UNIVERSITY INSTITUTE OF ENGINEERING AND

TECHNOLOGY,
KURUKSHETRA UNIVERSITY,
(SESSION : 2021 – 2025)

Submitted To: Submitted By:

Dr Kulwinder Singh Samarjeet Singh
Asst. Professor 252102058
UIET , KUK CSE – A , 7th Sem
K-Nearest Neighbor(KNN) Algorithm for
Machine Learning
o K-Nearest Neighbour is one of the simplest Machine Learning
algorithms based on Supervised Learning technique.

o K-NN algorithm assumes the similarity between the new case/data

and available cases and put the new case into the category that is
most similar to the available categories.

o K-NN algorithm stores all the available data and classifies a new
data point based on the similarity. This means when new data
appears then it can be easily classified into a well suite category by
using K- NN algorithm.

o K-NN algorithm can be used for Regression as well as for

Classification but mostly it is used for the Classification
problems.

o K-NN is a non-parametric algorithm, which means it does not

make any assumption on underlying data.

o It is also called a lazy learner algorithm because it does not learn

from the training set immediately instead it stores the dataset
and at the time of classification, it performs an action on the
dataset.

o KNN algorithm at the training phase just stores the dataset and
when it gets new data, then it classifies that data into a category
that is much similar to the new data.
o Example:
Imagine we have a dataset containing images of cats and dogs,
each labeled as either "cat" or "dog." Now, we get a new, unlabeled
image of an animal that has mixed characteristics resembling both.
To identify whether this new image is a cat or a dog, we use the
K-Nearest Neighbors (KNN) algorithm.

The KNN model compares the features of this new image—such as

its shape, fur texture, ear shape, and other visual
characteristics—with the images in our dataset. By calculating the
similarity or "distance" between the new image and each of the
images in the dataset, KNN identifies the 'k' closest images. If most
of these nearest neighbors are labeled as cats, then the algorithm
classifies the new image as a cat; otherwise, if most are dogs, it
classifies it as a dog.

KNN is effective here because it relies on the similarity measure,

allowing the model to make an accurate prediction based on known
patterns in the data. This approach is particularly useful for
non-complex image recognition tasks with distinct, measurable
differences in categories, and it adapts well when new images with
similar features are added to the dataset. This makes KNN a
flexible and intuitive method for classifying images without
needing an intensive training process, as it can work directly on
raw data points.
Why do we need a K-NN Algorithm?
● Suppose we have two categories, Category A and Category B, each containing
multiple data points, plotted on a graph based on certain features. Now, we
receive a new data point, x1, which is unlabeled, and our goal is to determine
whether x1 belongs to Category A or Category B. To solve this type of
classification problem, we can use the K-Nearest Neighbors (K-NN)
algorithm, which is effective in identifying the most likely category of a new
data point based on its similarity to existing labeled data points.

● In K-NN, we first choose a value for 'k,' representing the number of nearest
neighbors to consider. Then, we calculate the "distance" between x1 and each
data point in both Category A and Category B. The distance metric, typically
Euclidean distance, allows us to measure the closeness of x1 to each
neighboring data point. Once we’ve found the 'k' nearest neighbors to x1, we
observe the categories these neighbors belong to.

● If most of the closest neighbors are from Category A, K-NN will classify x1 as
belonging to Category A. Conversely, if most neighbors are from Category B,
x1 will be classified into Category B. This approach is simple yet powerful
because it bases predictions on real, measurable patterns in the data. The
visual diagram of the dataset helps illustrate how x1’s position relative to
Category A and Category B points can help decide its category, making K-NN
ideal for solving problems where spatial closeness correlates with categorical
similarity
How does K-NN work?
The working of the K-Nearest Neighbors (K-NN) algorithm can be explained with a
step-by-step approach as follows:

o Step 1: Select the number of neighbors, 'K,' to consider. This is a crucial

parameter, as it determines how many nearest data points will influence
the classification. Choosing the right K value is important; if K is too
small, the model might be overly sensitive to noise, while a larger K can
lead to a more general, but possibly less accurate, classification.

o Step 2: Calculate the Euclidean distance between the new data point and
each of the data points in the dataset. The Euclidean distance formula is
used to measure how close or far apart each point is from the new data
point. This distance metric allows us to identify the points that are
physically closest to the new data point in terms of feature values.

o Step 3: Identify the K data points with the smallest Euclidean distances to
the new data point. These are considered the "K nearest neighbors" of the
new point. By focusing on these closest neighbors, K-NN assumes that
points close to each other are likely to share the same category.

o Step 4: Among these K nearest neighbors, count the number of data points
belonging to each category (e.g., Category A and Category B). This is the
process of "voting," where each neighbor essentially "votes" for its
category, influencing the classification of the new data point.

o Step 5: Assign the new data point to the category with the majority vote
among its K neighbors. For example, if more neighbors belong to
Category A than Category B, the new point will be classified under
Category A. This decision rule assumes that the category with the most
neighbors around a point is the most appropriate classification.

o Step 6: The model is now ready for use. With the classification assigned
to the new data point, our K-NN model can now classify additional new
data points using the same process. The simplicity and flexibility of K-NN
make it suitable for both classification and regression tasks, particularly
where patterns in data can be recognized through proximity or similarity.

By following these steps, the K-NN algorithm provides an intuitive, data-driven

method to classify new data points based on similarity, making it a useful algorithm
for tasks like image recognition, recommendation systems, and even predictive
analysis.
Suppose we have a new data point and we need to put it in the required
category. Consider the below image:

● Firstly, we will choose the number of neighbors, so we will choose the k=5.

● Next, we will calculate the Euclidean distance between the data

points. The Euclidean distance is the distance between two points,
which we have already studied in geometry. It can be calculated as:
● By calculating the Euclidean distance between our new data point and all the
data points in the dataset, we identify the nearest neighbors based on their
proximity. Suppose the value of 'K' we chose is 5, meaning we are looking at
the five closest neighbors to our new data point. After measuring the distances,
we find that three of the nearest neighbors belong to Category A, while two
belong to Category B.
● This setup means that, out of the five nearest neighbors, the majority of them
(three) are in Category A, suggesting that our new data point is more similar to
Category A than to Category B. In K-NN, this majority vote among the
neighbors is a critical step because it determines the likely classification of the
new data point.
● Since Category A has more representatives among the nearest neighbors, the
algorithm will classify the new data point as belonging to Category A. This
decision is based on the principle that points closer together in the feature
space likely share the same category or class. The K-NN model relies on this
assumption, making it effective for problems where proximity correlates well
with category or class membership.
● If this example were visualized, we would see the new data point situated
closer to the cluster of points in Category A than to those in Category B,
reinforcing why it is classified as Category A. As we apply this to more data,
the K-NN model continues to classify each new point by comparing it with its
closest neighbors, making it both an adaptable and intuitive classification tool.

● As we can see the 3 nearest neighbors are from category A, hence this
new data point must belong to category A.
How to select the value of K in the K-NN
Algorithm?
When selecting the value of 'K' in the K-Nearest Neighbors (K-NN) algorithm, Some
key points to keep in mind:

1.No Fixed Rule for the Best 'K'

● There’s no single method to determine the optimal 'K' value for all datasets, so
it's often necessary to experiment with different values to find the one that
provides the best accuracy for the specific dataset. Cross-validation is a
common approach for testing various 'K' values.

2.Commonly Used Value

● The most commonly used value for 'K' is 5. This value tends to strike a
balance between stability and responsiveness, as it considers enough neighbors
to minimize the impact of random outliers but is still responsive to local data
patterns.

3.Effect of a Low 'K' Value

● When 'K' is set to a very low number, such as 1 or 2, the model can become
overly sensitive to outliers. This can lead to incorrect classifications if these
outliers differ significantly from the actual category patterns, resulting in a
"noisy" model that lacks generalization.

4.Challenges with a High 'K' Value

● Higher values of 'K' provide a more stable and generalized model by reducing
the effect of individual points. However, too large a value can dilute the
influence of nearby points, potentially leading to less accurate classifications,
especially in cases where the categories are closely spaced.

5.Finding the Right Balance

● Choosing 'K' is about finding a balance between bias and variance. A smaller
'K' may capture finer details in the data, while a larger 'K' smooths the
decision boundary but can overlook local patterns.
Advantages of KNN Algorithm

Six advantages of the K-Nearest Neighbors (K-NN) algorithm:

1. Simple to Understand and Implement

K-NN is an intuitive and straightforward algorithm that doesn’t require

complex calculations, making it easy to implement and interpret.

2. No Training Phase Required

Unlike many algorithms, K-NN is a lazy learner, meaning it doesn’t require a

lengthy training phase. This allows it to be directly applied to classify new
data points without pre-training.

3. Adaptable to Multi-Class Problems

K-NN can be easily adapted for multi-class classification tasks, where

multiple categories are present, by simply counting the majority among the
neighbors.

4. Handles Noisy Data Well

K-NN is generally robust to noise in the dataset, especially when an

appropriate 'K' value is chosen, as it uses nearby points to smooth out outlier
effects. K-NN makes no assumptions about the underlying distribution of the
data, unlike some algorithms (e.g., linear regression). This makes it useful for
datasets where the relationships between features are complex or non-linear.

5. Effective with Large Datasets

As the dataset size increases, K-NN’s accuracy can improve because it has a
wider pool of reference points, making it more effective in recognizing
complex patterns.

6. Versatile for Both Classification and Regression

K-NN can be used not only for classification tasks but also for regression by
averaging the values of the nearest neighbors, making it a flexible tool for
different types of problems.
Disadvantages of KNN Algorithm
Six disadvantages of the K-Nearest Neighbors (K-NN) algorithm:
1. Choosing the Value of K
Determining the optimal value for 'K' can be complex and time-consuming. If
the value of K is too small, the model may become sensitive to noise, while a
very large 'K' can overly generalize and miss finer patterns in the data, making
it tricky to achieve optimal performance.
2. High Computation Cost
K-NN requires calculating the distance between the new data point and all the
training samples in the dataset. This can be computationally expensive,
especially when dealing with large datasets, as the model performs this
distance calculation for every prediction, leading to slow performance.
3. Memory Intensive
As a lazy learner, K-NN needs to store the entire training dataset in memory
to classify new data points. This can be quite memory-intensive, especially
with large datasets, making it less scalable for very large training sets.
4. Sensitive to Irrelevant Features
K-NN can perform poorly when there are irrelevant or redundant features in
the dataset. These features can distort the distance calculations, leading to
inaccurate predictions since the model treats all features equally, even if they
don’t contribute to the classification task.
5. Difficulty Handling High-Dimensional Data (Curse of Dimensionality)
As the number of features increases, the distances between data points become
less distinct, making it harder for K-NN to differentiate between them. This
issue, known as the "curse of dimensionality," can reduce the algorithm’s
effectiveness in high-dimensional spaces, such as in text classification or
image recognition.
6. Poor Performance with Imbalanced Data
K-NN struggles with imbalanced datasets, where one class significantly
outnumbers the other. In such cases, the algorithm may be biased toward the
majority class, leading to inaccurate predictions for the minority class, as the
majority class will dominate the "voting" process of the K nearest neighbors.
Python implementation of the KNN
algorithm
To do the Python implementation of the K-NN algorithm, we will use the
same problem and dataset which we have used in Logistic Regression.
But here we will improve the performance of the model. Below is the
problem description:

Problem for K-NN Algorithm: There is a Car manufacturer company

that has manufactured a new SUV car. The company wants to give the
ads to the users who are interested in buying that SUV. So for this
problem, we have a dataset that contains multiple user's information
through the social network. The dataset contains lots of information but
the Estimated Salary and Age we will consider for the independent
variable and the Purchased variable is for the dependent variable. Below
is the dataset:
Steps to implement the K-NN algorithm:

o Data Pre-processing step

o Fitting the K-NN algorithm to the Training set
o Predicting the test result
o Test accuracy of the result(Creation of Confusion matrix)
o Visualizing the test set result.

Data Pre-Processing Step:

The Data Pre-processing step will remain exactly the same as Logistic
Regression. Below is the code for it:

1. # importing libraries
2. import numpy as nm
3. import matplotlib.pyplot as mtp
4. import pandas as pd
5.
6. #importing datasets
7. data_set= pd.read_csv('user_data.csv')
8.
9. #Extracting Independent and dependent Variable
10. x= data_set.iloc[:, [2,3]].values
11. y= data_set.iloc[:, 4].values
12. 12.
13. # Splitting the dataset into training and test set.
14. from sklearn.model_selection import train_test_split
15. x_train, x_test, y_train, y_test= train_test_split(x, y, test_size= 0.25, random_state=0)
16. 16.
17. #feature Scaling
18. from sklearn.preprocessing import StandardScaler
19. st_x= StandardScaler()
20. x_train= st_x.fit_transform(x_train)
21. x_test= st_x.transform(x_test)

By executing the above code, our dataset is imported to our program and well
pre-processed. After feature scaling our test dataset will look like:
From the above output image, we can see that our data is successfully scaled.

o Fitting K-NN classifier to the Training data:

Now we will fit the K-NN classifier to the training data. To do this
we will import the KNeighborsClassifier class of Sklearn
Neighbors library. After importing the class, we will create the
Classifier object of the class. The Parameter of this class will be

o n_neighbors: To define the required neighbors of the

algorithm. Usually, it takes 5.
o metric='minkowski': This is the default parameter and it
decides the distance between the points.
o p=2: It is equivalent to the standard Euclidean metric.

Once we have chosen the optimal value for 'K' and preprocessed the data, the next
step is to fit the classifier to the training data. This process involves training the
K-NN model by feeding it the feature values and corresponding labels from the
training dataset. The classifier learns the patterns and relationships in the data, so it
can later make predictions on new, unseen data. The code provided below
demonstrates how to implement this step in Python, using a K-NN algorithm from a
machine learning library, such as scikit-learn, to fit the model. This step is
crucial for enabling the model to classify new data based on the training it has
received
1. Fitting K-NN classifier to the training set
#

2. from sklearn.neighbors import KNeighborsClassifier

3. classifier= KNeighborsClassifier(n_neighbors=5, metric='minkowski', p=2 )
4. classifier.fit(x_train, y_train)

Output: By executing the above code, we will get the output as:

Out[10]:
KNeighborsClassifier(algorithm='auto', leaf_size=30, metric='minkowski',
metric_params=None, n_jobs=None, n_neighbors=5, p=2,
weights='uniform')
o Predicting the Test Result: To predict the test set result, we will
create a y_pred vector as we did in Logistic Regression. Below is
the code for it:

1. #Predicting the test set result

2. y_pred= classifier.predict(x_test)
Output:

The output for the above code will be:

o Creating the Confusion Matrix:
Now we will create the Confusion Matrix for our K-NN model to
see the accuracy of the classifier. Below is the code for it:

1. #Creating the Confusion matrix

2. from sklearn.metrics import confusion_matrix
3. cm= confusion_matrix(y_test, y_pred)

In above code, we have imported the confusion_matrix function and

called it using the variable cm.

Output: By executing the above code, we will get the matrix as below:

In the above image, we can see there are 64+29= 93 correct predictions
and 3+4= 7 incorrect predictions, whereas, in Logistic Regression, there
were 11 incorrect predictions. So we can say that the performance of the
model is improved by using the K-NN algorithm.

o Visualizing the Training set result:

Now, we will visualize the training set result for K-NN model. The
code will remain same as we did in Logistic Regression, except the
name of the graph. Below is the code for it:
1. #Visulaizing the trianing set result
2. from matplotlib.colors import ListedColormap
3. x_set, y_set = x_train, y_train
4. x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
5. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
6. mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
7. alpha = 0.75, cmap = ListedColormap(('red','green' )))
8. mtp.xlim(x1.min(), x1.max())
9. mtp.ylim(x2.min(), x2.max())
10.for i, j in enumerate(nm.unique(y_set)):
11. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
12. c = ListedColormap(('red', 'green'))(i), label = j)
13.mtp.title('K-NN Algorithm (Training set)')
14.mtp.xlabel('Age')
15.mtp.ylabel('Estimated Salary')
16.mtp.legend()
17.mtp.show()

Output:

By executing the above code, we will get the below graph:

Visualizing the Test set result:

After the training of the model, we will now test the result by
putting a new dataset, i.e., Test dataset. Code remains the same
except some minor changes: such as x_train and y_train will be
replaced by x_test and y_test.
Below is the code for it:
1. #Visualizing the test set result
2. from matplotlib.colors import ListedColormap
3. x_set, y_set = x_test, y_test
4. x1, x2 = nm.meshgrid(nm.arange(start = x_set[:, 0].min() - 1, stop = x_set[:,
0].max() + 1, step =0.01),
5. nm.arange(start = x_set[:, 1].min() - 1, stop = x_set[:, 1].max() + 1, step = 0.01))
6. mtp.contourf(x1, x2, classifier.predict(nm.array([x1.ravel(),
x2.ravel()]).T).reshape(x1.shape),
7. alpha = 0.75, cmap = ListedColormap(('red','green' )))
8. mtp.xlim(x1.min(), x1.max())
9. mtp.ylim(x2.min(), x2.max())
10.for i, j in enumerate(nm.unique(y_set)):
11. mtp.scatter(x_set[y_set == j, 0], x_set[y_set == j, 1],
12. c = ListedColormap(('red', 'green'))(i), label = j)
13.mtp.title('K-NN algorithm(Test set)')
14.mtp.xlabel('Age')
15.mtp.ylabel('Estimated Salary')
16.mtp.legend()
17.mtp.show()

Output:The above graph is showing the output for the test data set.

As we can see in the graph, the predicted output is well good as most of the red
points are in the red region and most of the green points are in the green region.

However, there are few green points in the red region and a few red
points in the green region. So these are the incorrect observations that we
have observed in the confusion matrix(7 Incorrect output).

JVC Lt-22hg45e Led TV PDF
No ratings yet
JVC Lt-22hg45e Led TV PDF
43 pages
Magnetic Particle Testing
80% (5)
Magnetic Particle Testing
7 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
Atheist Proofs
100% (1)
Atheist Proofs
103 pages
Boom Truck - V550K-TH - English
No ratings yet
Boom Truck - V550K-TH - English
4 pages
HOPE 3 2ndQ - W3
No ratings yet
HOPE 3 2ndQ - W3
22 pages
Heat of Reaction
83% (6)
Heat of Reaction
8 pages
Resume Sample
No ratings yet
Resume Sample
2 pages
Comprehensive Pharmacology
No ratings yet
Comprehensive Pharmacology
102 pages
Carrying Out Driving Instructor Tests and Checks Adi1
50% (2)
Carrying Out Driving Instructor Tests and Checks Adi1
142 pages
Atomic Habits by James Clear
100% (1)
Atomic Habits by James Clear
23 pages
Kubota Tractor B219 Loader - Model 25 Maximum Payload - 500 Pounds Figure: 1 - Safety Precautions
No ratings yet
Kubota Tractor B219 Loader - Model 25 Maximum Payload - 500 Pounds Figure: 1 - Safety Precautions
17 pages
Private Fire Hydrant (PFH) Inspection and Testing Form
No ratings yet
Private Fire Hydrant (PFH) Inspection and Testing Form
2 pages
Thermal Insulation Barrier Providing Corrosion Protection With "Cool-To-Touch" Properties
No ratings yet
Thermal Insulation Barrier Providing Corrosion Protection With "Cool-To-Touch" Properties
2 pages
Magnifico 160000334 V1 1121 LR 01
No ratings yet
Magnifico 160000334 V1 1121 LR 01
12 pages
Climate of India - Wikipedia
No ratings yet
Climate of India - Wikipedia
146 pages
O-Levels Metal Technology and Design Exemplar
100% (2)
O-Levels Metal Technology and Design Exemplar
33 pages
Topic 1 - Introduction To Cell
No ratings yet
Topic 1 - Introduction To Cell
23 pages
UNIT 3 ML Distance Based Learning
No ratings yet
UNIT 3 ML Distance Based Learning
19 pages
Guión Sofía y Francisco Ampliación de Inglés 3ºB
No ratings yet
Guión Sofía y Francisco Ampliación de Inglés 3ºB
2 pages
Underwater Noise Review: For Saoirse Wave Energy Limited
No ratings yet
Underwater Noise Review: For Saoirse Wave Energy Limited
29 pages
KNN Algorithm
No ratings yet
KNN Algorithm
4 pages
Homework - Plate No. 4b
No ratings yet
Homework - Plate No. 4b
1 page
KNN
No ratings yet
KNN
9 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
Intelligence in IoMT Turkey
No ratings yet
Intelligence in IoMT Turkey
17 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
CPE412 Pattern Recognition (Week 6)
No ratings yet
CPE412 Pattern Recognition (Week 6)
27 pages
A&P Chapter 12 Notes
No ratings yet
A&P Chapter 12 Notes
10 pages
Ken Kim PG79 FINAL
No ratings yet
Ken Kim PG79 FINAL
1 page
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
Hypatia Ipazia: The Mean Streets of Old Alexandria by Mike Flynn
No ratings yet
Hypatia Ipazia: The Mean Streets of Old Alexandria by Mike Flynn
28 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
Module Qw1325
No ratings yet
Module Qw1325
2 pages
K Nearest Neighbor (KNN)
No ratings yet
K Nearest Neighbor (KNN)
9 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
ML Lecture#10
No ratings yet
ML Lecture#10
17 pages
Day43 KNN Intro
No ratings yet
Day43 KNN Intro
4 pages
14 K - Nearest Neighbours
No ratings yet
14 K - Nearest Neighbours
8 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
E Learning KNN
No ratings yet
E Learning KNN
31 pages
Research Paper
No ratings yet
Research Paper
6 pages
K-Nearest Neighbor Algorithm
No ratings yet
K-Nearest Neighbor Algorithm
6 pages
Lecture 14 and 15
No ratings yet
Lecture 14 and 15
42 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
2 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
ML Unit-2
No ratings yet
ML Unit-2
24 pages
CEILING SUSPENDED AHU 2600 CFM 125 MMWG 2 Nos
No ratings yet
CEILING SUSPENDED AHU 2600 CFM 125 MMWG 2 Nos
1 page
ML Unit - 2
No ratings yet
ML Unit - 2
85 pages
Unit V: Distance and Rule Based Models
No ratings yet
Unit V: Distance and Rule Based Models
56 pages
K-Nearest Neighbors: KNN Algorithm Pseudocode
No ratings yet
K-Nearest Neighbors: KNN Algorithm Pseudocode
2 pages
Experiment No 7 ML
No ratings yet
Experiment No 7 ML
4 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
K-Nearest Neighbor Algorithm
100% (1)
K-Nearest Neighbor Algorithm
6 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Essay 3
No ratings yet
Essay 3
14 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
Ai 5
No ratings yet
Ai 5
11 pages
ML KN
No ratings yet
ML KN
12 pages
Chapter 4. K Nearest Neighbors
No ratings yet
Chapter 4. K Nearest Neighbors
55 pages
K-Nearest Neighbors (K-NN) Algorithm
No ratings yet
K-Nearest Neighbors (K-NN) Algorithm
10 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning
3 pages
KNN Algorithm
No ratings yet
KNN Algorithm
3 pages
K-Nearest Neighbor (KNN) Algorithm For Machine Learning: Non-Parametric Algorithm Lazy Learner Algorithm
No ratings yet
K-Nearest Neighbor (KNN) Algorithm For Machine Learning: Non-Parametric Algorithm Lazy Learner Algorithm
5 pages
K-Nearest Neighbour (KNN) Algorithm
No ratings yet
K-Nearest Neighbour (KNN) Algorithm
5 pages
Testbank For Life The Science of Biology 12th Edition Hillis Solution Manual
No ratings yet
Testbank For Life The Science of Biology 12th Edition Hillis Solution Manual
18 pages
21 KNN
No ratings yet
21 KNN
28 pages
Enhancing K-Nearest Neighbor Algorithm: A Comprehensive Review and Performance Analysis of Modifications
No ratings yet
Enhancing K-Nearest Neighbor Algorithm: A Comprehensive Review and Performance Analysis of Modifications
55 pages
CALTRACS - Standard - Calvert Racing, Inc
No ratings yet
CALTRACS - Standard - Calvert Racing, Inc
1 page
An Introduction To Groups and Their Matrices For Science Students Robert Kolenkow Download
No ratings yet
An Introduction To Groups and Their Matrices For Science Students Robert Kolenkow Download
76 pages
K-NN Algorithm and Clustering Analysis
No ratings yet
K-NN Algorithm and Clustering Analysis
93 pages
K-Nearest Neighbors (KNN)
No ratings yet
K-Nearest Neighbors (KNN)
3 pages
KNN
No ratings yet
KNN
3 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
Amrendra
No ratings yet
Amrendra
9 pages
KNN Algorithm
No ratings yet
KNN Algorithm
11 pages
Unit V Non Parametric Machine Learning
No ratings yet
Unit V Non Parametric Machine Learning
47 pages
Updated K-Nearest Neighbors in Machine Learning
No ratings yet
Updated K-Nearest Neighbors in Machine Learning
11 pages
K-Nearest Neighbor (KNN)
No ratings yet
K-Nearest Neighbor (KNN)
12 pages
Instance Based Learning
No ratings yet
Instance Based Learning
7 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
K-Means and KNN
No ratings yet
K-Means and KNN
11 pages

Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University

Uploaded by

Seminar Report File On KNN Models: University Institute of Engineering and Technology, Kurukshetra University

Uploaded by

SEMINAR REPORT FILE ON KNN MODELS

UNIVERSITY INSTITUTE OF ENGINEERING AND

Submitted To: Submitted By:

o K-NN algorithm assumes the similarity between the new case/data

o K-NN algorithm can be used for Regression as well as for

o K-NN is a non-parametric algorithm, which means it does not

o It is also called a lazy learner algorithm because it does not learn

The KNN model compares the features of this new image—such as

KNN is effective here because it relies on the similarity measure,

o Step 1: Select the number of neighbors, 'K,' to consider. This is a crucial

By following these steps, the K-NN algorithm provides an intuitive, data-driven

● Next, we will calculate the Euclidean distance between the data

1.No Fixed Rule for the Best 'K'

2.Commonly Used Value

3.Effect of a Low 'K' Value

4.Challenges with a High 'K' Value

5.Finding the Right Balance

Six advantages of the K-Nearest Neighbors (K-NN) algorithm:

1. Simple to Understand and Implement

K-NN is an intuitive and straightforward algorithm that doesn’t require

2. No Training Phase Required

Unlike many algorithms, K-NN is a lazy learner, meaning it doesn’t require a

3. Adaptable to Multi-Class Problems

K-NN can be easily adapted for multi-class classification tasks, where

4. Handles Noisy Data Well

K-NN is generally robust to noise in the dataset, especially when an

5. Effective with Large Datasets

6. Versatile for Both Classification and Regression

Problem for K-NN Algorithm: There is a Car manufacturer company

o Data Pre-processing step

Data Pre-Processing Step:

o Fitting K-NN classifier to the Training data:

o n_neighbors: To define the required neighbors of the

2. from sklearn.neighbors import KNeighborsClassifier

1. #Predicting the test set result

The output for the above code will be:

1. #Creating the Confusion matrix

In above code, we have imported the confusion_matrix function and

o Visualizing the Training set result:

By executing the above code, we will get the below graph:

Visualizing the Test set result:

You might also like