ML TRW

Similarity-based learning focuses on measuring similarity and dissimilarity between data points, used in tasks like recommendation systems, classification, and anomaly detection. Instance-based learners, known as lazy learners, store training data and compute predictions only when needed, while k-NN is a memory-based method that relies on stored data for predictions without generalization. The k-NN algorithm has advantages like ease of implementation and adaptability, but also limitations such as poor scalability and susceptibility to overfitting.

Uploaded by

Ronit Laha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

5 views5 pages

ML TRW

Uploaded by

Ronit Laha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

1. What do you understand by similarity-based learning?

Similarity learning is an area of supervised machine learning. Different from other

supervised learning algorithms (which focus on predicting labels based on input
data), it focuses on recognizing and measuring similarity and dissimilarity between
data points.

To make it easier to conceptualize it, we can think about comparing two photographs
and wanting to figure out if the object in the photos is the same one. Instead of
checking every pixel, similarity learning algorithms will find key characteristics and
features (for example the shape) of the object in each photo and compare them.

Similarity learning algorithms can be used for various tasks where the main goal is to
find similarities and relationships between items:
● Recommendation system: to keep the user spending time on the social media
platform, similarity learning is used to find content similar to the ones the user
has already liked and recommend them.
● Classification: to classify an item into a given class, we want to check if the
item is similar to the items in that class.
● Face Verification: similarity learning is used also to compare facial features in
an image to a database of faces, verifying and recognizing with remarkable
accuracy.
● Anomaly detection: by defining what “normal” data looks like, any dissimilarity
and deviation data can be detected and reported to prevent possible issues.

2. Compare between instance-based learning and model based learning.

3. Why instance based learners are called as lazy learners?

Instance-based learners are called "lazy learners" because they do not perform any
significant computation or model building during the training phase; instead, they
simply store the training data and only calculate predictions when a new data point
needs to be classified, essentially "lazily" postponing the heavy lifting until prediction
time.
Key points about lazy learners:
● No upfront generalization:
Unlike other learning algorithms that build a general model during training,
instance-based learners directly compare new data points to the stored
training instances without creating a generalized representation.
● Query-based Learning:
When making a prediction, lazy learners simply look at the stored data and
make a decision based on the closest or most relevant instances. For
example, in k-nearest neighbors (k-NN), predictions are based on the k
closest data points to the query point.
● Delayed Computation:
Lazy learners postpone most of the computation until they receive a query.
The prediction process is computationally expensive because it requires
comparing the query with all stored instances, which can be inefficient for
large datasets.
● Example algorithm: K-Nearest Neighbors (KNN):
One of the most well-known lazy learners, KNN stores all the training data
and classifies a new data point based on the class labels of its closest
neighbors.

4. Why k-NN method is called as memory-based method?

The k-Nearest Neighbors (k-NN) method is called a memory-based method because

it directly relies on the entire dataset stored in memory to make predictions. Here's a
breakdown of why this is the case:

1. Storage of Training Data: k-NN stores all the training examples and does not
abstract or build a generalized model from them. Instead, the raw data itself is used
to make predictions, which means the algorithm must keep all the training data in
memory.
2. Prediction by Instance Comparison: When a new query (or test instance) comes
in, the k-NN algorithm compares it to every instance in the stored training set to find
the closest k instances. This means that the algorithm's performance depends on
how quickly it can access and compute distances between the query and all stored
instances.
3. No Generalization: Since k-NN doesn’t build a generalized model, it "memorizes"
the training data and retrieves the relevant pieces of it when needed for predictions.
This makes the prediction process computationally expensive, as it must involve all
the data stored in memory during the query phase.
Therefore, the method is termed "memory-based" because it relies heavily on the
memory to store the entire dataset and use it directly during the prediction stage,
unlike model-based learners, which abstract the data into a model that doesn’t need
to reference the training set in full.

5. What are the benefits and limitations of k-NN algorithm?

Advantages of the KNN Algorithm

● Easy to implement as the complexity of the algorithm is not that high.
● Adapts Easily – As per the working of the KNN algorithm it stores all the data
in memory storage and hence whenever a new example or data point is
added then the algorithm adjusts itself as per that new example and has its
contribution to the future predictions as well.
● Few Hyperparameters – The only parameters which are required in the
training of a KNN algorithm are the value of k and the choice of the distance
metric which we would like to choose from our evaluation metric.

Disadvantages of the KNN Algorithm

● Does not scale – As we have heard about this, the KNN algorithm is also
considered a Lazy Algorithm. The main significance of this term is that this
takes lots of computing power as well as data storage. This makes this
algorithm both time-consuming and resource exhausting.
● Curse of Dimensionality – There is a term known as the peaking
phenomenon according to this the KNN algorithm is affected by the curse of
dimensionality which implies the algorithm faces a hard time classifying the
data points properly when the dimensionality is too high.
● Prone to Overfitting – As the algorithm is affected due to the curse of
dimensionality it is prone to the problem of overfitting as well. Hence generally
feature selection as well as dimensionality reduction techniques are applied to
deal with this problem.
6. Consider the data from a questionnaire survey and objective testing with two
attributes (acid durability and strength) to classify whether a special tissue is
good or not.

Now, the factory produces a new paper tissue that has acid durability = 3 and
strength = 7.
Classify this new tissue as GOOD or BAD

Step 1
First, Number of parameters K = Number of nearest neighbors.
Therefore, from the given data K = 3

Step 2
Calculate the distance between the query instance and all the training samples.
Here query instance is (3,7) and calculates the distance by using the Euclidean
Distance formula

The below table shows the Euclidean Distance for every paper from the query
instance (3, 7):

Step 3
Sort the distance and determine the nearest neighbors based on Kth
minimum distance
The below table shows the sorted distance and according to that nearest neighbor is
decided for each paper:
Step 4
Collect the Quality of the nearest neighbors. Hence in the below, table Quality for
Paper_2 is not included because the rank of this paper item is more than 3.
The below table shows the quality of each paper based on the nearest neighbor:

Step 5
Use the simple majority of the category of nearest neighbors as the prediction value
of the query instance.
Here, it got 2 Good and 1 Bad value for the quality of nearest neighbors.

Hence, 2 Good > 1 Bad from which, the conclusion is that a new sample paper_5
that passes laboratory test with "Acid durability = 3" and "Strength = 7" is included in
Good category quality.

AI in Foreign Language Learning and Teaching - Theory and Practice
100% (2)
AI in Foreign Language Learning and Teaching - Theory and Practice
160 pages
Distance-Based Methods - KNN
No ratings yet
Distance-Based Methods - KNN
8 pages
Unit 3 (B) NGP
No ratings yet
Unit 3 (B) NGP
84 pages
ML Unit 5..
No ratings yet
ML Unit 5..
40 pages
Unit 5 ML-2-70
No ratings yet
Unit 5 ML-2-70
69 pages
Week 7 Part 1KNN K Nearest Neighbor Classification
No ratings yet
Week 7 Part 1KNN K Nearest Neighbor Classification
47 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
ML Mid2 Ans
No ratings yet
ML Mid2 Ans
24 pages
Unsupervised Machine Learning
No ratings yet
Unsupervised Machine Learning
130 pages
CHP 4
No ratings yet
CHP 4
24 pages
1 - KNN
No ratings yet
1 - KNN
19 pages
Aiml M3 C2
No ratings yet
Aiml M3 C2
56 pages
STAT 451: Introduction To Machine Learning Lecture Notes
No ratings yet
STAT 451: Introduction To Machine Learning Lecture Notes
22 pages
Aiml Module 3 Part 2
No ratings yet
Aiml Module 3 Part 2
12 pages
Presentation UNIT-2 (Old)
No ratings yet
Presentation UNIT-2 (Old)
58 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
KNN
No ratings yet
KNN
53 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
ML Unit 2
No ratings yet
ML Unit 2
46 pages
Machine Learning Unit-3.1
No ratings yet
Machine Learning Unit-3.1
20 pages
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
No ratings yet
2EL1730-ML-Lecture04-Non Parametric Learning and Nearest Neighbor
47 pages
ML Unit V
No ratings yet
ML Unit V
10 pages
ML-Unit 5
No ratings yet
ML-Unit 5
40 pages
02-knn Notes
No ratings yet
02-knn Notes
23 pages
Instance Based Learning: Aiml/ Bda
No ratings yet
Instance Based Learning: Aiml/ Bda
25 pages
Machine Learning Unit 3
No ratings yet
Machine Learning Unit 3
40 pages
U3 KNN
No ratings yet
U3 KNN
6 pages
CH 2
No ratings yet
CH 2
30 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
03d Algind KNN Eng
No ratings yet
03d Algind KNN Eng
23 pages
Improving Time-Complexity of K Nearest Neighbors Classifier: A Systematic Review
No ratings yet
Improving Time-Complexity of K Nearest Neighbors Classifier: A Systematic Review
6 pages
KNN Algorithm
No ratings yet
KNN Algorithm
9 pages
'Machine Learning (Nagarjun)
No ratings yet
'Machine Learning (Nagarjun)
10 pages
Co-2 ML 2019
No ratings yet
Co-2 ML 2019
71 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
K-NN Algorithm and Clustering Analysis
No ratings yet
K-NN Algorithm and Clustering Analysis
93 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
Text Book 2 Module 4 Chapter 3-Similarity Based Learning
No ratings yet
Text Book 2 Module 4 Chapter 3-Similarity Based Learning
12 pages
ML Assignment No. 3: 3.1 Title
No ratings yet
ML Assignment No. 3: 3.1 Title
6 pages
AI Unit 5 Part1
No ratings yet
AI Unit 5 Part1
6 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
No ratings yet
STAT 479: Machine Learning Lecture Notes: Sebastian Raschka Department of Statistics University of Wisconsin-Madison
23 pages
COS4852 2023 Unit 2 - KNN
No ratings yet
COS4852 2023 Unit 2 - KNN
10 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
K-Nearest NEIGHBOUR
No ratings yet
K-Nearest NEIGHBOUR
16 pages
Lazy vs. Eager Learning
No ratings yet
Lazy vs. Eager Learning
6 pages
01 Basics 02knn 01
No ratings yet
01 Basics 02knn 01
7 pages
ML Unit 2
No ratings yet
ML Unit 2
31 pages
KNN (K-Nearest Neighbours) Is A Supervised Learning and Non-Parametric Algorithm That Can
No ratings yet
KNN (K-Nearest Neighbours) Is A Supervised Learning and Non-Parametric Algorithm That Can
4 pages
ML Unit 2 (Ab22)
No ratings yet
ML Unit 2 (Ab22)
61 pages
FPA Unit 2
No ratings yet
FPA Unit 2
20 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
ML Unit-4
No ratings yet
ML Unit-4
9 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
UNIT 2 - Notes
No ratings yet
UNIT 2 - Notes
31 pages
@vtudeveloper - in ML Mod 3
No ratings yet
@vtudeveloper - in ML Mod 3
32 pages
All in One AICourse
No ratings yet
All in One AICourse
173 pages
1 Unit 2 Notes
No ratings yet
1 Unit 2 Notes
31 pages
Why Do We Need A K-NN Algorithm?
No ratings yet
Why Do We Need A K-NN Algorithm?
11 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
ML 2
No ratings yet
ML 2
6 pages
Bartelt Industrial Automated Systems Chapter 01
0% (2)
Bartelt Industrial Automated Systems Chapter 01
15 pages
CCS369
No ratings yet
CCS369
2 pages
India Semicon Europa
No ratings yet
India Semicon Europa
18 pages
Artificial Intelligence in Higher Education: A Systematic Review of Impacts, Barriers, and Emerging Trends
No ratings yet
Artificial Intelligence in Higher Education: A Systematic Review of Impacts, Barriers, and Emerging Trends
9 pages
Report
No ratings yet
Report
75 pages
Chapter 10
100% (1)
Chapter 10
51 pages
AI Regulation 2022
No ratings yet
AI Regulation 2022
25 pages
1 s2.0 S1568494623000844 Main
No ratings yet
1 s2.0 S1568494623000844 Main
18 pages
AI and ML With Generative AI
No ratings yet
AI and ML With Generative AI
3 pages
Project Thesis Final YOLO SSD
No ratings yet
Project Thesis Final YOLO SSD
49 pages
Text Summarizer
No ratings yet
Text Summarizer
9 pages
Psy New Jan 2018
No ratings yet
Psy New Jan 2018
21 pages
Deep Learning: Convolutional Neural Network & Its Applications
No ratings yet
Deep Learning: Convolutional Neural Network & Its Applications
53 pages
Large Langage Models Pass The Turing Test
No ratings yet
Large Langage Models Pass The Turing Test
32 pages
2018 Shareholder Letter
No ratings yet
2018 Shareholder Letter
7 pages
Cerasis Digital Supply Chain Ebook PDF
No ratings yet
Cerasis Digital Supply Chain Ebook PDF
85 pages
SDGs
No ratings yet
SDGs
19 pages
Full Lecture
No ratings yet
Full Lecture
69 pages
Rfid
No ratings yet
Rfid
3 pages
Benchmark Data Contamination of Large Language Models: A Survey
No ratings yet
Benchmark Data Contamination of Large Language Models: A Survey
31 pages
Identification and Analysis of Adoption Barriers of Disruptive Technologies in The Logistics Industryinternational Journal of Logistics Management
No ratings yet
Identification and Analysis of Adoption Barriers of Disruptive Technologies in The Logistics Industryinternational Journal of Logistics Management
34 pages
SOP 2 Enh
No ratings yet
SOP 2 Enh
1 page
Einführungsveranstaltung
No ratings yet
Einführungsveranstaltung
16 pages
?? ???? ???????? PDF
No ratings yet
?? ???? ???????? PDF
12 pages
Machine Learning The Theoretical Minimum
No ratings yet
Machine Learning The Theoretical Minimum
63 pages
Assignment 5
No ratings yet
Assignment 5
3 pages
Experiential Study of Kernel Functions To Design An Optimized Multi-Class SVM
No ratings yet
Experiential Study of Kernel Functions To Design An Optimized Multi-Class SVM
6 pages
Hand Gesture Recognition: (I) Area of Interest Calculation
No ratings yet
Hand Gesture Recognition: (I) Area of Interest Calculation
7 pages

ML TRW

Uploaded by

ML TRW

Uploaded by

1. What do you understand by similarity-based learning?

Similarity learning is an area of supervised machine learning. Different from other

2. Compare between instance-based learning and model based learning.

4. Why k-NN method is called as memory-based method?

The k-Nearest Neighbors (k-NN) method is called a memory-based method because

5. What are the benefits and limitations of k-NN algorithm?

Advantages of the KNN Algorithm

Disadvantages of the KNN Algorithm

You might also like