0% found this document useful (0 votes)

12 views28 pages

Class11-PatternClassification KNN

The document discusses supervised machine learning, focusing on pattern classification, which involves categorizing new observations into predefined classes. It outlines the two-step process of building a classifier through training and then using it for prediction, emphasizing the importance of data preparation and accuracy measurement. Additionally, it covers methods like the K-Nearest Neighbors (K-NN) for classification and the need for data normalization.

Uploaded by

Paladin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views28 pages

Class11-PatternClassification KNN

Uploaded by

Paladin

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Supervised Machine Learning:

Pattern Classification
Classification
• Problem of identifying to which of a set of categories a
new observation belongs
• Predicts categorical labels
• Example:
• Predicting a person as adult or child (2-class)
• Predicting the raise in salary based on the year of
experience and salary (2-class)
• Identify an email as spam or not (2-class)
• Predicting the presence or absence of disease (2-class)
– Pima Indians Diabetes Database: predict whether a patient
has diabetes or not based on diagnostic measurements
• Categorising the disease according to symptoms (Multi-
class)
• Categorizing the Iris flowers (Multi-class)

2
Classification
• Classification is a two step process
– Step1: Building a classifier (data modeling)
• Learning from data (training phase)
• Supervised learning: In supervised learning, each example
is a pair consisting of an input example and a desired
output value (class label)
• Training phase or learning phase is viewed as the learning
of a mapping or function that can predict the associated
class label of a given training example

– xn is the nth training example and yn is the associated class label

– Step2: Using classification model for prediction
• Testing phase - Predicting class label for the unseen data
• Accuracy of a classifier: Percentage of test examples
that are correctly classified by the classifier
• Target of learning techniques: Good generalization
ability 3
2-class Classification
• Example: Classifying a person as child or adult

Weight (x2)
Adul
t
Height, x1 Class
Adult/Child
Classifier Chil
Weight, x2 d

Adult :Class C1
Child :Class C2 Height (x1)

x = [x1 x2]T

4
Illustration of Training Set: Adult-Child
• Number of training examples (N) = 20
• Dimension of a training example = 2
• Class label attribute is 3rd dimension
• Class:
– Child (0)
– Adult (1)

Weight
in Kg

Height in cm 5
Step1: Building a Classification
Model (Training Phase)

Feature
extraction Training Examples
Chil
90 21.5
d
Feature
extraction
32.4 Chil
100 d
5

Feature
Training extraction
28.4 Chil Classifier
98
Phase 3 d

Feature
extraction
Adul
183 90
t

Feature
extraction
67.4 Adul
163
5 t

6
Step2: Classification (Testing Phase)

Feature
extraction Training Examples
90 21.5 Child
Feature
extraction
32.4 Child
100
5
Class label
Feature
extraction
(Adult)
Training
98
28.4
Child Classifier
Phase 3

Feature
extraction
183 90 Adult

Feature
extraction
67.4
163 Adult
5

Feature
extraction
Testing
150 50.6
Phase
7
Data Preparation for the Classification

• Divide the data into training set and test set

• Approach 1: When the number samples from each
class are almost equal (Balanced data)
– Most common split is 70-30 split:
• Training data contain 70% of samples from each class
• Test data contain remaining 30% of samples from each
class
– One can use other splits like 50-50 or 60-40 or 80-20 or
90-10

8
Data Preparation for the Classification:
Approach 1
• Suppose that we are doing 70-30 split
• Suppose the data set has 3000 samples
• Each sample is belonging to one of the 3 classes
• Suppose each class has 1000 samples
– Step1: From class1, 70% i.e. 700 samples considered as
training samples and remaining 30% i.e. 300 samples are
considered as test samples
– Step2: From class2, 70% i.e. 700 samples considered as
training samples and remaining 30% i.e. 300 samples are
considered as test samples
– Step3: From class3, 70% i.e. 700 samples considered as
training samples and remaining 30% i.e. 300 samples are
considered as test samples
– Step4: Combine training examples from each class
• Training set now contain 700+700+700=2100 samples
– Step5: Combine test examples from each class
• Test set now contain 300+300+300=900 samples
9
Data Preparation for the Classification
• Divide the data into training set and test set
• Approach 1: When the number samples from each class
are almost equal (Balanced data)
– Example:
• Training data contain 70% of samples from each class
• Test data contain remaining 30% of samples from each class
• Approach 2: When the number samples from each class
are not equal (Imbalanced data)
– One class may have large number of samples and another has
small number of samples
– 70%-30% division may cause learned model to be bias to class
with larger number of training samples
– Solution:
• Consider 70% or 80% of the samples from the class with least
number of samples as training data from that class
• Consider the same number of samples from other class as training
examples
• Each class will have same number of training examples
10
Data Preparation for the Classification:
Approach 2
• Suppose the data set has 3000 samples
• Each sample is belonging to one of the 3 classes
• Suppose class1 has 700 samples, class2 has 300 samples
and class3 has 2000 samples
– Step1: From class2, 70% i.e. 210 samples considered as
training samples and remaining 30% i.e. 90 samples are
considered as test samples
– Step2: From class1, 210 samples considered as training
samples and remaining 490 samples are considered as test
samples
– Step3: From class3, 210 samples considered as training
samples and remaining 1790 samples are considered as test
samples
– Step4: Combine training examples from each class
• Training set now contain 210+210+210=630 samples
– Step5: Combine test examples from each class
• Test set now contain 490+90+1790=2370 samples
11
Nearest-Neighbour Method
• Training data with N samples:

– d: dimension of input example

– M: Number of classes
• Step 1: Compute Euclidean distance for a test
example x with every training examples, x1, x2, …, xn,
…, xN

x1
12
Nearest-Neighbour Method
• Training data:

– d: dimension of input example

– M: Number of classes
• Step 1: Compute Euclidean distance for a test
example x with every training examples, x1, x2, …, xn,
…, xN • Step 2: Sort the examples
in the training set in the
ascending order of the
x2 distance to test example x
• Step 3: Assign the class of
the training example with
the minimum distance to
x1 the test example, x
13
Illustration of Nearest Neighbour Method:
Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm

• Step 1: Compute Euclidean

distance (ED) will each training
examples

14
Illustration of Nearest Neighbour Method:
Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm

• Step 2: Sort the examples in the

training set in the ascending order
of the distance to test example
15
Illustration of Nearest Neighbour
Method: Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm
• Step 3: Assign the class of the
training example with the
minimum distance to the test
example
– Class: Adult (1)
16
Nearest-Neighbour Method
• Training data:

– d: dimension of input example

– M: Number of classes
• Step 1: Compute Euclidean distance for a test
example x with every training examples, x1, x2, …, xn,
…, xN • Step 2: Sort the examples
in the training set in the
ascending order of the
x2 distance to x
• Step 3: Assign the class of
the training example with
the minimum distance to
x1 the test example, x
17
Illustration of Nearest Neighbour Method:
Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm

• Step 1: Compute Euclidean

distance (ED) will each training
examples
18
Illustration of Nearest Neighbour Method:
Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm
• Step 2: Sort the examples in the
training set in the ascending order
of the distance to test example

19
Illustration of Nearest Neighbour
Method: Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm
• Step 3: Assign the class of the
training example with the minimum
distance to the test example
– Class: Adult (1) ?
20
K-Nearest Neighbours (K-NN) Method
• Consider the class labels of the K training examples
nearest to the test example
• Step 1: Compute Euclidean distance for a test
example x with every training examples, x1, x2, …, xn,
…, xN

21
K-Nearest Neighbours (K-NN) Method
• Consider the class labels of the K training examples
nearest to the test example
• Step 1: Compute Euclidean distance for a test
example x with every training examples, x1, x2, …, xn,
…, xN • Step 2: Sort the examples in
the training set in the
ascending order of the
distance to x
x2 • Step 3: Choose the first K
examples in the sorted list
– K is the number of
neighbours for text
x1 example
• Step 4: Test example is assigned the most common
class among its K neighbours
22
Illustration of Nearest Neighbour
Method: Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm
• Consider K=5
• Step 3: Choose the first K=5
examples in the sorted list
23
Illustration of Nearest Neighbour
Method: Adult(1)-Child(0) Classification
Test Example:

Weight
in Kg

Height in cm
• Consider K=5
• Step 4: Test example is assigned
the most common class among its
K neighbours
– Class: Adult
24
Determining K, Number of Neighbours

• This is determined experimentally

• Starting with K=1, test set is used to estimate the
accuracy of the classifier
• This process is repeated each time by incrementing K
to allow for more neighbour
• The K value that gives the maximum accuracy may be
selected
• Preferably the value of K should be an odd number
and prime number.

25
Data Normalization
• Since the distance measure is used, K-NN classifier
require normalising the values of each attribute
• Normalising the training data:
– Compute the minimum and maximum values of each of
the attributes in the training data
– Store the minimum and maximum values of each of the
attributes
– Perform the min-max normalization on training data set
• Normalizing the test data:
– Use the stored minimum and maximum values of each
of the attributes from training set to normalise the test
examples
• NOTE: Ensure that test examples are not causing out-
of-bound error

26
Lazy Learning : Learning from Neighbours
• The K nearest neighbour classifier is an example of
lazy learner
• Lazy learning waits until the last minute before doing
any model construction to classify test example
• When the training examples are given, a lazy learner
simply stores them and waits until it is given a test
example
• When it sees the test example, then it classify based
on its similarity to the stored training examples
• Since the lazy learns stores the training examples or
instances, they also called instance based learners
• Disadvantages:
– Making classification or prediction is computationally
intensive
– Require efficient huge storage techniques when the
training samples are huge 27
Text Books

1. J. Han and M. Kamber, Data Mining: Concepts and

Techniques, Third Edition, Morgan Kaufmann Publishers,
2011.

2. S. Theodoridis and K. Koutroumbas, Pattern Recognition,

Academic Press, 2009.

3. C. M. Bishop, Pattern Recognition and Machine Learning,

Springer, 2006.

Quantitative Test Bank Chapter 9
No ratings yet
Quantitative Test Bank Chapter 9
67 pages
Class10 14 PatternClassification - 13 24sept2019
No ratings yet
Class10 14 PatternClassification - 13 24sept2019
50 pages
ML UNIT - III-Complete
No ratings yet
ML UNIT - III-Complete
52 pages
Chapter 3
No ratings yet
Chapter 3
33 pages
Class12-PatternClassification PerformanceMetric ReferenceTemplate
No ratings yet
Class12-PatternClassification PerformanceMetric ReferenceTemplate
33 pages
4K-Nearest Neighbor
No ratings yet
4K-Nearest Neighbor
38 pages
ML Unit 3
No ratings yet
ML Unit 3
106 pages
Classification
No ratings yet
Classification
58 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Classification and Clustering Algorithms
No ratings yet
Classification and Clustering Algorithms
108 pages
08 - KNN
No ratings yet
08 - KNN
39 pages
08 Classification Using K NN
No ratings yet
08 Classification Using K NN
23 pages
12 - 23ECE216 - Nearest Neighbors
No ratings yet
12 - 23ECE216 - Nearest Neighbors
29 pages
ML Lecture#2
No ratings yet
ML Lecture#2
70 pages
Lecture 5-KNN
No ratings yet
Lecture 5-KNN
55 pages
MachineLearning Unit-III
No ratings yet
MachineLearning Unit-III
26 pages
Difference Between Instance-And Model-Based Learning
No ratings yet
Difference Between Instance-And Model-Based Learning
35 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
T6 - KNN - Features, Distances &amp Amp Non-Parametric Models
No ratings yet
T6 - KNN - Features, Distances &amp Amp Non-Parametric Models
23 pages
3.1 K Nearest Neighbour Classifier
No ratings yet
3.1 K Nearest Neighbour Classifier
24 pages
CH5 Data Mining Classification Prepared by Dr. Maher Abuhamdeh
No ratings yet
CH5 Data Mining Classification Prepared by Dr. Maher Abuhamdeh
61 pages
Chapter 4
No ratings yet
Chapter 4
40 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
19 pages
DW&M Unit 3 Part I
No ratings yet
DW&M Unit 3 Part I
101 pages
DataMining ch5 PDF
No ratings yet
DataMining ch5 PDF
42 pages
Unit 5
No ratings yet
Unit 5
73 pages
KNN
No ratings yet
KNN
26 pages
KNN Algorithm
No ratings yet
KNN Algorithm
16 pages
Example 1: Riding Mowers
No ratings yet
Example 1: Riding Mowers
6 pages
Unit 2 ML
No ratings yet
Unit 2 ML
89 pages
ML Unit2
No ratings yet
ML Unit2
38 pages
ML 5
No ratings yet
ML 5
76 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
Unit 4
No ratings yet
Unit 4
24 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
w5 Classification
No ratings yet
w5 Classification
34 pages
Unit 4 - KVR
No ratings yet
Unit 4 - KVR
111 pages
Lecture#2. K Nearest Neighbors
No ratings yet
Lecture#2. K Nearest Neighbors
10 pages
K Nearest Neighbor Algorithm PDF
No ratings yet
K Nearest Neighbor Algorithm PDF
40 pages
KNN Dan KMeans
No ratings yet
KNN Dan KMeans
37 pages
cs4302 Lecture2
No ratings yet
cs4302 Lecture2
40 pages
Machine Learning: Lecture # 2 Data Normalization, KNN & Minimum Distance
No ratings yet
Machine Learning: Lecture # 2 Data Normalization, KNN & Minimum Distance
74 pages
08 Class Basic
No ratings yet
08 Class Basic
154 pages
8.predictive Analytics - Classification 2
No ratings yet
8.predictive Analytics - Classification 2
28 pages
K-Nearest Neighbors
No ratings yet
K-Nearest Neighbors
35 pages
3a KNN PDF
No ratings yet
3a KNN PDF
26 pages
Slide 2 ML Basics
No ratings yet
Slide 2 ML Basics
42 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
Lec 11
No ratings yet
Lec 11
31 pages
Clustering For Clasification
No ratings yet
Clustering For Clasification
13 pages
Different Paradigms of Pattern Recognition
No ratings yet
Different Paradigms of Pattern Recognition
8 pages
Instance Based Learning: Aiml/ Bda
No ratings yet
Instance Based Learning: Aiml/ Bda
25 pages
Lecture 3
No ratings yet
Lecture 3
17 pages
ML DSBA Lab4
No ratings yet
ML DSBA Lab4
5 pages
K-Nearest Neighbor Learning
No ratings yet
K-Nearest Neighbor Learning
31 pages
06 KNN
No ratings yet
06 KNN
41 pages
Mlfa Autumn 22 Lec 03
No ratings yet
Mlfa Autumn 22 Lec 03
61 pages
Unit 2
No ratings yet
Unit 2
55 pages
Elementary Statistics
From Everand
Elementary Statistics
jay prakash Maheshwari
5/5 (1)
Topic1-Natural Number System
No ratings yet
Topic1-Natural Number System
11 pages
Topic3 Limit Continuity
No ratings yet
Topic3 Limit Continuity
9 pages
Clustering Partitioning-Hierarchical-DensityBased
No ratings yet
Clustering Partitioning-Hierarchical-DensityBased
87 pages
Class10-Introduction To ML
No ratings yet
Class10-Introduction To ML
32 pages
Lab4 Linking
No ratings yet
Lab4 Linking
3 pages
Cs8592-Object Oriented Analysis and Design
No ratings yet
Cs8592-Object Oriented Analysis and Design
8 pages
Junos Release Notes 22.4r3
No ratings yet
Junos Release Notes 22.4r3
177 pages
AFF FAS8300 and FAS8700 Install and Setup
No ratings yet
AFF FAS8300 and FAS8700 Install and Setup
13 pages
Automatically Generate Packaging Items in EWM
No ratings yet
Automatically Generate Packaging Items in EWM
9 pages
Buku Pnduan Spik Prgram
No ratings yet
Buku Pnduan Spik Prgram
40 pages
Sans Emea Curriculum Overview Catalogue 2020
No ratings yet
Sans Emea Curriculum Overview Catalogue 2020
20 pages
Wolaita Sodo University Electrical and Computer Engineering Smart Boom Gate
No ratings yet
Wolaita Sodo University Electrical and Computer Engineering Smart Boom Gate
49 pages
Self Assessment User Guide
No ratings yet
Self Assessment User Guide
5 pages
Hox Correctipon
No ratings yet
Hox Correctipon
79 pages
Coursera Certificate JavaScript Jquery and JSON
No ratings yet
Coursera Certificate JavaScript Jquery and JSON
1 page
CS202 Current Final Term Paper 2022
0% (1)
CS202 Current Final Term Paper 2022
4 pages
SB601 datasheet-EN
No ratings yet
SB601 datasheet-EN
2 pages
Research Methodology
No ratings yet
Research Methodology
2 pages
Comdex Multimedia and Web Design Course Kit - CS6
No ratings yet
Comdex Multimedia and Web Design Course Kit - CS6
1 page
Electronic Medical Record
100% (1)
Electronic Medical Record
10 pages
G9 Revision Work Sheet
No ratings yet
G9 Revision Work Sheet
3 pages
Executor
No ratings yet
Executor
2 pages
Form ISO
No ratings yet
Form ISO
6 pages
2023-2024 - SEM - 1 - Online B.Sc. CS-Batch 2 - SEM 1 - BCS ZC313 - Introduction To Programming - EC-3 - SLOT-1 - 08-10-2023
No ratings yet
2023-2024 - SEM - 1 - Online B.Sc. CS-Batch 2 - SEM 1 - BCS ZC313 - Introduction To Programming - EC-3 - SLOT-1 - 08-10-2023
9 pages
Case Study 2
No ratings yet
Case Study 2
4 pages
Activity Guide and Evaluation Rubric - Task 3 - Electromagnetic Waves in Guided Media PDF
No ratings yet
Activity Guide and Evaluation Rubric - Task 3 - Electromagnetic Waves in Guided Media PDF
7 pages
Pine Labs POS - Troubleshooting Guide-HRPL-1
No ratings yet
Pine Labs POS - Troubleshooting Guide-HRPL-1
14 pages
DFI Boards Catalog 2021 R2 - 210705 - 1028 - Web
No ratings yet
DFI Boards Catalog 2021 R2 - 210705 - 1028 - Web
32 pages
QAV Checksheet 23.05.2022 L
No ratings yet
QAV Checksheet 23.05.2022 L
10 pages
Ijmet 08 10 013
No ratings yet
Ijmet 08 10 013
7 pages
Message
No ratings yet
Message
2 pages
Nokia: Service Schematics
No ratings yet
Nokia: Service Schematics
6 pages
ASCC R&D Platforms (Future of Education) Regional Online Forum - Participant Administrative Note V2
No ratings yet
ASCC R&D Platforms (Future of Education) Regional Online Forum - Participant Administrative Note V2
3 pages
RemoteConnect and SCADAPack x70 Utilities R2.6.1-Release Notes
No ratings yet
RemoteConnect and SCADAPack x70 Utilities R2.6.1-Release Notes
12 pages

Class11-PatternClassification KNN

Uploaded by

Class11-PatternClassification KNN

Uploaded by

Supervised Machine Learning:

– xn is the nth training example and yn is the associated class label

• Divide the data into training set and test set

– d: dimension of input example

– d: dimension of input example

• Step 1: Compute Euclidean

• Step 2: Sort the examples in the

– d: dimension of input example

• Step 1: Compute Euclidean

• This is determined experimentally

1. J. Han and M. Kamber, Data Mining: Concepts and

2. S. Theodoridis and K. Koutroumbas, Pattern Recognition,

3. C. M. Bishop, Pattern Recognition and Machine Learning,

You might also like