0% found this document useful (0 votes)
23 views9 pages

4 3DM - Classification-Methods

The document discusses classification methods including k-nearest neighbors and naive Bayesian classifiers. K-nearest neighbors is an algorithm that classifies new data based on a majority vote of its k closest neighbors. It does not use a model and relies entirely on memory and distance calculations between data points.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views9 pages

4 3DM - Classification-Methods

The document discusses classification methods including k-nearest neighbors and naive Bayesian classifiers. K-nearest neighbors is an algorithm that classifies new data based on a majority vote of its k closest neighbors. It does not use a model and relies entirely on memory and distance calculations between data points.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Data Mining

Chapter 4_3: Classification Methods


(Examples)

2020 Dr Hisham Haider


Course’s Out Lines
2

 Introduction
 Data Preparation and Preprocessing
 Data Representation
 Classification Methods
 Evaluation
 Clustering Methods
 Mid Exam
 Association Rules
 Knowledge Representation
 Special Case study : Document clustering
 Discussion of Case studies by students
Out Lines
3

 Machine learning techniques


 k-Nearest Neighbors
 Naïve Bayesian Classifiers
k-Nearest Neighbors
4

 Also called instance based learning.

 K-nearest neighbor is a supervised learning


algorithm where the result of new instance
query is classified based on majority of K-
nearest neighbor category.

 The purpose of this algorithm is to classify a


new object based on attributes and training
samples.
k-Nearest Neighbors
5

 The classifiers do not use any model to fit and


only based on memory.
 Given a query point,

 We find K number of objects or (training points)


closest to the query point.

 The classification is using majority vote among


the classification of the K objects.
k-Nearest Neighbors - Algorithm
6

 Algorithm
 Given a new instance x,

 find its nearest neighbor <x’,y’>

 Return y’ as the class of x


Advantage and Disadvantage
7

 Advantage
 Robust to noisy training data
 Effective if the training data is large
 Disadvantage
 Need to determine value of parameter K
 Distance based learning is not clear which type of
distance to use and which attribute to use to
produce the best results.
 Computation cost is quite high because we need
to compute distance of each query instance to all
training samples.
Next …
8

 Naïve Bayesian Classifiers


 Artificial Neural Networks
Thanks
9

You might also like