0% found this document useful (0 votes)

9 views44 pages

06-Classification Part1

The document provides an introduction to classification in data mining, detailing its definition, examples, and various methods such as decision trees, Bayesian classification, and neural networks. It outlines the steps involved in classification, including model construction, testing, and usage, as well as applications of classification algorithms in real-world scenarios. Additionally, it discusses techniques like information gain and Gini index for attribute selection in decision tree algorithms.

Uploaded by

alprn13aydn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views44 pages

06-Classification Part1

Uploaded by

alprn13aydn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 44

Classification

Part 1

CME4416 – Introduction to Data Mining

Asst. Prof. Dr. Göksu Tüysüzoğlu

Outline

◘ What Is Classification?
◘ Classification Examples
◘ Classification Methods
– Decision Trees
– Bayesian Classification
– K-Nearest Neighbor
– Neural Network
– Support Vector Machines (SVM)
– Fuzzy Set Approaches
What Is Classification?

◘ Classification
– Construction of a model to classify data
– When constructing the model, use the training set and the class labels
(i.e. yes no) in the target column

Training Set Model

Terminology

◘ Classifier: An algorithm that maps the input data to a specific category.

◘ Classification model: A classification model tries to draw some conclusion
from the input values given for training. It will predict the class
labels/categories for the new data.
◘ Binary Classification: Classification task with two possible outcomes.
Eg: Gender classification (Male / Female)
◘ Multi-class classification: Classification with more than two classes. In
multi-class classification, each sample is assigned to one and only one
target label.
Eg: An animal can be a cat or dog but not both at the same time.
◘ Multi-label classification: Classification task where each sample is
mapped to a set of target labels (more than one class).
Eg: A news article can be about sports, a person, and location at the
same time.
Classification Steps

1. Model construction
– Each tuple is assumed to belong to a predefined class
– The set of tuples used for model construction is training set
– The model is represented as classification rules, trees, or mathematical formulas

2. Test Model
– Using test set, estimate accuracy rate of the model
• Accuracy rate is the percentage of test set samples that are correctly classified
by the model

3. Model Usage (Classifying future or unknown objects)

– If the accuracy is acceptable, use the model to classify data tuples whose classes
don’t known
Classification Steps
Tid Refund Marital Taxable Refund Marital Taxable
Status Income Cheat Status Income Cheat

1 Yes Single 125K No No Single 75K No

2 No Married 100K No Yes Married 50K Yes 2. Test Model
3 No Single 70K No No Married 150K Yes
4 Yes Married 120K No Yes Divorced 90K No
Test
10

5 No Divorced 95K Yes

6 No Married 60K No
Set
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No 1. Construct Model
10 No Single 90K Yes
10

Training
Learn
Set Classifier Model

Refund Marital Taxable

Status Income Cheat

Yes Divorced 50K ? New

No Married 50K ? Data
Yes Single 150K ? 3. Use Model
Applications of Classification Algorithms

◘ Email spam classification

◘ Bank customers loan pay willingness prediction
◘ Cancer tumor cells identification
◘ Sentiment analysis
◘ Drugs classification
◘ Facial key points detection
◘ Pedestrians detection in an automotive car driving.

Previous customers Classifier Rules

Age Good/
Salary > 5 L
Salary Bad
Profession Prof. = Exec
Location
Customer type

New applicant’s data

Classification Techniques
1. Decision Trees 4. Neural Network

2. Bayesian Classification 5. Support Vector Machines (SVM)

p(c j ) n
c max
cj p(d )
 p(a
i 1
i | cj)

3. K-Nearest Neighbor 6. Fuzzy Set Approaches

Classification Techniques

Decision Trees

Bayesian Classification

K-Nearest Neighbor

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches

…
Decision Trees

◘ Decision Tree is a tree where

– internal nodes are simple decision rules on one or more attributes
– leaf nodes are predicted class labels
◘ Decision trees are used for deciding between several courses of
action
age income student credit_rating buys_computer Attribute
<=30 high no fair no
<=30 high no excellent no
Value
31…40 high no fair yes age?
>40 medium no fair yes
>40 low yes fair yes >40
>40 low yes excellent no
<=30 31..40 Classification
31…40 low yes excellent yes
<=30 medium no fair no student? yes credit rating?
<=30 low yes fair yes
>40 medium yes fair yes No Yes Excellent Fair
<=30 medium yes excellent yes
31…40 medium no excellent yes no yes no yes
31…40 high yes fair yes
>40 medium no excellent no
Decision Regions
Rules Indicated by Decision Trees

◘ Write a rule for each path in the decision tree from the root to a leaf.
Entropy
Information Gain

◘ Information gain is a measure of the reduction in the overall

entropy of a set of instances that is achieved by testing on a
descriptive feature
◘ Computing information gain is a three-step process:
1. Compute the entropy of the original dataset with respect to the
target feature. This gives us an measure of how much information is
required in order to organize the dataset into pure sets.
2. For each descriptive feature, create the sets that result by
partitioning the instances in the dataset using their feature values, and
then sum the entropy scores of each of these sets. This gives a
measure of the information that remains required to organize the
instances into pure sets after we have split them using the descriptive
feature.
3. Subtract the remaining entropy value (computed in step 2) from the
original entropy value (computed in step 1) to give the information
gain.
Information Gain

◘ Which attribute is the best classifier?

– Calculate the information gain G(S,A) for each attribute A.
– Select the attribute with the highest information gain.

m
Entropy(S)   p i log2 p i Entropy(S)  p1 log 2 p1  p 2 log 2 p 2
i 1

| Si |
Gain( S , A) Entropy( S )  i A |S|
Entropy( S i )
Information Gain

Classification tree training [1]

Decision Tree Algorithms

◘ ID3
– Quinlan (1981)
– Tries to reduce expected number of comparison
◘ C 4.5
– Quinlan (1993)
– It is an extension of ID3
– Just starting to be used in data mining applications
– Also used for rule induction
◘ CART
– Breiman, Friedman, Olshen, and Stone (1984)
– Classification and Regression Trees
◘ CHAID
– Kass (1980)
– Oldest decision tree algorithm
– Well established in database marketing industry
◘ QUEST
– Loh and Shih (1997)
ID3 Algorithm

Input: A decision table with discrete-valued attributes.

Output: A decision tree.

1. For each attribute A, compute its information gain.

2. Select extended attribute A* with maximum information gain.
3. Partition the data set into k subsets according to the values of A*,
where k is the number of values of A*.
4. For each subset, if the class labels of the instances are same, then
obtain a leaf node with the label; otherwise, repeat the above process.
5. Output a decision tree.
Decision Tree Construction

Which attribute first?

Decision Tree Construction

Entropy( S )  (9 / 14) log2 (9 / 14)  (5 / 14) log2 (5 / 14) 0,940

Decision Tree Construction
Day Wind Tennis?
D1 weak no
Values(Wind)=weak, strong D2 strong no
D3 weak yes
D4 weak yes
Sweak = [6+, 2-]
D5 weak yes
Sstrong = [3+, 3-] D6 strong no
D7 strong yes
D8 weak no
D9 weak yes
D10 weak yes
D11 strong yes
D12 strong yes
D13 weak yes
D14 strong no
Decision Tree Construction
Entropi ( S )  (9 / 14) log 2 (9 / 14)  (5 / 14) log 2 (5 / 14) 0,940

Outlook
Sunny Overcast Rain
[2+, 3-] [4+, 0] [3+, 2-]
E=0.971 E=0.0 E=0.971

Humidity
High Normal
[3+, 4-] [6+,
E=0.985 1-]
E=0.592
Decision Tree Construction

Gain(S, Outlook) = 0,246

Gain(S, Temperature) = 0,029
Gain(S, Humidity) = 0,151
Gain(S, Wind) = 0,048
Decision Tree Construction

Outlook
Sunny Rain
Overcast
? yes ?
[2+, 3-] [4+, 0-] [3+, 2-]
Decision Tree Construction

◘ Which attribute is next?

Outlook
Sunny Rain
Overcas
? t yes
?
[2+, 3-] [4+, 0-] [3+, 2-]

Gain( S sunny ,Wind ) 0,970  (2 / 5)1,0  (3 / 5)0,918 0,970 0,019

Gain( S sunny , Humidity ) 0,970  (3 / 5)0,0  (2 / 5)0,0 0,970

Gain( S Sunny , Temperatur e) 0,970  (2 / 5)0  (2 / 5)1  (1 / 5)0 0,570

Decision Tree Construction

Outlook

Sunny Overcast Rain

Humidity Wind
yes
[D3,D7,D12,D13]
High Normal Weak Strong

no yes yes no
[D1,D2, D8] [D9,D11] [D4,D5,D10] [D6,D14]
Converting the Tree to Rules

Outlook

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

No Yes No Yes
R1: If (Outlook=Sunny)  (Humidity=High) Then PlayTennis=No
R2: If (Outlook=Sunny)  (Humidity=Normal) Then PlayTennis=Yes
R3: If (Outlook=Overcast) Then PlayTennis=Yes
R4: If (Outlook=Rain)  (Wind=Strong) Then PlayTennis=No
R5: If (Outlook=Rain)  (Wind=Weak) Then PlayTennis=Yes
Gain Ratio for Attribute Selection (C4.5)
 Information gain measure is biased towards attributes with a large
number of values
 C4.5 (a successor of ID3) uses gain ratio to overcome the problem
(normalization to information gain)

v | Dj | | Dj |
SplitInfo A ( D )   log 2 ( )
j 1 |D| |D|

 GainRatio(A) = Gain(A)/SplitInfo(A)
 The attribute with the maximum gain ratio is selected as the splitting
attribute
Computation of Gain Ratio
◘ Suppose the attribute “Wind” partitions D into 8 in D1: {Weak} and 6 in
D2: {Strong}
 8   8   6   6 
SplitInfoWind ( D )    log     log  0.9852
 14   14   14   14 

Gain(S, Wind) = 0.048

GainRatio(Wind) = Gain(S, Wind)/SplitInfo(Wind)
=0.048/0.9852=0.0487
Gini Index (CART, IBM IntelligentMiner)
 If a data set D contains examples from n classes, gini index, gini(D) is
defined as n
gini( D) 1  p 2j
j 1
where pj is the relative frequency of class j in D
 If a data set D is split on A into two subsets D1 and D2, the gini index
gini(D) is defined as gini A ( D) |D1| gini( D1)  |D2 | gini( D 2)
|D| |D|

 Reduction in Impurity: gini( A) gini( D)  giniA ( D)

 The attribute provides the largest reduction in impurity is chosen to
split the node
 Information gain can be calculated using the Gini index by replacing
the entropy measure with the Gini index.
Computation of Gini Index
◘ Ex. D has 9 tuples in “PlayTennis” = “yes” and 5 in “no”
2 2
 9  5
gini ( D) 1       0.459
 14   14 
◘ Suppose the attribute “Wind” partitions D into 8 in D 1: {Weak} and 6 in
D2: {Strong}
 8  6
giniwind ( D)   Gini( DWeak )    Gini( DStrong )
 14   14 
2 2 2 2
 3  3  6  2
gini( DStrong ) 1       0.5 gini( DWeak ) 1       0.375
 6  6  8  8
 8  6
giniwind ( D)   * 0.375    * 0.5 0.429
 14   14 
gini( A) gini( D)  giniWind ( D) 0.459  0.429 0.03
Overfitting and Tree Pruning
◘ Overfitting: An induced tree may overfit the training data
– Too many branches, some may reflect anomalies due to noise or
outliers
– Poor accuracy for unseen samples
◘ Two approaches to avoid overfitting
– Prepruning: Halt tree construction early ̵ do not split a node if this
would result in the goodness measure falling below a threshold
• Difficult to choose an appropriate threshold
– Postpruning: Remove branches from a “fully grown” tree—get a
sequence of progressively pruned trees
• Use a set of data different from the training data to decide
which is the “best pruned tree”
Random Forest
◘ A random forest F = (G1;…;Gm) is an ensemble of random decision trees Gi.
◘ Initially proposed by Breiman in 2001[1].
◘ It uses Bootstrap sampling and aggregating[2] when building each individual
tree.
◘ The prediction of an uncorrelated forest of trees is more accurate than that
of any individual tree.
Disadvantage:
High memory consumption
O(2d)

Memory consumption
grows exponentially with
the depth d of the trees.
Classification Techniques

Decision Trees

Bayesian Classification

K-Nearest Neighbor

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches

…
Classification Techniques
2- Bayesian Classification
◘ A statistical classifier: performs probabilistic prediction, i.e., predicts class
membership probabilities.
◘ Foundation: Based on Bayes’ Theorem.
Given training data X, posteriori probability of a hypothesis H, P(H|X), follows
the Bayes theorem

P(H | X) P(X | H )P(H )

P(X)

Example: According to a study, 1 out of 43 children develops a certain disease in

adulthood, and although it is not completely reliable, the test results of an infected child
are 80% positive and a healthy child's test is 10% positive. According to this information,
what is the likelihood that a child with a positive test result will actually be ill?

P (A): The probability that the child is ill = 1/43

P (B): Probability of the test being positive = 1/43 * 0.80 + 42/43 * 0.10 = 5/43
P (A | B): The probability of a positive test appearing (unknown)
P (B | A): The probability of positive test of the diseased child = 0.80

P (A | B) = P (B | A) * P (A) / P (B) => (0.80 * 1/43) / (5/43) = 0.16 = 16%.

Classification Techniques
2- Bayesian Classification
◘ X = (age <= 30 , income = medium, student = yes, credit_rating = fair)
age income student credit_rating buys_computer
◘ P(C1): P(buys_computer = “yes”) = 9/14 = 0.643 <=30 high no fair no
P(C2): P(buys_computer = “no”) = 5/14= 0.357 <=30 high no excellent no
31…40 high no fair yes
◘ Compute P(X|Ci) for each class >40 medium no fair yes
>40 low yes fair yes
P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222
>40 low yes excellent no
P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 31…40 low yes excellent yes
P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444 <=30 medium no fair no
P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4 <=30 low yes fair yes
P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667 >40 medium yes fair yes
P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2 <=30 medium yes excellent yes
P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667 31…40 medium no excellent yes
P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4 31…40 high yes fair yes
>40 medium no excellent no
◘ P(X|C1) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044
P(X|C2) : P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019

P(X|Ci)P(Ci) : P(X|buys_computer = “yes”) P(buys_computer = “yes”) = 0.028

P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007

Therefore, X belongs to class (“buys_computer = yes”)

Classification Techniques
2- Bayesian Classification
Classification Techniques

Decision Trees

Bayesian Classification

K-Nearest Neighbor

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches

…
K-Nearest Neighbor (k-NN)

◘ An object is classified by a majority vote of its neighbors (k closest

members) .

◘ If k = 1, then the object is simply assigned to the class of its nearest

neighbor.

◘ Euclidean Distance measure is used to calculate how close

K-Nearest Neighbor (k-NN)
Confusion Matrix with Classification Metrics

Predicted Class

Positive Negative

True True Positive (TP) False Negative (FN)

Actual Class

Type II Error

False False Positive (FP) True Negative (TN)

Type I Error
Confusion Matrix of Email Classification
Predicted Class

Positive Negative
Actual Class

True TP = 45 FN = 20

False FP = 5 TN = 30

The 69.23% spam emails are correctly

classified and excluded from all non-
spam emails.
The 85.71% non-spam emails
are accurately classified
and excluded from all spam emails.

The 90% of examples are classified as

spam are actually spam.

The 75% of examples are correctly

classified by the classifier.
Validation Techniques
◘ Simple Validation

Training set Test set

◘ Cross Validation

Training set Test set

Test set Training set

◘ n-Fold Cross Validation

Test set
References
◘ [1] Criminisi, A., Shotton, J., & Konukoglu, E. (2012). Decision forests: A unified framework for
classification, regression, density estimation, manifold learning and semi-supervised
learning. Foundations and Trends® in Computer Graphics and Vision, 7(2–3), 81-227.

Dragon Bundle: It's Time To & Make Projects That Matter
No ratings yet
Dragon Bundle: It's Time To & Make Projects That Matter
18 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
ML Unit II
No ratings yet
ML Unit II
183 pages
Decision Tree
No ratings yet
Decision Tree
33 pages
Classification and Prediction
No ratings yet
Classification and Prediction
40 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
Module - 4.1-DM-1
No ratings yet
Module - 4.1-DM-1
63 pages
Classification DecisionTreesNaiveBayeskNN
No ratings yet
Classification DecisionTreesNaiveBayeskNN
75 pages
DM 4
No ratings yet
DM 4
68 pages
Classification and Prediction
No ratings yet
Classification and Prediction
143 pages
Concepts and Techniques: Data Mining
100% (1)
Concepts and Techniques: Data Mining
81 pages
Classification: Basic Concepts
No ratings yet
Classification: Basic Concepts
73 pages
Data Mining Unit 2
No ratings yet
Data Mining Unit 2
41 pages
Unit 4 DM
No ratings yet
Unit 4 DM
88 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
83 pages
Data Mining: Classification
No ratings yet
Data Mining: Classification
70 pages
08 Class Basic
No ratings yet
08 Class Basic
86 pages
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
No ratings yet
Asset v1 MKAU+SEng9032+DEV 01+Type@Asset+Block@ML Chapterthree
129 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
81 pages
05 Classification Part1
No ratings yet
05 Classification Part1
35 pages
20210913115613D3708 - Session 05-08 Decision Tree Classification
No ratings yet
20210913115613D3708 - Session 05-08 Decision Tree Classification
37 pages
CH 5
No ratings yet
CH 5
81 pages
08ClassBasic L
No ratings yet
08ClassBasic L
78 pages
08 Class Basic
No ratings yet
08 Class Basic
81 pages
UNIT 2 Class Basic
No ratings yet
UNIT 2 Class Basic
69 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
Class Basic
No ratings yet
Class Basic
75 pages
DWDM Unit 4
No ratings yet
DWDM Unit 4
80 pages
Decision Tree
No ratings yet
Decision Tree
41 pages
Classification and Prediction
100% (1)
Classification and Prediction
31 pages
Unit6 - 2 Classification-Decision-Trees
No ratings yet
Unit6 - 2 Classification-Decision-Trees
36 pages
05 Classification
No ratings yet
05 Classification
79 pages
L-10 Iiitmg
No ratings yet
L-10 Iiitmg
28 pages
Data Mining: Concepts and Techniques: - Chapter 7
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 7
61 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
88 pages
Data Mining Book
No ratings yet
Data Mining Book
84 pages
Classification - Decision Trees
No ratings yet
Classification - Decision Trees
43 pages
Unit 3
No ratings yet
Unit 3
98 pages
08ClassBasic v1
No ratings yet
08ClassBasic v1
46 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
87 pages
ML Lecture 8 9 Classification
No ratings yet
ML Lecture 8 9 Classification
35 pages
Data Mining Unit 2
No ratings yet
Data Mining Unit 2
40 pages
Concepts and Techniques: - Chapter 8
No ratings yet
Concepts and Techniques: - Chapter 8
42 pages
05 Classification
No ratings yet
05 Classification
33 pages
04 Classification
No ratings yet
04 Classification
72 pages
Classification
No ratings yet
Classification
75 pages
Classification
No ratings yet
Classification
33 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
DM 3
No ratings yet
DM 3
37 pages
ML Unit 2
No ratings yet
ML Unit 2
84 pages
Unit 3 Classification
No ratings yet
Unit 3 Classification
71 pages
Mod 3 Part1 - Merged
No ratings yet
Mod 3 Part1 - Merged
101 pages
7 - Classification
No ratings yet
7 - Classification
71 pages
05classification Rule Mining
No ratings yet
05classification Rule Mining
56 pages
Classification
No ratings yet
Classification
45 pages
Consensus Clustering
No ratings yet
Consensus Clustering
7 pages
MCA Lateral 2017 PDF
No ratings yet
MCA Lateral 2017 PDF
53 pages
B.Tech 5th Sem CSE Final 1
No ratings yet
B.Tech 5th Sem CSE Final 1
16 pages
Event Analysis of Pulse-Reclosers in Distribution Systems Through Sparse Representation
No ratings yet
Event Analysis of Pulse-Reclosers in Distribution Systems Through Sparse Representation
6 pages
Fdiversity: User Manual
No ratings yet
Fdiversity: User Manual
57 pages
Feature Selection Methods
No ratings yet
Feature Selection Methods
24 pages
Base Paper (Flight Delay Prediction)
No ratings yet
Base Paper (Flight Delay Prediction)
6 pages
Department of Computer Science & Engineering: Syllabus Booklet
No ratings yet
Department of Computer Science & Engineering: Syllabus Booklet
27 pages
Lab 8 Manual
No ratings yet
Lab 8 Manual
8 pages
Naive Bayes
No ratings yet
Naive Bayes
26 pages
Data Mining: Analysis of Student Database Using Classification Techniques
No ratings yet
Data Mining: Analysis of Student Database Using Classification Techniques
7 pages
Spotle - Ai Data Science Final Capstone Project Building An Credit Card Analyser Using Decision Tree Classifier
No ratings yet
Spotle - Ai Data Science Final Capstone Project Building An Credit Card Analyser Using Decision Tree Classifier
4 pages
Machine Learning Coms-4771: Alina Beygelzimer Tony Jebara, John Langford, Cynthia Rudin
No ratings yet
Machine Learning Coms-4771: Alina Beygelzimer Tony Jebara, John Langford, Cynthia Rudin
17 pages
Face Recognition Attendance System Based On Real-Time Video Processing
No ratings yet
Face Recognition Attendance System Based On Real-Time Video Processing
8 pages
NLP Week 02
No ratings yet
NLP Week 02
54 pages
Twitter Sentiment Analysis
100% (2)
Twitter Sentiment Analysis
10 pages
CS3492-DBMS Question Bank - Watermark
No ratings yet
CS3492-DBMS Question Bank - Watermark
23 pages
01 Presentation AI 42
No ratings yet
01 Presentation AI 42
164 pages
Pa 1 Unit
No ratings yet
Pa 1 Unit
23 pages
AI Assignment 2
No ratings yet
AI Assignment 2
2 pages
ML Unit-V
No ratings yet
ML Unit-V
161 pages
6.algorithm Quasi-Optimal (AQ) Learning
No ratings yet
6.algorithm Quasi-Optimal (AQ) Learning
19 pages
Third Assessment-Business Analytics-2019-S1
No ratings yet
Third Assessment-Business Analytics-2019-S1
2 pages
Predictive Maintenance For Smart Industry
No ratings yet
Predictive Maintenance For Smart Industry
51 pages
Machine Learning With MATLAB Quick Reference
No ratings yet
Machine Learning With MATLAB Quick Reference
36 pages
Zs Internship Report
No ratings yet
Zs Internship Report
16 pages
Imbalanced Data: How To Handle Imbalanced Classification Problems
No ratings yet
Imbalanced Data: How To Handle Imbalanced Classification Problems
17 pages
ML & DL Notes
No ratings yet
ML & DL Notes
30 pages

06-Classification Part1

Uploaded by

06-Classification Part1

Uploaded by

Classification

CME4416 – Introduction to Data Mining

Asst. Prof. Dr. Göksu Tüysüzoğlu

Training Set Model

◘ Classifier: An algorithm that maps the input data to a specific category.

3. Model Usage (Classifying future or unknown objects)

1 Yes Single 125K No No Single 75K No

5 No Divorced 95K Yes

Refund Marital Taxable

Yes Divorced 50K ? New

◘ Email spam classification

Previous customers Classifier Rules

New applicant’s data

2. Bayesian Classification 5. Support Vector Machines (SVM)

3. K-Nearest Neighbor 6. Fuzzy Set Approaches

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches

◘ Decision Tree is a tree where

◘ Information gain is a measure of the reduction in the overall

◘ Which attribute is the best classifier?

Classification tree training [1]

Input: A decision table with discrete-valued attributes.

1. For each attribute A, compute its information gain.

Which attribute first?

Entropy( S )  (9 / 14) log2 (9 / 14)  (5 / 14) log2 (5 / 14) 0,940

Gain(S, Outlook) = 0,246

◘ Which attribute is next?

Gain( S sunny ,Wind ) 0,970  (2 / 5)1,0  (3 / 5)0,918 0,970 0,019

Gain( S Sunny , Temperatur e) 0,970  (2 / 5)0  (2 / 5)1  (1 / 5)0 0,570

Sunny Overcast Rain

Sunny Overcast Rain

Humidity Yes Wind

High Normal Strong Weak

Gain(S, Wind) = 0.048

 Reduction in Impurity: gini( A) gini( D)  giniA ( D)

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches

P(H | X) P(X | H )P(H )

Example: According to a study, 1 out of 43 children develops a certain disease in

P (A): The probability that the child is ill = 1/43

P (A | B) = P (B | A) * P (A) / P (B) => (0.80 * 1/43) / (5/43) = 0.16 = 16%.

P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028

Therefore, X belongs to class (“buys_computer = yes”)

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches

◘ An object is classified by a majority vote of its neighbors (k closest

◘ If k = 1, then the object is simply assigned to the class of its nearest

◘ Euclidean Distance measure is used to calculate how close

True True Positive (TP) False Negative (FN)

False False Positive (FP) True Negative (TN)

The 69.23% spam emails are correctly

The 90% of examples are classified as

The 75% of examples are correctly

Training set Test set

Training set Test set

Test set Training set

You might also like

P(X|Ci)P(Ci) : P(X|buys_computer = “yes”) P(buys_computer = “yes”) = 0.028