0% found this document useful (0 votes)
9 views35 pages

05 Classification Part1

Uploaded by

keremmertnet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views35 pages

05 Classification Part1

Uploaded by

keremmertnet
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 35

Classification

Part 1

CSE4416 – Introduction to Data Mining

Assoc. Prof. Dr. Derya BİRANT


Outline

◘ What Is Classification?
◘ Classification Examples
◘ Classification Methods
– Decision Trees
– Bayesian Classification
– K-Nearest Neighbor
– Neural Network
– Support Vector Machines (SVM)
– Fuzzy Set Approaches
What Is Classification?

◘ Classification
– Construction of a model to classify data
– When constructing the model, use the training set and the class labels
(i.e. yes no) in the target column

Training Set Model


Classification Steps

1. Model construction
– Each tuple is assumed to belong to a predefined class
– The set of tuples used for model construction is training set
– The model is represented as classification rules, trees, or mathematical formulae

2. Test Model
– Using test set, estimate accuracy rate of the model
• Accuracy rate is the percentage of test set samples that are correctly classified
by the model

3. Model Usage (Classifying future or unknown objects)


– If the accuracy is acceptable, use the model to classify data tuples whose classes
don’t known
Classification Steps
Tid Refund Marital Taxable Refund Marital Taxable
Status Income Cheat Status Income Cheat

1 Yes Single 125K No No Single 75K No


2 No Married 100K No Yes Married 50K Yes 2. Test Model
3 No Single 70K No No Married 150K Yes
4 Yes Married 120K No Yes Divorced 90K No
Test
10

5 No Divorced 95K Yes


6 No Married 60K No
Set
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No 1. Construct Model
10 No Single 90K Yes
10

Training
Learn
Model
Set Classifier

Refund Marital Taxable


Status Income Cheat

Yes Divorced 50K ? New


No Married 50K ? Data
Yes Single 150K ? 3. Use Model
Classification Example

◘ Given old data about customers and payments, predict new


applicant’s loan eligibility.
– Good Customers
– Bad Customers

Previous customers Classifier Rules


Salary > 5 L
Age
Salary Good/
Profession
Prof. = Exec
bad
Location
Customer type

New applicant’s data


Classification Techniques
1. Decision Trees 4. Neural Network

2. Bayesian Classification 5.Support Vector Machines (SVM)


p(c j ) n
c max
cj p(d )
 p(a
i 1
i | cj)

3. K-Nearest Neighbor 6.Fuzzy Set Approaches


Classification Techniques

Decision Trees

Bayesian Classification

K-Nearest Neighbor

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches


Decision Trees

◘ Decision Tree is a tree where


– internal nodes are simple decision rules on one or more attributes
– leaf nodes are predicted class labels
◘ Decision trees are used for deciding between several courses of
action
age income student credit_rating buys_computer Attribute
<=30 high no fair no
<=30 high no excellent no
Value
31…40 high no fair yes age?
>40 medium no fair yes
>40 low yes fair yes >40
>40 low yes excellent no
<=30 31..40 Classification
31…40 low yes excellent yes
<=30 medium no fair no student? yes credit rating?
<=30 low yes fair yes
>40 medium yes fair yes No Yes Excellent Fair
<=30 medium yes excellent yes
31…40 medium no excellent yes no yes no yes
31…40 high yes fair yes
>40 medium no excellent no
Decision Regions
Desicion Tree Applications

◘ Decision trees are used extensively in data mining.


◘ Has been applied to:
– classify medical patients based on the disease,
– classify customers based on past behavior (their interests, loyalty, etc.),
– classify documents
– ...

Salary < 1 M

Job = teacher Age < 30

Good Bad Bad Good


House Hiring
Decision Tree Adv. DisAdv.

Positives (+) Negatives (-)


+ Reasonable training time - Cannot handle complicated
+ Fast application relationship between features
+ Easy to interpret - Simple decision boundaries
(can be re-represented as if-then- - Problems with lots of missing
else rules) data
+ Easy to implement - Output attribute must be categorical
+ Can handle large number of - Limited to one output attribute
features
+ Does not require any prior
knowledge of data distribution
Rules Indicated by Decision Trees

◘ Write a rule for each path in the decision tree from the root to a leaf.
Decision Tree Algorithms

◘ ID3
– Quinlan (1981)
– Tries to reduce expected number of comparison
◘ C 4.5
– Quinlan (1993)
– It is an extension of ID3
– Just starting to be used in data mining applications
– Also used for rule induction
◘ CART
– Breiman, Friedman, Olshen, and Stone (1984)
– Classification and Regression Trees
◘ CHAID
– Kass (1980)
– Oldest decision tree algorithm
– Well established in database marketing industry
◘ QUEST
– Loh and Shih (1997)
Decision Tree Construction

◘ Which attribute is the best classifier?


– Calculate the information gain G(S,A) for each attribute A.
– Select the attribute with the highest information gain.

m
Entropy(S)   p i log 2 p i Entropy(S)  p1 log 2 p1  p 2 log 2 p 2
i 1

| Si |
Gain (S, A) Entropy(S )   Entropy(Si)
iA |S|
Entropy
Entropy
Decision Tree Construction

Which attribute first?


Decision Tree Construction

Entropi ( S )  (9 / 14) log 2 (9 / 14)  (5 / 14) log 2 (5 / 14) 0,940


Decision Tree Construction
Decision Tree Construction
Decision Tree Construction
Decision Tree Construction
Decision Tree Construction

Gain(S, Outlook) = 0,25


Gain(S, Temperature) = 0,03
Gain(S, Humidity) = 0,16
Gain(S, Windy) = 0,05

Outlook
Sunny rain
overcast

? yes
?
Decision Tree Construction

◘ Which attribute is next?

Outlook
Sunny rain
overcast

? yes
?

Gain( S sunny ,Wind ) 0,970  (2 / 5)1,0  (3 / 5)0,918 0,970 0,019


Gain( S sunny , Humidity ) 0,970  (3 / 5)0,0  (2 / 5)0,0 0,970

Gain( S Sunny , Temperatur e) 0,970  (2 / 5)0  (2 / 5)1  (1 / 5)0 0,570


Decision Tree Construction

Outlook

rain
Sunny overcast

Wind
Humidity yes

[D3,D7,D12,D13] weak strong


High Normal

yes no
No yes
[D4,D5,D10] [D6,D14]
[D1,D2, D8] [D9,D11]
Another Example
At the weekend:
- go shopping,
- watch a movie,
- play tennis or
- just stay in.

What you do depends on three things:


- the weather (windy, rainy or sunny);
- how much money you have (rich or poor)
- whether your parents are visiting.
Another Example
Classification Techniques

Decision Trees

Bayesian Classification

K-Nearest Neighbor

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches


Classification Techniques
2- Bayesian Classification
◘ A statistical classifier: performs probabilistic prediction, i.e., predicts class
membership probabilities.

◘ Foundation: Based on Bayes’ Theorem.

Given training data X, posteriori probability of a hypothesis H, P(H|X), follows


the Bayes theorem

P(H | X) P(X | H )P(H )


P(X)
Classification Techniques
2- Bayesian Classification
◘ X = (age <= 30 , income = medium, student = yes, credit_rating = fair)
age income student credit_rating buys_computer
◘ P(C1): P(buys_computer = “yes”) = 9/14 = 0.643 <=30 high no fair no
<=30 high no excellent no
P(C2): P(buys_computer = “no”) = 5/14= 0.357
31…40 high no fair yes
>40 medium no fair yes
◘ Compute P(X|Ci) for each class >40 low yes fair yes
P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 >40 low yes excellent no
P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 31…40 low yes excellent yes
P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444 <=30 medium no fair no
P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4 <=30 low yes fair yes
P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667 >40 medium yes fair yes
<=30 medium yes excellent yes
P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2
31…40 medium no excellent yes
P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667
31…40 high yes fair yes
P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4 >40 medium no excellent no

◘ P(X|C1) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044


P(X|C2) : P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019

P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028


P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007

Therefore, X belongs to class (“buys_computer = yes”)


Classification Techniques
2- Bayesian Classification
Classification Techniques

Decision Trees

Bayesian Classification

K-Nearest Neighbor

Classification Neural Network

Support Vector Machines (SVM)

Fuzzy Set Approaches


K-Nearest Neighbor (k-NN)

◘ An object is classified by a majority vote of its neighbors (k closest


members) .

◘ If k = 1, then the object is simply assigned to the class of its nearest


neighbor.

◘ Euclidean Distance measure is used to calculate how close


K-Nearest Neighbor (k-NN)

You might also like