Classification

Uploaded by

learningshubh26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views33 pages

Classification

Uploaded by

learningshubh26

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 33

Unit-5

Classification

1
Outline
 Introduction to classification
 Classification & prediction issues
 Classification methods
Introduction to classification
 Classification is a supervised learning method.
 It is a data mining function that assigns items in a collection to
target categories or classes.
 The goal of classification is to accurately predict the target class
for each case in the data.
 For example, a classification model could be used to identify loan
applicants as low, medium, or high credit risks.
 In supervised learning, the learner(computer program) is provided
with two sets of data, training data set and test data set.
 The idea is for the learner to “learn” from a set of labeled
examples in the training set so that it can identify unlabeled
examples in the test set with the highest possible accuracy.
Introduction to classification (Cont..)
 Suppose a Database D is given as D = {t1,t2,..tn} and a set of desired
classes are C={C1,…,Cm}.
 The classification problem is to define the mapping m in such a
way that which tuple of database D belongs to which class of C.
 Actually we divides D into equivalence classes.
 Prediction is similar, but it viewed as having infinite number of
classes.
Classification Example
 Teachers classify students grades as A,B,C,D or E.
 Identify individuals with credit risks (high, low, medium or
unknown).
 In cricket (batsman, bowler, all-rounder)
 Websites (educational, sports, music)
Classification Example (Cont..)
 How teachers give grades to students based on their obtained
marks?
• If x >= 90 then A grade.
• If 80 <= x < 90 then B grade. x
<90 >=90
• If 70 <= x < 80 then C grade.
• If 60 <= x < 70 then D grade. x A
<80 >=80
• If x < 60 then E grade.
x B
<70 >=70
x C
<60 >=60
E D
Classification : a two step process
1) Model Construction
Classification
Algorithms
Training
Data

Name Rank Years Tenured

Classifier
Mike Asst. Prof. 3 No (Model)
Mary Asst. Prof. 7 Yes
Bill Prof. 2 Yes
Jim Asso. Prof. 7 Yes If Rank = ‘professor’
Dave Asst. Prof. 6 No OR year > 6
THEN tenured = ‘yes’
Anne Asso. Prof. 3 No
Classification : a two step process (Cont..)
2) Model Usage
Classifier
Testing
Data

Unseen
Data
Name Rank Years Tenured
Tom Asst. Prof. 2 No
Merlisa Asso. Prof. 7 No
(Jeff, Professor, 4)
George Prof. 5 Yes
Joseph Asst. Prof. 7 Yes
Tenured?

Yes
Classification : a two step process (Cont..)
1) Model Construction
• Describing a set of predetermined classes :
o Each tuple/sample is assumed to belong to a predefined class, as determined by
the class label attribute.
o The set of tuples used for model construction is called as training set.
o The model is represented as classification rules, decision trees, or
mathematical formulae.

2) Model Usage
• For classifying future or unknown objects
o Estimate accuracy of the model
o The known label of test sample is compared with the classified result from the
model.
o Accuracy rate is the percentage of test set samples that are correctly classified
by the model.
Classification & prediction issues
Data Preparation
 Data cleaning
• Pre-process data in order to reduce noise and handle missing values.
 Relevance analysis (Feature selection)
• Remove the irrelevant or redundant attributes.
 Data transformation
• Generalize the data to higher level concepts using concept hierarchies and/or
normalize data which involves scaling the values.
Classification & prediction issues (Cont..)
Evaluating Classification Methods
 Predict accuracy
• This refers the ability of the model to correctly predict the class label of new or
previously unseen data.
 Speed and scalability
• Time to construct model
• Time to use the model
 Robustness
• Handling noise and missing values
 Interpretability
• Understanding and insight provided by model
 Goodness of rules
• Decision tree size
• Strongest rule or not
Classification methods
Classification
Decision Tree

Bayesian
Classification

Rule Based
Classification

Neural Network
Decision tree
 One of the most common tasks is to build models for the
prediction of the class of an object on the basis of its attributes.
 The objects can be seen as a customer, patient, transaction, e-mail
message or even a single character.
 Attributes of patient object can be heart rate, blood pressure,
weight and gender etc.
 The class of the patient object would most commonly be
positive/negative for a certain disease.
Decision tree (Cont..)
 In decision tree are represented by a fixed set of attributes (e.g.
gender) and their values (e.g. male, female) described as
attribute-value pairs.
 If the attribute has small number of disjoint possible values (e.g.
high, medium, low) or there are only two possible classes (e.g.
true, false) then decision tree learning is easy.
 Extension to decision tree algorithm also handles real value
attributes (e.g. salary).
 It gives a class label to each instance of dataset.
Decision tree (Cont..)
 Decision tree is a classifier in the form of a tree structure
• Decision node: Specifies a test on a single attribute
• Leaf node: Indicates the value of the target attribute
• Arc/edge: Split of one attribute
• Path: A disjunction of test to make the final decision
Decision tree representation- example

Root Node

Branches

Leaf Node Leaf Node

Set of Possible Answers Set of Possible Answers

Key requirements for classification
 Sufficient data:
• Enough training cases should be provided to learn the model.

 Attribute-value description:
• Object or case must be expressible in terms of a fixed collection of
properties or attributes (e.g., hot, mild, cold).

 Predefined classes (target values):

• The target function has discrete output values (boolean or multiclass)
Important terms for decision tree
 Entropy
 Information Gain
 Gini Index
Entropy (E)
 It defines the certainty of a decision
• 1 if completely certain,
• 0 if completely uncertain,
• Normally data remains between 0 to 1 as entropy, a probability-based
measure used to calculate the amount of uncertainty.
Entropy (E) (Cont..)
 It measures that of how much information we don't know (how
uncertain we are about the data).
 It can be also used to measure how much information we gain
from an attribute when the target attribute is revealed to us.
 Which attribute is best?
✔ The attribute with the largest expected reduction in entropy is the 'best'
attribute to use next.
✔ Because if we have a large expected reduction it means taking away that
attribute has a big effect, meaning it must be very certain.
Entropy (E) (Cont..)
 A decision tree is built top-down from a root node and involves
partitioning the data into subsets that contain instances with
similar values (homogenous).
 ID3 algorithm uses entropy to calculate the homogeneity of a
sample.
 If the sample is completely homogeneous the entropy is zero and
if the sample is an equally divided it has entropy of one.
Information Gain
 Information gain can be used for continues-valued (numeric)
attributes.
 The attribute which has the highest information gain is selected
for split.
 Assume, that there are two classes P(positive) & N(negative).
 Suppose we have S samples, out of these p samples belongs to
class P and n samples belongs to class N.
 The amount of information, needed to decide split in S belongs to
P or N & that is defined as
-
Gini Index
 Assume there exist several possible split values for each attribute.
 We may need other tools, such as clustering, to get the possible
split values.
 It can be modified for categorical attributes.
 An alternative method to information gain is called the Gini Index.
 Gini is used in CART (Classification and Regression Trees).
 If a dataset T Contains examples from n classes, gini index, gini(T)
is defined as
Gini (T) = 1 - 2
• n: the number of classes
• pj: the probability that a tuple in D belongs to class Ci
Bayesian Classification
 Thomas Bayes, who proposed the Bayes Theorem so, it named
Bayesian theorem.
 It is statistical method & supervised learning method for
classification.
 It can solve problems involving both categorical and continuous
valued attributes.
 Bayesian classification is used to find conditional probabilities.
The Bayes Theorem
 The Bayes Theorem:
• P(H|X)=
 P(H|X) : Probability that the customer will buy a computer given that we know
his age, credit rating and income. (Posterior Probability of H)
 P(H) : Probability that the customer will buy a computer regardless of age,
credit rating, income (Prior Probability of H)
 P(X|H) : Probability that the customer is 35 years old, have fair credit rating and
earns $40,000, given that he has bought computer (Posterior Probability of X)
 P(X) : Probability that a person from our set of customers is 35 years old, have
fair credit rating and earns $40,000. (Prior Probability of X)
Naïve Bayes classifier - Example
Age Income Student Credit_Rating Class : buys_computer
<=30 High No Fair No
<=30 High No Excellent No
31..40 High No Fair Yes
>40 Medium No Fair Yes
>40 Low Yes Fair Yes
>40 Low Yes Excellent No
31..40 Low Yes Excellent Yes
<=30 Medium No Fair No
<=30 Low Yes Fair Yes
>40 Medium Yes Fair Yes
<=30 Medium Yes Excellent Yes
31..40 Medium No Excellent Yes
31..40 High Yes Fair Yes
>40 Medium No Excellent No
Naïve Bayes classifier - Solution
Age
P (<=30 | Yes) = 2/9 P (<=30 | No) = 3/5
P (31..40 | Yes) = 4/9 P (31..40 | No) = 0/5 P (Yes) = 9/14
P (> 40 | Yes) = 3/9 P (> 40 | No) = 2/5 P (No) = 5/14
Income
P (High | Yes) = 2/9 P (High | No) = 2/5
P (Medium | Yes) = 4/9 P (Medium | No) = 2/5
P (Low | Yes) = 3/9 P (Low | No) = 1/5
Student
P (No | Yes) = 3/9 P (No | No) = 4/5
P (Yes | Yes) = 6/9 P (Yes | No) = 1/5
Credit_rating
P (Fair | Yes) = 6/9 P (Fair | No) = 2/5
P (Excellent | Yes) = 3/9 P (Excellent | No) = 3/5
Naïve Bayes classifier - Solution
 An unseen sample Y = (<=30, Low, Yes, Excellent)
 P(Y|Yes).P(Yes) = P(<=30|Yes). P(Low|Yes). P(Yes|Yes).
P(Excellent|Yes) . P(Yes)
= 2/9 * 3/9 * 6/9 * 3/9 * 9/14
= 0.010582

 P(Y|No).P(No) = P(<=30|No). P(Low|No). P(Yes|No).

P(Excellent|No) . P(No)
= 3/5 * 1/5 * 1/5 * 3/5 * 5/14
= 0.005142
 Choose the class so that it maximizes this probability, this means
that new instance will be classified as Yes (Buys_computer)
Try yourself (Bayesian Classification)
Car No Color Type Origin Stolen? Unseen Data
1 Red Sports Domestic Yes Y = <Red, Domestic,
SUV>
2 Red Sports Domestic No Actual Data
3 Red Sports Domestic Yes Y = <Red, Sports,
Domestic>
4 Yellow Sports Domestic No
5 Yellow Sports Imported Yes 0.024,0.072
(Unseen)
6 Yellow SUV Imported No 0.192, 0.096
7 Yellow SUV Imported Yes (Actual)
8 Yellow SUV Domestic No
9 Red SUV Imported No
10 Red Sports Imported Yes
Rule Based Classification
 It is featured by building rules based on an object attributes.
 Rule-based classifier makes use of a set of IF-THEN rules for
classification.
 We can express a rule in the following from
• IF condition THEN conclusion
 Let us consider a rule R1,
R1: IF age=youth AND student=yes THEN buy_computer=yes
• The IF part of the rule is called rule antecedent or precondition.
• The THEN part of the rule is called rule consequent (conclusion).
• The antecedent (IF) part the condition consist of one or more attribute tests
and these tests are logically ANDed.
• The consequent (THEN) part consists of class prediction.
Rule Based Classification (Cont..)
 We can also write rule R1 as follows:
R1: ((age = youth) ^ (student = yes)) => (buys_computer = yes)
• If the condition (that is, all of the attribute tests) in a rule antecedent holds
true for a given tuple, we say that the rule antecedent is satisfied and that
the rule covers the tuple.
 A rule R can be assessed by its coverage and accuracy.
 Given a tuple X, from a class labeled data set D, let it covers the
number of tuples by R; the number of tuples correctly classified by
R; and |D| be the number of tuples in D.
 We can define the coverage and accuracy of R as
Coverage (R) = Accuracy (R) =
Neural Network
 Neural Network is a set of connected INPUT/OUTPUT UNITS, where each
connection has a WEIGHT associated with it.
 Neural Network learning is also called CONNECTIONIST learning due to the
connections between units.
 Neural Network learns by adjusting the weights so It is able to correctly
classify the training data and after testing phase, to classify unknown data.
 Strengths of Neural Network:
• It can handle against complex data. (i.e., problems with many parameters)
• It can handle noise in the training data.
• The Prediction accuracy is generally high.
• Neural Networks are robust, work well even when training examples contain
errors.
• Neural Networks can handle missing data well.

Deed of EXCHANGE of MOTOR VEHICLE
100% (10)
Deed of EXCHANGE of MOTOR VEHICLE
2 pages
CRIMINAL LAW - Estrada Problems and Answers
67% (3)
CRIMINAL LAW - Estrada Problems and Answers
3 pages
CONSTI LAW I - Council of Teachers V Secretary of Education PDF
100% (6)
CONSTI LAW I - Council of Teachers V Secretary of Education PDF
4 pages
Chapter 3 PG - 36
No ratings yet
Chapter 3 PG - 36
401 pages
7 Classification
100% (3)
7 Classification
63 pages
Fatawa Lottery and Gambling
No ratings yet
Fatawa Lottery and Gambling
25 pages
Methods of Training Needs Identification
No ratings yet
Methods of Training Needs Identification
32 pages
Unit V - Classification and Prediction 2020-21
100% (1)
Unit V - Classification and Prediction 2020-21
68 pages
Service and Parts Frymaster Bigl30 Series Manual Lov™ Gas Fryer
No ratings yet
Service and Parts Frymaster Bigl30 Series Manual Lov™ Gas Fryer
75 pages
Presentation On Walton
No ratings yet
Presentation On Walton
9 pages
Classification & Prediction
No ratings yet
Classification & Prediction
24 pages
Classification
100% (1)
Classification
37 pages
19-Introduction Classification Algorithm-18-09-2024
No ratings yet
19-Introduction Classification Algorithm-18-09-2024
102 pages
Classification and Prediction
100% (1)
Classification and Prediction
31 pages
Life Cycle Costing
100% (1)
Life Cycle Costing
8 pages
Nonlinear Inversion Flight Control For A Supermaneuverable Aircraft
100% (1)
Nonlinear Inversion Flight Control For A Supermaneuverable Aircraft
9 pages
01 AUBF Notes On Lab Safety (HIGHLIGHTED)
No ratings yet
01 AUBF Notes On Lab Safety (HIGHLIGHTED)
5 pages
Project 2
No ratings yet
Project 2
7 pages
Classification and Prediction
No ratings yet
Classification and Prediction
14 pages
E Ticket
No ratings yet
E Ticket
2 pages
Guide For The IFT Approval
No ratings yet
Guide For The IFT Approval
34 pages
Unit 4 Classification
No ratings yet
Unit 4 Classification
87 pages
08 Class Basic
No ratings yet
08 Class Basic
141 pages
Supervised Learning Algorithms
No ratings yet
Supervised Learning Algorithms
224 pages
7 - Classification
No ratings yet
7 - Classification
71 pages
Unit-6: Classification and Prediction
No ratings yet
Unit-6: Classification and Prediction
63 pages
DWDM Unit-3: What Is Classification? What Is Prediction?
No ratings yet
DWDM Unit-3: What Is Classification? What Is Prediction?
12 pages
Data Mining: Concepts and Techniques: - Chapter 7
No ratings yet
Data Mining: Concepts and Techniques: - Chapter 7
61 pages
Online Platforms For ICT Content Development
No ratings yet
Online Platforms For ICT Content Development
11 pages
Ce2304 Nol
No ratings yet
Ce2304 Nol
171 pages
Week 4 Part 1 Classification
No ratings yet
Week 4 Part 1 Classification
71 pages
Unit 4
No ratings yet
Unit 4
186 pages
ML-Lec-06-Supervised Learning-Decision Trees
No ratings yet
ML-Lec-06-Supervised Learning-Decision Trees
45 pages
A Study ON "Training and Development"
No ratings yet
A Study ON "Training and Development"
83 pages
DWDM - Unit - V
No ratings yet
DWDM - Unit - V
93 pages
Module - 4.1-DM-1
No ratings yet
Module - 4.1-DM-1
63 pages
Concepts and Techniques: Data Mining
No ratings yet
Concepts and Techniques: Data Mining
88 pages
Classification Ppts 2021
No ratings yet
Classification Ppts 2021
80 pages
Data Mining Book
No ratings yet
Data Mining Book
84 pages
CH 5
No ratings yet
CH 5
84 pages
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
No ratings yet
Jalali@mshdiua - Ac.ir Jalali - Mshdiau.ac - Ir: Data Mining
50 pages
6 Classification
No ratings yet
6 Classification
53 pages
Classification and Prediction
No ratings yet
Classification and Prediction
40 pages
V1-CH-6-Classification and Prediction
No ratings yet
V1-CH-6-Classification and Prediction
38 pages
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
No ratings yet
Classification and Prediction Lecture-22,23,24,25,26,27, 28: Dr. Sudhir Sharma Manipal University Jaipur
43 pages
Classification and Prediction: Data Mining 이복주 단국대학교 컴퓨터공학과
No ratings yet
Classification and Prediction: Data Mining 이복주 단국대학교 컴퓨터공학과
75 pages
Week 6 - 7 - Classification
No ratings yet
Week 6 - 7 - Classification
67 pages
IntroClassificationDA 2024
No ratings yet
IntroClassificationDA 2024
129 pages
Classification
No ratings yet
Classification
73 pages
Data Mining: Classification
No ratings yet
Data Mining: Classification
70 pages
05classification Rule Mining
No ratings yet
05classification Rule Mining
56 pages
Classification: Basic Concepts
No ratings yet
Classification: Basic Concepts
73 pages
Chap4 Classification Lecture 5
No ratings yet
Chap4 Classification Lecture 5
74 pages
Unit 3 Machine Learning
No ratings yet
Unit 3 Machine Learning
159 pages
Classification DMKD
No ratings yet
Classification DMKD
50 pages
AI Chapter 3 Part 2
No ratings yet
AI Chapter 3 Part 2
51 pages
Unit 3-Classification
No ratings yet
Unit 3-Classification
71 pages
TTDS Lecture 4
No ratings yet
TTDS Lecture 4
31 pages
Lecture 6 - Decision Trees
No ratings yet
Lecture 6 - Decision Trees
43 pages
Unit 2 Notes
No ratings yet
Unit 2 Notes
83 pages
Charles Crissman Wendy Crissman Christine Crissman v. Dover Downs Entertainment Inc. Dover Downs, Inc, 289 F.3d 231, 3rd Cir. (2000)
No ratings yet
Charles Crissman Wendy Crissman Christine Crissman v. Dover Downs Entertainment Inc. Dover Downs, Inc, 289 F.3d 231, 3rd Cir. (2000)
31 pages
Data Mining Unit 3
No ratings yet
Data Mining Unit 3
50 pages
05 Classification
No ratings yet
05 Classification
79 pages
Classification and Prediction
No ratings yet
Classification and Prediction
130 pages
Advanced Nuclear Energy
No ratings yet
Advanced Nuclear Energy
46 pages
04 Classification
No ratings yet
04 Classification
72 pages
Classification
No ratings yet
Classification
45 pages
London Show Daily April 11, 2018
No ratings yet
London Show Daily April 11, 2018
32 pages
Unit 4 DM
No ratings yet
Unit 4 DM
88 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
Chapter 5 Classification
No ratings yet
Chapter 5 Classification
24 pages
Journal of Accounting and Economics: Shuping Chen, Ying Huang, Ningzhong Li, Terry Shevlin T
No ratings yet
Journal of Accounting and Economics: Shuping Chen, Ying Huang, Ningzhong Li, Terry Shevlin T
19 pages
10 Classification New 1
No ratings yet
10 Classification New 1
31 pages
Technical Notes - John C. Hull
No ratings yet
Technical Notes - John C. Hull
64 pages
4 22865 IS465 2019 1 2 1 08ClassBasic
No ratings yet
4 22865 IS465 2019 1 2 1 08ClassBasic
43 pages
08ClassBasic L
No ratings yet
08ClassBasic L
78 pages
Classification
No ratings yet
Classification
36 pages
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
No ratings yet
Classification & Prediction: - Shailesh Yadav Central University of Rajasthan
28 pages
Classification-1
No ratings yet
Classification-1
48 pages
Sison V Teodoro
No ratings yet
Sison V Teodoro
1 page
2 Plugins Changelog
No ratings yet
2 Plugins Changelog
3 pages
ELE 2 Module 1
No ratings yet
ELE 2 Module 1
4 pages
FINAL CERTIFICATE OF RECOGNITION Grade 11 For Finals 2nd Sem
No ratings yet
FINAL CERTIFICATE OF RECOGNITION Grade 11 For Finals 2nd Sem
3 pages
Affidavit With Pay Slip
No ratings yet
Affidavit With Pay Slip
4 pages
Unit Iii
No ratings yet
Unit Iii
11 pages
Siemens SW Process Simulate Fs
No ratings yet
Siemens SW Process Simulate Fs
3 pages
SFT3508S (SFT3508I) IPTV Gateway Server Spec
No ratings yet
SFT3508S (SFT3508I) IPTV Gateway Server Spec
4 pages
TMSI
No ratings yet
TMSI
1 page
Statistical Classification: Fundamentals and Applications
From Everand
Statistical Classification: Fundamentals and Applications
Fouad Sabry
No ratings yet
Alternating Decision Tree: Fundamentals and Applications
From Everand
Alternating Decision Tree: Fundamentals and Applications
Fouad Sabry
No ratings yet

Classification

Uploaded by

Classification

Uploaded by

Unit-5

Name Rank Years Tenured

Leaf Node Leaf Node

Set of Possible Answers Set of Possible Answers

 Predefined classes (target values):

 P(Y|No).P(No) = P(<=30|No). P(Low|No). P(Yes|No).

You might also like