0% found this document useful (0 votes)
6 views48 pages

TE - DWM Module No 3

Uploaded by

riteshsingh8746
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views48 pages

TE - DWM Module No 3

Uploaded by

riteshsingh8746
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

MODULE NO : 03

CLASSIFICATION

1
CLASSIFICATION

• Classification is a form of data analysis that extracts models


describing important data classes.

• Such models, called classifiers, predict categorical class labels

• E.g. we can build a classification model to categorize bank


loan applications as either safe or risky

• A bank loans officer need to analysis from the data to learn


which loan applicants are “safe” and which are “risky” for the
bank to sanction loan
2
GENERAL
APPROACH
• General approach of classification is divided into two-steps

• In the first step, we build a classification model based on previous


data.

• In the second step, we determine if the model’s accuracy is


acceptable, and if so, we use the model to classify new data.

3
FIRST STEP

4
SECOND STEP

5
General Approach
The data classification process:
(a) Learning:
• Training data are analyzed by a classification algorithm.
• Here, the class label attribute is loan decision, and the learned
model or classifier is represented in the form of classification
rules
(b) Classification:
• Test data are used to estimate the accuracy of the classification
rules.
• If the accuracy is considered acceptable, the rules can be
applied to the classification of new data tuples.

6
Decision Tree Induction
• Decision tree induction is the learning of decision trees from
class-labeled training tuples.
• A decision tree is a flowchart-like tree structure, where each
internal node (nonleaf node) denotes a test on an attribute
• Each branch represents an outcome of the test, and each leaf
node (or terminal node) holds a class label.
• The topmost node in a tree is the root node.
• A typical decision tree is shown in Figure

7
8
DECISION TREE
INDUCTION
“How are decision trees used for classification?”
• Given a tuple, X, for which the associated class label is unknown,
the attribute values of the tuple are tested against the decision tree.
• A path is traced from the root to a leaf node, which holds the class
prediction for that tuple.
• Decision trees can easily be converted to classification rules.

9
RULE EXTRACTION FROM A DECISION TREE
• Decision tree classifiers are a popular method of
• To extract rules from a decision tree, one rule is created for each
path from the root to a leaf node.
• Each splitting criterion along a given path is logically ANDed to form
the rule antecedent (“IF” part)
• The leaf node holds the class prediction, forming the rule
consequent (“THEN” part)

10
RULE EXTRACTION FROM A DECISION TREE

11
12
13
14
15
RULE EXTRACTION FROM A DECISION TREE

18
19
BAYES CLASSIFICATION METHOD : NAIVE BAYES
• CLASSIFICATION
Bayesian classifiers are statistical classifiers.
• They can predict class membership probabilities such as the
probability that a given tuple belongs to a particular class.
• A simple Bayesian classifier known as the naive Bayesian classifier
to be comparable in performance with decision tree
• Naive Bayesian classifiers assume that the effect of an attribute
value on a given class is independent of the values of the other
attributes. This assumption is called class conditional independence.
• It is made to simplify the computations involved and, in this sense,
is considered “naive.”

20
PREDICTING A CLASS LABEL USING NAIVE
BAYESIAN CLASSIFICATION

21
22
23
22
25
26
MODEL EVALUATION AND
• SELECTION
Now we know what is classification, how classifiers works so we
may built a classification model
• For example, suppose you used previous sales data to build a
classifier to predict customer purchasing behaviour
• In this example we would like to analyse how our model can predict
the purchasing behaviour of future customers.(data on which
classifier has not been trained)
• We may built different classifiers and we can compare their
accuracy/performance by applying various evaluation matrics
• Before we discuss the various evaluation matrics, we need to
understand some basic terminologies

29
MODEL EVALUATION AND
• SELECTION
MODEL : a model is created by applying an algorithms(or statistical
calculations) to data to generate predictions/classifications of new
data.
• Given data set is partitioned into subsets
• Training data set
• Testing data set
• Training data set: training data set is used to derive the model or
train the model
• Testing data set: the models accuracy is estimated by using testing
data set

30
MODEL EVALUATION AND
• SELECTION
Positive tuples : positive tuples of the class attribute (in our last
example positive tuples are buys_computer= yes)
• Negative tuples : negative tuples of the class attribute (in our last
example negative tuples are buys_computer= no)
• Suppose we use our classifier on a test set of labeled tuples.
• P is the number of positive tuples and N is the number of negative
tuples.
• For each tuple, we compare the classifier’s class attribute prediction
with the tuple’s known class attribute value.

31
MODEL EVALUATION AND SELECTION
There are four additional terms we need to know that are
• True positives (TP): These refer to the positive tuples that were correctly
labeled by the classifier. Let TP be the number of true positives.
• True negatives (TN): These are the negative tuples that were correctly
labeled by the classifier. Let TN be the number of true negatives.
• False positives (FP) Type I Error: These are the negative tuples that were
incorrectly labeled as positive (e.g., tuples of class buys_computer=no
for which the classifier predicted buys_computer=yes). Let FP be the
number of false positives.
• False negatives (FN) Type II Error: These are the positive tuples that were
mislabeled as negative (e.g., tuples of class buys_computer=yes for which
the classifier predicted buys_computer=no). Let FN be the number of
false negatives.

32
MODEL EVALUATION AND
SELECTION

33
CONFUSION
• MATRIX
The confusion matrix is a useful tool for analyzing how well your
classifier can recognize tuples of different classes.
• TP and TN tell us when the classifier is getting things right, while
FP and FN tell us when the classifier is getting things wrong.

34
CONFUSION
• MATRIX
E.g. suppose in a data set of the customers who buys the computer,
there are total 10000 tuples, out of that 7000 are positive and 3000
are negative and our model has predicated 6954 are positive and
2588 are negative, so prepare confusion matrix

35
CONFUSION
• MATRIX
E.g. suppose in a data set of the customers who buys the computer,
there are total 10000 tuples, out of that 7000 are positive and 3000
are negative and our model has predicated 6954 are positive and
2588 are negative, so the confusion matrix will be

36
CLASSIFIERS PERFORMANCE
EVALUATION MEASURES

37
CLASSIFIERS PERFORMANCE
• EVALUATION
Find all evaluation MEASURES
measures for the following confusion matrix

38
CONFUSION
• MATRIX
E.g. suppose in a data set of the cancer, there are total 10000 tuples,
out of that 300 are positive and 9700 are negative and our model has
predicated 90 are positive and 9560 are negative, so prepare
confusion matrix and Find all evaluation measures for the confusion
matrix

39
Evaluation measures for the confusion matrix

1. Accuracy:
2. Error rate:
3. Sensitivity: ability to correctly label the positive as positive
4. Specificity: ability to correctly label the negative as negative
5. Precision: % of positive tuples labelled as positive

41
MODEL EVALUATION AND
SELECTION METHODS
1. Holdout
2. Random sampling
3. Cross validation
4. Bootstrap
5. ROC Curves (Receiver operating characteristic curves)

42
HOLD
• InOUT
this method, the given data are randomly partitioned into two
independent sets, a training set and a test set.
• Typically, two-thirds of the data are allocated to the training set, and
the remaining one-third is allocated to the test set.
• The training set is used to derive the model. The model’s accuracy is
then estimated with the test set.
• The estimate is pessimistic(negative) because only a portion of the
initial data is used to derive the model.

43
HOLD
OUT

44
RANDOM SUB-
• SAMPLING
Random subsampling is a variation of the holdout method in which
the holdout method is repeated k times.
• The overall accuracy estimate is taken as the average of the
accuracies obtained from each iteration.

45
CROSS-VALIDATION
• In k-fold cross-validation, the initial data are randomly partitioned into k
mutually exclusive subsets or “folds,” D1, D2, .... , Dk, each of
approximately equal size.
• Training and testing is performed k times. In iteration i, partition Di is
reserved as the test set, and the remaining partitions are collectively used
to train the model.
• That is, in the first iteration, subsets D2,....., Dk collectively serve as the
training set to obtain a first model, which is tested on D1
• the second iteration is trained on subsets D1, D3, ...... , Dk and tested on
D2 and so on...
• Each fold is used the same number of times for training and once for
testing
• the accuracy estimate is the overall number of correct classifications from
the k iterations, divided by the total number of tuples in the initial data

46
CROSS-
VALIDATION

47
BOOTST Video
RAP
• Bootstrap randomly selects a tuple from the original data set
• Add that tuple into the training dataset and again send it back to the
original dataset
• Repeat this process N times (N is the total number of tuples in the
original dataset)
• The bootstrap is allowed to select the same tuple more than once.
• We use the training data set to train the model and test dataset to
obtain an accuracy estimate of the model

48

You might also like