0% found this document useful (0 votes)
111 views17 pages

Classification Model, Features and Decision Region

Classification is a machine learning technique used to predict group membership for new data based on a training set. It involves extracting features from data, training a classification model using a machine learning algorithm like decision trees or neural networks, and then using the trained model to classify new unlabeled data. Model performance is evaluated using metrics like accuracy, precision, recall from a confusion matrix or ROC curves, which plot the true positive rate against the false positive rate.

Uploaded by

rashi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
111 views17 pages

Classification Model, Features and Decision Region

Classification is a machine learning technique used to predict group membership for new data based on a training set. It involves extracting features from data, training a classification model using a machine learning algorithm like decision trees or neural networks, and then using the trained model to classify new unlabeled data. Model performance is evaluated using metrics like accuracy, precision, recall from a confusion matrix or ROC curves, which plot the true positive rate against the false positive rate.

Uploaded by

rashi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 17

WHAT IS CLASSIFICATION

 Classification is the process of constructing a

model that classifies data based on the training


set and uses it in classifying new data instances.
 For example classify
 Countries based on climate
 Cars based on gas mileage
 Customers for credit approval
 Card Fraud Detection etc.
EXAMPLE-STEGANLYSIS PROCESS
CLASSIFICATION IN MATLAB

Popular Methods for classification:


• Decision trees
• Rule learners
• Naive Bayes
• Decision tables
• SVMs
• ANN
TRAINING SET
FEATURE EXTRACTION
ANN CLASSIFIER
TEST SET – CLASSIFYING NEW INSTANCES
Training
(1)You can explore your data,
(2)select features,
(3)train models, and
(4)assess results.
Testing
(1)Prepare dataset
(2)select features,
(3)Give features to trained classifier( classifier-trained from Training Phase)
(4)assess results.
HOW TO MEASURE PERFORMANCE OF CLASSIFIER

(1)CONFUSION MATRIX
Predicted class
A(yes) B(no)
Actual class A(yes) 74 (TP) 64 (FN)
B(no) 30 (FP) 132 (TN)
• Correctly classified instances : 206
• Incorrectly classified instances : 94

• Accuracy – (TP+TN)/(TP+TN+FN+FP)=206/300 = 68.6667 %


• Error Rate – (FN+FP)/(TP+TN+FN+FP)=94/ 300 = 31.3333%
PERFORMANCE EVALUATION CLASS LABEL-YES
 Precision: proportion of the predicted cases that were
correct .
P= TP/(TP+FP)=74/104=.71
 Recall or TP rate: proportion of positive cases that are
correctly identified.
TPR= TP/(TP+FN)=74/138=.536
 False Positive Rate (FP) : proportion of negatives cases that
were incorrectly classified as positive.
FPR=FP/(FP+TN)=30/162=.185
 F-Measure : is a combined measure for precision and recall.
2*Precision*Recall/(Precision+Recall)
PERFORMANCE EVALUATION CLASS LABEL-NO
 Precision: proportion of the predicted cases that were
correct
P= TN/(TN+FN)=132/(132+64)=.67
 Recall or TP rate: proportion of positive cases that are
correctly identified.
TPR= TN/(TN+FP)=132/162 = .81
 False Positive Rate (FP) : proportion of negatives cases that
were incorrectly classified as positive.
FPR=FN/(TP+FN)=64/138=.46
 F-Measure : is a combined measure for precision and recall.
2*Precision*Recall/(Precision+Recall)
(2)ROC-RECEIVER OPERATING
CHARACTERISTICS
ROC graphs are a way to examine the performance of
classifiers .
A ROC graph is a plot with the false positive rate on
the X axis and the true positive rate on the Y axis.
THE ABOVE FIGURE SHOWS AN EXAMPLE OF AN ROC
GRAPH WITH TWO ROC CURVES LABELED C1 AND C2,
AND TWO ROC POINTS LABELED P1 AND P2.
CLASSIFICATION PROBLEM WITH OVERLAP

5
FEATURE 2

0
0 1 2 3 4 5 6 7 8
FEATURE 1
DECISION BOUNDARIES

8
Decision
Boundary Decision
7 Region 1

5
FEATURE 2

Decision
1 Region 2

0
0 1 2 3 4 5 6 7 8
FEATURE 1

You might also like