0% found this document useful (0 votes)
6 views30 pages

Module2 2

Uploaded by

aarchanasingh20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views30 pages

Module2 2

Uploaded by

aarchanasingh20
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

PRESIDENCY UNIVERSITY

Bengaluru

Module 2
Supervised machine learning algorithms Part 2
– Classification models
Agenda
• Classification models
• Decision Tree algorithms using Entropy and Gini Index
as measures of node impurity,
• Model evaluation metrics for classification algorithms,
• Cohen's Kappa Statistic,
• Multi-class classification
• Class Imbalance problem.
• Naïve Bayes Classifiers
• Naive Bayes model for sentiment classification – An
Introduction

Artificial Intelligence
classification models

Artificial Intelligence
What is classification algorithm?
• Classification algorithm is a Supervised Learning technique in
which a program learns from the given dataset or observations
and then classifies new observation into a number of classes or
groups. Such as, Yes or No, 0 or 1, Spam or Not Spam, cat
or dog, etc.
• Classes can be called as targets/labels or categories.
• Unlike regression, the output variable of Classification is a
category, not a value, such as "Green or Blue", "fruit or animal”.
• Since Classification algorithm is a Supervised learning
technique, it takes labeled input data, which means it contains
input with the corresponding output.
• In classification algorithm, a discrete output function(y) is
mapped to input variable(x), i.e.,
y=f(x), where y = categorical output

Artificial Intelligence
Types of classifications

• The algorithm which implements the classification on a


dataset is known as a classifier.
• There are two types of Classifications:
 Binary Classifier: If the classification problem has only
two possible outcomes, then it is called as Binary
Classifier. Examples: YES or NO, MALE or FEMALE,
SPAM or NOT SPAM, CAT or DOG, etc.

 Multi-class Classifier: If a classification problem has


more than two outcomes, then it is called as Multi-class
Classifier. Example: Classifications of types of crops,
Classification of types of music.

Artificial Intelligence
Learners in classification problems
1. Lazy Learners
• Stores the training dataset and wait until it receives the test
dataset.
• In this case, classification is done on the basis of the most related
data stored in the training dataset.
• It takes less time in training but more time for predictions.
• Example: K-NN algorithm, Case-based reasoning

2. Eager Learners
• Eager Learners develop a classification model based on a training
dataset before receiving a test dataset.
• Unlike Lazy learners, Eager Learner takes more time in learning,
and less time in prediction.
• Example: Decision Trees, Naïve Bayes, ANN.

Artificial Intelligence
Types of classification algorithms

• Linear Models
• Logistic Regression
• Support Vector Machines

• Non-linear Models
• K-Nearest Neighbours
• Kernel SVM
• Naïve Bayes
• Decision Tree Classification
• Random Forest Classification

Artificial Intelligence
Methods for Evaluating a classification
model

Log Loss or Cross-Entropy Loss


• It is used for evaluating the performance of a classifier, whose
output is a probability value between the 0 and 1.
• For a good binary Classification model, the value of log loss
should be near to 0.
• The value of log loss increases if the predicted value deviates
from the actual value.
• The lower log loss represents the higher accuracy of the model.
• For Binary classification, cross-entropy can be calculated as
- (ylog(p)+(1-y)log(1-p))
• where y= Actual output, p= predicted output.

Artificial Intelligence
Methods for Evaluating a classification
model
Confusion Matrix
• The confusion matrix provides us a matrix/table as output and
describes the performance of the model.
• It is also known as the error matrix.
• The matrix consists of predictions result in a summarized form,
which has a total number of correct predictions and incorrect
predictions. The matrix looks like as below table:
Actual Positive Actual
Negative
Predicted Positive True Positive False Positive

Predicted Negative False Negative True Negative

Artificial Intelligence
Methods for Evaluating a classification
model

AUC-ROC curve

• ROC curve stands for Receiver Operating Characteristics


Curve and AUC stands for Area Under the Curve.
• It is a graph that shows the performance of the classification model at
different thresholds.
• To visualize the performance of the multi-class classification model,
we use the AUC-ROC Curve.
• The ROC curve is plotted with TPR and FPR, where TPR (True Positive
Rate) on Y-axis and FPR(False Positive Rate) on X-axis.

Artificial Intelligence
Uses cases of classification algorithms

• Email Spam Detection


• Speech Recognition
• Identifications of Cancer tumor cells.
• Drugs Classification
• Biometric Identification, etc.

Artificial Intelligence
Decision Tree algorithms using
Entropy and Gini Index

Artificial Intelligence
What is decision tree?
• A Supervised learning technique that can be used for both
classification and Regression problems, but mostly it is
preferred for solving Classification problems.
• Contains two nodes: Decision Node and Leaf Node.
• Decision nodes are used to make any decision and have
multiple branches, whereas Leaf nodes are the output of those
decisions and do not contain any further branches.
• It is a tree-structured classifier, where internal nodes
represent the features of a dataset, branches represent
the decision rules and each leaf node represents the
outcome.
• The decisions or the test are performed on the basis of features
of the given dataset.
• It is a graphical representation for getting all the
possible solutions to a problem/decision based on given
conditions.

Artificial Intelligence
CONTD…
• In order to build a tree, we use the CART algorithm, which
stands for Classification and Regression Tree algorithm.
• A decision tree simply asks a question, and based on the
answer (Yes/No), it further split the tree into subtrees

Artificial Intelligence
Significance of decision tree

• Decision Trees usually mimic human thinking ability


while making a decision, so it is easy to understand.
• The logic behind the decision tree can be easily
understood because it shows a tree-like structure.

Artificial Intelligence
Decision Tree Terminologies
 Root Node: Node from where the decision tree starts. It
represents the entire dataset, which further gets divided into two
or more homogeneous sets.

 Leaf Node: Final output node, and the tree cannot be segregated
further after getting a leaf node.

 Splitting: Process of dividing the decision node/root node into sub-


nodes according to the given conditions.

 Branch/Sub Tree: A tree formed by splitting the tree.

 Pruning: Pruning is the process of removing the unwanted


branches from the tree.

 Parent & Child node: The root node of the tree is called the
parent node, and other nodes are called the child nodes.

Artificial Intelligence
Decision tree algorithm working

• Step-1: Begin the tree with the root node, says S, which
contains the complete dataset.
• Step-2: Find the best attribute in the dataset using Attribute
Selection Measure (ASM).
• Step-3: Divide the S into subsets that contains possible values
for the best attributes.
• Step-4: Generate the decision tree node, which contains the
best attribute.
• Step-5: Recursively make new decision trees using the subsets
of the dataset created in step -3. Continue this process until a
stage is reached where you cannot further classify the nodes
and called the final node as a leaf node.

Artificial Intelligence
Illustrative Example

Artificial Intelligence
Attribute Selection Measures: Entropy

Artificial Intelligence
Attribute Selection Measures: Information
Gains

• Information gain is the measurement of changes in entropy


after the segmentation of a dataset based on an attribute.
• It calculates how much information a feature provides us about
a class.
• According to the value of information gain, we split the node
and build the decision tree.
• A decision tree algorithm always tries to maximize the value of
information gain, and a node/attribute having the highest
information gain is split first. It can be calculated using the
below formula:

Information Gain= Entropy(S)- [(Weighted Avg) *Entropy(each fe


ature)]

Artificial Intelligence
Attribute Selection Measures: Gini Index

•Gini index is a measure of impurity or purity used while


creating a decision tree in the CART(Classification and
Regression Tree) algorithm.

•An attribute with the low Gini index should be preferred as


compared to the high Gini index.

•It only creates binary splits, and the CART algorithm uses the
Gini index to create binary splits.

•Gini index can be calculated using the below formula:


Gini Index= 1 - ∑jPj2

Artificial Intelligence
Numerical example –
Decision Tree (Entropy, Gini Impurity & Information
Gain)

Artificial Intelligence
Artificial Intelligence
Artificial Intelligence
Artificial Intelligence
Artificial Intelligence
Computation time is reduced as don’t use
logarithmic function in Gini impurity

Artificial Intelligence
Artificial Intelligence
Advantages & Disadvantages of the Decision Tree

Advantages of the Decision Tree


• It is simple to understand as it follows the same process which a human follow
while making any decision in real-life.
• It can be very useful for solving decision-related problems and to generate
possible outcomes for a problem.
• There is less requirement of data cleaning compared to other algorithms.

Disadvantages of the Decision Tree


• The decision tree contains lots of layers, which makes it complex.
• It has overfitting issue - resolved using the Random Forest algorithm.
• For more class labels, the computational complexity of the decision tree may
increase.

Artificial Intelligence

You might also like