0% found this document useful (0 votes)
15 views17 pages

AAM Unit 2

The document provides an overview of supervised learning algorithms, focusing on Naive Bayes and Decision Tree classification methods. It explains the principles of Bayes' Theorem, the steps involved in Naive Bayes classification, and the structure and functioning of decision trees, including their advantages and disadvantages. Additionally, it discusses Random Forest as an ensemble learning technique that improves predictive accuracy by combining multiple decision trees.

Uploaded by

Ritika Darade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views17 pages

AAM Unit 2

The document provides an overview of supervised learning algorithms, focusing on Naive Bayes and Decision Tree classification methods. It explains the principles of Bayes' Theorem, the steps involved in Naive Bayes classification, and the structure and functioning of decision trees, including their advantages and disadvantages. Additionally, it discusses Random Forest as an ensemble learning technique that improves predictive accuracy by combining multiple decision trees.

Uploaded by

Ritika Darade
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Unit 2: Supervised Learning: Naive Bayes, Decision Tree

Naive Bayes Classification


 Supervised Learning Algorithm

 Bayes : Based on Applying Bayes’ Theorem

 Naïve : All the variables used in the Algorithm are independent

Bayes’ Theorem

P (A ∣ B)
Is the conditional probability of event A occurring, given that B is true.
P (B ∣ A)
Is the conditional probability of event B occurring, given that A is true.
P (A) and P(B)
Are the probabilities of A and B occurring independently of one another.
Example 1 of Bayes Theorem
Three bags contain 6 red, 4 black; 4 red, 6 black, and 5 red, 5 black balls
respectively. One of the bag is selected at random and a ball is drawn from it. If the
ball drawn is red, find the probability that it is drawn from the first bag.

Solution:
Let E1, E2, E3, and A be the events defined as follows:
E1 = bag first is chosen, E2 = bag second is chosen, E3 = bag third is chosen,
A = ball drawn is red

Each bag is equally likely to be chosen:

Step 1: Probabilities of Drawing a Red Ball from Each Bag:

Step 2: Find P (A) (Total Probability of Drawing Red):

Step 3: Apply Bayes’ Theorem

Thus, the probability that the red ball was drawn from the first bag is 2/5 or 40%
Example 2: of Bayes Theorem
Amy has two bags. Bag I has 7 red and 4 blue balls and bag II has 5 red and 9 blue
balls. Amy draws a ball at random and it turns out to be red. Determine the
probability that the ball was from the bag I.

Solution:

Define the events:

X = Ball is from Bag I, Y = Ball is from Bag II, A = Ball drawn is red

Assume A to be the event of drawing a red ball. We know that the probability of
choosing a bag for drawing a ball is 1/2, that is,

Each bag is equally likely to be chosen:

Step 1: Find the Probability of Drawing a Red Ball from Each Bag:

Step 2: Find P (A) (Total Probability of Drawing a Red Ball)

Converting to a common denominator:


Step 2: Apply Bayes’ Theorem

Thus, so, the probability that the red ball came from Bag I is 64%.

Naïve Bayes Classification example


Below is a training data set of weather and corresponding target variable
‘Play’ (suggesting possibilities of playing). Now, we need to classify
whether players will play or not based on weather condition.
Naïve Bayes Classification Example
Let’s follow the below steps to perform it –
1. In this first step data set is converted into a frequency table
2. Create Likelihood table by finding the probabilities
3. Use Naive Bayesian equation to calculate the posterior probability
for each class. The class with the highest posterior probability is
the outcome of the prediction.
Naïve Bayes Classification Example
Problem: Players will play if the weather is sunny. Is this statement
correct?
We can solve it using the above-discussed method of posterior
Probability.

Given data:

Step 1: Apply Bayes’ Theorem

Step 2: Interpretation

Since (which is higher probability), players


are more likely to play when the weather is sunny.

Conclusion: The statement is likely correct, but not always certain.


Applications that use Naive Bayes

 Text classification: The Naive Bayes Algorithm is used as a probabilistic


learning technique for text classification.

 Sentiment analysis: The Naive Bayes Algorithm is used to analyze sentiments


or feelings, whether positive, neutral, or negative.

 Recommendation system: The Naive Bayes Algorithm is a collection of


collaborative filtering issued for building hybrid recommendation systems that
assist you in predicting whether a user will receive any resource.

 Spam filtering: It is also similar to the text classification process. It is popular


for helping you determine if the mail you receive is spam.

 Medical diagnosis: This algorithm is used in medical diagnosis and helps you
to predict the patient’s risk level for certain diseases.

 Weather prediction: You can use this algorithm to predict whether the weather
will be good.

 Face recognition: This helps you identify faces.


Advantages of a Naive Bayes Classifier

 It doesn’t require larger amounts of training data.


 It is straightforward to implement.
 It is highly scalable with several data points and predictors.
 It can handle both continuous and categorical data.
 It is used in real-time predictions.

Disadvantages of a Naive Bayes Classifier

 The Naive Bayes Algorithm has trouble with the ‘zero-frequency


problem’. It happens when you assign zero probability for
categorical variables in the training dataset that is not available.

 It will assume that all the attributes are independent, which rarely
happens in real life. It will limit the application of this algorithm in
real-world situations.

 It will estimate things wrong sometimes, so you shouldn’t take its


probability outputs seriously.
Decision Tree Classification Algorithm

 Decision Tree is a Supervised learning technique that can be used for


both classification and Regression problems.

 It is a tree-structured classifier, where internal nodes represent the


features of a dataset, branches represent the decision rules and each leaf
node represents the outcome.

 In a Decision tree, there are two nodes, which are the Decision
Node and Leaf Node.

 Decision nodes are used to make any decision and have multiple
branches, whereas Leaf nodes are the output of those decisions and do
not contain any further branches.
Decision Tree Terminologies

• Root Node: Root node is from where the decision tree starts. It
represents the entire dataset, which further gets divided into two or more
homogeneous sets.

• Leaf Node: Leaf nodes are the final output node, and the tree cannot be
segregated further after getting a leaf node.

• Splitting: Splitting is the process of dividing the decision node/root


node into sub-nodes according to the given conditions.

• Branch/Sub Tree: A tree formed by splitting the tree.

• Pruning: Pruning is the process of removing the unwanted branches


from the tree.

• Parent/Child node: The root node of the tree is called the parent node,
and other nodes are called the child nodes.
Decision Tree working

 Step-1: Begin the tree with the root node, says S, which contains
the complete dataset.

 Step-2: Find the best attribute in the dataset using Attribute


Selection Measure (ASM).

 Step-3: Divide the S into subsets that contains possible values for
the best attributes.

 Step-4: Generate the decision tree node, which contains the best
attribute.

 Step-5: Recursively make new decision trees using the subsets of


the dataset created in step -3. Continue this process until a stage is
reached where you cannot further classify the nodes and called the
final node as a leaf node.
Attribute Selection Measure (ASM).
ASM is a technique used for the selecting best attribute for
discrimination among tuples. It gives rank to each attribute and the best
attribute is selected as splitting criterion.

There are two popular techniques for ASM, which are:


1. Information Gain: It calculates how much information a feature
provides us about a class. According to the value of information
gain, we split the node and build the decision tree.
2. Gini Index: Gini Index aims to decrease the impurities from the
root nodes (at the top of decision tree) to the leaf nodes of a
decision tree model.

Example:
Suppose there is a candidate who has a job offer and wants to decide
whether he should accept the offer or Not. So, to solve this problem, the
decision tree starts with the root node (Salary attribute by ASM).
Advantages of the Decision Tree
 It is simple to understand as it follows the same process which a
human follow while making any decision in real-life.

 It can be very useful for solving decision-related problems.

 It helps to think about all the possible outcomes for a problem.

 There is less requirement of data cleaning compared to other


algorithms.

 The cost of using tree for inference is logarithmic,so not to worry


much about calculation speed.It gives good speed

Disadvantages of the Decision Tree


 The decision tree contains lots of layers, which makes it complex.

 It may have an overfitting issue, which can be resolved using


the Random Forest algorithm.

 For more class labels, the computational complexity of the


decision tree may increase.

 Prone to errors for imbalanced datasets


Random Forest
 Random forest is a supervised learning technique.

 It can be used for both Classification and Regression problems in


Machine Learning.

 It is based on the concept of ensemble learning, which is a process


of combining multiple classifiers to solve a complex problem and
to improve the performance of the model.

 As the name suggests, "Random Forest is a classifier that


contains a number of decision trees on various subsets of the
given dataset and takes the average (regression) or majority votes
(classification) to improve the predictive accuracy of that
dataset."
Why use Random Forest?
 It takes less training time as compared to other algorithms.

 It predicts output with high accuracy, even for the large dataset it
runs efficiently.

 It can also maintain accuracy when a large proportion of data is


missing.

Random Forest Working

Step-1: Select random K data points from the training set.


Step-2: Build the decision trees associated with the selected data points
(Subsets).
Step-3: Choose the number N for decision trees that you want to build.
Step-4: Repeat Step 1 & 2.
Step-5: For new data points, find the predictions of each decision tree,
and assign the new data points to the category that wins the majority
votes.
Applications of Random Forest
 Banking: Banking sector mostly uses this algorithm for the
identification of loan risk.

 Medicine: With the help of this algorithm, disease trends and risks
of the disease can be identified.

 Land Use: We can identify the areas of similar land use by this
algorithm.

 Marketing: Marketing trends can be identified using this


algorithm.

Advantages of Random Forest

 Random Forest is capable of performing both Classification and


Regression tasks.

 It is capable of handling large datasets with high dimensionality.

 It enhances the accuracy of the model and prevents the overfitting


issue.

Disadvantages of Random Forest

 Although random forest can be used for both classification and


regression tasks, it is not more suitable for Regression tasks.

You might also like