0% found this document useful (0 votes)

9 views

Unit-4

Unit IV covers probabilistic learning and ensemble methods, including Bayesian learning, Naive Bayes classifiers, and ensemble techniques like bagging and boosting. It explains conditional probability, Bayes' theorem, and the workings of Naive Bayes algorithms, highlighting their applications and limitations. Additionally, it discusses ensemble learning's effectiveness in improving prediction accuracy by combining multiple models through techniques such as max voting, averaging, and weighted averaging.

Uploaded by

anshulchauhan0109

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

9 views

Unit-4

Uploaded by

anshulchauhan0109

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Unit-IV: PROBABILISTIC LEARNING & ENSEMBLE

• Bayesian Learning, Bayes Optimal Classifier, Naive Bayes Classifier, Bayesian Belief
Networks.
• Ensembles methods: Bagging & boosting, C5.0 boosting, Random Forest, Gradient Boosting
Machines and XGBoost
Conditional Probability:

• Conditional Probability:
• Conditional probability is a concept in probability theory that deals with the probability of an event occurring
given that another event has already occurred.
• It's denoted as P(A | B), which reads as "the probability of event A occurring given that event B has occurred.“
• In other words, it quantifies the likelihood of event A happening under the condition that we already know
event B has taken place.
• The formula for conditional probability is:

P(A | B) = P(A and B) / P(B)

Where:
• P(A | B) is the conditional probability of event A given event B.
• P(A and B) is the joint probability of both events A and B occurring together.
• P(B) is the probability of event B happening.
Conditional Probability applications:
Conditional probability is useful in various real-world applications, such as:

1. Medical Diagnosis: The probability of a patient having a particular disease given the results of a medical test.

2. Weather Forecasting: The probability of rain tomorrow given today's weather conditions.

3. Finance: Assessing the probability of a stock price increasing given certain economic indicators.

4. Quality Control: The probability of a product being defective given a specific manufacturing process.

5. Information Retrieval: Recommender systems in e-commerce websites calculate the probability of a user liking a
product based on their past preferences.

Conditional probability is a fundamental concept in probability theory and is essential for making informed decisions in
situations where events depend on each other.
Bayes Theorem:
We should have prerequisite knowledge of conditional probability.

• Bayes theorem is also known with some other name such as Bayes rule or Bayes Law.

• Bayes theorem helps to determine the probability of an event with random knowledge.

• It is used to calculate the probability of occurring one event while other one already occurred.

• It is a best method to relate the condition probability and marginal probability.

• Bayes theorem is extensively applied in financial, in health and medical, research and survey industry, aeronautical
sector, etc.

• a simplified version of Bayes theorem (Naïve Bayes classification) is also used to reduce computation time and average
cost of the projects.
Bayes Theorem:
Bayes' Theorem, named after the 18th-century statistician and philosopher Thomas Bayes, is a fundamental principle in probability theory

and statistics. It provides a way to update the probability for a hypothesis (or event) based on new evidence. The theorem can be stated as

follows:

P(A | B) = [P(B | A) * P(A)] / P(B)

Where: P(A | B) is called as posterior, which we need to calculate. It is defined as updated probability after considering the evidence.

P(B | A) is called the likelihood. It is the probability of evidence when hypothesis is true.

P(A) is the prior probability of event A, probability of hypothesis before considering the evidence.

P(B) is called marginal probability. It is defined as the probability of evidence under any consideration.

Hence, Bayes Theorem can be written as:

posterior = likelihood * prior / evidence

Bayes Theorem:
Bayes' Theorem allows you to update your prior beliefs (prior probability) by incorporating new evidence (likelihood) to calculate the
revised probability (posterior probability) of the hypothesis of interest. The marginal probability represents the overall probability of
observing the evidence, taking into account all possible hypotheses.

P(A) is prior probability : Probability of hypothesis before observing the evidence.

P(B | A) is likelihood probability : Probability of the evidence B given that the probability of a hypothesis A is true.

P(A | B) is posterior probability: Probability of hypothesis A on the observed event B.

P(B) is marginal probability: The marginal probability is the overall probability of observing evidence B.

P(B) = Σ [P(Ai) * P(B | Ai)]

Naïve Bayes Classifier:
• Naïve Bayes algorithm is a supervised learning algorithm, which is based on Bayes theorem and used for solving

classification problems.

• It is mainly used in text classification that includes a high-dimensional training dataset.

• Naïve Bayes Classifier is one of the simple and most effective Classification algorithms which helps in building the fast

machine learning models that can make quick predictions.

• It is a probabilistic classifier, which means it predicts on the basis of the probability of an object.

• Some popular examples of Naïve Bayes Algorithm are spam filtration, Sentimental analysis, and classifying articles.
Why is it called Naïve Bayes?
• The Naïve Bayes algorithm is comprised of two words Naïve and Bayes, Which can be described as:

• Naïve: It is called Naïve because it assumes that the occurrence of a certain feature is independent of the occurrence of

other features. Such as if the fruit is identified on the bases of color, shape, and taste, then red, spherical, and sweet fruit

is recognized as an apple. Hence each feature individually contributes to identify that it is an apple without depending on

each other.

• Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.

How Do Naive Bayes Algorithms Work?
• I have a training data set of weather and corresponding target variable ‘Play’ (suggesting possibilities of playing).

• Now, we need to classify whether players will play or not based on weather condition.

• Let’s follow the below steps to perform it.

1. Convert the data set into a frequency table: In this first step data set is converted into a frequency table

2. Create Likelihood table by finding the probabilities: Create Likelihood table by finding the probabilities like Overcast

probability = 0.29 and probability of playing is 0.64.

How Do Naive Bayes Algorithms Work?
How Do Naive Bayes Algorithms Work?
• Use Naive Bayesian equation to calculate the posterior probability
• Now, use Naive Bayesian equation to calculate the posterior probability for each class. The class with the
highest posterior probability is the outcome of the prediction.

• Problem: Players will play if the weather is sunny. Is this statement correct?
• We can solve it using the above-discussed method of posterior probability.
P(Yes | Sunny) = P( Sunny | Yes) * P(Yes) / P (Sunny)

• Here we have P (Sunny |Yes) = 3/9 = 0.33, P(Sunny) = 5/14 = 0.36, P( Yes)= 9/14 = 0.64

• Now, P (Yes | Sunny) = 0.33 * 0.64 / 0.36 = 0.60, which has higher probability.
How Do Naive Bayes Algorithms Work?
How Do Naive Bayes Algorithms Work?
How Do Naive Bayes Algorithms Work?
How Do Naive Bayes Algorithms Work?
How Do Naive Bayes Algorithms Work?
How Do Naive Bayes Algorithms Work?
Pros and Cons of Naive Bayes:
Pros:
• It is easy and fast to predict class of test data set. It also perform well in multi class prediction
• When assumption of independence holds, the classifier performs better compared to other machine learning
models like logistic regression or decision tree, and requires less training data.
• It perform well in case of categorical input variables compared to numerical variable(s). For numerical
variable, normal distribution is assumed (bell curve, which is a strong assumption).
Cons:
• If categorical variable has a category (in test data set), which was not observed in training data set, then model
will assign a 0 (zero) probability and will be unable to make a prediction. This is often known as “Zero
Frequency”. To solve this, we can use the smoothing technique. One of the simplest smoothing techniques is
called Laplace estimation.
• On the other side, Naive Bayes is also known as a bad estimator, so the probability outputs from
predict_proba are not to be taken too seriously.
• Another limitation of this algorithm is the assumption of independent predictors. In real life, it is almost
impossible that we get a set of predictors which are completely independent.
Applications of Naive Bayes Algorithms:
• Real-time Prediction: Naive Bayesian classifier is an eager learning classifier and it is super fast. Thus, it
could be used for making predictions in real time.

• Multi-class Prediction: This algorithm is also well known for multi class prediction feature. Here we can
predict the probability of multiple classes of target variable.

• Text classification/ Spam Filtering/ Sentiment Analysis: Naive Bayesian classifiers mostly used in text
classification (due to better result in multi class problems and independence rule) have higher success rate as
compared to other algorithms. As a result, it is widely used in Spam filtering (identify spam e-mail) and
Sentiment Analysis (in social media analysis, to identify positive and negative customer sentiments)

• Recommendation System: Naive Bayes Classifier and Collaborative Filtering together builds a
Recommendation System that uses machine learning and data mining techniques to filter unseen information
and predict whether a user would like a given resource or not.
Ensemble Learning:
• Ensemble learning is a machine learning technique that enhances accuracy and resilience in forecasting by merging
predictions from multiple models.

• It aims to mitigate errors or biases that may exist in individual models by leveraging the collective intelligence of the
ensemble.

• The underlying concept behind ensemble learning is to combine the outputs of diverse models to create a more precise
prediction. By considering multiple perspectives and utilizing the strengths of different models, ensemble learning
improves the overall performance of the learning system.

• This approach not only enhances accuracy but also provides resilience against uncertainties in the data.

• By effectively merging predictions from multiple models, ensemble learning has proven to be a powerful tool in
various domains, offering more robust and reliable forecasts.
Ensemble Learning Prediction Techniques:
• Max Voting

• Averaging

• Weighted Averaging
Ensemble Learning Prediction Techniques:
Max Voting: A Voting Classifier is an ensemble machine learning technique that combines the predictions from multiple
individual classifiers (also known as base classifiers or estimators) to make a final prediction.

• It’s a type of model averaging approach where each base classifier contributes its prediction, and the final prediction
is determined by a majority vote (for classification) or an average (for regression).

• The Voting Classifier can be used for both binary and multiclass classification tasks.
Types of Voting Classifiers:
• Hard Voting: In hard voting, each base classifier’s prediction is treated as a vote, and the final prediction is the
majority vote among the predictions of the individual classifiers. This is commonly used for classification tasks.

• Soft Voting: In soft voting, each base classifier’s predicted probabilities for each class are averaged, and the class with
the highest average probability is chosen as the final prediction. Soft voting often produces better results than hard
voting because it takes into account the confidence levels of the classifiers.
Ensemble Learning Prediction Techniques:
Max Voting:

• A Voting Classifier is an ensemble machine learning technique that combines the predictions from multiple individual
classifiers (also known as base classifiers or estimators) to make a final prediction.

• For example, when you asked 5 of your colleagues to rate your movie (out of 5); we’ll assume three of them rated it as
4 while two of them gave it a 5. Since the majority gave a rating of 4, the final rating will be taken as 4. You can
consider this as taking the mode of all the predictions.

• The result of max voting would be something like this:

Colleague 1 Colleague 2 Colleague 3 Colleague 4 Colleague 5 Final rating

5 4 5 4 4 4
Averaging:
Similar to the max voting technique, multiple predictions are made for each data point in averaging.

In this method, we take an average of predictions from all the models and use it to make the final prediction.

Averaging can be used for making predictions in regression problems or while calculating probabilities for classification
problems.

For example, in the below case, the averaging method would take the average of all the values.

i.e. (5+4+5+4+4)/5 = 4.4

Colleague 1 Colleague 2 Colleague 3 Colleague 4 Colleague 5 Final rating

5 4 5 4 4 4.4
Weighted Average:
This is an extension of the averaging method. All models are assigned different weights defining the importance of each
model for prediction.

For instance, if two of your colleagues are critics, while others have no prior experience in this field, then the answers by
these two friends are given more importance as compared to the other people.

The result is calculated as [(50.23) + (40.23) + (50.18) + (40.18) + (4*0.18)] = 4.41.

Colleague 1 Colleague 2 Colleague 3 Colleague 4 Colleague 5 Final rating

weight 0.23 0.23 0.18 0.18 0.18

rating 5 4 5 4 4 4.41
Ensemble learning technique classification:
Ensemble learning is a machine learning technique where multiple models are combined to improve the overall
performance of the system. The basic idea is that by combining multiple models, each model capturing different aspects
of the data, the ensemble model can often achieve better predictive performance than any individual model.

There are two main types of ensemble learning techniques: bagging and boosting.
Bagging (Bootstrap Aggregating):
• Bagging involves training multiple instances of the same base learning algorithm on different subsets of the training
data.

• Each subset is sampled with replacement from the original dataset, which means that some instances may be repeated
while others may be left out.

• The final prediction is typically made by averaging the predictions of all the models (for regression) or by taking a
majority vote (for classification).

• Random Forest is a popular algorithm based on bagging, where decision trees are the base learners.
Bagging (Bootstrap Aggregating):
• Step 1: Generate Bootstrap Samples of same size (with replacement) are repeatedly taken from the training data set, so
that each record has an equal probability of being selected.

• Step 2: A classification or estimation model is trained on each bootstrap sample drawn in Step 1, and a prediction is
recorded for each sample.

• Step 3: The bagging ensemble prediction is then defined to be the class with the most votes in Step 2 (for
classification models) or the average of the predictions made in Step 2 (for estimation models)
Bagging (Bootstrap Aggregating):
Boosting:
• Boosting involves sequentially training multiple weak learners (models that are only slightly better than
random guessing) to correct the errors made by the previous models in the sequence.
• Each subsequent model focuses more on the instances that were misclassified by the previous models, thereby
reducing the overall error.
• Unlike bagging, the base learners are trained sequentially, and each subsequent model tries to correct the
mistakes of the previous ones.
• Some popular boosting algorithms include AdaBoost (Adaptive Boosting), Gradient Boosting Machines
(GBM), and XGBoost.
Boosting:
Step 1: All observations have equal weight in the original training data set D1. An initial “base” classifier h1 is determined.

Step 2: The observations that were incorrectly classified by the previous base classifier have their weights increased, while
the observations that were correctly classified have their weights decreased.
This gives us data distribution Dm, m=2, … , M.
A new base classifier hm, m = 2, … , M is determined, based on the new weights. This step is repeated until the desired
number of iterations M is achieved.

Step 3: The final boosted classifier is the weighted sum of the M base classifiers.

Generalization & Test Error:

Generalization error refers to the performance on examples that are not seen by the weak learners during the training. It is
the probability of misclassifying a new example by the base classifier.

Test error is the faction of mistakes on a newly sampled test set.

Boosting:
Bagging vs Boosting:
The bagging method will not present better result in bias when there is a challenge of low performance. However, since
the boosting method concentrates on the optimization of the strengths of individual model and reducing the weakness of
a single model, booting model gives output with low errors.

When facing the challenge of overfitting in a single model, the bagging method out performs the boosting method as the
later model itself comes with over-fitting.
Bagging VS Boosting:
Reference:
1. https://www.analyticsvidhya.com/blog/2018/03/introduction-k-neighbours-algorithm-clustering/

2. https://www.javatpoint.com/clustering-in-machine-learning

3. https://www.shiksha.com/online-courses/articles/types-of-clustering-algorithm-scenario-you-must-know-as-a-data-
scientist/

4. https://enstoa.com/blog/machine-learning-construction-how-clustering-data-can-improve-processes-part-2-of-2

Bayes Theorem in Machine learning
No ratings yet
Bayes Theorem in Machine learning
37 pages
Computer Project For ISC
61% (36)
Computer Project For ISC
52 pages
Bayes Theorem
No ratings yet
Bayes Theorem
20 pages
Unit II Probabilistic Reasoning
No ratings yet
Unit II Probabilistic Reasoning
28 pages
An Introduction to Naive Bayes Algorithm for Beginners
No ratings yet
An Introduction to Naive Bayes Algorithm for Beginners
11 pages
UNIT 2 AAM notes (1)
No ratings yet
UNIT 2 AAM notes (1)
38 pages
ML Unit3
No ratings yet
ML Unit3
21 pages
Unit 6
No ratings yet
Unit 6
19 pages
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
No ratings yet
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
47 pages
BSC ML CH2.pptx
No ratings yet
BSC ML CH2.pptx
79 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
Bayes Rule PR-2
No ratings yet
Bayes Rule PR-2
5 pages
Unit-3
No ratings yet
Unit-3
157 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
24 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
10 pages
AI NOTES unit 2
No ratings yet
AI NOTES unit 2
9 pages
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm With Codes in Python and R
6 pages
Unit I Probabilistic Reasoning I 9
No ratings yet
Unit I Probabilistic Reasoning I 9
20 pages
Aiml 2 3
No ratings yet
Aiml 2 3
51 pages
Machine Ass
No ratings yet
Machine Ass
33 pages
Naive Bayes
No ratings yet
Naive Bayes
21 pages
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
3 pages
Bayesian Learning
No ratings yet
Bayesian Learning
41 pages
Naive Bayes-1 Th....
No ratings yet
Naive Bayes-1 Th....
13 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
3 pages
Naive Bayes Classification_04c360b1c962b080d8b84f51f8a494ad
No ratings yet
Naive Bayes Classification_04c360b1c962b080d8b84f51f8a494ad
5 pages
Probabilistic Models in Machine Learning: Unit - III Chapter - 1
No ratings yet
Probabilistic Models in Machine Learning: Unit - III Chapter - 1
18 pages
What Is Naive Bayes?
No ratings yet
What Is Naive Bayes?
6 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
6 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
3 pages
4. Probability Models
No ratings yet
4. Probability Models
23 pages
Bayes Theorem
No ratings yet
Bayes Theorem
7 pages
Unit-4 Naïve Bayes & Support Vector Machine
No ratings yet
Unit-4 Naïve Bayes & Support Vector Machine
79 pages
Aiml Iii
No ratings yet
Aiml Iii
28 pages
ML Unit-4
No ratings yet
ML Unit-4
82 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
11 pages
Notes On Module 3 - Pattern Recognition
No ratings yet
Notes On Module 3 - Pattern Recognition
17 pages
Unit 2 Bayesian Learning
No ratings yet
Unit 2 Bayesian Learning
50 pages
Mechine Learning
No ratings yet
Mechine Learning
7 pages
Bayes Learning
No ratings yet
Bayes Learning
37 pages
Unit 2
No ratings yet
Unit 2
20 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
Bayesian Data Analysis
No ratings yet
Bayesian Data Analysis
14 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
Data Analytics Unit-2 PPT Notes
No ratings yet
Data Analytics Unit-2 PPT Notes
190 pages
DWM Exp 4
No ratings yet
DWM Exp 4
7 pages
Naive Bayes
No ratings yet
Naive Bayes
24 pages
ML-9
No ratings yet
ML-9
15 pages
8.-Naive-Bayes-Classifier
No ratings yet
8.-Naive-Bayes-Classifier
37 pages
Naive Bayes Theorm
No ratings yet
Naive Bayes Theorm
4 pages
unit-3(after_mid)
No ratings yet
unit-3(after_mid)
10 pages
Naive_Bayes_1696233556
No ratings yet
Naive_Bayes_1696233556
5 pages
L25 - Naïve Bayes
No ratings yet
L25 - Naïve Bayes
18 pages
Lecture 06 Bayesian Networks 07112022 011127pm
No ratings yet
Lecture 06 Bayesian Networks 07112022 011127pm
33 pages
ml last document group 2.pdf
No ratings yet
ml last document group 2.pdf
13 pages
Baye's Theorem - Example
No ratings yet
Baye's Theorem - Example
7 pages
Bayes Algorithm
No ratings yet
Bayes Algorithm
26 pages
Unit 3 Bayesian Learning
No ratings yet
Unit 3 Bayesian Learning
49 pages
Bayesian Inference: Fundamentals and Applications
From Everand
Bayesian Inference: Fundamentals and Applications
Fouad Sabry
No ratings yet
Naive Bayes Classifier: Fundamentals and Applications
From Everand
Naive Bayes Classifier: Fundamentals and Applications
Fouad Sabry
No ratings yet
Bus Impedance Matrix
100% (2)
Bus Impedance Matrix
14 pages
ARTIFICIAL INTELLIGENCE(18CS2T29)- End Term Exam. 2020-2021
No ratings yet
ARTIFICIAL INTELLIGENCE(18CS2T29)- End Term Exam. 2020-2021
3 pages
Numerical Analysis Project
No ratings yet
Numerical Analysis Project
2 pages
Giuaki
No ratings yet
Giuaki
7 pages
A Linear Programming Algorithm For Least-Cost Scheduling
No ratings yet
A Linear Programming Algorithm For Least-Cost Scheduling
6 pages
Linear Programming Model Graphical Method
No ratings yet
Linear Programming Model Graphical Method
8 pages
Moore's Procedure & Johnson's Procedure: Session 12
No ratings yet
Moore's Procedure & Johnson's Procedure: Session 12
16 pages
% Program Loadflow - Gs % This Is A Gauss-Seidel Power Flow Program
No ratings yet
% Program Loadflow - Gs % This Is A Gauss-Seidel Power Flow Program
5 pages
Z Transform
No ratings yet
Z Transform
42 pages
Legendre Wavelet For Solving Linear System of Fredholm and Volterra Integral Equations
No ratings yet
Legendre Wavelet For Solving Linear System of Fredholm and Volterra Integral Equations
9 pages
Dsa Project
No ratings yet
Dsa Project
12 pages
Q 01 Question Bank Solutions
No ratings yet
Q 01 Question Bank Solutions
5 pages
Data Science Project
No ratings yet
Data Science Project
25 pages
2024 - An Image Is Worth 32 Tokens For Reconstruction and Generation - Yu Et Al
No ratings yet
2024 - An Image Is Worth 32 Tokens For Reconstruction and Generation - Yu Et Al
20 pages
CH 13
No ratings yet
CH 13
22 pages
hw1 f21112 Problems11
No ratings yet
hw1 f21112 Problems11
2 pages
New_python_programs
No ratings yet
New_python_programs
53 pages
BE 18CS71 7 Sem July August 2022
No ratings yet
BE 18CS71 7 Sem July August 2022
2 pages
Denoising Diffusion Probabilistic Models
No ratings yet
Denoising Diffusion Probabilistic Models
25 pages
Analysis of CT and MRI Image Fusion Using Wavelet Transform
No ratings yet
Analysis of CT and MRI Image Fusion Using Wavelet Transform
5 pages
Dsplab
No ratings yet
Dsplab
6 pages
Unit1 Ai&ml
No ratings yet
Unit1 Ai&ml
51 pages
Digit Recognition Using Convolutional Neural Networks
No ratings yet
Digit Recognition Using Convolutional Neural Networks
4 pages
15EC54 - A - B Syllabus
No ratings yet
15EC54 - A - B Syllabus
3 pages
Autoencoders - Buffalo University
No ratings yet
Autoencoders - Buffalo University
36 pages
Cse 421 HW 8
No ratings yet
Cse 421 HW 8
3 pages
Dendrogram - Slides
No ratings yet
Dendrogram - Slides
27 pages
Hcai Mock
100% (1)
Hcai Mock
5 pages
Clipping Compared To Limiting
No ratings yet
Clipping Compared To Limiting
1 page

Unit-4

Uploaded by

Unit-4

Uploaded by

Unit-IV: PROBABILISTIC LEARNING & ENSEMBLE

P(A | B) = P(A and B) / P(B)

• It is a best method to relate the condition probability and marginal probability.

P(A | B) = [P(B | A) * P(A)] / P(B)

Hence, Bayes Theorem can be written as:

posterior = likelihood * prior / evidence

P(A) is prior probability : Probability of hypothesis before observing the evidence.

P(A | B) is posterior probability: Probability of hypothesis A on the observed event B.

P(B) = Σ [P(Ai) * P(B | Ai)]

• It is mainly used in text classification that includes a high-dimensional training dataset.

machine learning models that can make quick predictions.

• Bayes: It is called Bayes because it depends on the principle of Bayes' Theorem.

• Let’s follow the below steps to perform it.

probability = 0.29 and probability of playing is 0.64.

• The result of max voting would be something like this:

Colleague 1 Colleague 2 Colleague 3 Colleague 4 Colleague 5 Final rating

i.e. (5+4+5+4+4)/5 = 4.4

Colleague 1 Colleague 2 Colleague 3 Colleague 4 Colleague 5 Final rating

The result is calculated as [(5*0.23) + (4*0.23) + (5*0.18) + (4*0.18) + (4*0.18)] = 4.41.

Colleague 1 Colleague 2 Colleague 3 Colleague 4 Colleague 5 Final rating

weight 0.23 0.23 0.18 0.18 0.18

Generalization & Test Error:

Test error is the faction of mistakes on a newly sampled test set.

You might also like

The result is calculated as [(50.23) + (40.23) + (50.18) + (40.18) + (4*0.18)] = 4.41.