0% found this document useful (0 votes)

26 views26 pages

07 - ML - Naive-Bayes-update

Uploaded by

11-Nguyễn Thị Quỳnh Châu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views26 pages

07 - ML - Naive-Bayes-update

Uploaded by

11-Nguyễn Thị Quỳnh Châu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 26

Ho Chi Minh University of Banking

Department of Economic
Mathematics

Machine Learning
Naïve Bayes Classifier (NBC)

Vuong Trong Nhan ([email protected])

Outline
Naive Bayes: Introduction
Bayes’s Theorem
Example using Naive Bayes
Types of Naïve Bayes classifiers
Evaluation
Advantages and Disadvantages
Some applications
Exercises

2
Naïve Bayes classifiers

The Naïve Bayes classifier is a supervised

machine learning algorithm which is used for
classification tasks.
Based on applying Bayes’ theorem
With the “naive” assumption of conditional
independence between every pair of features given
the value of the class variable.

3
Applications of the Naïve Bayes classifier
Spam filtering:
Spam classification is one of the most popular applications of Naïve Bayes cited in
literature. Oreilly.
Document classification:
Document and text classification go hand in hand. Another popular use case of Naïve
Bayes is content classification. Imagine the content categories of a News media
website. All the content categories can be classified under a topic taxonomy based on
the each article on the site. Federick Mosteller and David Wallace are credited with the
first application of Bayesian inference within their 1963 paper.
Sentiment analysis:
While this is another form of text classification, sentiment analysis is commonly
leveraged within marketing to better understand and quantify opinions and attitudes
around specific products and brands.
Mental state predictions:
Using fMRI data, naïve bayes has been leveraged to predict different cognitive states
among humans. The goal of this research was to assist in better understanding
hidden cognitive states, particularly among brain injury patients.
4
https://www.ibm.com/topics/naive-bayes
Example
Dataset that describes the weather conditions for playing Tennis

Play
Day Outlook Temperature Humidity Wind
Tennis Features are
D1 Sunny Hot High Weak No ‘Outlook’,
D2 Sunny Hot High Strong No ‘Temperature’,
D3 Overcast Hot High Weak Yes ‘Humidity’ and ‘Windy’.
D4 Rainy Mild High Weak Yes
Class: play Tennis
D5 Rainy Cool Normal Weak Yes
D6 Rainy Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes Predict:
D10 Rainy Mild Normal Weak Yes • Today = D15
D11 Sunny Mild Normal Strong Yes • Play Tennis = ?
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rainy Mild High Strong No

D15 Sunny Hot Normal Weak ??? 5

Bayes’ Theorem
Bayes’ theorem finds the probability of an event occurring given the
probability of another event that has already occurred.

where Y and X are events and P(X) ≠ 0.

Trying to find probability of event Y, given the event X is true. Event X

is also termed as evidence.
P(Y) is the prior probability of Y.
Probability of event before evidence is seen).
The evidence is an attribute value of an unknown instance (here, it is event X).
P(Y|X) is a posteriori probability of X, i.e. probability of event after
evidence is seen.
6
“Naïve” Bayes Assumption
Assumption: Each feature contributes independently and equally to the outcome.

Independence: no pair of features are dependent.

o I.e., the temperature being ‘Hot’ has nothing to do with the humidity or
the outlook being ‘Rainy’ has no effect on the winds.
o Hence, the features are assumed to be independent.
Equality: each feature is given the same weight (or
importance).
o I.e., knowing only temperature and humidity alone can’t predict the
outcome accurately.
o None of the attributes is irrelevant and assumed to be contributing
equally to the outcome.
Note: In-fact, the independence assumption is never correct but often works well
in practice. 7
Data presentation

Dataset D: (X, y)
X is an independent feature vector (of size n)
y is class variable
Apply Bayes’ theorem

E.g:
X = (Rain, Hot, High, Weak) P(y|X) means, the probability of “Not playing tennis”
given that the weather conditions are “Rainy outlook”,
y = No
“Temperature is hot”, “high humidity” and “weak wind”.
8
(1)
Naïve Bayes
Since A and B are independent (naive assumption):
P(A,B) = P(A)P(B)
(1) become:
(2)

(2) can be expressed as:

(3)

As the denominator remains constant for a given input, we can

remove that term: Proportion to
(4)

Maximum likelihood

9
Note: The different naive Bayes classifiers differ mainly by the assumptions they make regarding the distribution of P(xi | y).
Naïve Bayes

Step 1: Calculate P(y)

y = Yes
y = No

Play Tennis
Yes No P(Yes) P(No)
9 5 9/14 5/14

P(play = Yes) = 9/14

P(play = No) = 5/14

10
Predict Toda
Naïve Bayes D15 y Sunny Hot Normal Weak ???

Step 2: Calculate: P(xi|y)

12
Evaluate a Naïve Bayes classifier

Accuracy, Precision, Recall, Confusion matrix

13
https://www.ibm.com/topics/naive-bayes
Types of Naïve Bayes classifiers
Based on the distributions of the feature values:
Gaussian Naïve Bayes (GaussianNB):
o Feature: continuous variables
o eg. Age ∈ [18, 60]
o Gaussian distribution
Multinomial Naïve Bayes (MultinomialNB):
o Feature: discrete values(eg. frequency counts)
o eg. outlook = {sunny, overcast, rainy}
o Multinomial distribution
Bernoulli Naïve Bayes (BernoulliNB):
o features: Boolean variables
o {True, False} or {1, 0}.
o Bernoulli distribution
Hydrid NB
o by combining existing Naive Bayes models

14
https://www.ibm.com/topics/naive-bayes
Advantages and disadvantages

Advantages
Less complex:
o Naïve Bayes is considered a simpler classifier since the
parameters are easier to estimate.
Scales well:
o Compared to logistic regression, Naïve Bayes is
considered a fast and efficient classifier that is fairly
accurate when the conditional independence assumption
holds. It also has low storage requirements.
Can handle high-dimensional data:
o Use cases, such document classification, can have a high
number of dimensions, which can be difficult for other
classifiers to manage.

15
https://www.ibm.com/topics/naive-bayes
Advantages and disadvantages

Disadvantages:
Subject to Zero frequency:
o Zero frequency occurs when a categorical variable does not
exist within the training set.
o For example, imagine that we’re trying to find the maximum
likelihood estimator for the word, “sir” given class “spam”, but
the word, “sir” doesn’t exist in the training data. The probability
in this case would zero, and since this classifier multiplies all the
conditional probabilities together, this also means that posterior
probability will be zero. (To avoid this issue, laplace smoothing
can be leveraged)
Unrealistic core assumption:
o While the conditional independence assumption overall
performs well, the assumption does not always hold, leading to
incorrect classifications.

16
https://www.ibm.com/topics/naive-bayes
(Optional)

Deal with Zero-frequency problem

Laplace smoothing/correction
Deal with Continuous features
Discretization
Density probability function

17
Yes No P(Yes) P(No)
Zero-frequency problem 9 5 9/14 5/14

Outlook Temperature
xi Yes No P(xi|Yes) P(xi|No) xi Yes No P(xi|Yes) P(xi|No)
Sunny 2 3 2/9 3/5 Hot 2 2 2/9 2/5
Overcast 4 0 4/9 0/5 Mild 4 2 4/9 2/5
Rainy 3 2 3/9 2/5 Cool 3 1 3/9 1/5

Humidity Wind
xi Yes No P(xi|Yes) P(xi|No) Yes No P(xi|Yes) P(xi|No)
High 3 4 3/9 4/5 Weak 6 2 6/9 2/5
Normal 6 1 6/9 1/5 Strong 3 3 3/9 3/5

Predict

D16 Overcast Cool High Strong ?

18
Laplace Smoothing/Correction

In Naive Bayes classification, Laplace smoothing, also

known as add-one smoothing is a technique used to
handle the problem of zero probabilities

𝑐𝑜𝑢𝑛𝑡 𝑥& , 𝑦 + 𝛼
𝑃!"#,% 𝑥& 𝑦 =
𝑐𝑜𝑢𝑛𝑡 𝑦 + 𝛼|𝑋|
Where:
• P (xi ∣y ) is the probability of feature xi given class y.
• 𝛼 is the smoothing parameter (𝛼 > 0, usually using 𝛼 = 1)
• count (xi |y) is the count of occurrences of feature xi with class y in the training data.
• count (y ) is the total count of instances of class y in the training data.
• ∣X ∣ is number of unique feature values (or the size of the vocabulary)

19
Laplace smoothing/correction
P(xi|y) without using Laplace smoothing

Outlook Predict
xi Yes No P(xi|Yes) P(xi|No)
D16 Overcast Cool High Strong ?
Sunny 2 3 2/9 3/5
Overcast 4 0 4/9 0/5
Rainy 3 2 3/9 2/5

P(xi|y) using Laplace smoothing

Outlook (using Laplace smoothing)
xi Yes No P(xi|Yes) P(xi|No) • Choose 𝛼 = 1
Sunny 2 3 3/12 4/8 • Outlook = {Sunny, Overcast, Rainy}
Þ |Outlook| = 3
Overcast 4 0 5/12 1/8 • c(Overcast,Yes) = 4, c(Overcast,No) = 0
Rainy 3 2 4/12 3/8 • c(Yes) = 9, c(No) = 5

4+𝟏 5 0+𝟏 1
P(Outlook = Overcast |Yes) = = P(Outlook = Overcast |No) = =
9+ 𝟏∗𝟑 12 5+ 𝟏∗𝟑 8
20
NBC using Laplace smoothing
Yes No P(Yes) P(No)
9 5 9/14 5/14

Outlook (Laplace smoothing) Temperature (Laplace smoothing)

xi Yes No P(xi|Yes) P(xi|No) xi Yes No P(xi|Yes) P(xi|No)
Sunny 2 3 3/12 4/8 Hot 2 2 3/12 3/8
Overcast 4 0 5/12 1/8 Mild 4 2 5/12 3/8
Rainy 3 2 4/12 3/8 Cool 3 1 4/12 2/8

Humidity (Laplace smoothing) Wind (Laplace smoothing)

xi Yes No P(xi|Yes) P(xi|No) Yes No P(xi|Yes) P(xi|No)
High 3 4 4/11 5/7 Weak 6 2 7/11 3/7
Normal 6 1 7/11 2/7 Strong 3 3 4/11 4/7

Predict D16 Overcast Cool High Strong ?

21
NBC with continous features

Deal with continous value

Change to discrete value (data binning)
o Eg.
• Temperature = 80 => high
• Temperature = 70 => mild
• Temperature = 60 => cool
Using probability density distribution function (f)
𝑃 𝑋 𝑥! , 𝑥" , … , 𝑥# 𝑌 = 𝑦 = . 𝑓(𝑋$ = 𝑥$ |𝑌 = 𝑦)

Probability density function for the normal distribution (Gaussian distribution)

1 (#!$)!
!
𝑓 𝑥 = 𝑒 & '!
𝜎 2𝜋
23
NBC with continous features
Using probability density distribution function (f) D17 = { Outlook = Overcast,
Temperature = 60,
Play
Day Outlook Temperature Humidity Wind
Tennis
Humidity = 62,
Wind = Weak }
D1 Sunny 85 85 Weak No
D2 Sunny 80 90 Strong No
(#$%&'%⋯% #))
D3 Overcast 83 86 Weak Yes 𝜇 Temp|yes = +
= 73
D4 Rainy 70 96 Weak Yes (#$,&$)! %(&',&$)! % ⋯ %(#),&$)!
𝜎(Temp|yes) = = 6.2
+,)
D5 Rainy 68 80 Weak Yes
D6 Rainy 65 70 Strong No (#1% #'% …% &))
𝜇 Temp|no = = 74.6
D7 Overcast 64 65 Strong Yes 1
(#1,&3.5)! %(#',&3.5)! % ⋯ %(&),&3.5)!
D8 Sunny 72 95 Weak No 𝜎(Temp|no) = =8
1,)
D9 Sunny 69 70 Weak Yes
D10 Rainy 75 80 Weak Yes Probability density function
D11 Sunny 75 70 Strong Yes for the normal distribution
D12 Overcast 72 90 Strong Yes 1 (-,.)!
,
𝑓 temp = 60|yes = 𝑒 / 0! = 0.071
D13 Overcast 81 75 Weak Yes 𝜎 2𝜋
D14 Rainy 71 91 Strong No 1 ,
(-,.)!
𝑓 temp = 60|no = 𝑒 / 0! = 0.0094
𝜎 2𝜋
24
Summary

Naïve Bayes Classifier

Naïve assumption
Bayes Theory
Types:
Gaussian NB
Multinominal NB
Bernoulli NB

27
Gaussian naïve bayes example
# load the iris dataset
from sklearn.datasets import load_iris
iris = load_iris()

# store the feature matrix (X) and response vector (y)

X = iris.data
y = iris.target

# splitting X and y into training and testing sets

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=1)

# training the model on training set

from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
gnb.fit(X_train, y_train)

# making predictions on the testing set

y_pred = gnb.predict(X_test)

# comparing actual response values (y_test) with predicted response values (y_pred)
from sklearn import metrics
print("Gaussian Naive Bayes model accuracy:", metrics.accuracy_score(y_test, y_pred))

### Gaussian Naive Bayes model accuracy: 0.95

28
Exercise

Day Outlook Temperature Humidity Wind Play Tennis

D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rain Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No

Using Naïve Bayes algorithm to predict:

• D15 = (Sunny, Hot, High, Weak) • D16 = (Rain, Mild, Normal, Weak)
• Play tennis (D15) = ? • Play tennis (D16) = ? 29

Learning Organizations John Renesch Sarita Chawla PDF Download
100% (1)
Learning Organizations John Renesch Sarita Chawla PDF Download
82 pages
Naïve Bayes
No ratings yet
Naïve Bayes
15 pages
Photo Shop For Mac 2015
No ratings yet
Photo Shop For Mac 2015
15 pages
Naive Bayes
No ratings yet
Naive Bayes
26 pages
Naive Bayes
No ratings yet
Naive Bayes
7 pages
LM3 - Naive Bayes Model
No ratings yet
LM3 - Naive Bayes Model
21 pages
Naive Bayes Model
No ratings yet
Naive Bayes Model
10 pages
What Is Naive Bayes Algorithm
No ratings yet
What Is Naive Bayes Algorithm
10 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
16 pages
Navies Bayes
No ratings yet
Navies Bayes
18 pages
CSL0777 L24
No ratings yet
CSL0777 L24
38 pages
Unit-4 Naïve Bayes & Support Vector Machine
No ratings yet
Unit-4 Naïve Bayes & Support Vector Machine
79 pages
Naive Bayes Algorithm
No ratings yet
Naive Bayes Algorithm
46 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
17 pages
Naive-Bayes-Classifier-1
No ratings yet
Naive-Bayes-Classifier-1
18 pages
16 - Naïve Bayes Classifier
No ratings yet
16 - Naïve Bayes Classifier
21 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
Lecture - 4.1 - Bayes Classifier
No ratings yet
Lecture - 4.1 - Bayes Classifier
31 pages
01 Naiv Bayes
No ratings yet
01 Naiv Bayes
25 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Naive Bayes
No ratings yet
Naive Bayes
6 pages
NOTES
No ratings yet
NOTES
15 pages
Lecture 12 Dr. Lamiaa
No ratings yet
Lecture 12 Dr. Lamiaa
21 pages
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
No ratings yet
ML Unit No.4 Naïve Bayes Classifiers PPT Notes
47 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
6 pages
Naive Bayes Classifier in Machine Learning
No ratings yet
Naive Bayes Classifier in Machine Learning
16 pages
Naive Bayes
No ratings yet
Naive Bayes
30 pages
Naive Bayes Classifier in Machine Learning - Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning - Javatpoint
19 pages
UNIT 2 AAM Notes
No ratings yet
UNIT 2 AAM Notes
38 pages
Unit 2.2
No ratings yet
Unit 2.2
9 pages
Ame: Waqar Ali
No ratings yet
Ame: Waqar Ali
22 pages
Mechine Learning
No ratings yet
Mechine Learning
7 pages
L6 - SLM Notes (Bayes Algorithm)
No ratings yet
L6 - SLM Notes (Bayes Algorithm)
28 pages
Naive Bayes
No ratings yet
Naive Bayes
19 pages
Baye's Notes
No ratings yet
Baye's Notes
3 pages
Report Ai
No ratings yet
Report Ai
7 pages
ML Lecture 10 (Naïve Bayes Classifier)
No ratings yet
ML Lecture 10 (Naïve Bayes Classifier)
14 pages
Lecture10 - Bayesian Classifier
No ratings yet
Lecture10 - Bayesian Classifier
40 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
Naive Bayes
No ratings yet
Naive Bayes
41 pages
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
No ratings yet
6 Easy Steps To Learn Naive Bayes Algorithm (With Code in Python)
3 pages
ML Lecture 12 NB
No ratings yet
ML Lecture 12 NB
15 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Naive Bays
No ratings yet
Naive Bays
10 pages
Naive Bayes Classifier in Machine Learning Javatpoint
No ratings yet
Naive Bayes Classifier in Machine Learning Javatpoint
23 pages
Naïve Bayes Classifiers 3
No ratings yet
Naïve Bayes Classifiers 3
16 pages
Report On Naive Bayes
No ratings yet
Report On Naive Bayes
5 pages
ML For ME S15-16 Naïve Bayes
No ratings yet
ML For ME S15-16 Naïve Bayes
17 pages
Unit 5-6
No ratings yet
Unit 5-6
18 pages
Naïve Bayes Classifier Algorithm
No ratings yet
Naïve Bayes Classifier Algorithm
11 pages
L25 - Naïve Bayes
No ratings yet
L25 - Naïve Bayes
18 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
07 Naive Bayes
No ratings yet
07 Naive Bayes
6 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
Lecture 5 Bayesian
No ratings yet
Lecture 5 Bayesian
37 pages
Naive Bayes
No ratings yet
Naive Bayes
12 pages
Bayesian-Classification Ok
No ratings yet
Bayesian-Classification Ok
21 pages
Unit 2 AAM
No ratings yet
Unit 2 AAM
32 pages
Lecture-7 Classification Using Naive Bays
No ratings yet
Lecture-7 Classification Using Naive Bays
19 pages
Bayes' Theorem Explained
No ratings yet
Bayes' Theorem Explained
18 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
Differential Games
From Everand
Differential Games
Avner Friedman
No ratings yet
Personality of Counselor
No ratings yet
Personality of Counselor
16 pages
Apply 5S Procedure
91% (11)
Apply 5S Procedure
18 pages
Governmentadda - Com Problem On Ages
No ratings yet
Governmentadda - Com Problem On Ages
35 pages
Response
0% (1)
Response
3 pages
Moha - Security
No ratings yet
Moha - Security
2 pages
Tuition Fee Waiver (Supplementary) 2023-24
No ratings yet
Tuition Fee Waiver (Supplementary) 2023-24
24 pages
Internet Accessibility Towards Academic Perfomance in Senior High School Students
100% (1)
Internet Accessibility Towards Academic Perfomance in Senior High School Students
10 pages
Learning Worksheet No. In: 10 Statistics and Probability
No ratings yet
Learning Worksheet No. In: 10 Statistics and Probability
5 pages
VOCAB
No ratings yet
VOCAB
6 pages
Injunctions and Motivation in Human Growth From The Perspective of Triology
No ratings yet
Injunctions and Motivation in Human Growth From The Perspective of Triology
7 pages
Minor Discipline Programme
No ratings yet
Minor Discipline Programme
2 pages
Academic Calendar April August 2018 1 1
No ratings yet
Academic Calendar April August 2018 1 1
5 pages
20130919080910lecture 1 - Curriculum Studies
No ratings yet
20130919080910lecture 1 - Curriculum Studies
22 pages
Ultimate Mental Health Worksheets Pack-3
No ratings yet
Ultimate Mental Health Worksheets Pack-3
1 page
LESSON 3.2 Understanding Pressure in Liquids: Example 1
No ratings yet
LESSON 3.2 Understanding Pressure in Liquids: Example 1
4 pages
Semi Detailed Lesson Plan For Grade 8 Week 6
No ratings yet
Semi Detailed Lesson Plan For Grade 8 Week 6
5 pages
Supporting Parents To Promote Emotion Regulation Abilities in Young Children With Autism Spectrum Disorders: A SCERTS Model Perspective
No ratings yet
Supporting Parents To Promote Emotion Regulation Abilities in Young Children With Autism Spectrum Disorders: A SCERTS Model Perspective
20 pages
Journal of Contextual Behavioral Science: James E. Yadavaia, Steven C. Hayes, Roger Vilardaga
No ratings yet
Journal of Contextual Behavioral Science: James E. Yadavaia, Steven C. Hayes, Roger Vilardaga
10 pages
Michael Clapis Rough Tablets
No ratings yet
Michael Clapis Rough Tablets
3 pages
1
100% (1)
1
250 pages
DM 01 Introduction
No ratings yet
DM 01 Introduction
14 pages
3-Minute Dutch S1 #2 Greetings: Lesson Notes
No ratings yet
3-Minute Dutch S1 #2 Greetings: Lesson Notes
4 pages
GC University Faisalabad Thesis
100% (3)
GC University Faisalabad Thesis
5 pages
Quality-Of-Life (Qol) of Indonesian Children Living With Hiv: The Role of Caregiver Stigma, Burden of Care, and Coping
No ratings yet
Quality-Of-Life (Qol) of Indonesian Children Living With Hiv: The Role of Caregiver Stigma, Burden of Care, and Coping
9 pages
SPSS Multiple Linear Regression
No ratings yet
SPSS Multiple Linear Regression
55 pages
Gaurav Resume Support New
No ratings yet
Gaurav Resume Support New
4 pages
La Evolución de La Psicoterapia
No ratings yet
La Evolución de La Psicoterapia
40 pages
C1 CLC - Noi Dung 15w
No ratings yet
C1 CLC - Noi Dung 15w
21 pages