23-Naive Bayes

Uploaded by

Sahith Krishna 21BDS0078

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

31 views22 pages

23-Naive Bayes

Uploaded by

Sahith Krishna 21BDS0078

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 22

Module5_Bayes_Classifica

tion_Methods
Reference: Data Mining: Concepts and Techniques, (3rd Edn.), Jiawei
Han, Micheline Kamber, Morgan Kaufmann, 2015
Bayes ’ Rule

• The Bayes' rule (also Bayes' law or Bayes' theorem):

• This simple equation underlies prediction models.

Example
• A doctor knows that the disease meningitis causes the patient to have a stiff neck,
say, 50% of the time. The doctor also knows some unconditional facts: the prior
probability of a patient having meningitis is 1/50,000, and the prior probability of
any patient having a stiff neck is 1/20. Letting S be the hypothesis that the patient
has a stiff neck and M be the hypothesis that the patient has meningitis.
Bayes ’ Theorem: Basics

• Bayes’ Theorem: P( H | X) P(X | H ) P( H ) P(X | H )P( H ) / P(X)

P(X)

• Let X be a data sample (“evidence”): class label is unknown

• Let H be a hypothesis that X belongs to class C
• Classification is to determine P(H|X), (i.e., posteriori probability): the probability that the hypothesis
holds given the observed data sample X
• P(H) (prior probability): the initial probability
• E.g., X will buy computer, regardless of age, income, …
• prior probability of hypothesis H (i.e. the initial probability before we observe any data, reflects the background
knowledge)
• P(X): probability that sample data is observed
• P(X|H) (likelihood): the probability of observing the sample X, given that the hypothesis holds
• E.g., Given that X will buy computer, the prob. that X’s age is 31..40 with medium income
Prediction Based on Bayes’ Theorem
• Given training data X, posteriori probability of a hypothesis H,
P(H|X), follows the Bayes’ theorem

P(H | X) P(X | H ) P(H ) P(X | H )P(H ) / P(X)

P(X)

• Informally, this can be viewed as

posteriori = likelihood x prior/evidence
• Predicts X belongs to Ci iff the probability P(Ci|X) is the highest
among all the P(Ck|X) for all the k classes

5
Bayes Classifier
• A statistical classifier: performs probabilistic prediction, i.e., predicts class
membership probabilities P( A | B) = P ( B | A ) P (A)
P(B)
• Foundation: Based on Bayes’ Theorem.
• Probabilistic learning: Calculate explicit probabilities for hypothesis, among
the most practical approaches to certain types of learning problems
• Probabilistic prediction: Predict multiple hypotheses, weighted by their
probabilities
• Performance: A simple Bayesian classifier, naïve Bayesian classifier, has
comparable performance with few other classifiers
Classification Is to Derive the Maximum
Posteriori
• Let D be a training set of tuples and their associated class labels,
and each tuple is represented by an n-D attribute vector X = (x1,
x2, …, xn)
• Suppose there are m classes C1, C2, …, Cm.
• Classification is to derive the maximum posteriori, i.e., the
maximal P(Ci|X)
• This can be derived from Bayes’ theorem P(X | C )P(C )
P(C | X)  i i
i P(X)

• Since P(X) is constant for all classes, only

P(C | X) P(X | C )P(C )
i i i
needs to be maximized
7
Naïve Bayes Classifier
• A simplified assumption: attributes are conditionally
independent (i.e., no dependence
n relation between attributes):
P( X | C i)   P( x | C i ) P( x | C i ) P( x | C i) ... P( x | C i)
k 1 2 n
k 1
• This greatly reduces the computation cost: Only counts the class
distribution
• Once the probability P(X|Ci) is known, assign X to the class with
maximum P(X|Ci)*P(Ci)
• If Ak is categorical, P(xk|Ci) is the # of tuples in Ci having value xk
for Ak divided by |Ci, D| (# of tuples of Ci in D)
• If Ak is continous-valued, P(xk|Ci) is usually computed based on
Gaussian distribution with a mean μ and standard deviation 1 σ  ( x  ) 2

P ( X | C i )  g ( xk ,  Ci ,  Ci ) g ( x,  ,  )  e 2
2

and P(xk|Ci) is 2 
8
Naïve Bayes Classifier - Example
• Class: • Dataset
age income student credit_rating buys_computer
• C1:buys_computer = ‘yes’ <=30 high no fair no
• C2:buys_computer = ‘no’ <=30 high no excellent no
31…40 high no fair yes
>40 medium no fair yes
>40 low yes fair yes
• Instance to be classified: >40 low yes excellent no
31…40 low yes excellent yes
X = (age <=30, <=30 medium no fair no
Income = medium, <=30 low yes fair yes
>40 medium yes fair yes
Student = yes <=30 medium yes excellent yes
31…40 medium no excellent yes
Credit_rating = fair) 31…40 high yes fair yes
>40 medium no excellent no
Naïve Bayes Classifier - Example
• P(Ci): P(buys_computer = “yes”) = 9/14 = 0.643 age income student credit_rating buys_computer
<=30 high no fair no
P(buys_computer = “no”) = 5/14= 0.357 <=30 high no excellent no
31…40 high no fair yes
• Compute P(X|Ci) for each class >40 medium no fair yes
P(age = “<=30” | buys_computer = “yes”) = 2/9 = 0.222 >40 low yes fair yes
>40 low yes excellent no
P(age = “<= 30” | buys_computer = “no”) = 3/5 = 0.6 31…40 low yes excellent yes
P(income = “medium” | buys_computer = “yes”) = 4/9 = 0.444 <=30 medium no fair no
<=30 low yes fair yes
P(income = “medium” | buys_computer = “no”) = 2/5 = 0.4 >40 medium yes fair yes
P(student = “yes” | buys_computer = “yes) = 6/9 = 0.667 <=30 medium yes excellent yes
31…40 medium no excellent yes
P(student = “yes” | buys_computer = “no”) = 1/5 = 0.2 31…40 high yes fair yes
P(credit_rating = “fair” | buys_computer = “yes”) = 6/9 = 0.667 >40 medium no excellent no
P(credit_rating = “fair” | buys_computer = “no”) = 2/5 = 0.4
• X = (age <= 30 , income = medium, student = yes, credit_rating = fair)
P(X|Ci) : P(X|buys_computer = “yes”) = 0.222 x 0.444 x 0.667 x 0.667 = 0.044
P(X|buys_computer = “no”) = 0.6 x 0.4 x 0.2 x 0.4 = 0.019
P(X|Ci)*P(Ci) : P(X|buys_computer = “yes”) * P(buys_computer = “yes”) = 0.028
P(X|buys_computer = “no”) * P(buys_computer = “no”) = 0.007 10
Avoiding the Zero-Probability Problem
• Naïve Bayesian prediction requires each conditional prob. be
non-zero. Otherwise, the predicted prob. will be zero
n
P( X | C i)   P( x k | C i)
k 1
• Ex. Suppose a dataset with 1000 tuples, income=low (0),
income= medium (990), and income = high (10)
• Use Laplacian correction (or Laplacian estimator)
• Adding 1 to each case
Prob(income = low) = 1/1003
Prob(income = medium) = 991/1003
Prob(income = high) = 11/1003
• The “corrected” prob. estimates are close to their
“uncorrected” counterparts
11
Naïve Bayes Classifier: Comments
• Advantages
• Easy to implement
• Good results obtained in most of the cases
• Disadvantages
• Assumption: class conditional independence, therefore loss of
accuracy
• Practically, dependencies exist among variables
• E.g., hospitals: patients: Profile: age, family history, etc.
Symptoms: fever, cough etc., Disease: lung cancer,
diabetes, etc.
• Dependencies among these cannot be modeled by Naïve Bayes
Classifier
• How to deal with these dependencies? Bayesian Belief Networks
12
Example
• Example: Play Tennis - Given a new instance x’, predict its label
x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong)
EXAMPLE (SPAM/NONSPAM)
• Infer if the email document with the text content “machine learning for free"
is SPAM or NONSPAM using Bayes rule and the document set is given below.
• “free money for free gambling fun” -> SPAM
• money, money, money” -> SPAM
• “gambling for fun” -> SPAM
• “machine learning for fun, fun, fun” -> NONSPAM
• “free machine learning” -> NONSPAM

• Hint: P(Word/Category) = (Number of occurrence of the word

in all the documents from a category+1) divided by (All the
words in every document from a category + Total number of
unique words in all the documents)
EXAMPLE (SPAM/NONSPAM)
• “free money for free
• To find: gambling fun” -> SPAM
• P(SPAM | New_DOC) = P(SPAM) * P(New_DOC | SPAM) • money, money, money” ->
SPAM
• P(NONSPAM | New_DOC) = P(NONSPAM) * P(New_DOC | NONSPAM)
• “gambling for fun” ->
• New_DOC : “machine learning for free" SPAM
• “machine learning for fun,
• P(SPAM) = 3/5 = 0.6 fun, fun” -> NONSPAM
• “free machine learning” -
• P(NONSPAM) = 2/5 = 0.4 > NONSPAM
• P(New_DOC | SPAM) = P(“machine learning for free”| SPAM)
= P(“machine”| SPAM) * P(“learning”| SPAM)* P(“for”| SPAM)* P(“free”| SPAM)
= 0/12 * 0 /12 * 2 /12 * 2 /12
= ( 0+1)/ (12 + 7) * ( 0+1)/ (12 + 7) * ( 2+1)/ (12 + 7) * ( 2+1)/ (12 + 7)
= 1 /19 * 1 /19 * 3/19 * 3/19 =9/ 194
• P(New_DOC | NONSPAM) = P(“machine learning for free”| NONSPAM)
= P(“machine”| NONSPAM) * P(“learning”| NONSPAM)* P(“for”| NONSPAM)* P(“free”| NONSPAM)
= 2/9 * 2 /9* 1/9 * 1/9
= ( 2+1)/ (9 + 7) * ( 2+1)/ (9 + 7) * ( 1+1)/ (9 + 7) * ( 1+1)/ (9 + 7)
= 3 /16 * 3 /16 * 2/16 * 2/16 = 36 / 164
• P( Ci | X ) = P(X|Ci)*P(Ci)
• P(SPAM | New_DOC) = P(SPAM) * P(New_DOC | SPAM) = 0.6 * 9/ 194 = 4.143 * 10-5
• P(NONSPAM |New_DOC) = P(NONSPAM) * P(New_DOC |NONSPAM) = 0.4 * 36/164 = 21.97 * 10-5 15
• Therefore, applying MAP Rule,the New_DOC belongs to the class “NONSPAM”
Example continued
• Example:
• P(Word/Category) = (Number of occurrence of the word in all the
documents from a category+1) divided by (All the words in every
document from a category + Total number of unique words in all the
documents)
EXAMPLE
• Infer if the email document with the text content “machine learning
for free" is SPAM or NONSPAM using Bayes rule and the document
set is given below.
• “free money for free gambling fun” -> SPAM
• money, money, money” -> SPAM
• “gambling for fun” -> SPAM
• “machine learning for fun, fun, fun” -> NONSPAM
• “free machine learning” -> NONSPAM
Example
• Infer the class for the sentence "What is the price of this book“ , using Bayes rule and the dataset
is given below.
Example
• Infer the Tag for the text "A very close game" using Bayes rule and the dataset is given below.
EXAMPLE
• Infer if one can play golf given the weather conditions as “(Sunny, Hot, Normal,
False)" using Bayes rule and the dataset is given below.
ID Outlook Temperature Humidity Windy Play Golf
0 Rainy Hot High False No
1 Rainy Hot High True No
2 Overcast Hot High False Yes
3 Sunny Mild High False Yes
4 Sunny Cool Normal False Yes
5 Sunny Cool Normal True No
6 Overcast Cool Normal True Yes
7 Rainy Mild High False No
8 Rainy Cool Normal False Yes
9 Sunny Mild Normal False Yes
10 Rainy Mild Normal True Yes
11 Overcast Mild High True Yes
12 Overcast Hot Normal False Yes
13 Sunny Mild High True No
EXAMPLE
• Infer if one can play golf given the weather conditions as “<Outlook=sunny, Temperature=66,
Humidity=90, Windy=True>" using Bayes rule and the dataset is given below.
EXAMPLE
• Infer if one can play golf given the weather conditions as “<Outlook=overcast, Temperature=66,
Humidity=90, Windy=True>" using Bayes rule and the dataset is given below.

Teaching Negotiation
No ratings yet
Teaching Negotiation
22 pages
AI Notes
No ratings yet
AI Notes
19 pages
Bayes Classification
No ratings yet
Bayes Classification
9 pages
Unit-3 AML (Bayesian Concept Learning)
No ratings yet
Unit-3 AML (Bayesian Concept Learning)
40 pages
ML 05 Bayesian Classifier
No ratings yet
ML 05 Bayesian Classifier
19 pages
Naive Bayes
No ratings yet
Naive Bayes
37 pages
Unit-4 DWDM
No ratings yet
Unit-4 DWDM
10 pages
L3 (Week3) Bayesian Classifier
No ratings yet
L3 (Week3) Bayesian Classifier
21 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Statistical Inference INF312 - Is - Lecture 03 - Part 3
No ratings yet
Statistical Inference INF312 - Is - Lecture 03 - Part 3
18 pages
2.3 Bayes Classification
No ratings yet
2.3 Bayes Classification
15 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
A5 PDF
No ratings yet
A5 PDF
9 pages
4 22865 IS465 2019 1 2 1 08ClassBasic
No ratings yet
4 22865 IS465 2019 1 2 1 08ClassBasic
43 pages
Module 3 - Bayesian Classifier
No ratings yet
Module 3 - Bayesian Classifier
17 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
Data Mining - Bayesian Classification
No ratings yet
Data Mining - Bayesian Classification
6 pages
Lecture12 Ch8 ClassBasic Part2
No ratings yet
Lecture12 Ch8 ClassBasic Part2
22 pages
Lecture 8 - Naive Bayes
No ratings yet
Lecture 8 - Naive Bayes
27 pages
3 - Bayesian Classification
No ratings yet
3 - Bayesian Classification
15 pages
Naïve Bayesv1
No ratings yet
Naïve Bayesv1
31 pages
Baye's Rule and Its Use
No ratings yet
Baye's Rule and Its Use
9 pages
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
No ratings yet
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
66 pages
Unit6 - 3 Classification-Bayesian
No ratings yet
Unit6 - 3 Classification-Bayesian
15 pages
WINSEM2019-20 CSE3013 ETH VL2019205006650 Reference Material I 21-Feb-2020 BAYES RULE AND ITS USE
No ratings yet
WINSEM2019-20 CSE3013 ETH VL2019205006650 Reference Material I 21-Feb-2020 BAYES RULE AND ITS USE
10 pages
Bayesian Learning
No ratings yet
Bayesian Learning
58 pages
Nayes Bayes Classifier
No ratings yet
Nayes Bayes Classifier
46 pages
Bayes Classification Method
No ratings yet
Bayes Classification Method
18 pages
6 Classification
No ratings yet
6 Classification
53 pages
Text Mining - Classification
No ratings yet
Text Mining - Classification
28 pages
Naive Bayes
No ratings yet
Naive Bayes
38 pages
UNIT - IV
No ratings yet
UNIT - IV
169 pages
9-Decision Tree Induction-23-01-2025
No ratings yet
9-Decision Tree Induction-23-01-2025
40 pages
Bayes Classification
No ratings yet
Bayes Classification
4 pages
IME672 - Lecture 44
No ratings yet
IME672 - Lecture 44
16 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Naive by
No ratings yet
Naive by
23 pages
Bayes' Rule and Its Use
No ratings yet
Bayes' Rule and Its Use
13 pages
l18 Irsw Pir
No ratings yet
l18 Irsw Pir
36 pages
Data Mining - Module 7
No ratings yet
Data Mining - Module 7
8 pages
Lesson 3.3 - Supervised Learning Rule Based Classification
No ratings yet
Lesson 3.3 - Supervised Learning Rule Based Classification
43 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
ML Module4 Classification
No ratings yet
ML Module4 Classification
79 pages
10 Classification New 1
No ratings yet
10 Classification New 1
31 pages
K - Nearest Neighbours Classifier / Regressor
No ratings yet
K - Nearest Neighbours Classifier / Regressor
35 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
Irs Unit 4 CH 1
No ratings yet
Irs Unit 4 CH 1
58 pages
Naive Bayes
No ratings yet
Naive Bayes
24 pages
8 - Classification NaiveBayes PDF
No ratings yet
8 - Classification NaiveBayes PDF
13 pages
Unit 6 Classification and Prediction
No ratings yet
Unit 6 Classification and Prediction
66 pages
Classification DMKD
No ratings yet
Classification DMKD
50 pages
Classification Clustering
No ratings yet
Classification Clustering
44 pages
Lecture13 Nbayes
No ratings yet
Lecture13 Nbayes
56 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
16 pages
ML 09 Naive Bayes Classifier
No ratings yet
ML 09 Naive Bayes Classifier
24 pages
ML Lec 15 Naive Bayes
No ratings yet
ML Lec 15 Naive Bayes
16 pages
Pre-Calculus Essentials
From Everand
Pre-Calculus Essentials
Ernest Woodward
No ratings yet
Calculus Super Review
From Everand
Calculus Super Review
Editors of REA
No ratings yet
SAT Math: Master the Skills in 40 Pages
From Everand
SAT Math: Master the Skills in 40 Pages
Jennifer L Johnson
No ratings yet
Stat - Lab.Report - Santo Kumar Pramanik (ACCT 22043)
No ratings yet
Stat - Lab.Report - Santo Kumar Pramanik (ACCT 22043)
12 pages
Gr.12 Revision 1 Portions-1
No ratings yet
Gr.12 Revision 1 Portions-1
8 pages
Nptel Week 1
No ratings yet
Nptel Week 1
3 pages
CE 311-ST - M1 - T1 - W2 - Introduction
No ratings yet
CE 311-ST - M1 - T1 - W2 - Introduction
11 pages
DLL Q2 Mathematics Week 8 d3
100% (1)
DLL Q2 Mathematics Week 8 d3
2 pages
How To Improve Writing Skills
No ratings yet
How To Improve Writing Skills
3 pages
Adhesive Failure
No ratings yet
Adhesive Failure
5 pages
Active Structural Control Research at Kajima Corporation
No ratings yet
Active Structural Control Research at Kajima Corporation
54 pages
Corrosion and Paint Removal
No ratings yet
Corrosion and Paint Removal
48 pages
Lecture4 Y Junctions
No ratings yet
Lecture4 Y Junctions
10 pages
DR Yogendra Singh - Principles and Methodologies
No ratings yet
DR Yogendra Singh - Principles and Methodologies
13 pages
Polycrete MC
No ratings yet
Polycrete MC
2 pages
609 Building Ar Experiences With Reality Composer
No ratings yet
609 Building Ar Experiences With Reality Composer
133 pages
Thesis Defense
No ratings yet
Thesis Defense
15 pages
Quiz 1 A 09102024 122600pm
No ratings yet
Quiz 1 A 09102024 122600pm
2 pages
Titration - Oxalic Acid Vs NaOH
No ratings yet
Titration - Oxalic Acid Vs NaOH
4 pages
Engineering Mechanics
No ratings yet
Engineering Mechanics
156 pages
Ielts Simulation Test Vol2 With Answers - Reading Practice Test 4 v9 1034
No ratings yet
Ielts Simulation Test Vol2 With Answers - Reading Practice Test 4 v9 1034
27 pages
Multi Sentry MST 160kVA 200kVA User Manual
100% (1)
Multi Sentry MST 160kVA 200kVA User Manual
54 pages
Ibrahim2006 - From Individual To Collective Capabilities
No ratings yet
Ibrahim2006 - From Individual To Collective Capabilities
21 pages
Progression Test Class 10 Ms
No ratings yet
Progression Test Class 10 Ms
28 pages
Hanging Downcomer
No ratings yet
Hanging Downcomer
11 pages
Chain and Compass Survey
No ratings yet
Chain and Compass Survey
59 pages
Mno 2705 Individual Assessment
No ratings yet
Mno 2705 Individual Assessment
29 pages
Science-Related Issues in The Philippines
86% (7)
Science-Related Issues in The Philippines
4 pages
Moody Chart
No ratings yet
Moody Chart
1 page
Air Pressure
No ratings yet
Air Pressure
11 pages
Ajay Mark - Quiz On Pressure, Winds and Clouds
No ratings yet
Ajay Mark - Quiz On Pressure, Winds and Clouds
8 pages
Pir Zia Inayat Khan - Seven Pillars House of Wisdom The Seven Pillars - Journey Toward Wisdom - 2015 - S
No ratings yet
Pir Zia Inayat Khan - Seven Pillars House of Wisdom The Seven Pillars - Journey Toward Wisdom - 2015 - S
242 pages

23-Naive Bayes

Uploaded by

23-Naive Bayes

Uploaded by

Module5_Bayes_Classifica

• The Bayes' rule (also Bayes' law or Bayes' theorem):

• This simple equation underlies prediction models.

• Bayes’ Theorem: P( H | X) P(X | H ) P( H ) P(X | H )P( H ) / P(X)

• Let X be a data sample (“evidence”): class label is unknown

P(H | X) P(X | H ) P(H ) P(X | H )P(H ) / P(X)

• Informally, this can be viewed as

• Since P(X) is constant for all classes, only

• Hint: P(Word/Category) = (Number of occurrence of the word

You might also like