0% found this document useful (0 votes)

36 views36 pages

CPE412 Pattern Recognition (Week 5) - Updated

The document discusses Bayesian decision theory and summarizes the key concepts in 3 sentences: It introduces the classification problem of determining whether an unlabeled insect is a katydid or grasshopper based on measured features. It explains how Bayesian classification calculates the probability that an insect belongs to each class given observed features using Bayes' rule. The document also discusses how naive Bayes classifiers make the simplifying assumption that features are independent to determine the probability of a class generating an observed instance.

Uploaded by

Basil Albattah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views36 pages

CPE412 Pattern Recognition (Week 5) - Updated

Uploaded by

Basil Albattah

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Week 5

Bayesian Decision Theory

Dr. Nehad Ramaha,

Computer Engineering Department
Karabük Universities 1
The class notes are a compilation and edition from many sources. The instructor does not claim intellectual property or ownership of the lecture notes.
The Classification Problem Katydids
(informal definition)

Given a collection of annotated

data. In this case 5 instances
Katydids of and five of
Grasshoppers, decide what type of
insect the unlabeled example is. Grasshoppers

Katydid or Grasshopper?
For any domain of interest, we can measure features

Color {Green, Brown, Gray, Other} Has Wings?

Abdomen Thorax
Length Length Antennae
Length

Mandible
Size
Spiracle
Diameter Leg Length
Grasshoppers Katydids

10
9
8
7

Antenna Length
6
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10
Abdomen Length

Let’s get lots more data…

With a lot of data, we can build a histogram. Let
us just build one for “Antenna Length” for now…
10
9
8
7

Antenna Length
6
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10

Katydids
Grasshoppers
We can leave the
histograms as they are,
or we can summarize
them with two normal
distributions.

Let us us two normal

distributions for ease
of visualization in the
following slides…
• We want to classify an insect we have found. Its antennae are 3 units long.
How can we classify it?

• We can just ask ourselves, give the distributions of antennae lengths we have
seen, is it more probable that our insect is a Grasshopper or a Katydid.
• There is a formal way to discuss the most probable classification…

p(cj | d) = probability of class cj, given that we have observed d

Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed
d
P(Grasshopper | 3 ) = 10 / (10 + 2) = 0.833
P(Katydid | 3 ) = 2 / (10 + 2) = 0.166

Antennae length is 3
p(cj | d) = probability of class cj, given that we have observed
d
P(Grasshopper | 7 ) = 3 / (3 + 9) = 0.250
P(Katydid | 7 ) = 9 / (3 + 9) = 0.750

9
3

Antennae length is 7
p(cj | d) = probability of class cj, given that we have observed
d
P(Grasshopper | 5 ) = 6 / (6 + 6) = 0.500
P(Katydid | 5 ) = 6 / (6 + 6) = 0.500

Antennae length is
5
That was a visual intuition for a simple case of the Bayes
classifier, also called:

• Idiot Bayes
• Naïve Bayes
• Simple Bayes

We are about to see some of the mathematical formalisms, and

more examples, but keep in mind the basic idea.

Find out the probability of the previously unseen instance

belonging to each class, then simply pick the most probable class.
Assume that we have two classes
(Note: “Drew
c1 = male, and c2 = female. can be a male Drew Barrymore
or female
We have a person whose sex we do not know, say name”)
“drew” or d.
Classifying drew as male or female is equivalent to
asking is it more probable that drew is male or female,
i.e which is greater p(male | drew) or p(female | drew)

What is the probability of being Drew Carey

called “drew” given that you are a

male?
What is the
probability of being
a male?
p(male | drew) = p(drew | male ) p(male)
p(drew) What is the probability of being
named “drew”? (actually irrelevant,
since it is that same for all classes) 12
This is Officer Drew (who arrested me in
1997). Is Officer Drew a Male or Female?
Luckily, we have a small
database with names and sex.

We can use it to apply Bayes Name Sex

rule…
Drew Male
Officer Drew Claudia Female
Drew Female
Drew Female
p(cj | d) = p(d | cj ) p(cj) Alberto Male
p(d) Karin Female
Nina Female
Sergio Male
Name Sex
Drew Male
Claudia Female
Drew Female
Drew Female
p(cj | d) = p(d | cj ) p(cj) Alberto Male
p(d) Karin Female
Officer Drew Nina Female
Sergio Male
p(male | drew) = 1/3 * 3/8 = 0.125
3/8 3/8 Officer Drew is
more likely to be
p(female | drew) = 2/5 * 5/8 = 0.250 a Female.
3/8 3/8
Officer Drew IS a female!

Officer Drew

p(male | drew) = 1/3 * 3/8 = 0.125

3/8 3/8

p(female | drew) = 2/5 * 5/8 = 0.250

3/8 3/8
So far, we have only considered Bayes p(cj | d) = p(d | cj ) p(cj)
Classification when we have one
attribute (the “antennae length”, or the p(d)
“name”). But we may have many
features.
How do we use all the features?
Name Over 170CM Eye Hair length Sex
Drew No Blue Short Male
Claudia Yes Brown Long Female
Drew No Blue Long Female
Drew No Blue Long Female
Alberto Yes Brown Short Male
Karin No Blue Long Female
Nina Yes Brown Short Female
Sergio Yes Blue Long Male
 To simplify the task, naïve Bayesian classifiers assume
attributes have independent distributions, and thereby
estimate

p(d|cj) = p(d1|cj) * p(d2|cj) * ….* p(dn|cj)

The probability of class

cj generating instance d,
equals….
The probability of class cj
generating the observed
value for feature 1,
multiplied by..
The probability of class cj
generating the observed
value for feature 2,
multiplied by..
 To simplify the task, naïve Bayesian classifiers assume
attributes have independent distributions, and thereby
estimate
p(d|cj) = p(d1|cj) * p(d2|cj) * ….*p(dn|cj)

p(officer drew|cj) = p(over_170cm = yes|cj) * p(eye =blue|cj) * ….

Officer
Drew is
blue-eyed,
over 170cm
tall, and has p(officer drew| Female) = 2/5 * 3/5 * ….
long hair
p(officer drew| Male) = 2/3 * 2/3 * ….
The Naive Bayes classifiers is
often represented as this type of cj
graph…

Note the direction of the arrows,

which state that each class causes
certain features, with a certain
probability

p(d1|cj) p(d2|cj) … p(dn|cj)

Naïve Bayes is fast and cj
space efficient

We can look up all the probabilities

with a single scan of the database
and store them in a (small) table…

p(d1|cj) p(d2|cj)
… p(dn|cj)

Sex Over190c Sex Long Sex

m Hair Male
Male Yes 0.15 Male Yes 0.05
No 0.85 No 0.95 Female
Female Yes 0.01 Female Yes 0.70
No 0.99 No 0.30
An obvious point. I have used a
simple two class problem, and
cj
two possible values for each
example, for my previous
examples. However we can have
an arbitrary number of classes, or
feature values
p(d1|cj) p(d2|cj)
… p(dn|cj)

Animal Mass >10kg Animal Color

Animal
Cat Yes 0.15 Cat Black 0.33
Cat
No 0.85 White 0.23

Dog Yes 0.91 Brown 0.44 Dog

No 0.09 Dog Black 0.97

Pig Yes 0.99 White 0.03 Pig

No 0.01 Brown 0.90

Pig Black 0.04
White 0.01
Brown 0.95
Advantages/Disadvantages of Naïve Bayes
• Advantages:
– Fast to train (single scan). Fast to classify
– Not sensitive to irrelevant features
– Handles real and discrete data
– Handles streaming data well
• Disadvantages:
– Assumes independence of features:
Relationships between variables cannot be modeled
because operations are performed assuming the
features are independent of each other.
 Real Time Systems
 Multiple Classification Problems (News / E-
Commerce Categories)
 Text Classification (Spam Filtering /
Sentiment Analysis)
 Disease Diagnosis
 Recommendation System

23
 Multinomial Naive Bayes:
◦ This is mostly used for document classification problem, i.e whether a
document belongs to the category of sports, politics, technology etc.
The features/predictors used by the classifier are the frequency of the
words present in the document.
 Bernoulli Naive Bayes:
◦ This is similar to the multinomial naive bayes but the predictors are
Boolean variables. The parameters that we use to predict the class
variable take up only values yes or no, for example if a word occurs in
the text or not.
 Gaussian Naive Bayes:
◦ When the predictors take up a continuous value and are not discrete,
we assume that these values are sampled from a gaussian
distribution.

24
 If your continuous features are not normally distributed,
we must convert them to normal distribution using various
methods or transformations.
 If there is a zero-frequency situation in our data set,
 If you have two categories that are very similar to each
other and have a lot of relationship, it is recommended to
remove one of them. This is because this feature will be
counted as voted twice and will seem overly important.
 There are not many parameters in the Naive Bayes
algorithm that you can play with and improve the model. If
you are going to use Naive Bayes for this, you need to do
data pre-processing, especially feature selection, very
well.

25
 Suppose we have a dataset of weather conditions and
corresponding target variable "Play". So using this
dataset we need to decide that whether we should
play or not on a particular day according to the
weather conditions.
 To solve this problem, we need to follow the below
steps:
◦ Convert the given dataset into frequency tables.
◦ Generate Likelihood table by finding the probabilities of
given features.
◦ Now, use Bayes theorem to calculate the posterior
probability.

26
 Problem: If the weather is sunny, then the Player should
play or not?
 Solution: To solve this, first consider the below dataset:
Outlook Play

0 Rainy Yes
1 Sunny Yes
2 Overcast Yes
3 Overcast Yes
4 Sunny No
5 Rainy Yes
6 Sunny Yes
7 Overcast Yes
8 Rainy No
9 Sunny No
10 Sunny Yes
11 Rainy No
12 Overcast Yes
13 Overcast Yes 27
 Frequency table for the Weather Conditions:

Weather Yes No
Overcast 5 0
Rainy 2 2
Sunny 3 2
Total 10 4

28
 Likelihood table weather condition:

Weather No Yes
Overcast 0 5 5/14= 0.35
Rainy 2 2 4/14=0.29
Sunny 2 3 5/14=0.35
All 4/14=0.29 10/14=0.71

29
 Applying Bayes’ theorem:
 P(Yes|Sunny)= P(Sunny|Yes)*P(Yes)/P(Sunny)
 P(Sunny|Yes)= 3/10= 0.3
 P(Sunny)= 0.35
 P(Yes)=0.71
 So P(Yes|Sunny) = 0.3*0.71/0.35= 0.60

30
 Applying Bayes’ theorem:
 P(No|Sunny)= P(Sunny|No)*P(No)/P(Sunny)
 P(Sunny|NO)= 2/4=0.5
 P(No)= 0.29
 P(Sunny)= 0.35
 So P(No|Sunny)= 0.5*0.29/0.35 = 0.41

31
 Applying Bayes’ theorem:
 P(Yes|Sunny) = 0.60
 P(No|Sunny)= 0.41
 So as we can see from the above calculation that
P(Yes|Sunny)>P(No|Sunny)
 Hence on a Sunny day, Player can play the game.

32
Whether a document/topic belongs to a particular category. The
features/predictors used by the classifier are the frequency of
words found in the document.

33
 P(C) = 3/4 = 0.75 (Ratio of rows in category
C to all rows in the data to be taught)
 P(J) = 1/4 = 0.25 (Ratio of rows in the Japan
category to all rows in the data to be taught)
 P(X| Y) =(Number of repetitions of the
expression “X” in the lines in category Y +1) /
(Number of all words in the lines in category Y +
Number of different words in data taught)

34
if we didn't add 1
the result will be
zero

P(C | Test) = P(C) * P(Chinese | C) * P(Chinese | C) * P(Chinese | C) * P(Tokyo| C) * P(Japan| C)

P(Ç | Test) = 0.75 * 0.428 * 0.428 * 0.428 * 0.071 * 0.071 = 0.0003

P(Japan| Test) = P(J) * P(Chinese | J) * P(Chinese | J) * P(Chinese | J) * P(Tokyo| J) * P(Japan| J)

P(Japan| Test) = 0.25 * 0.222 * 0.222 * 0.222 * 0.022 * 0.022 = 0.0001

35
36

06 - NaiveBayes and ME
No ratings yet
06 - NaiveBayes and ME
26 pages
Naive Bayes
No ratings yet
Naive Bayes
2 pages
Lect 7 DM
No ratings yet
Lect 7 DM
65 pages
CPE412 Pattern Recognition (Week 4)
No ratings yet
CPE412 Pattern Recognition (Week 4)
47 pages
14.0 Naive Bayes
No ratings yet
14.0 Naive Bayes
25 pages
ML Lecture15
No ratings yet
ML Lecture15
13 pages
NaiveBayes TomasWard
No ratings yet
NaiveBayes TomasWard
39 pages
Lecture 12
No ratings yet
Lecture 12
13 pages
Lecture 5-Naïve Bayes
No ratings yet
Lecture 5-Naïve Bayes
26 pages
Navie Classifier
No ratings yet
Navie Classifier
8 pages
Foundations of Data Science - Unit 6 - Naive Bayes
No ratings yet
Foundations of Data Science - Unit 6 - Naive Bayes
12 pages
Bays Classifier (Machine Learning)
No ratings yet
Bays Classifier (Machine Learning)
16 pages
Bayesian Decision Theory
No ratings yet
Bayesian Decision Theory
27 pages
ML Lecture#5
No ratings yet
ML Lecture#5
65 pages
Lecture 7
No ratings yet
Lecture 7
15 pages
Bayesian Learning
No ratings yet
Bayesian Learning
58 pages
Class Adv Classification IV
No ratings yet
Class Adv Classification IV
49 pages
Text Mining - Classification
No ratings yet
Text Mining - Classification
28 pages
6 - Naive Bayes
No ratings yet
6 - Naive Bayes
26 pages
05 Classification II 2024
No ratings yet
05 Classification II 2024
54 pages
ML 09 Naive Bayes Classifier
No ratings yet
ML 09 Naive Bayes Classifier
24 pages
Bayesian Classification Examples
No ratings yet
Bayesian Classification Examples
17 pages
Naïve Bayes Classifier
No ratings yet
Naïve Bayes Classifier
21 pages
U4-Naive Bayes Algorithm
No ratings yet
U4-Naive Bayes Algorithm
5 pages
Bayesian Learning
No ratings yet
Bayesian Learning
41 pages
Lecture 5 Bayesian
No ratings yet
Lecture 5 Bayesian
37 pages
Clasificacion Espol
No ratings yet
Clasificacion Espol
75 pages
Simple Bayesian Classifier: Assist - Prof. Songül Albayrak Yıldız Teknik Üniversitesi Bilgisayar Müh. Bölümü
No ratings yet
Simple Bayesian Classifier: Assist - Prof. Songül Albayrak Yıldız Teknik Üniversitesi Bilgisayar Müh. Bölümü
15 pages
Naive Bayes
No ratings yet
Naive Bayes
36 pages
IML Module 3
No ratings yet
IML Module 3
95 pages
Bayes Classifier
No ratings yet
Bayes Classifier
20 pages
Lecture AI 08 26032025 102321am
No ratings yet
Lecture AI 08 26032025 102321am
16 pages
Naive Bayes
No ratings yet
Naive Bayes
9 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
ML Lec 15 Naive Bayes
No ratings yet
ML Lec 15 Naive Bayes
16 pages
Naive Bayes
No ratings yet
Naive Bayes
29 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
No ratings yet
Data Mining Classification: Naïve Bayes Classifier Lecture Notes For Chapter 4 &5
26 pages
Innocence Lost: Book One of The Corpus Ad Astra Adventure
From Everand
Innocence Lost: Book One of The Corpus Ad Astra Adventure
Eric C. Holtgrefe
No ratings yet
K - Nearest Neighbours Classifier / Regressor
No ratings yet
K - Nearest Neighbours Classifier / Regressor
35 pages
Naïve Bayesv1
No ratings yet
Naïve Bayesv1
31 pages
Classification With NaiveBayes
No ratings yet
Classification With NaiveBayes
19 pages
Naive Bayes
No ratings yet
Naive Bayes
6 pages
Lecture10 - Bayesian Classifier
No ratings yet
Lecture10 - Bayesian Classifier
40 pages
Naivebayes Tute
No ratings yet
Naivebayes Tute
4 pages
20 - Naive Bayes
No ratings yet
20 - Naive Bayes
28 pages
20210913115710D3708 - Session 09-12 Bayes Classifier
No ratings yet
20210913115710D3708 - Session 09-12 Bayes Classifier
30 pages
Bayesian Classification
No ratings yet
Bayesian Classification
25 pages
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
No ratings yet
Bayesian Classification: Cse 634 Data Mining - Prof. Anita Wasilewska
66 pages
Naive Bayesian Classifier: National Institute of Technology Sikkim
No ratings yet
Naive Bayesian Classifier: National Institute of Technology Sikkim
6 pages
05 ZeroR OneR Bayes KNN
No ratings yet
05 ZeroR OneR Bayes KNN
76 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Knit Your Own Moustache
From Everand
Knit Your Own Moustache
Vicky Eames
No ratings yet
Classification-Alternative Techniques: Bayesian Classifiers
No ratings yet
Classification-Alternative Techniques: Bayesian Classifiers
7 pages
GV Series: Variable Speed Booster Sets With The New Sd60 Control Card
100% (1)
GV Series: Variable Speed Booster Sets With The New Sd60 Control Card
40 pages
De MC Smo PRG en 01 v4 3 1 CNRSZR
No ratings yet
De MC Smo PRG en 01 v4 3 1 CNRSZR
458 pages
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
No ratings yet
Bayesian Classification: Dr. Navneet Goyal BITS, Pilani
35 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Lecture 5 Bayesian Classification
No ratings yet
Lecture 5 Bayesian Classification
16 pages
Naive Bayes Classifier
No ratings yet
Naive Bayes Classifier
9 pages
REQUIREMENTS-Storage and Filling of LPG in Bulk: WWW - Erc.go - Ke
No ratings yet
REQUIREMENTS-Storage and Filling of LPG in Bulk: WWW - Erc.go - Ke
2 pages
Temperature Prediction Models in Mass Concrete State of The Art Literature Review
No ratings yet
Temperature Prediction Models in Mass Concrete State of The Art Literature Review
10 pages
US IT Recruiting Training Material - Road To US Staffing and USA
No ratings yet
US IT Recruiting Training Material - Road To US Staffing and USA
17 pages
Advanced Continuous Historian
No ratings yet
Advanced Continuous Historian
7 pages
Data Mining - Bayesian Classification
No ratings yet
Data Mining - Bayesian Classification
6 pages
How Many Hairs on a Grizzly Bear?: And Other Big Questions about Numbers
From Everand
How Many Hairs on a Grizzly Bear?: And Other Big Questions about Numbers
Tracey Turner
No ratings yet
Broch Samcef Mecano PDF
No ratings yet
Broch Samcef Mecano PDF
4 pages
Windows Server 2003 Domains Active Directory
No ratings yet
Windows Server 2003 Domains Active Directory
392 pages
Arpan Koley - Oe-Ec604c - Ca-1
No ratings yet
Arpan Koley - Oe-Ec604c - Ca-1
9 pages
A Level 9618 7
100% (1)
A Level 9618 7
4 pages
La Dificultad de Escribir Un Ensayo Persuasivo
100% (1)
La Dificultad de Escribir Un Ensayo Persuasivo
7 pages
Abdullah 2018
No ratings yet
Abdullah 2018
45 pages
Huntington Et Al - HyLogging - Voluminous Industrial-Scale Reflectance Spectroscopy of The Earth PDF
No ratings yet
Huntington Et Al - HyLogging - Voluminous Industrial-Scale Reflectance Spectroscopy of The Earth PDF
14 pages
OR-Model Question Paper - 2024
No ratings yet
OR-Model Question Paper - 2024
6 pages
Bilal Turabi CV
No ratings yet
Bilal Turabi CV
1 page
UPDPSWin 3000MU
No ratings yet
UPDPSWin 3000MU
5 pages
Bpo
No ratings yet
Bpo
8 pages
In The Future All Cars
No ratings yet
In The Future All Cars
51 pages
One To One and Onto1
No ratings yet
One To One and Onto1
9 pages
Introduction To Excel Formulas
No ratings yet
Introduction To Excel Formulas
4 pages
Lead Mechanical Design Engineer in Atlanta GA Resume Tatiana Laguna
No ratings yet
Lead Mechanical Design Engineer in Atlanta GA Resume Tatiana Laguna
2 pages
SATIR DX-Series - DX-300 - Catalogue
No ratings yet
SATIR DX-Series - DX-300 - Catalogue
3 pages
Benefits of Being Hilton
100% (3)
Benefits of Being Hilton
3 pages
Arrays
No ratings yet
Arrays
5 pages
Handout 1
No ratings yet
Handout 1
5 pages
Digital System Design Q1 Q2
No ratings yet
Digital System Design Q1 Q2
3 pages
Zlib 3 PDF
No ratings yet
Zlib 3 PDF
2 pages
K Fold
No ratings yet
K Fold
2 pages
Mobile Phone Cloning IJERTCONV3IS10043
No ratings yet
Mobile Phone Cloning IJERTCONV3IS10043
5 pages
ArchiCAD Report
No ratings yet
ArchiCAD Report
1 page

CPE412 Pattern Recognition (Week 5) - Updated

Uploaded by

CPE412 Pattern Recognition (Week 5) - Updated

Uploaded by

Week 5

Bayesian Decision Theory

Dr. Nehad Ramaha,

Given a collection of annotated

Color {Green, Brown, Gray, Other} Has Wings?

Let’s get lots more data…

Let us us two normal

p(cj | d) = probability of class cj, given that we have observed d

We are about to see some of the mathematical formalisms, and

Find out the probability of the previously unseen instance

What is the probability of being Drew Carey

called “drew” given that you are a

We can use it to apply Bayes Name Sex

p(male | drew) = 1/3 * 3/8 = 0.125

p(female | drew) = 2/5 * 5/8 = 0.250

p(d|cj) = p(d1|cj) * p(d2|cj) * ….* p(dn|cj)

The probability of class

p(officer drew|cj) = p(over_170cm = yes|cj) * p(eye =blue|cj) * ….

Note the direction of the arrows,

p(d1|cj) p(d2|cj) … p(dn|cj)

We can look up all the probabilities

Sex Over190c Sex Long Sex

Animal Mass >10kg Animal Color

Dog Yes 0.91 Brown 0.44 Dog

Pig Yes 0.99 White 0.03 Pig

No 0.01 Brown 0.90

P(C | Test) = P(C) * P(Chinese | C) * P(Chinese | C) * P(Chinese | C) * P(Tokyo| C) * P(Japan| C)

P(Ç | Test) = 0.75 * 0.428 * 0.428 * 0.428 * 0.071 * 0.071 = 0.0003

P(Japan| Test) = P(J) * P(Chinese | J) * P(Chinese | J) * P(Chinese | J) * P(Tokyo| J) * P(Japan| J)

P(Japan| Test) = 0.25 * 0.222 * 0.222 * 0.222 * 0.022 * 0.022 = 0.0001

You might also like