Pattern Recognition - Lec02
Pattern Recognition - Lec02
Bayes classifier is based on the assumption that information about classes in the
form of prior probabilities and distributions of patterns in the class are known.
Bayes classifier employs the posterior probabilities to assign the class label to a
test pattern.
A pattern is assigned to the class label that has maximum posterior probability.
The classifier employs Bayes theorem to convert prior probability into posterior
probability based on the pattern to be classified using the likelihood values.
𝑷 𝑨𝒌 𝑷 𝑨ห𝑨𝒌
𝑷 𝑨𝒌 ȁ𝑨 =
σ𝒏𝒋=𝟏 𝑷 𝑨𝒋 𝑷 𝑨 ቚ𝑨𝒋
𝑷 𝑿ȁ𝑯 𝑷 𝑯 𝟎. 𝟗 × 𝟎. 𝟑
𝑷 𝐫𝐨𝐚𝐝 𝐢𝐬 𝐰𝐞𝐭ȁ𝐢𝐭 𝐡𝐚𝐬 𝐫𝐚𝐢𝐧𝐝 = = = 𝟎. 𝟗
𝑷 𝑿 𝟎. 𝟑
The probability of error is 0.1, which is the probability a road is not wet
given that it has rained.
𝑷(𝐠𝐫𝐞𝐞𝐧 ȁ 𝐩𝐞𝐧𝐜𝐢𝐥)
𝑷 𝐩𝐞𝐧𝐜𝐢𝐥 𝐠𝐫𝐞𝐞𝐧) × 𝑷(𝐠𝐫𝐞𝐞𝐧)
=
𝑷 𝐩𝐞𝐧𝐜𝐢𝐥 𝐠𝐫𝐞𝐞𝐧) 𝑷 𝐠𝐫𝐞𝐞𝐧 + 𝑷 𝐩𝐞𝐧𝐜𝐢𝐥 𝐛𝐥𝐮𝐞) 𝑷 𝐛𝐥𝐮𝐞 + 𝑷 𝐩𝐞𝐧𝐜𝐢𝐥 𝐫𝐞𝐝) 𝑷(𝐫𝐞𝐝)
Dr. Mona Nagy ElBedwehy 10
Bayes Theorem
𝟏 𝟏
× 𝟏
𝑷(𝐠𝐫𝐞𝐞𝐧 ȁ 𝐩𝐞𝐧𝐜𝐢𝐥) = 𝟑 𝟐 =
𝟏 𝟏 𝟏 𝟏 𝟏 𝟏 𝟐
× + × + ×
𝟑 𝟐 𝟐 𝟒 𝟔 𝟒
𝟏 𝟏
× 𝟑
𝑷(𝐛𝐥𝐮𝐞 ȁ 𝐩𝐞𝐧𝐜𝐢𝐥) = 𝟐 𝟒 =
𝟏 𝟏 𝟏 𝟏 𝟏 𝟏 𝟖
× + × + ×
𝟑 𝟐 𝟐 𝟒 𝟔 𝟒
𝟏 𝟏
× 𝟏
𝑷(𝐫𝐞𝐝 ȁ 𝐩𝐞𝐧𝐜𝐢𝐥) = 𝟔 𝟒 =
𝟏 𝟏 𝟏 𝟏 𝟏 𝟏 𝟖
× + × + ×
𝟑 𝟐 𝟐 𝟒 𝟔 𝟒
Dr. Mona Nagy ElBedwehy 11
Bayes Theorem
We decide that pencil is a member of class (green) because the posterior
probability is 0.5, which is greater than the posterior probabilities of the
other class (red and blue).
𝟑 𝟏 𝟏
𝑷(𝐞𝐫𝐫𝐨𝐫 ȁ 𝐩𝐞𝐧𝐜𝐢𝐥) = + =
𝟖 𝟖 𝟐
This enables us to decide that pen belongs to class (green) and the
corresponding probability of error:
𝟐 𝟏 𝟏
𝑷(𝐞𝐫𝐫𝐨𝐫 ȁ 𝐩𝐞𝐧) = + =
Dr. Mona Nagy ElBedwehy 𝟗 𝟗 𝟑 13
Bayes Theorem
Finally, for paper, the posterior probabilities are:
𝟐
𝑷(𝐠𝐫𝐞𝐞𝐧 ȁ 𝐩𝐚𝐩𝐞𝐫) =
𝟕
𝟐
𝑷(𝐛𝐥𝐮𝐞 ȁ 𝐩𝐚𝐩𝐞𝐫) =
𝟕
𝟑
𝑷(𝐫𝐞𝐝 ȁ 𝐩𝐚𝐩𝐞𝐫) =
𝟕
𝟏 𝟏 𝟏
= 𝑷(𝐞𝐫𝐫𝐨𝐫 ȁ 𝐩𝐞𝐧𝐜𝐢𝐥) × + 𝑷(𝐞𝐫𝐫𝐨𝐫 ȁ 𝐩𝐞𝐧) × + 𝑷(𝐞𝐫𝐫𝐨𝐫 ȁ 𝐩𝐚𝐩𝐞𝐫) ×
𝟑 𝟑 𝟑
𝟏 𝟏 𝟏 𝟏 𝟒 𝟏 𝟓𝟗
= × + × + × =
𝟐 𝟑 𝟑 𝟑 𝟕 𝟑 𝟏𝟐𝟔
P( X1 , X2 , , Xn |C ) = P( X1 | X2 , , Xn ; C )P( X2 , , Xn |C )
= P( X1 |C )P( X2 , , Xn |C )
= P( X1 |C )P( X2 |C ) P( Xn |C )
[Pˆ (a1 |c* ) Pˆ (an |c* )]Pˆ (c* ) [Pˆ (a1 |c) Pˆ (an |c)]Pˆ (c), c c* , c = c1 , , cL
– Look up tables
P(Outlook=Sunny|Play=Yes) = 2/9 P(Outlook=Sunny|Play=No) = 3/5
P(Temperature=Cool|Play=Yes) = 3/9 P(Temperature=Cool|Play==No) = 1/5
P(Huminity=High|Play=Yes) = 3/9 P(Huminity=High|Play=No) = 4/5
P(Wind=Strong|Play=Yes) = 3/9 P(Wind=Strong|Play=No) = 3/5
P(Play=Yes) = 9/14 P(Play=No) = 5/14
with 20 tuples.
Given this is the knowledge of data and classes, we are to find most likely
classification for any other unseen instance, for example:
Prior Probability 14/20 = 0.70 2/20 = 0.10 3/20 = 0.15 1/20 = 0.05
Case3: Class = Very Late : 0.15 × 1.0 × 0.67 × 0.33 × 0.67 = 0.0222
In real life situation, all attributes are not necessarily be categorical, In fact,
there is a mix of both categorical and continuous attributes.
𝟏 𝒙−𝝁 𝟐
𝑷 𝒙: 𝝁, 𝝈𝟐 = 𝒆−
𝟐𝝅𝝈 𝟐𝝈𝟐
Here, the parameter 𝝁𝒊𝒋 can be calculated based on the sample mean of
attribute value of Aj for the training records that belong to the class Ci.
In this circumstance, 𝑷(𝒙𝟏 ȁ𝒄𝒊 ) ⋅⋅⋅ 𝑷(𝒂𝒋𝒌 ȁ𝒄𝒊 ) ⋅⋅⋅ 𝑷(𝒙𝒏 ȁ𝒄𝒊 ) = 𝟎 during test.
If the posterior probability for one of the attribute is zero, then the overall
class-conditional probability for the class vanishes.
In other words, if training data do not cover many of the attribute values, then
we may not be able to classify some of the test records.
3 No Small 70K No
6 No Medium 60K No
Training Set
Apply
Tid Attrib1 Attrib2 Attrib3 Class Model
11 No Small 55K ?
15 No Large 67K ?
10
Test Set
Dr. Mona Nagy ElBedwehy 37
Classification
▪ A number of classification techniques are known, which can be broadly
classified into the following categories:
1. Statistical-Based Methods
➢ Regression
➢ Bayesian Classifier
2. Distance-Based Classification
➢ K-Nearest Neighbours
3. Decision Tree-Based Classification
➢ ID3, C 4.5, CART
4. Classification using Machine Learning (SVM)
5. Classification using Neural Network (ANN)
Dr. Mona Nagy ElBedwehy 38
Simple Probability
𝑃 𝐴∪𝐵 =𝑃 𝐴 +𝑃 𝐵 −𝑃 𝐴∩𝐵
𝑃 𝐴∪𝐵 =𝑃 𝐴 +𝑃 𝐵
Suppose, A and B are two events associated with a random experiment. The probability
of A under the condition that B has already occurred and 𝑃(𝐵) ≠ 0 is given by
For n events A1, A2, …, An and if all events are mutually independent to each other
𝑃 𝐴1 ∩ 𝐴2 ∩ … … … … ∩ 𝐴𝑛 = 𝑃 𝐴1 . 𝑃 𝐴2 … … … … 𝑃 𝐴𝑛
Note:
𝑃 𝐴 𝐵 = 0 if events are mutually exclusive
𝑃 𝐴𝐵 =𝑃 𝐴 if A and B are independent
𝑃 𝐴 𝐵 ⋅ 𝑃 𝐵 = 𝑃 𝐵 𝐴 ⋅ 𝑃(𝐴) otherwise,
P A ∩ B = P(B ∩ A)
Dr. Mona Nagy ElBedwehy 42
Total Probability
𝑃 𝐴 = 𝑃 𝐸1 . 𝑃 𝐴 𝐸1 + 𝑃 𝐸2 . 𝑃 𝐴 𝐸2 + ⋯ … … … . +𝑃 𝐸𝑛 . 𝑃(𝐴ȁ𝐸𝑛 )
𝑃 𝐸𝑖 . 𝑃(𝐴ȁ𝐸𝑖 )
𝑃(𝐸𝑖 𝐴 = 𝑛
σ𝑖=1 𝑃 𝐸𝑖 . 𝑃(𝐴ȁ𝐸𝑖 )