0% found this document useful (0 votes)

19 views33 pages

Lecture 19 - Bayes

Bayes

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views33 pages

Lecture 19 - Bayes

Bayes

Uploaded by

raoseshu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 33

Transfer Functions

Supervised Learning – Classification

Naïve Bayes Method
Bayes’ Rule
• The most important formula in probabilistic machine
learning

𝑃(𝐵|𝐴) × 𝑃 𝐴
𝑃 𝐴𝐵 =
𝑃 𝐵

Bayes, Thomas (1763) An essay

towards solving a problem in the
doctrine of chances. Philosophical
Transactions of the Royal Society of
London, 53:370-418
Bayes’ Rule for Machine Learning
• Allows us to reason from evidence to hypotheses
• Another way of thinking about Bayes’ rule:
Basics of Bayesian Learning
• Goal: find the best hypothesis from some space 𝐻 of hypotheses, given
the observed data (evidence) 𝐷.

• Define best to be: most probable hypothesis in 𝐻

• In order to do that, we need to assume a probability distribution over the
class 𝐻.

• In addition, we need to know something about the relation between the

data observed and the hypotheses (E.g., a coin problem.)

– As we will see, we will be Bayesian about other things, e.g., the parameters of the
model
Basics of Bayesian Learning
• 𝑃(ℎ) - the prior probability of a hypothesis ℎ
Reflects background knowledge; before data is observed. If
no information - uniform distribution.
• 𝑃(𝐷) - The probability that this sample of the Data is
observed. (No knowledge of the hypothesis)
• 𝑃(𝐷|ℎ): The probability of observing the sample 𝐷, given that
hypothesis ℎ is the target
• 𝑃(ℎ|𝐷): The posterior probability of ℎ. The probability that ℎ is
the target, given that 𝐷 has been observed.
Bayes Theorem

𝑃(𝐷|ℎ)𝑃 ℎ
𝑃 ℎ𝐷 =
𝑃(𝐷)
• 𝑃(ℎ|𝐷) increases with 𝑃(ℎ) and with 𝑃(𝐷|ℎ)

• 𝑃(ℎ|𝐷) decreases with 𝑃(𝐷)

Example: Bayesian Classification
 Example: Air Traffic Data

 Let us consider a set

observation recorded in a
database
 Regarding the arrival of
airplanes in the routes from
any airport to New Delhi
under certain conditions.

7
Air-Traffic Data
Days Season Fog Rain Class
Weekday Spring None None On Time
Weekday Winter None Slight On Time
Weekday Winter None None On Time
Holiday Winter High Slight Late
Saturday Summer Normal None On Time
Weekday Autumn Normal None Very Late
Holiday Summer High Slight On Time
Sunday Summer Normal None On Time
Weekday Winter High Heavy Very Late
Weekday Summer None Slight On Time

8
Air-Traffic Data
Cond. from previous slide…
Days Season Fog Rain Class
Saturday Spring High Heavy Cancelled
Weekday Summer High Slight On Time
Weekday Winter Normal None Late
Weekday Summer High None On Time
Weekday Winter Normal Heavy Very Late
Saturday Autumn High Slight On Time
Weekday Autumn None Heavy On Time
Holiday Spring Normal Slight On Time
Weekday Spring Normal None On Time
Weekday Spring Normal Heavy On Time

9
Air-Traffic Data
 In this database, there are four attributes
A = [ Day, Season, Fog, Rain]
with 20 tuples.
 The categories of classes are:
C= [On Time, Late, Very Late, Cancelled]

 Given this is the knowledge of data and classes, we are to find most
likely classification for any other unseen instance, for example:

Week Winter High None ???

Day

 Classification technique eventually to map this tuple into an accurate

class.
10
Naïve Bayesian Classifier
 Example - Air Traffic Dataset: Let us tabulate all the probabilities.

Class
Attribute On Time Late Very Late Cancelled
Weekday 9/14 = 0.64 ½ = 0.5 3/3 = 1 0/1 = 0
Saturday 2/14 = 0.14 ½ = 0.5 0/3 = 0 1/1 = 1
Day

Sunday 1/14 = 0.07 0/2 = 0 0/3 = 0 0/1 = 0

Holiday 2/14 = 0.14 0/2 = 0 0/3 = 0 0/1 = 0
Spring 4/14 = 0.29 0/2 = 0 0/3 = 0 0/1 = 0
Summer 6/14 = 0.43 0/2 = 0 0/3 = 0 0/1 = 0
Season

Autumn 2/14 = 0.14 0/2 = 0 1/3= 0.33 0/1 = 0

Winter 2/14 = 0.14 2/2 = 1 2/3 = 0.67 0/1 = 0
11
Naïve Bayesian Classifier

Class
Attribute On Time Late Very Late Cancelled
None 5/14 = 0.36 0/2 = 0 0/3 = 0 0/1 = 0
Fog

High 4/14 = 0.29 1/2 = 0.5 1/3 = 0.33 1/1 = 1

Normal 5/14 = 0.36 1/2 = 0.5 2/3 = 0.67 0/1 = 0
None 5/14 = 0.36 1/2 = 0.5 1/3 = 0.33 0/1 = 0
Rain

Slight 8/14 = 0.57 0/2 = 0 0/3 = 0 0/1 = 0

Heavy 1/14 = 0.07 1/2 = 0.5 2/3 = 0.67 1/1 = 1
Prior Probability 14/20 = 0.70 2/20 = 0.10 3/20 = 0.15 1/20 = 0.05

12
Naïve Bayesian Classifier
Instance:

Week Winter High Heavy ???

Day
Case1: Class = On Time : 0.70 × 0.64 × 0.14 × 0.29 × 0.07 = 0.0013

Case2: Class = Late : 0.10 × 0.50 × 1.0 × 0.50 × 0.50 = 0.0125

Case3: Class = Very Late : 0.15 × 1.0 × 0.67 × 0.33 × 0.67 = 0.0222

Case4: Class = Cancelled : 0.05 × 0.0 × 0.0 × 1.0 × 1.0 = 0.0000

Case3 is the strongest; Hence correct classification is Very Late

13
Naive Bayes
𝑉𝑀𝐴𝑃 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑣 𝑃(𝑥1, 𝑥2, … , 𝑥𝑛 | 𝑣 )𝑃(𝑣)

𝑃 𝑥1 , 𝑥2 , … , 𝑥𝑛 𝑣𝑗 = 𝑃 𝑥1 𝑥2 , … , 𝑥𝑛 , 𝑣𝑗 𝑃(𝑥2 , … , 𝑥𝑛 |𝑣𝑗 )
= 𝑃 𝑥1 𝑥2 , … , 𝑥𝑛 , 𝑣𝑗 𝑃 𝑥2 𝑥3 , … , 𝑥𝑛 , 𝑣𝑗 𝑃(𝑥3 , … , 𝑥𝑛 |𝑣𝑗 )
=⋯
= 𝑃 𝑥1 𝑥2 , … , 𝑥𝑛 , 𝑣𝑗 𝑃 𝑥2 𝑥3 , … , 𝑥𝑛 , 𝑣𝑗 𝑃 𝑥3 𝑥4 , … , 𝑥𝑛 , 𝑣𝑗 … 𝑃 𝑥𝑛 𝑣𝑗
𝑛

= ෑ 𝑃(𝑥𝑖 |𝑣𝑗 )
𝑖=1

• Assumption: feature values are independent given the target value

Naïve Bayes Example
Day Outlook Temperature Humidity Wind PlayTennis

1 Sunny Hot High Weak No

2 Sunny Hot High Strong No
3 Overcast Hot High Weak Yes
4 Rain Mild High Weak Yes
5 Rain Cool Normal Weak Yes
6 Rain Cool Normal Strong No
7 Overcast Cool Normal Strong Yes
8 Sunny Mild High Weak No
9 Sunny Cool Normal Weak Yes
10 Rain Mild Normal Weak Yes
11 Sunny Mild Normal Strong Yes
12 Overcast Mild High Strong Yes
13 Overcast Hot Normal Weak Yes
14 Rain Mild High Strong No
Example
• Compute 𝑃(𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠); 𝑃(𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑛𝑜)
• Compute 𝑃(𝑜𝑢𝑡𝑙𝑜𝑜𝑘 = 𝑠/𝑜𝑐/𝑟 | 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (6 numbers)
• Compute 𝑃(𝑇𝑒𝑚𝑝 = ℎ/𝑚𝑖𝑙𝑑/𝑐𝑜𝑜𝑙 | 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (6 numbers)
• Compute 𝑃(ℎ𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = ℎ𝑖/𝑛𝑜𝑟 | 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (4 numbers)
• Compute 𝑃(𝑤𝑖𝑛𝑑 = 𝑤/𝑠𝑡 | 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (4 numbers)
Example
• Compute 𝑃(𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠); 𝑃(𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑛𝑜)
• Compute 𝑃(𝑜𝑢𝑡𝑙𝑜𝑜𝑘 = 𝑠/𝑜𝑐/𝑟 | 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (6 numbers)
• Compute 𝑃(𝑇𝑒𝑚𝑝 = ℎ/𝑚𝑖𝑙𝑑/𝑐𝑜𝑜𝑙 | 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (6 numbers)
• Compute 𝑃(ℎ𝑢𝑚𝑖𝑑𝑖𝑡𝑦 = ℎ𝑖/𝑛𝑜𝑟| 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (4 numbers)
• Compute 𝑃(𝑤𝑖𝑛𝑑 = 𝑤/𝑠𝑡 | 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠/𝑛𝑜) (4 numbers)

• Given a new instance:

(Outlook=sunny; Temperature=cool; Humidity=high; Wind=strong)

• Predict: 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = ?
Example
• Given: (Outlook=sunny; Temperature=cool; Humidity=high; Wind=strong)
• 𝑃 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠 𝑃 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑛𝑜
= 9/14 = 0.64) = 5/14 = 0.36

• 𝑃(𝑜𝑢𝑡𝑙𝑜𝑜𝑘 = 𝑠𝑢𝑛𝑛𝑦 | 𝑦𝑒𝑠) = 2/9 𝑃(𝑜𝑢𝑡𝑙𝑜𝑜𝑘 = 𝑠𝑢𝑛𝑛𝑦 | 𝑛𝑜) = 3/5

• 𝑃(𝑦𝑒𝑠, … . . ) ~ 0.0053 𝑃(𝑛𝑜, … . . ) ~ 0.0206

Example
• Given: (Outlook=sunny; Temperature=cool; Humidity=high; Wind=strong)
• 𝑃 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑦𝑒𝑠 𝑃 𝑃𝑙𝑎𝑦𝑇𝑒𝑛𝑛𝑖𝑠 = 𝑛𝑜
= 9/14 = 0.64) = 5/14 = 0.36

• 𝑃(𝑜𝑢𝑡𝑙𝑜𝑜𝑘 = 𝑠𝑢𝑛𝑛𝑦 | 𝑦𝑒𝑠) = 2/9 𝑃(𝑜𝑢𝑡𝑙𝑜𝑜𝑘 = 𝑠𝑢𝑛𝑛𝑦 | 𝑛𝑜) = 3/5

• 𝑃 𝑦𝑒𝑠, … . . ~ 0.0053 𝑃 𝑛𝑜, … . . ~ 0.0206

• 𝑃(𝑛𝑜|𝑖𝑛𝑠𝑡𝑎𝑛𝑐𝑒) = 0.0206/(0.0053 + 0.0206) = 0.795
What if we were asked about Outlook=OC ?
Additional Material
Naïve Bayes: Two Classes Why do things work?

𝑣𝑁𝐵 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑣𝑗 ∈𝑉 P vj ෑ 𝑃(𝑥𝑖 |𝑣𝑗 )

𝑖
• Notice that the naïve Bayes method gives a method for
predicting
• rather than an explicit classifier.
• In the case of two classes, 𝑣{0,1} we predict that 𝑣 =
1 iff: 𝑃 𝑣𝑗 = 1 ⋅ ς𝑛𝑖=1 𝑃 𝑥𝑖 𝑣𝑗 = 1
>1
𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1 𝑃 𝑥𝑖 𝑣𝑗 = 0
Naïve Bayes: Two Classes
𝑣𝑁𝐵 = 𝑎𝑟𝑔𝑚𝑎𝑥𝑣𝑗 ∈𝑉 𝑃 𝑣𝑗 ෑ 𝑃(𝑥𝑖 |𝑣𝑗 )
𝑖

• Notice that the naïve Bayes method gives a method for

predicting rather than an explicit classifier.
• In the case of two classes, 𝑣{0,1} we predict that 𝑣 =
1 iff: 𝑃 𝑣𝑗 = 1 ⋅ ς𝑛𝑖=1 𝑃 𝑥𝑖 𝑣𝑗 = 1
>1
𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1 𝑃 𝑥𝑖 𝑣𝑗 = 0

Denote 𝑝𝑖 = 𝑃 𝑥𝑖 = 1 𝑣 = 1 , 𝑞𝑖 = 𝑃 𝑥𝑖 = 1 𝑣 = 0
𝑥
𝑃 𝑣𝑗 = 1 ⋅ ς𝑛𝑖=1 𝑝𝑖 𝑖 1 − 𝑝𝑖 1−𝑥𝑖
𝑛 𝑥𝑖 1−𝑥
>1
𝑃 𝑣𝑗 = 0 ⋅ ς𝑖=1 𝑞𝑖 1 − 𝑞𝑖 𝑖
Naïve Bayes: Two Classes
• In the case of two classes, 𝑣{0,1} we predict that 𝑣 =
1 iff: 𝑛 𝑥𝑖 1−𝑥𝑖 𝑛 𝑥𝑖
𝑃 𝑣𝑗 = 1 ⋅ ς𝑖=1 𝑝𝑖 1 − 𝑝𝑖 𝑃 𝑣𝑗 = 1 ⋅ ς𝑖=1(1 − 𝑝𝑖 ) 𝑝𝑖 /1 − 𝑝𝑖
𝑥 = >1
𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1 𝑞𝑖 𝑖 1 − 𝑞𝑖 1−𝑥𝑖 𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1(1 − 𝑞𝑖 ) 𝑞𝑖 /1 − 𝑞𝑖 𝑥𝑖
Naïve Bayes: Two Classes
• In the case of two classes, 𝑣{0,1} we predict that 𝑣 =
1 iff: 𝑥
𝑃 𝑣 = 1 ⋅ ς𝑛 𝑝 𝑖 1 − 𝑝 1−𝑥𝑖 𝑃 𝑣 = 1 ⋅ ς𝑛 (1 − 𝑝 ) 𝑝 /1 − 𝑝 𝑥𝑖
𝑗 𝑖=1 𝑖 𝑖 𝑗 𝑖=1 𝑖 𝑖 𝑖
= >1
𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1 𝑞𝑖𝑥𝑖 1 − 𝑞𝑖 1−𝑥𝑖 𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1(1 − 𝑞𝑖 ) 𝑞𝑖 /1 − 𝑞𝑖 𝑥𝑖

Take logarithm; we predict 𝑣 = 1 iff

𝑃 𝑣𝑗 = 1 1 − 𝑝𝑖 𝑝𝑖 𝑞𝑖
𝑙𝑜𝑔 + ෍ log + ෍(log − log )𝑥 > 0
𝑃(𝑣𝑗 = 0) 1 − 𝑞𝑖 1 − 𝑝𝑖 1 − 𝑞𝑖 𝑖
𝑖 𝑖
Naïve Bayes: Two Classes
• In the case of two classes, v{0,1} we predict that v=1 iff:
𝑥
𝑃 𝑣𝑗 = 1 ⋅ ς𝑛𝑖=1 𝑝𝑖 𝑖 1 − 𝑝𝑖 1−𝑥𝑖 𝑃 𝑣𝑗 = 1 ⋅ ς𝑛𝑖=1(1 − 𝑝𝑖 ) 𝑝𝑖 /1 − 𝑝𝑖 𝑥𝑖
𝑥 = >1
𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1 𝑞𝑖 𝑖 1 − 𝑞𝑖 1−𝑥𝑖 𝑃 𝑣𝑗 = 0 ⋅ ς𝑛𝑖=1(1 − 𝑞𝑖 ) 𝑞𝑖 /1 − 𝑞𝑖 𝑥𝑖

Take logarithm; we predict 𝑣 = 1 iff

𝑃 𝑣𝑗 = 1 1 − 𝑝𝑖 𝑝𝑖 𝑞𝑖
𝑙𝑜𝑔 + ෍ log + ෍(log − log )𝑥 > 0
𝑃(𝑣𝑗 = 0) 1 − 𝑞𝑖 1 − 𝑝𝑖 1 − 𝑞𝑖 𝑖
𝑖 𝑖
• We get that naive Bayes is a linear separator with
𝑝𝑖 𝑞𝑖 𝑝𝑖 1 − 𝑞𝑖
𝑤𝑖 = log − log = log
1 − 𝑝𝑖 1 − 𝑞𝑖 𝑞𝑖 1 − 𝑝𝑖
• If 𝑝𝑖 = 𝑞𝑖 then 𝑤𝑖 = 0 and the feature is irrelevant
Naïve Bayes: Two Classes
• In the case of two classes we have that:
𝑃(𝑣𝑗 = 1|𝑥) We have:
log = ෍ 𝒘 𝑖 𝒙𝑖 − 𝑏
𝑃(𝑣𝑗 = 0|𝑥) 𝐴 = 1 − 𝐵; 𝐿𝑜𝑔(𝐵/𝐴) = −𝐶.
𝑖
Then:
• but since 𝐸𝑥𝑝(−𝐶) = 𝐵/𝐴 =
𝑃 𝑣𝑗 = 1 𝑥 = 1 − 𝑃(𝑣𝑗 = 0|𝑥) = (1 − 𝐴)/𝐴 = 1/𝐴 – 1
• We get: = 1 + 𝐸𝑥𝑝(−𝐶) = 1/𝐴
1 𝐴 = 1/(1 + 𝐸𝑥𝑝(−𝐶))
𝑃 𝑣𝑗 = 1 𝑥 =
1 + exp(− σ𝑖 𝒘𝑖 𝒙𝑖 + 𝑏)
• Which is simply the logistic function.

• The linearity of NB provides a better explanation for why it works.

Naïve Bayes: Continuous Features
• 𝑋𝑖 can be continuous
• We can still use
𝑃 𝑋1 , … . , 𝑋𝑛 |𝑌 = ෑ 𝑃(𝑋𝑖 |𝑌)
𝑖
• And

𝑃 𝑌 = 𝑦 ς𝑖 𝑃(𝑋𝑖 |𝑌 = 𝑦)
𝑃 𝑌 = 𝑦 𝑋1 , … , 𝑋𝑛 ) =
σ𝑗 𝑃(𝑌 = 𝑦𝑗 ) ς𝑖 𝑃(𝑋𝑖 |𝑌 = 𝑦𝑗 )
Naïve Bayes: Continuous Features
• 𝑋𝑖 can be continuous
• We can still use
𝑃 𝑋1 , … . , 𝑋𝑛 |𝑌 = ෑ 𝑃(𝑋𝑖 |𝑌)
𝑖
• And

𝑃 𝑌 = 𝑦 ς𝑖 𝑃(𝑋𝑖 |𝑌 = 𝑦)
𝑃 𝑌 = 𝑦 𝑋1 , … , 𝑋𝑛 ) =
σ𝑗 𝑃(𝑌 = 𝑦𝑗 ) ς𝑖 𝑃(𝑋𝑖 |𝑌 = 𝑦𝑗 )
Naïve Bayes classifier:
𝑌 = arg max 𝑃 𝑌 = 𝑦 ෑ 𝑃(𝑋𝑖 |𝑌 = 𝑦)
𝑦
𝑖
Naïve Bayes: Continuous Features
• 𝑋𝑖 can be continuous
• We can still use
𝑃 𝑋1 , … . , 𝑋𝑛 |𝑌 = ෑ 𝑃(𝑋𝑖 |𝑌)
𝑖
• And

𝑃 𝑌 = 𝑦 ς𝑖 𝑃(𝑋𝑖 |𝑌 = 𝑦)
𝑃 𝑌 = 𝑦 𝑋1 , … , 𝑋𝑛 ) =
σ𝑗 𝑃(𝑌 = 𝑦𝑗 ) ς𝑖 𝑃(𝑋𝑖 |𝑌 = 𝑦𝑗 )
Naïve Bayes classifier:
𝑌 = arg max 𝑃 𝑌 = 𝑦 ෑ 𝑃(𝑋𝑖 |𝑌 = 𝑦)
𝑦
𝑖
Assumption: 𝑃(𝑋𝑖 |𝑌) has a Gaussian distribution
The Gaussian Probability Distribution
• Gaussian probability distribution also called normal
distribution.
• It is a continuous distribution with pdf:
•  = mean of distribution
• 𝜎 2 = variance of distribution
• 𝑥 is a continuous variable (−∞ ≤ 𝑥 ≤ ∞)
• Probability of 𝑥 being in the range [𝑎, 𝑏] cannot be evaluated
analytically (has to be looked up in a table)
𝑥 −𝜇 2
1 −
• 𝑝 𝑥 = 𝑒 2 𝜎2 p(x) 
1
e

( x  )2
2
2
gaussian
𝜎 2𝜋  2

x
Naïve Bayes: Continuous Features
• 𝑃(𝑋𝑖 |𝑌) is Gaussian
• Training: estimate mean and standard deviation
– 𝜇𝑖 = 𝐸 𝑋𝑖 𝑌 = 𝑦
– 𝜎𝑖2 = 𝐸 𝑋𝑖 − 𝜇𝑖 2 𝑌=𝑦
Note that the following slides abuse notation
significantly. Since 𝑃(𝑥) = 0 for continues distributions,
we think of
𝑃 (𝑋 = 𝑥| 𝑌 = 𝑦), not as a classic probability distribution,
but just as a function 𝑓(𝑥) = 𝑁(𝑥, 𝜇, 𝜎 2 ).
𝑓(𝑥) behaves as a probability distribution in the sense
that 8 𝑥, 𝑓(𝑥) ¸ 0 and the values add up to 1. Also, note
that 𝑓(𝑥) satisfies Bayes Rule, that is, it is true that:
𝑓𝑌(𝑦|𝑋 = 𝑥) = 𝑓𝑋 (𝑥|𝑌 = 𝑦) 𝑓𝑌 (𝑦)/𝑓𝑋(𝑥)
Naïve Bayes: Continuous Features
• 𝑃(𝑋𝑖 |𝑌) is Gaussian
• Training: estimate mean and standard deviation
– 𝜇𝑖 = 𝐸 𝑋𝑖 𝑌 = 𝑦
– 𝜎𝑖2 = 𝐸 𝑋𝑖 − 𝜇𝑖 2 𝑌=𝑦

𝑋1 𝑋2 𝑋3 𝑌
2 3 1 1
−1.2 2 0.4 1
1.2 0.3 0 0
2.2 1.1 0 1
Naïve Bayes: Continuous Features
• 𝑃(𝑋𝑖 |𝑌) is Gaussian
• Training: estimate mean and standard
deviation 𝑋1 𝑋2 𝑋3 𝑌
– 𝜇𝑖 = 𝐸 𝑋𝑖 𝑌 = 𝑦 2 3 1 1
– 𝜎𝑖2 = 𝐸 𝑋𝑖 − 𝜇𝑖 2
𝑌=𝑦 −1.2 2 0.4 1
2+ −1.2 +2.2 1.2 0.3 0 0
– 𝜇1 = 𝐸 𝑋1 𝑌 = 1 = =1
3
2.2 1.1 0 1
– 𝜎12 = 𝐸 𝑋1 − 𝜇1 𝑌 = 1 =
2−1 2 + −1.2 −1 2 + 2.2 −1 2
= 2.43
3

《Proofs》
No ratings yet
《Proofs》
330 pages
06 - NaiveBayes and ME
No ratings yet
06 - NaiveBayes and ME
26 pages
Remote Sensing Image Processing
100% (1)
Remote Sensing Image Processing
137 pages
Naive Bayes
No ratings yet
Naive Bayes
5 pages
Chapter 4 Torsion PDF
67% (3)
Chapter 4 Torsion PDF
30 pages
ML Unit 2
No ratings yet
ML Unit 2
107 pages
Ridge Regression
No ratings yet
Ridge Regression
82 pages
Lecture 5-1 Naive
No ratings yet
Lecture 5-1 Naive
44 pages
Nave Bayes Algorithms
No ratings yet
Nave Bayes Algorithms
15 pages
32-Naive Bayes Cont''d-03-10-2024
No ratings yet
32-Naive Bayes Cont''d-03-10-2024
31 pages
Naive Ba Yes
No ratings yet
Naive Ba Yes
65 pages
Brahmagupta: Early Life and Work
100% (1)
Brahmagupta: Early Life and Work
3 pages
Lecture 5 Bayesian
No ratings yet
Lecture 5 Bayesian
37 pages
Lec 03 NaiveBayesClassification
No ratings yet
Lec 03 NaiveBayesClassification
33 pages
Naïve Bayes Classifier Example: Play Tennis
No ratings yet
Naïve Bayes Classifier Example: Play Tennis
5 pages
ML 4
No ratings yet
ML 4
50 pages
Naive Bayes
No ratings yet
Naive Bayes
21 pages
06 Classification Naive Bayes
No ratings yet
06 Classification Naive Bayes
13 pages
Lecture 12
No ratings yet
Lecture 12
13 pages
Naive Bayes
No ratings yet
Naive Bayes
14 pages
Naive Ba Yes
No ratings yet
Naive Ba Yes
28 pages
Simple Learning Algorithms: Jiming Peng, Advol, Cas, Mcmaster 1
No ratings yet
Simple Learning Algorithms: Jiming Peng, Advol, Cas, Mcmaster 1
41 pages
AI 02 Naive Bayes
No ratings yet
AI 02 Naive Bayes
9 pages
Naive Bayes
No ratings yet
Naive Bayes
62 pages
Bayes Classifier
No ratings yet
Bayes Classifier
35 pages
Stability & Determinacy of Trusses PDF
No ratings yet
Stability & Determinacy of Trusses PDF
5 pages
2024 - Slide2 - BayesML Sub
No ratings yet
2024 - Slide2 - BayesML Sub
40 pages
Bayesian-Classification Ok
No ratings yet
Bayesian-Classification Ok
21 pages
Naïve Bayes
No ratings yet
Naïve Bayes
15 pages
Data Mining All Slides
No ratings yet
Data Mining All Slides
206 pages
DLWSS551 - Algorithms Part I
No ratings yet
DLWSS551 - Algorithms Part I
59 pages
Chapter 8
No ratings yet
Chapter 8
24 pages
Pattern Recognition
No ratings yet
Pattern Recognition
76 pages
2 Naive Bayes
No ratings yet
2 Naive Bayes
49 pages
Unit - 4 Learning Notes
No ratings yet
Unit - 4 Learning Notes
6 pages
Bayesian-Classification
No ratings yet
Bayesian-Classification
14 pages
Naive Bayes Play Tennis Classification
No ratings yet
Naive Bayes Play Tennis Classification
2 pages
ML Lecture 12 NB
No ratings yet
ML Lecture 12 NB
15 pages
Naive by
No ratings yet
Naive by
23 pages
Naïve Bayes Classifier: Ke Chen
No ratings yet
Naïve Bayes Classifier: Ke Chen
19 pages
29-Naive Bayes-03-10-2024
No ratings yet
29-Naive Bayes-03-10-2024
48 pages
Naive Bayes
No ratings yet
Naive Bayes
62 pages
07 - ML - Naive-Bayes-update
No ratings yet
07 - ML - Naive-Bayes-update
26 pages
Bayesian Learning
No ratings yet
Bayesian Learning
41 pages
Decision Trees
No ratings yet
Decision Trees
49 pages
Class 3 Navie Bayes
No ratings yet
Class 3 Navie Bayes
21 pages
Naive Bayes Classification Outlne
No ratings yet
Naive Bayes Classification Outlne
12 pages
Naivebayes Tute
No ratings yet
Naivebayes Tute
4 pages
Naive Bayes
No ratings yet
Naive Bayes
26 pages
Classification and Prediction: Data Mining Concepts and Techniques
No ratings yet
Classification and Prediction: Data Mining Concepts and Techniques
18 pages
Lecture10 - Bayesian Classifier
No ratings yet
Lecture10 - Bayesian Classifier
40 pages
What Is Naive Bayes Algorithm?
No ratings yet
What Is Naive Bayes Algorithm?
18 pages
ML Lecture 10 (Naïve Bayes Classifier)
No ratings yet
ML Lecture 10 (Naïve Bayes Classifier)
14 pages
Brute Force Bayes Algorithm Example
No ratings yet
Brute Force Bayes Algorithm Example
6 pages
Thermal Deformation Analysis of Automotive Disc Brake Squeal
No ratings yet
Thermal Deformation Analysis of Automotive Disc Brake Squeal
26 pages
Probabilistic Class I Fiers
No ratings yet
Probabilistic Class I Fiers
5 pages
Naive Bayes
No ratings yet
Naive Bayes
11 pages
ML L9 Naive Bayes
No ratings yet
ML L9 Naive Bayes
18 pages
Variation of Velocity and Acceleration in Suction and Delivery Pipes Due To Acceleration of Piston
100% (1)
Variation of Velocity and Acceleration in Suction and Delivery Pipes Due To Acceleration of Piston
9 pages
4TH CLASS MATHS SEM-2 - Watermark
No ratings yet
4TH CLASS MATHS SEM-2 - Watermark
132 pages
Naive Bayes Classification
No ratings yet
Naive Bayes Classification
47 pages
Ec Gate 2010 PDF
No ratings yet
Ec Gate 2010 PDF
24 pages
1.3.1 Equations & Graphs of Motion
No ratings yet
1.3.1 Equations & Graphs of Motion
8 pages
Data Mining - Classification
No ratings yet
Data Mining - Classification
53 pages
Naive Bayes
No ratings yet
Naive Bayes
6 pages
Decisiontrees
No ratings yet
Decisiontrees
46 pages
01 Naiv Bayes
No ratings yet
01 Naiv Bayes
25 pages
Lesson 9 5 Multiplication Division of Radical Expressions
100% (1)
Lesson 9 5 Multiplication Division of Radical Expressions
17 pages
Pgm5 With Output
No ratings yet
Pgm5 With Output
13 pages
Converse, Inverse, Contrapositive, and Biconditional: Welcome, Grade 8
No ratings yet
Converse, Inverse, Contrapositive, and Biconditional: Welcome, Grade 8
20 pages
2022-2023 ASVAB Arithmetic Reasoning and Mathematics
No ratings yet
2022-2023 ASVAB Arithmetic Reasoning and Mathematics
4 pages
TM1 - Turbo Integrator Fuctions
No ratings yet
TM1 - Turbo Integrator Fuctions
22 pages
Ander
No ratings yet
Ander
2 pages
Problems in Hydraulics and Fluid Mechanics-218
No ratings yet
Problems in Hydraulics and Fluid Mechanics-218
1 page
Class - VIII HHW (2025-26) - 3
No ratings yet
Class - VIII HHW (2025-26) - 3
6 pages
Quesioner Design and Analyisis
No ratings yet
Quesioner Design and Analyisis
25 pages
Bordasvaldez Studyhabitsattitudetowardsmathmathachievementsofdoscststudents
No ratings yet
Bordasvaldez Studyhabitsattitudetowardsmathmathachievementsofdoscststudents
19 pages
Power System Modelling: M. Tech. (Integrated Power System / Power Electronics & Power System) First Semester (C.B.C.S.)
No ratings yet
Power System Modelling: M. Tech. (Integrated Power System / Power Electronics & Power System) First Semester (C.B.C.S.)
2 pages
Acknowledgement
No ratings yet
Acknowledgement
14 pages
KCBE - Mathematic MS2023
No ratings yet
KCBE - Mathematic MS2023
15 pages
Control Proporcional
No ratings yet
Control Proporcional
5 pages
2 Mesh Analysis
No ratings yet
2 Mesh Analysis
16 pages
Test Review
No ratings yet
Test Review
8 pages
CVP Analysis
No ratings yet
CVP Analysis
16 pages
GATE 2024 Mining Engineering MN Solutions
No ratings yet
GATE 2024 Mining Engineering MN Solutions
8 pages
LAS DRAWING-Q2-Classification of Drawing Tools
No ratings yet
LAS DRAWING-Q2-Classification of Drawing Tools
3 pages
History of Exponents
No ratings yet
History of Exponents
2 pages
BAYES Theorem
From Everand
BAYES Theorem
Jeffery Short
2/5 (5)
How Pi Can Save Your Life: Using Math to Survive Plane Crashes, Zombie Attacks, Alien Encounters, and Other Improbable Real-World Situations
From Everand
How Pi Can Save Your Life: Using Math to Survive Plane Crashes, Zombie Attacks, Alien Encounters, and Other Improbable Real-World Situations
Chris Waring
No ratings yet
Simple Numbers
From Everand
Simple Numbers
Prasant
No ratings yet