Tutorial 4
Tutorial 4
Mengye Ren
[email protected]
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 1 / 21
Naive Bayes
Bayes Rules:
p(x|t)p(t)
p(t|x) =
p(x)
Naive Bayes Assumption:
D
Y
p(x|t) = p(xj |t)
j=1
Likelihood function:
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 2 / 21
Example: Spam Classification
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 3 / 21
Bernoulli Naive Bayes
Assuming all data points x (i) are i.i.d. samples, and p(xj |t) follows a
Bernoulli distribution with parameter µjt
D (i)
Y x (i)
p(x (i) |t (i) ) = µjtj(i) (1 − µjt (i) )(1−xj )
j=1
N N D (i)
Y Y Y x (i)
p(t|x) ∝ p(t (i) )p(x (i) |t (i) ) = p(t (i) ) µjtj(i) (1 − µjt (i) )(1−xj )
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 4 / 21
Derivation of maximum likelihood estimator (MLE)
θ = [µ, π]
P
Want: arg maxθ log L(θ) subject to k πk = 1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 5 / 21
Derivation of maximum likelihood estimator (MLE)
Take derivative w.r.t. µ
N (i) (i)
∂ log L(θ) xj 1 − xj
1 t (i) = k
X
=0⇒ − =0
∂µjk µjk 1 − µjk
i=1
N h i
1 t (i) = k (i) (i)
X
xj (1 − µjk ) − 1 − xj µjk = 0
i=1
N N
1 t (i) = k µjk = 1 t (i) = k xj(i)
X X
i=1 i=1
i=1 1 t
PN (i)
(i)
= k xj
µjk =
i=1 1
PN
t (i) = k
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 6 / 21
Derivation of maximum likelihood estimator (MLE)
i=1 1
PN
t (i) = k)
πk = −
λ
P
Apply constraint: k πk = 1 ⇒ λ = −N
i=1 1
PN
t (i) = k)
πk =
N
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 7 / 21
Spam Classification Demo
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 8 / 21
Gaussian Bayes Classifier
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 9 / 21
Derivation of maximum likelihood estimator (MLE)
q
θ = [µ, Σ, π], Z = (2π)D det(Σ)
1 1 T −1
p(x|t) = exp − (x − µ) Σ (x − µ)
Z 2
N
X 1 (i) T
= log πt (i) − log Z − x − µt (i) Σ−1
t (i)
x (i)
− µ t (i)
2
i=1
P
Want: arg maxθ log L(θ) subject to k πk = 1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 10 / 21
Derivation of maximum likelihood estimator (MLE)
i=1 1 t
PN (i) = k x (i)
µk =
i=1 1
PN
t (i) = k
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 11 / 21
Derivation of maximum likelihood estimator (MLE)
∂x T Ax
= xx T
∂A
ΣT = Σ
N
" #
∂ log L ∂ log Zk 1 (i)
1 t =k −
X
(i) (i) T
=− − (x − µk )(x − µk ) =0
∂Σ−1k i=0
∂Σ −1
k
2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 12 / 21
Derivation of maximum likelihood estimator (MLE)
q
Zk = (2π)D det(Σk )
1
−1 − 2
∂ log Zk 1 ∂Zk D 1 D ∂ det Σ
= = (2π)− 2 det(Σk )− 2 (2π) 2 k
∂Σ−1k
Z k ∂Σ −1
k ∂Σ −1
k
1 1 3
− 2 1
= det(Σ−1 det Σ−1 det Σ−1
T
k )
2 − k k Σk = − Σk
2 2
N 1
∂ log L 1 (i)
1 t =k
X
(i) (i) T
=− Σk − (x − µk )(x − µk ) = 0
∂Σ−1k i=0
2 2
T
i=1 1
PN
t (i) = k x (i) − µk x (i) − µk
Σk =
i=1 1
PN
t (i) = k
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 13 / 21
Derivation of maximum likelihood estimator (MLE)
i=1 1
PN
t (i) = k)
πk =
N
(Same as Bernoulli)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 14 / 21
Gaussian Bayes Classifier Demo
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 15 / 21
Gaussian Bayes Classifier
D D
Y 1 1 Y
= p exp − ||xj − µjt ||22 = p(xj |t)
(2π)D Σt,jj 2Σt,jj
j=1 j=1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 16 / 21
Gaussian Bayes Classifier
p(x|t) = N (x|µt , Σ)
p(x|t) = N (x|µt , Σt )
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 17 / 21
Gaussian Bayes Binary Classifier Decision Boundary
p(x|t = 1) = p(x|t = 0)
1 1
log π1 − (x − µ1 )T Σ−1 (x − µ1 ) = log π0 − (x − µ0 )T Σ−1 (x − µ0 )
2 2
C + x T Σ−1 x − 2µT −1 T −1 T −1 T −1
1 Σ x + µ1 Σ µ1 = x Σ x − 2µ0 Σ x + µ0 Σ µ0
T −1
h i
2(µ0 − µ1 )T Σ−1 x − (µ0 − µ1 )T Σ−1 (µ0 − µ1 ) = C
⇒ aT x − b = 0
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 18 / 21
Relation to Logistic Regression
p(x, t = 0) π0 N (x|µ0 , Σ)
=
p(x, t = 0) + p(x, t = 1) π0 N (x|µ0 , Σ) + π1 N (x|µ1 , Σ)
−1
π1 1 T −1 1 T −1
= 1+ exp − (x − µ1 ) Σ (x − µ1 ) + (x − µ0 ) Σ (x − µ0 )
π0 2 2
π1 1 T −1 −1
= 1 + exp log + (µ1 − µ0 )T Σ−1 x + µ 1 Σ µ1 − µT
0 Σ −1
µ 0
π0 2
1
=
1 + exp(−w T x − b)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 19 / 21
Gaussian Bayes Binary Classifier Decision Boundary
p(x|t = 1) = p(x|t = 0)
1 1
log π1 − (x − µ1 )T Σ−1 T −1
1 (x − µ1 ) = log π0 − (x − µ0 ) Σ0 (x − µ0 )
2 2
x T Σ−1 −1
x − 2 µT −1 T −1
x + µT T
1 − Σ0 1 Σ 1 − µ0 Σ 0 0 Σ0 µ0 − µ1 Σ1 µ1 = C
⇒ x T Qx − 2b T x + c = 0
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 20 / 21
Thanks!
Mengye Ren Naive Bayes and Gaussian Bayes Classifier October 18, 2015 21 / 21