Mixture Models and EM Algorithm: S. Sumitra
Mixture Models and EM Algorithm: S. Sumitra
S. Sumitra
p(C = i)p(xj /C = i)
p(C = i/xj ) = pij = , i = 1, 2, . . . k, j = 1, 2, . . . N (2)
p(xj )
1
E step
In the E step, compute the probabilities pij , i = 1, 2, . . . k, j = 1, 2, . . . N.
M step
Compute the new mean, covariance and component weights as follows:
PN
j=1 pij xj
µi = PN
j=1 pij
P
j 1{xj ∈ C = i}xj
[For sure event, µi = P . Here, we don’t know whether xj is in
j 1.{xj ∈ C = i}
component i. We only know p(C = i/xj ).]
T
P
j pij (xj − µi )(xj − µi )
Σi = PN
j=1 pij
PN
j=1 pij
wi =
N
[Compare these formulas with those of Gaussian discriminant analysis]
The algorithm can be summarized as follows:
2
Algorithm 1 EM algorithm
Initialize µi , Σi , wi , i = 1, 2, . . . k
Iterate until covergence:
E Step
for i = 1 to k do
for j = 1 to N do
1 1
calculate p(xj /C = i) = n/2 1/2
exp − (xj − µi )T Σ−1
i (xj − µi )
(2π) |Σi | 2
PN
calculate pij = (p(xj /C = i)wi ) / j=1 p(x j /C = i)w i
end for
pi = N
P
j=1 pij
end for
M Step
for i = 1 to k doP
N
j=1 pij xj
calculate µi =
PN p i T
j=1 pij (xj − µi )(xj − µi )
calculate Σi =
pi
pi
set wi =
N
end for
end
3
References
(1) Artificial Intelligence by Stuart Russel and Peter Norwig
(2) Andrew Ng’s Lecture Note