0% found this document useful (0 votes)
69 views27 pages

Hidden Markov Models: Adapted From

This document provides an introduction to hidden Markov models (HMMs). It describes the specification of an HMM using parameters like the state transition probability matrix A, observation probability distribution B, and initial state distribution π. The three central problems of HMMs are then discussed: evaluation, decoding, and learning. Efficient algorithms like the forward algorithm, Viterbi algorithm, and Baum-Welch algorithm are presented for solving these problems in polynomial time.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
69 views27 pages

Hidden Markov Models: Adapted From

This document provides an introduction to hidden Markov models (HMMs). It describes the specification of an HMM using parameters like the state transition probability matrix A, observation probability distribution B, and initial state distribution π. The three central problems of HMMs are then discussed: evaluation, decoding, and learning. Efficient algorithms like the forward algorithm, Viterbi algorithm, and Baum-Welch algorithm are presented for solving these problems in polynomial time.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 27

Hidden Markov

Models
Adapted from
Dr Catherine Sweeney-Reeds
slides
Summary
Introduction
Description
Central problems in HMM modelling
Extensions
Demonstration
Specification of an HMM
N - number of states
Q = {q
1
; q
2
; : : : ;q
T
} - set of states
M - the number of symbols (observables)
O = {o
1
; o
2
; : : : ;o
T
} - set of symbols
Description
Specification of an HMM
A - the state transition probability matrix
aij = P(q
t+1
= j|q
t
= i)
B- observation probability distribution
b
j
(k) = P(o
t
= k|q
t
= j) i k M
- the initial state distribution
Description
Specification of an HMM
Full HMM is thus specified as a triplet:
= (A,B,)
Description
Central problems in HMM
modelling
Problem 1
Evaluation:
Probability of occurrence of a particular
observation sequence, O = {o
1
,,o
k
}, given
the model
P(O|)
Complicated hidden states
Useful in sequence classification
Central
problems
Central problems in HMM
modelling
Problem 2
Decoding:
Optimal state sequence to produce given
observations, O = {o
1
,,o
k
}, given model
Optimality criterion
Useful in recognition problems

Central
problems
Central problems in HMM
modelling
Problem 3
Learning:
Determine optimum model, given a training
set of observations
Find , such that P(O|) is maximal

Central
problems
Problem 1: Nave solution
State sequence Q = (q
1
,q
T
)
Assume independent observations:



) ( )... ( ) ( ) , | ( , | (
2 2 1 1
1
T qT q q t
T
i
t
o b o b o b q o P q O P = = )
[
=

Central
problems
NB Observations are mutually independent, given the
hidden states. (Joint distribution of independent
variables factorises into marginal distributions of the
independent variables.)
Problem 1: Nave solution
Observe that :


And that:
qT qT q q q q q
a a a q P
1 3 2 2 1 1
... ) | (

= t
Central
problems

=
q
q P q O P O P ) | ( ) , | ( ) | (
Problem 1: Nave solution
Finally get:
Central
problems

=
q
q P q O P O P ) | ( ) , | ( ) | (
NB:
-The above sum is over all state paths
-There are N
T
states paths, each costing
O(T) calculations, leading to O(TN
T
)
time complexity.
Problem 1: Efficient solution
Define auxiliary forward variable :
Central
problems
) , | ,..., ( ) (
1
o i q o o P i
t t t
= =

t
(i) is the probability of observing a partial sequence of
observables o
1
,o
t
such that at time t, state q
t
=i
Forward algorithm:
Problem 1: Efficient solution
Recursive algorithm:
Initialise:


Calculate:


Obtain:
) ( ) (
1 1
o b i
i i
t o =
Central
problems
) ( ] ) ( [ ) (
1
1
1 +
=
+
=
t j
N
i
ij t t
o b a i j o o

=
=
N
i
T
i O P
1
) ( ) | ( o
Complexity is O(N
2
T)
(Partial obs seq to t AND state i at t)
x (transition to j at t+1) x (sensor)
Sum of different ways
of getting obs seq
Sum, as can reach j from
any preceding state
o incorporates partial obs seq to t
Problem 1: Alternative solution
Define auxiliary
forward variable :

Central
problems
Backward algorithm:
) , | ,..., , ( ) (
2 1
| i q o o o P i
t T t t t
= =
+ +
|
t
(i) the probability of observing a sequence of
observables o
t+1
,,o
T
given state q
t
=i at time t, and
Problem 1: Alternative solution
Recursive algorithm:
Initialise:


Calculate:


Terminate:
1 ) ( = j
T
|
Central
problems
Complexity is O(N
2
T)
1 1
1
( ) ( ) ( )
N
t t ij j t
j
i j a b o | |
+ +
=
=

=
=
N
i
i O p
1
1
) ( ) | ( | 1 ,..., 1 =T t
Problem 2: Decoding
Choose state sequence to maximise
probability of observation sequence
Viterbi algorithm - inductive algorithm that
keeps the best state sequence at each
instance
Central
problems
Problem 2: Decoding
State sequence to maximise P(O,Q|):


Define auxiliary variable :
) , | ,... , (
2 1
O q q q P
T
Viterbi algorithm:
Central
problems
) | ,... , , ,..., , ( max ) (
2 1 2 1
o
t t
q
t
o o o i q q q P i = =

t
(i) the probability of the most probable
path ending in state q
t
=i
Problem 2: Decoding
Recurrent property:


Algorithm:
1. Initialise:
) ( ) ) ( ( max ) (
1 1 + +
=
t j ij t
i
t
o b a i j o o
Central
problems
) ( ) (
1 1
o b i
i i
t o = N i s s 1
0 ) (
1
= i
To get state seq, need to keep track
of argument to maximise this, for each
t and j. Done via the array
t
(j).
Problem 2: Decoding
2. Recursion:





3. Terminate:



) ( ) ) ( ( max ) (
1
1
t j ij t
N i
t
o b a i j

s s
= o o
Central
problems
) ) ( ( max arg ) (
1
1
ij t
N i
t
a i j

s s
= o
N j T t s s s s 1 , 2
) ( max
1
i P
T
N i
o
s s
=
-
) ( max arg
1
i q
T
N i
T
o
s s
-
=
P* gives the state-optimised probability
Q* is the optimal state sequence
(Q* = {q1*,q2*,,qT*})
Problem 2: Decoding
4. Backtrack state sequence:







) (
1 1
-
+ +
-
=
t t t
q q
1 ,..., 2 , 1 + T T t
O(N
2
T) time complexity
Central
problems
Problem 3: Learning
Training HMM to encode obs seq such that HMM
should identify a similar obs seq in future
Find =(A,B,), maximising P(O|)
General algorithm:
Initialise:
0
Compute new model , using
0
and observed
sequence O
Then
Repeat steps 2 and 3 until:


o
Central
problems
d O P O P < ) | ( log ) | ( log
0

Problem 3: Learning
Let (i,j) be a probability of being in state i at time
t and at state j at time t+1, given and O seq
) | (
) ( ) ( ) (
) , (
1 1

| o

O P
j o b a i
j i
t t j ij t + +
=
Central
problems

= =
+ +
+ +
=
N
i
N
j
t t j ij t
t t j ij t
j o b a i
j o b a i
1 1
1 1
1 1
) ( ) ( ) (
) ( ) ( ) (
| o
| o
Step 1 of Baum-Welch algorithm:
Problem 3: Learning
Central
problems
Operations required for the computation
of the joint event that the system is in state
Si and time t and State Sj at time t+1
Problem 3: Learning
Let be a probability of being in state i at
time t, given O



- expected no. of transitions from state i

- expected no. of transitions

=
=
N
j
t t
j i i
1
) , ( ) (
Central
problems
1
1
( )
T
t
t
i

1
1
( )
T
t
t
i

j i
( )
t
i
Problem 3: Learning
the expected frequency of state i at time t=1


ratio of expected no. of transitions from
state i to j over expected no. of transitions from state i


ratio of expected no. of times in state j
observing symbol k over expected no. of times in state j

=
) (
) , (

i
j i
a
t
t
ij

Central
problems
Step 2 of Baum-Welch algorithm:

=
=
) (
) (
) (

,
j
j
k b
t
k o t
t
j
t


) (

1
i t =
Problem 3: Learning
Baum-Welch algorithm uses the forward and
backward algorithms to calculate the auxiliary
variables ,
B-W algorithm is a special case of the EM
algorithm:
E-step: calculation of and
M-step: iterative calculation of , ,
Practical issues:
Can get stuck in local maxima
Numerical problems log and scaling
t

ij
a
) (

k b
j
Central
problems
Further Reading
L. R. Rabiner, "A tutorial on Hidden Markov Models and
selected applications in speech recognition,"
Proceedings of the IEEE, vol. 77, pp. 257-286, 1989.
R. Dugad and U. B. Desai, "A tutorial on Hidden Markov
models," Signal Processing and Artifical Neural
Networks Laboratory, Dept of Electrical Engineering,
Indian Institute of Technology, Bombay Technical Report
No.: SPANN-96.1, 1996.
W.H. Laverty, M.J. Miket, and I.W. Kelly, Simulation of
Hidden Markov Models with EXCEL, The Statistician,
vol. 51, Part 1, pp. 31-40, 2002

You might also like