Fsa and HMM: LING 572 Fei Xia 1/5/06
Fsa and HMM: LING 572 Fei Xia 1/5/06
LING 572
Fei Xia
1/5/06
Outline
FSA
HMM
a b
q0 q1
Definition of FST
A FST is (Q, , , I , F , )
Q: a finite set of states
: a finite set of input symbols
: a finite set of output symbols
I: the set of initial states
F: the set of final states
Q ( { }) ( { }) Q : the transition
relation between states.
a:x b:y
q0 q1
Operations on FSTs
Union: x[T S ] y iff x[T ] y or x[ S ] y
Concatenation:
wx[T S ] yz iff w[T ] y and x[ S ]z
Composition:
x[T S ]z iff y s.t. x[T ] y and y[ S ]z
An example of composition
operation
a:x b:y
q0 q1
x:
y:z
q0
Probabilistic finite-state automata
(PFA)
Informally, in a PFA, each arc is associated with a probability.
Tasks:
Given a string x, find the best path for x.
Given a string x, find the probability of x in a PFA.
Find the string with the highest probability in a PFA
Formal definition of PFA
A PFA is (Q, , I , F , , P)
Q: a finite set of N states
: a finite set of input symbols
I: Q R+ (initial-state probabilities)
F: Q R+ (final-state probabilities)
Q ( { }) Q : the transition relation
between states.
P: R
(transition probabilities)
Constraints on function:
I (q) 1
qQ
q Q F (q ) P (q, a, q ' ) 1
a
q 'Q
Probability of a string:
n
P ( w1,n , q1,n 1 ) I (q1 ) * F (q n 1 ) * p (qi , wi , qi 1 )
i 1
P ( w1,n ) P( w
q1, n 1
1, n , q1,n 1 )
Consistency of a PFA
Let A be a PFA.
Def: P(x | A) = the sum of all the valid paths for x in A.
Def: a valid path in A is a path for some string x with
probability greater than 0.
Def: A is called consistent if x
P ( x | A) 1
b:0.8
a:1
q0:0 q1:0.2 I(q0)=1.0
I(q1)=0.0
P(abn)=0.2*0.8n
0.8 0
x P( x)
n0
P(ab ) 0.2 * 0.8 0.2 *
n
n 0
n
1 0.8
1
Weighted finite-state automata
(WFA)
Each arc is associated with a weight.
Sum and Multiplication can be other
meanings.
weight ( x) ( I ( s ) P( s, x, t ) F (t ))
s ,tQ
HMM
Two types of HMMs
State-emission HMM (Moore machine):
The emission probability depends only on the
state (from-state or to-state).
s1 s2 sN
w1 w4 w1 w3 w5 w1
# of Parameters: O(N2M+N2).
Are the two types of HMMs
equivalent?
For each state-emission HMM1, there is an
arc-emission HMM2, such that for any
sequence O, P(O|HMM1)=P(O|HMM2).
Q3 and Q4 of hw1.
Definition of arc-emission HMM
A HMM is a tuple ( S , , , A, B ) :
A set of states S={s1, s2, , sN}.
A set of output symbols ={w1, , wM}.
Initial state probabilities { i }
State transition prob: A={aij}.
Symbol emission prob: B={bijk}
P (O1,n ) P(O
X 1 , n 1
1, n , X 1,n 1 )
Constraints
N
a b 1
ij ijk
i 1 k j
i 1
N
a
j 1
ij 1
Q2 of hw1.
Properties of HMM
Limited horizon: P( X t 1 | X 1 , X 2 ,... X t ) P( X t 1 | X t )
N
P(O) i (T 1)
i 1
Calculating forward probability
Initialization: i (1) i
Induction:
j (t 1) P(O1,t , X t 1 j )
P(O1,t , X t i, X t 1 j )
i
P(O1,t 1 , X t i) * P(ot , X t 1 j | X t i )
i
i (t )aij bijot
i
(2) Finding the best state sequence
Given the observation O1,T=o1oT, find the
state sequence X1,T+1=X1 XT+1 that
maximizes P(X1,T+1 | O1,T).
o1 o2 oT
XT XT+1
X1 X2
Viterbi algorithm
Viterbi algorithm
The probability of the best path that produces O 1,t-1
while ending up in state si:
def
i (t ) max P ( X 1,t 1 , O1,t 1 , X t i )
X 1, t 1
Initialization: i (1) i
Induction: j (t 1) max i (t )aij bijot
i
Q S { f }
2 1 { }
q S I (q ) (q ), I ( f ) 0
q S F (q ) 0 F ( f ) 1
{(qi , wk , q j ) | qi , q j S (bijk * aij 0)} {(qi , , f ) | qi S }
P (qi , wk , q j ) aij * bijk
P(qi , , f ) 1
A HMM is a tuple ( S , , , A, B, q f ) :
A set of states S={s1, s2, , sN}.
A set of output symbols ={w1, , wM}.
Initial state probabilities { i }
State transition prob: A={aij}.
Symbol emission prob: B={bijk}
qf is the final state: there are no outcoming
edges from qf
N Constraints
i 1
i 1
N For any HMM (under
i q f a
j 1
ij 1 this new definition)
M
i q f bijk 1
k 1 P(O | HMM ) 1
O
j a q f , j 0
jk bq f , j ,k 0
HMM PFA
HMM ( S , 1 , , A, B, q f )
PFA (Q, 2 , I , F , , P )
QS
2 1
q S I (q ) (q)
F (q f ) 1 and q S {q f } F (q ) 0
{( qi , wk , q j ) | qi , q j S (bijk * aij 0)}
P (qi , wk , q j ) aij * bijk
PFA HMM
PFA (Q, 1 , I , F , , P) HMM ( S , 2 , , A, B, q f )
S Q {q f }
2 1 { }
i Q [i ] I [i ]
[q f ] 0
Need to add a new
final state and edges to it
i Q aij P(qi , wk , q j )
k
a i ,q f F [i ]
HMM