0% found this document useful (0 votes)
71 views27 pages

Hidden Markov Models: CH 3.2, 3.2 of DEKM

Hidden Markov models can be used to identify CpG islands in DNA sequences. The sequence is modeled as being generated by two states: a CpG island state with high CG frequency, and a normal state. The Viterbi and forward-backward algorithms can be used to find the most likely state path and posterior state probabilities. The expectation-maximization algorithm updates transition and emission probabilities to maximize the probability of the observed sequence.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
71 views27 pages

Hidden Markov Models: CH 3.2, 3.2 of DEKM

Hidden Markov models can be used to identify CpG islands in DNA sequences. The sequence is modeled as being generated by two states: a CpG island state with high CG frequency, and a normal state. The Viterbi and forward-backward algorithms can be used to find the most likely state path and posterior state probabilities. The expectation-maximization algorithm updates transition and emission probabilities to maximize the probability of the observed sequence.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Hidden Markov Models

Ch 3.2, 3.2 of DEKM

CpG islands
The dinucleotide CG (called CpG) is rare
C in a CG often gets methylated and the resulting C then mutates to T

Methylation is suppressed in some areas of genome, called CpG islands Such CpG islands often found around genes Problem: find CpG islands in whole genome

Markov Chains
Given a sequence, can calculate its probability under a CpG Markov chain and its probability under a normal genome Markov chain. Then calculate
Pr( x | CpGMarkovChain ) S ( x ) = log Pr( x | NormalMarkovChain )

Markov Chains
Slide a fixed length window, say of length 100 bp If any window x has a positive (or very positive) S(x), mark it as a (part of a) CpG island

Issues
Why fixed length windows? Why 100 bp ? A more satisfactory approach will be to build one model for the whole sequence, that incorporates both Markov chains.

high CG

rare CG

CG island

normal

Two states. Each state emits sequence. Sequence emitted by CG island state is high in CG frequency Concatenation of sequence emissions = genome

HMM vs MC
The main difference between HMM and Markov chains is that in an HMM there is not a one-to-one correspondence between states visited and symbols observed.

HMM: Formal
The path or parse is the sequence of states visited by the process. This is a simple Markov chain given by the following transition probabilities:
a

akl = P (i = l|i

= k)

HMM: Formal
For technical reasons, assume a begin state (0) from where the process transitions to any state k with probability a0k Similarly assume an end state (also 0) with transition probabilities into it being ak0

HMM: Formal
In each state other than 0, the model can emit a symbol according to some distribution. This is the emission probability distribution

e k ( b ) = P ( x i = b | i = k )

Joint probability of sequence and path

L P( x, ) = a0 1 e i ( xi )a i i+1 a L 0 i=1
We may ignore a0 and a simplicity.
1

L0

for

Decoding
In an HMM-generated sequence, we cant say for sure whether a particular symbol was generated from a particular state. Yet, it is often the underlying states that we are more interested in finding out. This is called decoding (from speech recognition).

Most probable state path


Find

= argmax P (x, )

Viterbi algorithm
Suppose vk(i) is the probability of the most probable path ending in state k with observation xi. Suppose vk(i) is known for all k. Then:

vl (i + 1) = el (xi+1 )maxk (vk (i)akl )


Initial condition: v0(0) = 1.

Viterbi algorithm
Would you implement as a recursion? No. Use dynamic programming

Probability of sequence
Calculate the probability of the sequence by summing over all paths:

P ( x) =

P (x, )

The number of paths increases exponentially with the sequence length

The Forward Algorithm


Define fk(i) as the probability of the sequence up to and including xi while requiring that i=k. That is: The recurrence for this is:

fk (i) = P (x1 xi , i = k ) X
k

fl (i + 1) = el (xi+1 ) fk (i)akl k X fk ( L) ak 0 P(x) is simply P (x) =

Posterior state probabilities


We often want to know what is the most probable state for an observation xi This is not what Viterbi algorithm gives us In general, calculate P(i=k | x).

Posterior state probability


Consider the probability of producing the entire sequence while requiring that the ith symbol was produced by state k
P (x, i = k ) = P (x1 xi , i = k )P (xi+1 xL |x1 xi , i = k ) = P (x1 xi , i = k )P (xi+1 xL |i = k )

The first term on the RHS is fk(i). Call the second term bk(i)

The Backward Algorithm


Define:

bk (i) = P (xi+1 xL |i = k )
Recurrence:

bk ( i ) =

X
l

akl el (xi+1 )bl (i + 1)

Posterior state probability


f k ( i ) bk ( i ) P ( i = k | x ) = P ( x)

Parameter Estimation for HMMs


Given x, how to calculate the transition and emission probabilities so as to maximize P(x) ? Expectation Maximization

Parameter Estimation for HMMs


Suppose the emission probabilities are fixed, and we want to learn the optimal transition probabilities. Recall from E-M lecture that in each iteration, we must update as follows:

akl

Where Akl is as defined in next slide

Akl =P l0 Akl0

Parameter estimation for HMMs


Akl is the expected number of transitions from state k to state l, under the distribution on paths given sequence Let Akl () be the number of transitions from k to l in path . This is an integer.

Akl =

Akl ( )P ( |x)

Expectation step
Note that Akl is EP(|x)(Akl()) Note also that

Akl ( ) =

Where Akli() is 1 if the state transitions from k to l at position i, and 0 otherwise.

L X i=1

Akli ( )

Akl = E (Akl ( )) =

X
i

E (Akli ( ))

Expectation step
But:

E (Akli ( )) = P (Akli ( ) = 1|x)


This is given by:

P (i = k, i+1

Thus, having already run the forward and backward algorithms, we can calculate Akl

fk (i)akl el (xi+1 )bl (i + 1) = l | x) = P ( x)

Also read Ch 3.6.


Numerical stability of HMM algorithms Products of large numbers of probabilities => very small numbers. Work with logs. Products become sums.

You might also like