0% found this document useful (0 votes)
34 views30 pages

HMM Bioinformatics

- A Hidden Markov Model (HMM) is a Markov model where the states are hidden and each state has probabilities of emitting different output symbols. - An HMM consists of states, transition probabilities between states, and emission probabilities of symbols for each state. The states are hidden and only the emitted symbols are observed. - To analyze a sequence of emitted symbols from an HMM, the forward algorithm can be used to compute the probability of the sequence and infer the most likely hidden states that produced it.

Uploaded by

Club Music
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
34 views30 pages

HMM Bioinformatics

- A Hidden Markov Model (HMM) is a Markov model where the states are hidden and each state has probabilities of emitting different output symbols. - An HMM consists of states, transition probabilities between states, and emission probabilities of symbols for each state. The states are hidden and only the emitted symbols are observed. - To analyze a sequence of emitted symbols from an HMM, the forward algorithm can be used to compute the probability of the sequence and infer the most likely hidden states that produced it.

Uploaded by

Club Music
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 30

Hidden Markov Model (HMM)

–1–
CSCE 771 Spring 2011
Markov Chain
▪ A Markov chain is a model that tells us something about the
probabilities of sequences of random variables, states, each of
which can take on values from some set
▪ A Markov chain makes a very strong assumption that if we
want to predict the future in the sequence, all that matters is
the current state
▪ The states before the current state have no impact on the future
except via the current state
▪ It’s as if to predict tomorrow’s weather you could examine
today’s weather but you weren’t allowed to look at yesterday’s
weather

–2–
CSCE 771 Spring 2011
Markov Chain: Weather Example

–3–
CSCE 771 Spring 2011
Markov Chain: “First-order observable
Markov Model”

–4–
CSCE 771 Spring 2011
4
Hidden Markov Models
States Q = q1, q2…qN;
Observations O= o1, o2…oN;
■ Each observation is a symbol from a vocabulary V =
{v1,v2,…vV}
Transition probabilities
■ Transition probability matrix A = {aij}

Observation likelihoods
■ Output probability matrix B={bi(k)}

–5–
Special initial probability vector π CSCE 771 Spring 2011
A Markov model

–6–
CSCE 771 Spring 2011
A Markov model

–7–
CSCE 771 Spring 2011
The weather model

–8–
CSCE 771 Spring 2011
Practice Example
What is the probability of 4 consecutive warm
days?
Sequence is : hot-hot-hot-hot

–9–
CSCE 771 Spring 2011
Markov Chain for Weather
What is the probability of 4 consecutive hot weather?
Sequence is hot-hot-hot-hot
I.e., state sequence is 1-1-1-1-1
P(1,1,1,1,1) =
■ π1a11a11a11a11 = 0.5 x (0.5)4 = 0.03125

– 10 –
CSCE 771 Spring 2011
Markov Model
A Markov model consists of three things:
 A set of states A, C, G, T
 Transition probabilities between states
 Probabilities for the starting states
0.2 Prob(starting in X)=1/4
0.4 0.7 Prob(starting in Y)=1/5

x Y
0.1

– 11 –
CSCE 771 Spring 2011
Markov Model
A Markov model consists of three things:
 A set of states A, C, G, T
 Transition probabilities between states
 Probabilities for the starting states
Prob(starting in A)=1/4
Prob(starting in T)=1/4
Prob(starting in G)=0
Prob(starting in C)=1/4

– 12 –
CSCE 771 Spring 2011
Markov Model
A Markov model consists of three things:
 A set of states A, C, G, T
 Transition probabilities between states
 Probabilities for the starting states
Prob(starting in A)=0.25
A C G T Prob(starting in T)=0.25
Prob(starting in G)=0
Prob(starting in C)=o.5
A 0.4 0.2 0.2 0.2

C 0.25 0.25 0.25 0.25

G 0.3 0.3 0.1 0.3

T 0.1 0.1 0.1 0.3

– 13 – Transition Matrix CSCE 771 Spring 2011


Markov Model
A Markov model consists of three things:
If you give a markov model as before and a gene, how do you
figure out of the probability of that a gene?
Gene of interest: TGCTCAAA
Prob(starting in A)=0.25
A C G T Prob(starting in T)=0.25
Prob(starting in G)=0
Prob(starting in C)=0.5
A 0.4 0.2 0.2 0.2

C 0.25 0.25 0.25 0.25

G 0.3 0.3 0.1 0.3

T 0.1 0.1 0.1 0.3

– 14 – Transition Matrix CSCE 771 Spring 2011


Markov Model
Gene of interest: TGCTCAAA

A C G T Prob(starting in A)=0.25
Prob(starting in T)=0.25
A 0.4 0.2 0.2 0.2 Prob(starting in G)=0
C 0.25 0.25 0.25 0.25 Prob(starting in C)=0.5

G 0.3 0.3 0.1 0.3


T 0.1 0.1 0.1 0.3

What is the probability of starting with state/nucleotide T? ¼


We are in state T; What is the probability of transiting to G? 0.1
We are in state G; What is the probability of transiting to C? 0.3
We are in state C; What is the probability of transiting to T? 0.25
We are in state T; What is the probability of transiting to C? 0.1
We are in state C; What is the probability of transiting to A? 0.25
We are in state A; What is the probability of transiting to A? 0.4
We are in state A; What is the probability of transiting to A? 0.4
– 15 – Transition Matrix CSCE 771 Spring 2011
Markov Model
Gene of interest: TGCTCAAA
What is the probability of starting with state/nucleotide T? ¼
Prob(starting in A)=0.25
We are in state T; What is the probability of transiting to G? 0.1 Prob(starting in T)=0.25
We are in state G; What is the probability of transiting to C? 0.3 Prob(starting in G)=0
Prob(starting in C)=0.5
We are in state C; What is the probability of transiting to T? 0.25
We are in state T; What is the probability of transiting to C? 0.1
We are in state C; What is the probability of transiting to A? 0.25
We are in state A; What is the probability of transiting to A? 0.4
We are in state A; What is the probability of transiting to A? 0.4
So we want that probability that all these things happen. This is the joint probability.
Prob(T)*Prob(T to G)*Prob(G to C)*Prob(C to T)*Prob(T to C)*Prob(C to A)*Prob(A to A)*Prob(A to
A)=0.25*0.1*0.3*0.25*0.1*0.25*0.4*0.4=0.0000075

– 16 – Transition Matrix CSCE 771 Spring 2011


Markov Model
 An HMM is a Markov model that emits symbols at each
state different probabilities
 You are at a casino and the dealer has two coins
one coin is fair: 50% Heads and 50% Tails
one coin is biased: 90% Heads and 10% Tails
Goals: For each of the 10 coin tosses, guess which coin was used
Emit H: 0.5 Emit H: 0.9
T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2

State: F F B B B B F F
Emissions: T H H H T H T H

– 17 –
CSCE 771 Spring 2011
Markov Model
 An HMM is a Markov model that emits symbols at each
state different probabilities
 You are at a casino and the dealer has two coins
one coin is fair: 50% Heads and 50% Tails
one coin is biased: 90% Heads and 10% Tails
Goals: For each of the 10 coin tosses, guess which coin was used
Emit H: 0.5 Emit H: 0.9
T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2

State: F F B B B B F F
Emissions: T H H H T H T H

– 18 –
CSCE 771 Spring 2011
Markov Model
 The player only see the emissions
 States are hidden

State: F F B B B B F F
Emissions: T H H H T H T H

– 19 –
CSCE 771 Spring 2011
Markov Model
What would be your guess for states from the following emissions?

Emissions: T H H H T H T H H H H T T H H H H
States:

Emit H: 0.5 Emit H: 0.9


T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2

– 20 –
CSCE 771 Spring 2011
Markov Model: Forward Algorithm
 It is easy for the dealer to compute the probability because they know both the
states and emissions

Emit H: 0.5 Emit H: 0.9


T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2

State: F F B B B
Emissions: T H H H H

P(start in F). P(Emit T in F).P(Stay state in F). P(Emit H in F). P(F to B). P(Emit H in B).P(Stay state in B).
P(Emit H in B).P(Stay state in B). P(Emit H in B)
=0.5*0.5*0.8*0.5*0.2*0.9*0.8*0.9*0.8*0.9=0.00933

– 21 –
CSCE 771 Spring 2011
Markov Model: Forward Algorithm
 But not easy for the player without knowing the states
 As the nucleotide is five, there are different state combinations
Emit H: 0.5 Emit H: 0.9
T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2
Case 1:
P

P(start in F). P(Emit T in F).P(Stay state in F). P(Emit H in F). P(Stay state in F). P(Emit H in F). P(Stay
state in F). P(Emit H in F). P(Stay state in F). P(Emit H in F).
=0.5*0.5*0.8*0.5*0.8*0.5*0.80*0.5*0.8*0.5=0.0064

– 22 –
CSCE 771 Spring 2011
Markov Model: Forward Algorithm
 But not easy for the player without knowing the states
 As the nucleotide is five, there are different state combinations
Emit H: 0.5 Emit H: 0.9
T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2
Case 2:
P

P(start in F). P(Emit T in F).P(Stay state in F). P(Emit H in F). P(Stay state in F). P(Emit H in F). P(Stay
state in F). P(Emit H in F). P(F to B). P(Emit H in B).
=0.5*0.5*0.8*0.5*0.8*0.5*0.80*0.5*0.2*0.9=0.0029

– 23 –
CSCE 771 Spring 2011
Markov Model: Forward Algorithm
 But not easy for the player without knowing the states
 As the nucleotide is five, there are different state combinations
Emit H: 0.5 Emit H: 0.9
T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2
Case 3:
P

P(start in F). P(Emit T in F).P(Stay state in F). P(Emit H in F). P(Stay state in F). P(Emit H in F). P(F to B).
P(Emit H in B). P(B to F). P(Emit H in F).
=0.5*0.5*0.8*0.5*0.8*0.5*0.20*0.9*0.2*0.5=0.0007

– 24 –
CSCE 771 Spring 2011
Markov Model: Forward Algorithm
 But not easy for the player without knowing the states
 As the nucleotide is five, there are different state combinations
Emit H: 0.5 Emit H: 0.9
T: 0.5 0.2 T: 0.1
0.8 0.8

F B

0.2
Case N:
P

P(start in B). P(Emit T in B).P(Stay state in B). P(Emit H in B). P(Stay state in B). P(Emit H in B). P(Stay
state in B). P(Emit H in B). P(Stay state in B). P(Emit H in B).
=0.5*0.1*0.8*0.9*0.8*0.9*0.80*0.9*0.8*0.9=0.0134

– 25 –
CSCE 771 Spring 2011
Viterbi Algorithm
 Find the most likely state sequence

Input: a HMM and emission sequence


Output: A state sequence with maximum probability

– 26 –
CSCE 771 Spring 2011
Scenario: The Occasionally
Dishonest Casino Problem
• A casino uses a fair die most of the time, but
occasionally switches to a loaded one
– Fair die: Prob(1) = Prob(2) = . . . = Prob(6) = 1/6
– Loaded die: Prob(1) = Prob(2) = . . . = Prob(5) =
1/10, Prob(6) = ½
– These are the emission probabilities
• Transition probabilities
– Prob(Fair, Loaded) = 0.01
– Prob(Loaded, Fair) = 0.2
– Transitions between states obey a Markov process

– 27 –
CSCE 771 Spring 2011
An HMM for Occasionally
Dishonest Casino
akl
0.99
0.80
1: 1/6 0.01
1: 1/10
2: 1/6
2: 1/10
3: 1/6 3: 1/10
4: 1/6 4: 1/10
5: 1/6 0.2 5: 1/10
6: 1/6 6: 1/2

ek (b) Fair Loaded

– 28 –
CSCE 771 Spring 2011
Outcome = 6,2,6
The Viterbi Algorithm Static Probability is always =
0.5
Transition Probability

6 2 6
Fair (1/6)x(1/2) (1/6) x max{(1/12) x 0.99, (1/6) x
= 1/12 (1/4) x 0.2} max{0.01375 x 0.99,
0.02 x 0.2}
= 0.01375 = 0.00226875
Loaded (1/2) x (1/2) (1/10) x max{(1/12) x 0.01, (1/2) x
= 1/4 (1/4) x 0.8} max{0.01375 x 0.01,
0.02 x 0.8}
= 0.02 = 0.08

0.80
0.99
1: 1/6 0.01 1: 1/10

r v
2: 1/10
vk (i )  ek (xi ) max
r
rk
2:
3:
1/6
1/6 3: 1/10
4: 1/6 4: 1/10
(i  1)a  5:
6:
1/6
1/6
0.2 5:
6:
1/10
1/2
– 29 – Fair Loaded
CSCE 771 Spring 2011
The Viterbi Algorithm: Example

6 2 6
Fair (1/6)x(1/2) (1/6) x max{(1/12) x 0.99, (1/6) x
= 1/12 (1/4) x 0.2} max{0.01375 x 0.99,
0.02 x 0.2}
= 0.01375 = 0.00226875
Loaded (1/2) x (1/2) (1/10) x max{(1/12) x 0.01, (1/2) x
= 1/4 (1/4) x 0.8} max{0.01375 x 0.01,
0.02 x 0.8}
= 0.02 = 0.08

0.80
0.99
1: 1/6 0.01 1: 1/10

r v
2: 1/10
vk (i )  ek (xi ) max
r
rk
2:
3:
1/6
1/6 3: 1/10
4: 1/6 4: 1/10
(i  1)a  5:
6:
1/6
1/6
0.2 5:
6:
1/10
1/2
– 30 – Fair Loaded
CSCE 771 Spring 2011

You might also like