Title: Hidden Markov Model: Hidden Markov Model The States That Were Responsible For Emitting The Various Symbols Are

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 5

Title : Hidden Markov Model

The Hidden Markov Model is a finite set of states, each of which is associated with a
(generally multidimensional) probability distribution []. Transitions among the states are
governed by a set of probabilities called transition probabilities. In a particular state an
outcome or observation can be generated, according to the associated probability distribution.
Approach : The approach that we have used is unsupervised which is similar what we would
have achieved through clustering. Because, you can train a k-state HMM on it, and at the end
of the training process, run the Viterbi algorithm on your sequence to get the most likely state
associated with each input vector (or just pull this from the during the training process).
This gives you a clustering of your input sequence into kclasses, but unlike what you would
have obtained by running your data through k-means, your clustering is homogeneous on the
time axis.
Algorithm :
Let t(i) be the probability of the partial observation sequence Ot = {o(1), o(2), ... , o(t)} to be
produced by all possible state sequences that end at the i-th state.
t(i) = P(o(1), o(2), ... , o(t) | q(t) = qi ).
Then the unconditional probability of the partial observation sequence is the sum of t(i) over
all N states. The Forward Algorithm is a recursive algorithm for calculating t(i) for the
observation sequence of increasing length t .
First, the probabilities for the single-symbol sequence are calculated as a product of initial ith state probability and emission probability of the given symbol o(1) in the i-th state. Then
the recursive formula is applied. Assume we have calculated t(i) for some t. To
calculate t+1(j), we multiply every t(i) by the corresponding transition probability from the ith state to the j-th state, sum the products over all states, and then multiply the result by the
emission probability of the symbol o(t+1). Iterating the process, we can eventually
calculate T(i), and then summing them over all states, we can obtain the required probability.
Explanation :
The numbers represent the probabilities of transitioning between the various states, or of emitting
certain symbols. For example, state S1 has a 90% chance of transitioning back to itself; each
time it is visited, there is a 50% chance that it emits a 1, and a 50% chance that it emits a 2.
Clearly, this model can be used to produce strings of 1s and 2s that fit its parameters. But we
can use it to solve a much more interesting question: given a string of 1s and 2s, which
sequence of states most likely generated the string? This is why its described as
a hidden Markov model; the states that were responsible for emitting the various symbols are

unknown, and we would like to establish which sequence of states is most likely to have
produced the sequence of symbols.
Lets look at what might have generated the string 222. We can look at every possible
combination of 3 states to establish which was most likely responsible for that string.

S1 S1 S1: 0.5 (initial probability of being in state 1) * 0.5 (probability of S1


emitting a 2) * 0.9 (probability of S1 transitioning to S1) * 0.5 (probability of

S1 emitting a 2) * 0.9 (probability of S1 transitioning to S1) * 0.5 (probability


of S1 emitting a 2) = 0.050625
S1 S1 S2: 0.5 * 0.5 * 0.9 * 0.5 * 0.1 * 0.75 = 0.0084375 (less likely than

the previous sequence)


S1 S2 S1: 0.5 * 0.5 * 0.1 * 0.75 * 0.8 * 0.5 = 0.0075
S1 S2 S2: 0.5 * 0.5 * 0.1 * 0.75 * 0.2 * 0.75 = 0.0028125

S2 S1 S1: 0.5 * 0.75 * 0.8 * 0.5 * 0.9 * 0.5 = 0.0675


S2 S1 S2: 0.5 * 0.75 * 0.8 * 0.5 * 0.1 * 0.75 = 0.01125

S2 S2 S1: 0.5 * 0.75 * 0.2 * 0.75 * 0.8 * 0.5 = 0.0225


S2 S2 S2: 0.5 * 0.75 * 0.2 * 0.75 * 0.2 * 0.75 = 0.0084375

Example :

Applications :

Protien folding-Protein folding is the physical process by which a protein chain


acquires its native 3-dimensional structure, a conformation that is usually biologically
functional, in an expeditious and reproducible manner using Hidden Markov Model. It is
the physical process by which a folds into its characteristic and functional threedimensional structure from random coil.
Alignment of bio-sequences In bioinformatics, a sequence alignment is a way of
arranging the sequences of to identify regions of similarity that may be a consequence of
functional, structural, or evolutionary relationships between the sequences.

Conclusion :
Thus, we have implemented Hidden Markov Model successfully.

Program Code :
import java.io.*;
import java.util.*;
class hmm
{
public static void main(String args[])
{
Scanner sc=new Scanner(System.in);
double prob[]=new double[10];
int l=0;
System.out.print("Initial probabilities");
double ip[]=new double[2];
ip[0]=sc.nextDouble(); ip[1]=sc.nextDouble();
System.out.println("Transitional probabilities");
double tp[][]=new double[2][2];
for(int i=0;i<2;i++)
for(int j=0;j<2;j++)
tp[i][j]=sc.nextDouble();
System.out.print("Emission probabilities");
double ep[][]=new double[2][2];
for(int i=0;i<2;i++)
for(int j=0;j<2;j++)
ep[i][j]=sc.nextDouble();
System.out.print("Enter the desired output");

int op=sc.nextInt();
prob[l++]=prob(op,0,0,0,ip,tp,ep);
prob[l++]=prob(op,0,0,1,ip,tp,ep);
prob[l++]=prob(op,0,1,0,ip,tp,ep);
prob[l++]=prob(op,0,1,1,ip,tp,ep);
prob[l++]=prob(op,1,0,0,ip,tp,ep);
prob[l++]=prob(op,1,0,1,ip,tp,ep);
prob[l++]=prob(op,1,1,0,ip,tp,ep);
prob[l++]=prob(op,1,1,1,ip,tp,ep);
for(int i=0;i<l;i++)
System.out.println(prob[i]);
double max=0;
for(int i=0;i<l;i++)
if(prob[i]>max) max=prob[i];
System.out.println("max is"+max);

static double prob(int e,int a,int b,int c,double ip[],double tp[][],double ep[][])
{
double d1,d2,d3,d4,d5,d6;
d1=ip[a];
d2=ep[a][e];
d3=tp[a][b];
d4=ep[b][e];
d5=tp[b][c];
d6=ep[c][e];

return d1*d2*d3*d4*d5*d6;
}
}
Output :

You might also like