0% found this document useful (0 votes)

188 views5 pages

NLP Assignment 5

The document discusses the forward-backward algorithm and how it can be used to compute probabilities in hidden Markov models. It also explains how hidden Markov models can be used for part-of-speech tagging by assigning probabilities to tags for each word.

Uploaded by

poorvaja.r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

188 views5 pages

NLP Assignment 5

Uploaded by

poorvaja.r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

NLP ASSIGNMENT 5

POORVAJA R
AIML
III YEAR

1. Explain Forward – Backward algorithm

The forward-backward algorithm is a dynamic programming algorithm used to compute
the probabilities of being in a particular state in a hidden Markov model (HMM). HMMs
are statistical models that are commonly used to model time series data, where the
observed data is a sequence of observations, and the underlying state of the system is
not directly observable.

The forward-backward algorithm computes two sets of probabilities: the forward

probabilities and the backward probabilities. The forward probabilities are the
probabilities of being in a particular state at a particular time, given all the previous
observations. The backward probabilities are the probabilities of observing all the future
observations, given that the system is in a particular state at a particular time.

The forward algorithm computes the forward probabilities recursively, using the following
formula:

α_i(t) = P(o_1, o_2, ..., o_t, q_t = i | λ)

where α_i(t) is the probability of being in state i at time t, given the observations up to
time t and the HMM λ. The forward algorithm starts by computing the base case, which
is the probability of starting in state i and observing the first observation:

α_i(1) = π_i * b_i(o_1)

where π_i is the initial probability of being in state i, and b_i(o_1) is the probability of
observing the first observation o_1, given that the system is in state i.

The forward algorithm then computes the forward probabilities for each time step t > 1,
using the following recursive formula:

α_i(t) = b_i(o_t) * ∑_j α_j(t-1) * a_ji

where b_i(o_t) is the probability of observing o_t given that the system is in state i, a_ji
is the transition probability from state j to state i, and the sum is over all possible states
j.

The backward algorithm computes the backward probabilities recursively, using the
following formula:

β_i(t) = P(o_t+1, o_t+2, ..., o_T | q_t = i, λ)

where β_i(t) is the probability of observing the future observations, given that the system
is in state i at time t and the HMM λ. The backward algorithm starts by computing the
base case, which is the probability of observing the future observations, given that the
system is in state i at the last time step:

β_i(T) = 1

The backward algorithm then computes the backward probabilities for each time step t <
T, using the following recursive formula:

β_i(t) = ∑_j a_ij * b_j(o_t+1) * β_j(t+1)

where b_j(o_t+1) is the probability of observing the observation o_t+1, given that the
system is in state j, and the sum is over all possible states j.

Once the forward and backward probabilities have been computed, the probability of
being in a particular state at a particular time can be computed using the following
formula:

γ_i(t) = α_i(t) * β_i(t) / P(O | λ)

where P(O | λ) is the probability of observing the entire sequence of observations O,

given the HMM λ.

The forward-backward algorithm is used in many applications, such as speech

recognition, natural language processing, and bioinformatics. It is particularly useful in
applications where the underlying state of the system is not directly observable, and the
observed data is noisy or incomplete. The algorithm can be extended to handle more
complex models, such as hidden semi-Markov models, and can be used in combination
with other algorithms, such as the Viterbi algorithm, to perform more complex inference
tasks.
2. Explain HMM model for POS Tagging
Part-of-speech (POS) tagging is the process of assigning a part of speech, such as
noun, verb, adjective, or adverb, to each word in a text. Hidden Markov Model (HMM) is
a statistical model that can be used for POS tagging. In HMM-based POS tagging, the
observed data is the sequence of words in a text, and the hidden states are the parts of
speech.

In an HMM-based POS tagging model, each word in the text is treated as an

observation, and each part of speech is treated as a state. The model assumes that the
probability of observing a word depends only on the part of speech of the word, and that
the part of speech of a word depends only on the part of speech of the previous word.
This assumption is known as the Markov assumption.

The HMM-based POS tagging model can be represented using the following
components:

A set of states S = {s1, s2, ..., sn} representing the different parts of speech.
A set of observations O = {o1, o2, ..., om} representing the different words in the text.
An initial probability distribution π, which gives the probability of starting in each state.
A transition probability matrix A, which gives the probability of moving from one state to
another.
An emission probability matrix B, which gives the probability of observing each
observation given the current state.
The transition probability matrix A and emission probability matrix B can be estimated
from a corpus of labeled data using the maximum likelihood estimation method. The
initial probability distribution π can be set to be uniform or estimated from the corpus.

Given a sequence of words, the HMM-based POS tagging algorithm computes the most
likely sequence of part-of-speech tags that would produce the observed sequence of
words. This is done using the Viterbi algorithm, which is a dynamic programming
algorithm that computes the maximum likelihood sequence of hidden states.

The Viterbi algorithm starts by computing the probability of being in each state at the
first time step, given the first observation. This is done using the following formula:

δ_i(1) = π_i * b_i(o_1)

where δ_i(1) is the probability of being in state i at the first time step, given the first
observation o_1.
The algorithm then computes the maximum likelihood sequence of hidden states
recursively, using the following formula:

δ_i(t) = b_i(o_t) * max_j (δ_j(t-1) * a_ji)

where δ_i(t) is the probability of being in state i at time step t, given the observations up
to time t, and the maximum is taken over all possible states j.

Once the most likely sequence of hidden states has been computed using the Viterbi
algorithm, the corresponding part-of-speech tags can be assigned to the words in the
observed sequence.

HMM-based POS tagging has been used in many natural language processing
applications, such as information extraction, text summarization, and machine
translation. It is a simple and effective approach for POS tagging that can be easily
extended to handle more complex models, such as hidden semi-Markov models or
neural network models.

3. Explain the Viterbi Algorithm with the help of a suitable example.

The Viterbi algorithm is a dynamic programming algorithm used to find the most likely
sequence of hidden states in a Hidden Markov Model (HMM). It is commonly used in
natural language processing tasks such as part-of-speech tagging, where the goal is to
assign a part of speech tag to each word in a sentence.

The Viterbi algorithm works by recursively computing the maximum likelihood probability
of each state at each time step, given the observations up to that time step. At each
time step, the algorithm keeps track of the most likely sequence of states that would
produce the observed sequence of observations up to that time step. Once the
algorithm has computed the most likely sequence of states, the corresponding output
sequence can be determined.

Let's take an example of using the Viterbi algorithm for part-of-speech tagging. Consider
the following sentence: "The cat sat on the mat". We want to assign a part of speech tag
to each word in the sentence. We can represent this problem as an HMM, where the
hidden states represent the part of speech tags and the observations represent the
words in the sentence.

We can define the hidden states as follows:

S = {Noun, Verb, Article, Preposition}

And the observations as follows:

O = {The, cat, sat, on, mat}

We can estimate the transition probability matrix A and emission probability matrix B
from a training corpus. For simplicity, let's assume that we have estimated the matrices
as follows:

A=
Noun Verb Article Preposition
Noun 0.2 0.4 0.1 0.3
Verb 0.1 0.3 0.2 0.4
Article 0.6 0.1 0.2 0.1
Prep 0.4 0.2 0.3 0.1

B=
The cat sat on mat
Noun 0.2 0.3 0.1 0.2 0.2
Verb 0.1 0.2 0.3 0.2 0.2
Article 0.6 0.1 0.1 0.1 0.1
Prep 0.1 0.1 0.2 0.4 0.2

Now, we can apply the Viterbi algorithm to find the most likely sequence of
part-of-speech tags for the sentence "The cat sat on the mat".

At time step t=1, we start with the initial probability distribution π, which can be set to be
uniform. The probability of being in each state at the first time step is:

δ_Noun(1) = π_Noun * B_Noun(The) = 0.25 * 0.2 = 0.05

δ_Verb(1) = π_Verb * B_Verb(The) = 0.25 * 0.1 = 0.025
δ_Article(1) = π_Article * B_Article(The) = 0.25 * 0.6 = 0.15
δ_Prep(1) = π_Prep * B_Prep(The) = 0.25 * 0.1 = 0.025

At time step t=2, we compute the probability of being in each state at time step t=2,
given the observations up to time step t=2, and the most likely sequence of states

The Living Earth Student Edition 1st Edition Unlocked Test Bank
No ratings yet
The Living Earth Student Edition 1st Edition Unlocked Test Bank
303 pages
Windows 11
No ratings yet
Windows 11
125 pages
TPTG620 - Assignment 06 - Spring 2025 - Sample Solution
No ratings yet
TPTG620 - Assignment 06 - Spring 2025 - Sample Solution
90 pages
Chat GPT
No ratings yet
Chat GPT
145 pages
Anaphy - Chapter-3-Cells-andTissues
100% (1)
Anaphy - Chapter-3-Cells-andTissues
80 pages
Class 12 Biology Topic Wise Line by Line Chapter 5 Molecular Basis of Inheritance
No ratings yet
Class 12 Biology Topic Wise Line by Line Chapter 5 Molecular Basis of Inheritance
31 pages
Complexity and The Economy W Brian Arthur Arthur W Brian Instant Download
No ratings yet
Complexity and The Economy W Brian Arthur Arthur W Brian Instant Download
88 pages
Norms of Nature Naturalism and The Nature of Functions (Paul Sheldon Davies) (Z-Library)
No ratings yet
Norms of Nature Naturalism and The Nature of Functions (Paul Sheldon Davies) (Z-Library)
251 pages
The Torre Bueno Glossary of Entomology
No ratings yet
The Torre Bueno Glossary of Entomology
533 pages
Contact Session8 - Viterbie and Forward Backward Algortihm
No ratings yet
Contact Session8 - Viterbie and Forward Backward Algortihm
63 pages
24f 09 Hidden Markov Models
No ratings yet
24f 09 Hidden Markov Models
79 pages
Chat GPT
No ratings yet
Chat GPT
24 pages
Baseline Test 1 Grade 12 Memo 2025
No ratings yet
Baseline Test 1 Grade 12 Memo 2025
5 pages
MICROTOMES
No ratings yet
MICROTOMES
59 pages
Hidden Markovnikov Model
No ratings yet
Hidden Markovnikov Model
32 pages
Lecture 9
No ratings yet
Lecture 9
39 pages
POS Tagging
No ratings yet
POS Tagging
11 pages
Hidden Markov Model
No ratings yet
Hidden Markov Model
13 pages
L4 Tagging
No ratings yet
L4 Tagging
107 pages
Biology Class X Teacher Resources
No ratings yet
Biology Class X Teacher Resources
62 pages
Lsa352 Lec5
No ratings yet
Lsa352 Lec5
70 pages
Week 9
No ratings yet
Week 9
36 pages
Lecture05-Hmm Pos Tagging
No ratings yet
Lecture05-Hmm Pos Tagging
38 pages
Hidden Markov Model
No ratings yet
Hidden Markov Model
35 pages
19CSE453 - Natural Language Processing: Part of Speech Tagging
No ratings yet
19CSE453 - Natural Language Processing: Part of Speech Tagging
59 pages
Lecture 04
No ratings yet
Lecture 04
42 pages
HMM Isolated Word Recognition
No ratings yet
HMM Isolated Word Recognition
23 pages
HMM Detailed
No ratings yet
HMM Detailed
41 pages
5 Sequence Learning
No ratings yet
5 Sequence Learning
50 pages
Day 1
No ratings yet
Day 1
8 pages
Lecture 2
No ratings yet
Lecture 2
21 pages
Sequence Model:: Hidden Markov Models
No ratings yet
Sequence Model:: Hidden Markov Models
60 pages
Speech Recognition
No ratings yet
Speech Recognition
29 pages
Lec PoS Tagging 2022
No ratings yet
Lec PoS Tagging 2022
67 pages
PoSTagging-HMM
No ratings yet
PoSTagging-HMM
24 pages
13 - IB Biology 2023 New Syllabus B2.3 Cell Specialization PowerPoint
No ratings yet
13 - IB Biology 2023 New Syllabus B2.3 Cell Specialization PowerPoint
45 pages
Introduction To Hidden Markov Models
No ratings yet
Introduction To Hidden Markov Models
11 pages
Chapter 1-Biology and Its Theme (Week 2) With Brainstorming Questions 3
No ratings yet
Chapter 1-Biology and Its Theme (Week 2) With Brainstorming Questions 3
18 pages
2 cs626 Pos Tagging Week of 1aug22
No ratings yet
2 cs626 Pos Tagging Week of 1aug22
57 pages
Lecture3 (Gastropod)
No ratings yet
Lecture3 (Gastropod)
15 pages
May 14
No ratings yet
May 14
23 pages
Assignment 3
No ratings yet
Assignment 3
12 pages
CS 4705 Hidden Markov Models: Slides Adapted From Dan Jurafsky, and James Martin
No ratings yet
CS 4705 Hidden Markov Models: Slides Adapted From Dan Jurafsky, and James Martin
35 pages
dsr8,9
No ratings yet
dsr8,9
6 pages
Evidence of Evolution Packet
No ratings yet
Evidence of Evolution Packet
7 pages
The Integumentary System
No ratings yet
The Integumentary System
35 pages
CSCI 5832 Natural Language Processing: Jim Martin
No ratings yet
CSCI 5832 Natural Language Processing: Jim Martin
47 pages
Science: Modified Strategic Intervention Materials
100% (1)
Science: Modified Strategic Intervention Materials
24 pages
BDT Assignment4
No ratings yet
BDT Assignment4
4 pages
Etm
No ratings yet
Etm
16 pages
Hidden Markov Models: Julia Hirschberg CS4705
No ratings yet
Hidden Markov Models: Julia Hirschberg CS4705
37 pages
Design of Experiment Project Report
No ratings yet
Design of Experiment Project Report
10 pages
Lecture 8: State-Space Models Based On Slides By: Probabilis C Graphical Models
No ratings yet
Lecture 8: State-Space Models Based On Slides By: Probabilis C Graphical Models
29 pages
NLP Assignment 4
No ratings yet
NLP Assignment 4
3 pages
Florabel Cillo Unit 1
100% (1)
Florabel Cillo Unit 1
6 pages
Optical Character Recognition Using Hidden Markov Models
100% (1)
Optical Character Recognition Using Hidden Markov Models
31 pages
Ai TXT Unit5
No ratings yet
Ai TXT Unit5
7 pages
Recognition of Socphatic Speaking
No ratings yet
Recognition of Socphatic Speaking
7 pages
CSCI 5832 Natural Language Processing: Jim Martin
No ratings yet
CSCI 5832 Natural Language Processing: Jim Martin
46 pages
Viterbi Algorithm
No ratings yet
Viterbi Algorithm
9 pages
FST 0000261 Aop
No ratings yet
FST 0000261 Aop
7 pages
PR l23 PDF
No ratings yet
PR l23 PDF
23 pages
Hidden Markov Models in Speech Recognition: Wayne Ward
No ratings yet
Hidden Markov Models in Speech Recognition: Wayne Ward
35 pages
HMM-DNN Speech Recognition Techniques: A Review: ISSN: 2230-9926
No ratings yet
HMM-DNN Speech Recognition Techniques: A Review: ISSN: 2230-9926
5 pages
Week 5 & 6 - Practice Exercise - ML
No ratings yet
Week 5 & 6 - Practice Exercise - ML
5 pages
A Guide To Hidden Markov Model and Its Applications in NLP
No ratings yet
A Guide To Hidden Markov Model and Its Applications in NLP
11 pages
The Identity of Marsdenia Parasita
No ratings yet
The Identity of Marsdenia Parasita
6 pages
Recitation4 Notes
No ratings yet
Recitation4 Notes
6 pages
Name - Suraj Sala-WPS Office
No ratings yet
Name - Suraj Sala-WPS Office
7 pages
Hidden Markov Models: Ts. Nguyễn Văn Vinh Bộ môn KHMT, Trường ĐHCN, ĐH QG Hà nội
No ratings yet
Hidden Markov Models: Ts. Nguyễn Văn Vinh Bộ môn KHMT, Trường ĐHCN, ĐH QG Hà nội
55 pages
Isolated-Word Speech Recognition Using Hidden Markov Models: H Akon Sandsmark December 18, 2010
No ratings yet
Isolated-Word Speech Recognition Using Hidden Markov Models: H Akon Sandsmark December 18, 2010
9 pages
Parts of Speech Tagging
No ratings yet
Parts of Speech Tagging
4 pages
Hidden Markov Models: Ts. Nguyễn Văn Vinh Bộ môn KHMT, Trường ĐHCN, ĐH QG Hà nội
No ratings yet
Hidden Markov Models: Ts. Nguyễn Văn Vinh Bộ môn KHMT, Trường ĐHCN, ĐH QG Hà nội
51 pages
Hidden Markov Models
No ratings yet
Hidden Markov Models
20 pages
Equation Sheet
No ratings yet
Equation Sheet
4 pages
Elebsc
No ratings yet
Elebsc
10 pages
Hidden Markov Model in Automatic Speech Recognition
No ratings yet
Hidden Markov Model in Automatic Speech Recognition
29 pages
NLP Assignment 5
No ratings yet
NLP Assignment 5
5 pages
PAC-Learning of Markov Models With Hidden State
No ratings yet
PAC-Learning of Markov Models With Hidden State
12 pages
Hidden Markov Model (HMM) Architecture
No ratings yet
Hidden Markov Model (HMM) Architecture
15 pages
Lecture Notes On Syntactic Processing
No ratings yet
Lecture Notes On Syntactic Processing
14 pages
Bmi3702 S1 - A1 - 2024-F
No ratings yet
Bmi3702 S1 - A1 - 2024-F
2 pages
Perasaan Kesepian Dan Self-Esteem Pada Mahasiswa: Jurnal Ilmiah Universitas Batanghari Jambi Vol.15 No.4 Tahun 2015
No ratings yet
Perasaan Kesepian Dan Self-Esteem Pada Mahasiswa: Jurnal Ilmiah Universitas Batanghari Jambi Vol.15 No.4 Tahun 2015
6 pages
Second Quarter Exam
No ratings yet
Second Quarter Exam
7 pages
Cu HMM
No ratings yet
Cu HMM
13 pages
Prokaryotic Cells-Cell Envelope and Its Modifications
No ratings yet
Prokaryotic Cells-Cell Envelope and Its Modifications
3 pages
G25 Oligo Clean Up - Desalt Kit (YCG25)
No ratings yet
G25 Oligo Clean Up - Desalt Kit (YCG25)
1 page
Assignment Submission Speech Recognition System Architectural Design
No ratings yet
Assignment Submission Speech Recognition System Architectural Design
5 pages
An Introduction To Hidden Markov Models
No ratings yet
An Introduction To Hidden Markov Models
12 pages
Two Pass Hidden Markov Model For Speech Recognition Systems: 1 Abstract
No ratings yet
Two Pass Hidden Markov Model For Speech Recognition Systems: 1 Abstract
5 pages
A Literature Survey of Speech Recognition and Hidden Markov Models
No ratings yet
A Literature Survey of Speech Recognition and Hidden Markov Models
6 pages
Photosynthasis Summative Project - Final
No ratings yet
Photosynthasis Summative Project - Final
4 pages
Ernst Niedermeyer, Fernando Lopes Da Silva,, Electroencephalography. Basic Principles, Clinica
No ratings yet
Ernst Niedermeyer, Fernando Lopes Da Silva,, Electroencephalography. Basic Principles, Clinica
1 page
All Chapter Download Evolutionary Analysis 5th Edition Herron Test Bank
100% (8)
All Chapter Download Evolutionary Analysis 5th Edition Herron Test Bank
41 pages