0% found this document useful (0 votes)
10 views8 pages

Unit No 3

The document is a question bank for an NLP end semester exam, specifically focusing on parsing, POS tagging, and various related techniques. It includes questions on parsing methods, statistical parsing, and different POS tagging algorithms such as rule-based, transformation-based, and Hidden Markov Model (HMM) tagging. Additionally, it discusses the advantages and disadvantages of POS tagging, providing examples and explanations for each concept.

Uploaded by

Darshan Tipale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views8 pages

Unit No 3

The document is a question bank for an NLP end semester exam, specifically focusing on parsing, POS tagging, and various related techniques. It includes questions on parsing methods, statistical parsing, and different POS tagging algorithms such as rule-based, transformation-based, and Hidden Markov Model (HMM) tagging. Additionally, it discusses the advantages and disadvantages of POS tagging, providing examples and explanations for each concept.

Uploaded by

Darshan Tipale
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

NLP END SEM QUESTION BANK

UNIT NO: 3
Q. No Question Marks
1 What is Parsing? Explain Chunking. 6
ANS
2 Write Short note on Hybrid of Rule Based and Probabilistic parsing. 6
ANS
3 Perform parsing using simple top down parsing for the sentence: 6
“The dogs cried”
Using the grammar given below:
S- NP VP
NP- ART N
NP- ART ADJ N
VP- V
VP- V NP
ANS
4 For CFGs given: 6
S – NP VP
VP – V NP
NP – Det N
Draw the Shift-Reducer parser in processing the sentence:
“The Woman Saw a puppy”
Use the following Lexical entries to create the chart parser.
The | a: Det
Woman | puppy: N
Saw: V
ANS
5 Explain the Statistical parsing. Describe usage of Probabilistic Context free Grammar(PCFG) in 6
NLP.
ANS
6 What are different parsing techniques? Explain Dependency parsing and Constituency parsing 6
ANS
7 What is POS tagging? Explain any one algorithm used for POS tagging. 6
ANS  POS Tagging:
 Parts of Speech tagging is a linguistic activity in Natural Language Processing (NLP)
wherein each word in a document is given a particular part of speech (adverb, adjective,
verb, etc.) or grammatical category.
 Through the addition of a layer of syntactic and semantic information to the words, this
procedure makes it easier to comprehend the sentence’s structure and meaning.
NLP END SEM QUESTION BANK
 Example of POS Tagging:
 Input: “The quick brown fox jumps over the lazy dog.”
 POS Tagging:
o “The” is tagged as determiner (DT)
o “quick” is tagged as adjective (JJ)
o “brown” is tagged as adjective (JJ)
o “fox” is tagged as noun (NN)
o “jumps” is tagged as verb (VBZ)
o “over” is tagged as preposition (IN)
o “the” is tagged as determiner (DT)
o “lazy” is tagged as adjective (JJ)
o “dog” is tagged as noun (NN)

 Types of POS Tagging:


1) Rule-Based Tagging:
 Rule-based part-of-speech (POS) tagging involves assigning words their respective parts
of speech using predetermined rules, contrasting with machine learning-based POS
tagging that requires training on annotated text corpora.
 In a rule-based system, POS tags are assigned based on specific word characteristics and
contextual cues.
 For instance, a rule-based POS tagger could designate the “noun” tag to words ending in
“‑tion” or “‑ment,” recognizing common noun-forming suffixes.
 This approach offers transparency and interpretability, as it doesn’t rely on training data.
 Rule: Assign the POS tag “noun” to words ending in “-tion” or “-ment.”
 Text: “The presentation highlighted the key achievements of the project’s
development.”
 Rule based Tags:
o “The” – Determiner (DET)
o “presentation” – Noun (N)
o “highlighted” – Verb (V)
o “the” – Determiner (DET)
o “key” – Adjective (ADJ)
o “achievements” – Noun (N)
o “of” – Preposition (PREP)
o “the” – Determiner (DET)
o “project’s” – Noun (N)
o “development” – Noun (N)
 In this instance, the predetermined rule is followed by the rule-based POS tagger to label
words.
 “Noun” tags are applied to words like “presentation,” “achievements,” and
“development” because of the aforementioned restriction.
 Rule-based taggers may handle a broad variety of linguistic patterns by incorporating
different rules, which makes the tagging process transparent and comprehensible.
2) Transformation Based tagging:
 Transformation-based tagging (TBT) is a part-of-speech (POS) tagging method that uses
a set of rules to change the tags that are applied to words inside a text.
 In contrast, statistical POS tagging uses trained algorithms to predict tags
probabilistically, while rule-based POS tagging assigns tags directly based on predefined
rules.
NLP END SEM QUESTION BANK
 To change word tags in TBT, a set of rules is created depending on contextual
information.
 A rule could, for example, change a verb’s tag to a noun if it comes after a determiner
like “the.”
 The text is systematically subjected to these criteria, and after each transformation, the
tags are updated.
 When compared to rule-based tagging, TBT can provide higher accuracy, especially
when dealing with complex grammatical structures.
 To attain ideal performance, nevertheless, it might require a large rule set and additional
computer power.
 Transformation rule: Change the tag of a verb to a noun if it follows a determiner like
“the.”
 Text: “The cat chased the mouse”.
 Initial Tags:
o “The” – Determiner (DET)
o “cat” – Noun (N)
o “chased” – Verb (V)
o “the” – Determiner (DET)
o “mouse” – Noun (N)
 Transformation rule applied:
 Change the tag of “chased” from Verb (V) to Noun (N) because it follows the determiner
“the.”
 Updated tags:
o “The” – Determiner (DET)
o “cat” – Noun (N)
o “chased” – Noun (N)
o “the” – Determiner (DET)
o “mouse” – Noun (N)
 In this instance, the tag “chased” was changed from a verb to a noun by the TBT system
using a transformation rule based on the contextual pattern.
 The tagging is updated iteratively and the rules are applied sequentially.
 Although this example is simple, given a well-defined set of transformation rules, TBT
systems can handle more complex grammatical patterns.
3) Hidden Markov Model (HMM) POS Tagging:
 An HMM model may be defined as the doubly-embedded stochastic model, where the
underlying stochastic process is hidden.
 This hidden stochastic process can only be observed through another set of stochastic
processes that produces the sequence of observations.
 Example: A sequence of hidden coin tossing experiments is done and we see only the
observation sequence consisting of heads and tails. The actual details of the process -
how many coins used, the order in which they are selected - are hidden from us. By
observing this sequence of heads and tails, we can build several HMMs to explain the
sequence. Following is one form of Hidden Markov Model for this problem
NLP END SEM QUESTION BANK

 We assumed that there are two states in the HMM and each of the state corresponds to
the selection of different biased coin.
 Following matrix gives the state transition probabilities

 Here,

 We can also create an HMM model assuming that there are 3 coins or more.
 This way, we can characterize HMM by the following elements

 Use of HMM for POS Tagging:


 The POS tagging process is the process of finding the sequence of tags which is most
likely to have generated a given word sequence.
NLP END SEM QUESTION BANK
 We can model this POS process by using a Hidden Markov Model (HMM), where tags
are the hidden states that produced the observable output, i.e., the words.
 Mathematically, in POS tagging, we are always interested in finding a tag sequence (C)
which maximizes

 On the other side of coin, the fact is that we need a lot of statistical data to reasonably
estimate such kind of sequences.
 However, to simplify the problem, we can apply some mathematical transformations
along with some assumptions.
 The use of HMM to do a POS tagging is a special case of Bayesian interference.
 Hence, we will start by restating the problem using Bayes’ rule, which says that the
above-mentioned conditional probability is equal to

 We can eliminate the denominator in all these cases because we are interested in finding
the sequence C which maximizes the above value.
 This will not affect our answer. Now, our problem reduces to finding the sequence C
that maximizes

 Even after reducing the problem in the above expression, it would require large amount
of data.
 We can make reasonable independence assumptions about the two probabilities in the
above expression to overcome the problem.
NLP END SEM QUESTION BANK

 Advantages of POS Tagging:


1) Text Simplification: Breaking complex sentences down into their constituent parts makes
the material easier to understand and easier to simplify.
2) Information Retrieval: Information retrieval systems are enhanced by point-of-sale (POS)
tagging, which allows for more precise indexing and search based on grammatical
categories.
3) Named Entity Recognition: POS tagging helps to identify entities such as names, locations,
and organizations inside text and is a precondition for named entity identification.
NLP END SEM QUESTION BANK
4) Syntactic Parsing: It facilitates syntactic parsing, which helps with phrase structure analysis
and word link identification.

 Disadvantages of POS Tagging:


1) Ambiguity: The inherent ambiguity of language makes POS tagging difficult since words
can signify different things depending on the context, which can result in misunderstandings.
2) Idiomatic Expressions: Slang, colloquialisms, and idiomatic phrases can be problematic
for POS tagging systems since they don’t always follow formal grammar standards.
3) Out-of-Vocabulary Words: Out-of-vocabulary words (words not included in the training
corpus) can be difficult to handle since the model might have trouble assigning the correct
POS tags.
4) Domain Dependence: POS tagging models trained on a single domain should have a lot of
domain-specific training data because they might not generalize well to other domains.

8 With Suitable examples, Explain Part-Of -Speech aging HMM tagging. 6


ANS  Part of Speech aging HMM tagging:
NLP END SEM QUESTION BANK

You might also like