0% found this document useful (0 votes)
22 views45 pages

NLP 2

The document contains a series of questions and answers related to natural language processing (NLP), covering topics such as challenges in NLP, ambiguities, applications, and morphological analysis. It discusses the importance of preprocessing text data, the role of finite state automata in morphological analysis, and various language models including N-grams. Additionally, it differentiates between open-class and closed-class words, explains stemming algorithms, and highlights the significance of perplexity in evaluating language models.

Uploaded by

omiiii1207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views45 pages

NLP 2

The document contains a series of questions and answers related to natural language processing (NLP), covering topics such as challenges in NLP, ambiguities, applications, and morphological analysis. It discusses the importance of preprocessing text data, the role of finite state automata in morphological analysis, and various language models including N-grams. Additionally, it differentiates between open-class and closed-class words, explains stemming algorithms, and highlights the significance of perplexity in evaluating language models.

Uploaded by

omiiii1207
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

Here are the questions from your document: ### Module 1 1.

Discuss various challenges in


processing natural language. (5M) 2. Differentiate between syntactic and lexical ambiguity. (5M)
3. Discuss in detail any application considering an Indian regional language of your choice. 4.
Why is it necessary to preprocess text data in natural language? Explain in detail the steps of
preprocessing with examples. 5. Explain different stages involved in the NLP process with
examples. (5M & 10M) 6. Explain the ambiguities associated at each level for NLP. 7. Discuss
the challenges in various stages of NLP. 8. Explain the applications of NLP. proviude answer in
7 to 8 points in detailed and for differentiate and compare provide 7 to 8 points table format

ChatGPT said:
ChatGPT

1. Discuss various challenges in processing natural language. (5M)

1. Ambiguity: Natural language is inherently ambiguous, with lexical, syntactic, and


semantic ambiguities often complicating interpretation.
2. Complex Grammar Rules: Diverse languages have complex grammatical structures,
making it challenging to design universal NLP models.
3. Lack of Context: Capturing and understanding context in text is difficult, especially in
long conversations or documents.
4. Idiomatic Expressions: Phrases like "spill the beans" or "kick the bucket" don't translate
literally, complicating NLP tasks.
5. Data Scarcity: For many regional languages, limited annotated data hampers effective
model training.
6. Multilingual Challenges: Handling code-mixing or translating between languages with
different scripts or grammar rules is complex.
7. Domain-Specific Knowledge: NLP models often struggle with understanding
specialized terminology in areas like medicine or law.
8. Evolving Language: The constant evolution of language, including slang and informal
expressions, requires ongoing adaptation of models.

2. Differentiate between syntactic and lexical ambiguity. (5M)


Aspect Syntactic Ambiguity Lexical Ambiguity

Definition Ambiguity arising from sentence Ambiguity arising from words having
structure or grammar. multiple meanings.

Example "The chicken is ready to eat" – "Bank" – Can refer to a financial


Who is eating? institution or a riverbank.

Cause Ambiguous sentence construction. Words with multiple senses.


Focus Related to syntax and parsing. Related to vocabulary and
semantics.

Resolution Requires syntactic analysis. Requires contextual understanding.

Level Operates at the sentence or Operates at the word level.


phrase level.

Detection Often more complex to detect. Easier to detect but harder to


Complexity disambiguate.

Example "Visiting relatives can be fun" "Bat" (Could mean an animal or


Context (Who is visiting whom?). sports equipment).

3. Discuss in detail any application considering an Indian regional language


of your choice.

Example: Machine Translation for Kannada Language

1. Description: Machine Translation (MT) facilitates automated translation between


Kannada and other languages like English or Hindi.
2. Challenges: Kannada's rich grammar, agglutination, and lack of annotated corpora
present challenges.
3. Preprocessing: Tokenization, stemming, and transliteration handle Kannada script.
4. Approach: Neural Machine Translation (NMT) using transformer models provides
efficient sentence-level translations.
5. Applications: Enables government services, education, and e-commerce content
translation for Kannada speakers.
6. Benefits: Promotes inclusivity by reducing the digital divide.
7. Current Tools: Google Translate and Bhashini are examples of Kannada MT systems.
8. Future Prospects: Enhanced corpora and improved contextual models can boost
Kannada translation accuracy.

4. Why is it necessary to preprocess text data in natural language? Explain


in detail the steps of preprocessing with examples.

Importance:

1. Ensures data consistency.


2. Reduces noise for better analysis.
3. Prepares text for NLP tasks like tokenization or vectorization.
Steps:

1. Tokenization: Break text into words or sentences.


○ Example: "Hello world!" → ["Hello", "world!"].
2. Lowercasing: Convert text to lowercase for uniformity.
○ Example: "HELLO" → "hello".
3. Stopword Removal: Eliminate common words like "is", "the".
○ Example: "This is a book" → ["book"].
4. Stemming: Reduce words to root forms using algorithms like Porter Stemmer.
○ Example: "running" → "run".
5. Lemmatization: Similar to stemming but considers context.
○ Example: "better" → "good".
6. Punctuation Removal: Strip punctuation for clean input.
○ Example: "Hello, world!" → ["Hello", "world"].
7. Normalization: Handle contractions and spelling corrections.
○ Example: "isn't" → "is not".

5. Explain different stages involved in the NLP process with examples. (5M
& 10M)

1. Tokenization: Splitting text into smaller units.


○ Example: "I love NLP." → ["I", "love", "NLP"].
2. POS Tagging: Assign parts of speech to each token.
○ Example: "I (PRON) love (VERB) NLP (NOUN)".
3. Named Entity Recognition (NER): Identifying entities like names or dates.
○ Example: "Google was founded in 1998." → ["Google" (ORG), "1998" (DATE)].
4. Parsing: Understanding sentence structure.
○ Example: Subject-Verb-Object analysis.
5. Semantics: Understanding meaning and context.
○ Example: Resolving word sense ambiguities.
6. Coreference Resolution: Linking pronouns to entities.
○ Example: "John loves his dog. He takes care of it." → "He" refers to "John".
7. Sentiment Analysis: Determining emotion or opinion in text.
○ Example: "The movie was amazing!" → Positive.
8. Text Generation: Producing coherent text.
○ Example: ChatGPT responses.

6. Explain the ambiguities associated at each level for NLP.

1. Lexical Ambiguity: Words with multiple meanings.


○ Example: "Bat" (animal vs. sports).
2. Syntactic Ambiguity: Ambiguous sentence structures.
○ Example: "Flying planes can be dangerous."
3. Semantic Ambiguity: Multiple interpretations of meaning.
○ Example: "I saw the man with the telescope."
4. Pragmatic Ambiguity: Context-dependent ambiguities.
○ Example: "Can you pass the salt?" (Request vs. ability).
5. Discourse Ambiguity: Inconsistent reference across sentences.
○ Example: "John met Tom. He was happy." (Who is "he"?)

7. Discuss the challenges in various stages of NLP.

1. Data Preparation: Handling noisy, incomplete, or unstructured data.


2. Tokenization: Dealing with compound words, contractions, and hyphenation.
3. POS Tagging: Ambiguities in parts of speech.
4. NER: Identifying rare or domain-specific entities.
5. Parsing: Complex sentence structures cause parsing errors.
6. Semantics: Capturing deep contextual relationships.
7. Pragmatics: Understanding intent and sarcasm.
8. Language Models: Handling resource-scarce languages.

8. Explain the applications of NLP.

1. Machine Translation: Automatic translation between languages.


○ Example: Google Translate.
2. Chatbots: Conversational agents for customer support.
○ Example: ChatGPT.
3. Sentiment Analysis: Understanding opinions in text.
○ Example: Analyzing product reviews.
4. Speech Recognition: Converting speech to text.
○ Example: Siri, Google Assistant.
5. Text Summarization: Condensing long texts.
○ Example: Summarizing news articles.
6. Search Engines: Understanding queries for relevant results.
○ Example: Google Search.
7. Spell and Grammar Checking: Identifying language errors.
○ Example: Grammarly.
8. Document Classification: Categorizing documents.
○ Example: Email spam filtering.

4o
You said:
continue

ChatGPT said:
ChatGPT

Continuation of Applications of NLP

9. Information Extraction: Automatically retrieving structured data from unstructured text.


○ Example: Extracting names, dates, and locations from articles.
10. Recommendation Systems: Providing personalized suggestions based on textual data.
○ Example: Movie or product recommendations based on reviews or preferences.
11. Language Modeling: Building models that predict text sequences.
○ Example: Autocomplete features in typing applications.
12. Optical Character Recognition (OCR): Extracting text from images or scanned
documents.
○ Example: Digitizing handwritten or printed text.
13. Emotion Detection: Understanding and categorizing emotions expressed in text.
○ Example: Analyzing social media posts for public sentiment.
14. Code Analysis: Analyzing programming languages for errors or improvement.
○ Example: GitHub Copilot assisting in code writing.
15. Personalized Education: Automating assessments and feedback generation.
○ Example: NLP-based educational tools like Duolingo for language learning.
16. Social Media Monitoring: Tracking trends and opinions on platforms.
○ Example: Brand reputation analysis.

Each application highlights the transformative impact of NLP in diverse sectors, addressing both
individual and organizational needs.

4o

You said:
### Module 2 1. What is the role of FSA in morphological analysis? (5M & 10M) 2. What do you
mean by stemming? Explain Porter’s stemming algorithm in detail with rules. 3. Explain
derivational and inflectional morphology in detail with suitable examples. 4. What is a language
model? Explain the N-gram model. 5. Describe open-class and closed-class words in English
with examples. (5M) 6. Explain the perplexity of any language model. (5M) 7. Explain how the
N-gram model is used in spelling correction. (5M) 8. Represent output of morphological analysis
for regular verbs, irregular verbs, singular nouns, plural nouns, and explain the role of FST in
morphological parsing with examples. 9. Explain Good Turing Discounting. (5M) 10. Explain
FSA for nouns and verbs. Also design a Finite State Automata (FSA) for the words of English
numbers 1-99. 11. Illustrate the concept of tokenization and stemming in NLP. (5M) 12. Illustrate
the working of the Porter Stemmer algorithm. 13. Consider a suitable training dataset and show
the Bigram probability calculation for the same. (5M) 14. Numerical (Bigram Probability). provide
answerds as above'
ChatGPT said:
ChatGPT

1. What is the role of FSA in morphological analysis? (5M & 10M)

Role of FSA:

1. Lexical Analysis: FSA identifies valid word forms by processing sequences of


characters.
2. Finite Representation: Encodes morphological rules of a language into finite states and
transitions.
3. Segmentation: Splits words into root forms and affixes, crucial for tasks like stemming
or lemmatization.
4. Word Recognition: Validates whether a given word adheres to the language's
morphological rules.
5. Efficiency: Processes input with linear time complexity.
6. Parsing: Extracts features like tense, plurality, and case through transitions between
states.
7. Compatibility: Easily integrates into larger NLP pipelines for morphological analysis.

Example (10M):

● Consider the word "running":


○ Start State → Read "run" → Transition to verb state → Read "ing" → Transition
to present participle state → Accept State.

2. What do you mean by stemming? Explain Porter’s stemming algorithm in


detail with rules.

Stemming:

Stemming reduces a word to its root form by removing prefixes or suffixes. Example: "running"
→ "run".

Porter’s Algorithm:

1. Rule-Based Phases: Sequential application of five phases with ordered rules.


2. Phase 1: Removes plural forms and past-tense suffixes.
○ Example: "caresses" → "caress", "running" → "run".
3. Phase 2: Replaces suffixes like “ational” → “ate”.
○ Example: "relational" → "relate".
4. Phase 3: Removes “ness” and “ful”.
○ Example: "hopefulness" → "hopeful".
5. Phase 4: Removes additional suffixes if the word exceeds a certain length.
○ Example: "entire" → "entir".
6. Phase 5: Final cleanup by removing "e" or simplifying double consonants.
○ Example: "derivative" → "deriv".

3. Explain derivational and inflectional morphology in detail with suitable


examples.
Aspect Derivational Morphology Inflectional Morphology

Definition Forms new words by adding affixes Changes grammatical aspects of


to roots. a word.

Purpose Alters meaning or part of speech. Maintains original word category.

Examples "Beauty" → "Beautiful" (Noun → "Cat" → "Cats" (Singular →


Adjective). Plural).

Types of Significant meaning changes. Grammatical variations.


Changes

Affixes Prefixes and suffixes (e.g., -ness, Suffixes only (e.g., -s, -ed).
un-).

Count Many derivational affixes exist. Only a few inflectional affixes


exist.

Semantic Impact Creates new lexemes. Keeps core meaning intact.

Examples in "Happy" → "Happiness" (Noun "Walk" → "Walked" (Past tense).


Context formation).

4. What is a language model? Explain the N-gram model.

Language Model:

A statistical model predicting the next word based on a sequence of previous words. Used in
tasks like autocomplete and machine translation.

N-gram Model:

1. Definition: An approximation where a word’s probability depends on the previous n-1


words.
2. Unigram Model: Each word is independent.
○ Example: P(w1,w2)=P(w1)×P(w2)P(w_1, w_2) = P(w_1) \times
P(w_2)P(w1​,w2​)=P(w1​)×P(w2​).
3. Bigram Model: Depends on one preceding word.
○ Example: P(wn∣wn−1)=Count(wn−1wn)Count(wn−1)P(w_n|w_{n-1}) =
\frac{\text{Count}(w_{n-1}
w_n)}{\text{Count}(w_{n-1})}P(wn​∣wn−1​)=Count(wn−1​)Count(wn−1​wn​)​.
4. Trigram Model: Depends on two preceding words.
5. Benefits: Captures context; improves predictions.
6. Limitations: Suffers from sparsity in large vocabularies.

5. Describe open-class and closed-class words in English with examples.


(5M)
Aspect Open-Class Words Closed-Class Words

Definition Words that accept new entries. Words with fixed entries.

Categories Nouns, Verbs, Adjectives, Adverbs. Pronouns, Prepositions, Articles,


Conjunctions.

Examples "Computer", "Run", "Quickly". "He", "Of", "And".

Additions New words can be added (e.g., Rarely accept new words.
“selfie”).

Meaning Carry core semantic content. Provide grammatical structure.

Frequency Less frequent but diverse. More frequent but limited.

Adaptability Evolves with language trends. Stable over time.

Importance Crucial for content in sentences. Necessary for sentence structure.

6. Explain the perplexity of any language model. (5M)

1. Definition: A measure of how well a language model predicts a test dataset.

2. Formula:
3. Interpretation: Lower perplexity indicates better performance.
4. Example: A bigram model predicting sentences with probabilities of 0.2 and 0.1 has
higher perplexity than one predicting probabilities of 0.5 and 0.6.
5. Significance: Helps compare different language models.
6. Challenges: Affected by vocabulary size and data sparsity.
7. Practical Use: Optimizing speech recognition and text generation models.

Remaining questions will be answered in subsequent responses. Let me know if you want me to
continue!

4o

You said:
continue

ChatGPT said:
ChatGPT

7. Explain how the N-gram model is used in spelling correction. (5M)

Steps for Spelling Correction Using N-grams:

1. Candidate Generation: Generate possible corrections for a misspelled word.


○ Example: For "recieve", possible candidates are "receive", "deceive", etc.
2. Contextual Probability: Use the N-gram model to find the most probable correction
based on the surrounding context.
○ Example: In the phrase "I will ___ it", the bigram probabilities for candidates like
"receive" might be higher than "deceive".
3. Calculation: Compute the probability of a sentence with the corrected word.


Example for Bigram Model:Best Candidate: Choose the word that maximizes the overall
probability.

Example:

For "I recieve your email":

● Candidates for "recieve": {"receive", "retrieve"}.


● Contextual Probability:
○ P("I receive your email")P(\text{"I receive your email"})P("I receive your email"):
High.
○ P("I retrieve your email")P(\text{"I retrieve your email"})P("I retrieve your email"):
Low.
8. Represent output of morphological analysis for regular verbs, irregular
verbs, singular nouns, plural nouns, and explain the role of FST in
morphological parsing with examples.

Examples of Morphological Analysis:

1. Regular Verbs:
○ Input: "walked".
○ Output: Root = "walk", Tense = "past".
2. Irregular Verbs:
○ Input: "went".
○ Output: Root = "go", Tense = "past".
3. Singular Nouns:
○ Input: "dog".
○ Output: Base Form = "dog", Number = "singular".
4. Plural Nouns:
○ Input: "dogs".
○ Output: Base Form = "dog", Number = "plural".

Role of Finite State Transducers (FST):

1. Parsing: Maps surface forms (e.g., "walked") to lexical forms (e.g., "walk + past").
2. Bidirectionality: Can generate surface forms from lexical forms.
3. Efficiency: Linear time complexity.
4. Example:
○ Transition for regular verbs:
"walk"→"walk"+"ed"\text{"walk"} \rightarrow \text{"walk"} +
\text{"ed"}"walk"→"walk"+"ed"

9. Explain Good-Turing Discounting. (5M)


10. Explain FSA for nouns and verbs. Also design a Finite State Automata
(FSA) for the words of English numbers 1–99.

FSA for Nouns and Verbs:

1. Nouns:
○ Singular → Add "s" for plural.
■ Example: "cat"→"cats"\text{"cat"} \rightarrow \text{"cats"}"cat"→"cats".
2. Verbs:
○ Base → Add "ed" for past tense, "ing" for progressive.
■ Example: "walk"→"walked"\text{"walk"} \rightarrow
\text{"walked"}"walk"→"walked".

FSA for Numbers 1–99:

1. States:
○ Start → Tens → Units → End.
2. Transitions:
○ Start → "Twenty" → Units (One to Nine).
○ Example: Input = "Thirty-two".

11. Illustrate the concept of tokenization and stemming in NLP. (5M)

Tokenization:

● Definition: Breaking text into smaller units.


● Example:
○ Input: "Cats are running".
○ Output Tokens: ["Cats", "are", "running"].

Stemming:

● Definition: Reducing words to their root form.


● Example:
○ Input Tokens: ["Cats", "running"].
○ Stems: ["Cat", "run"].

12. Illustrate the working of the Porter Stemmer algorithm.

Example Word: "running"

1. Phase 1a: Remove plural forms.


○ Input: "running".
○ Output: No change.
2. Phase 1b: Remove “ing”.
○ Input: "running".
○ Output: "run".
3. Phase 2: Apply rules like “ational” → “ate”.
○ Input: "relational".
○ Output: "relate".
4. Final Output: Stem = "run".

13. Consider a suitable training dataset and show the Bigram probability
calculation for the same. (5M)

Training Data:

Sentence: "I love NLP. I love Python."


Bigram Probabilities:

1. Count Pairs:
○ C("I love")=2C(\text{"I love"}) = 2C("I love")=2, C("love NLP")=1C(\text{"love
NLP"}) = 1C("love NLP")=1.
2. Probability:
P("love"∣"I")=C("I love")C("I")=22=1P(\text{"love"}|\text{"I"}) = \frac{C(\text{"I
love"})}{C(\text{"I"})} = \frac{2}{2} = 1P("love"∣"I")=C("I")C("I love")​=22​=1

14. Numerical (Bigram Probability):

Sentence: "I eat pizza."

● Bigram: P("eat"∣"I")P(\text{"eat"}|\text{"I"})P("eat"∣"I"),
P("pizza"∣"eat")P(\text{"pizza"}|\text{"eat"})P("pizza"∣"eat").
● Counts: C("I eat")=3C(\text{"I eat"}) = 3C("I eat")=3, C("eat pizza")=2C(\text{"eat pizza"})
= 2C("eat pizza")=2.
● Probabilities: P("eat"∣"I")=35,P("pizza"∣"eat")=24P(\text{"eat"}|\text{"I"}) = \frac{3}{5},
\quad P(\text{"pizza"}|\text{"eat"}) = \frac{2}{4}P("eat"∣"I")=53​,P("pizza"∣"eat")=42​

Let me know if you need further explanation or diagrams!

4o

You said:
### Module 3 1. How is HMM used for POS tagging? Explain in detail. 2. Explain the use of
CFG in NLP with a suitable example. (5M) 3. What is POS tagging? Discuss various challenges
faced by POS tagging. (5M & 10M) 4. What are the limitations of the Hidden Markov Model?
(5M) 5. Compare the top-down and bottom-up approaches of parsing with examples. 6.
Numerical (CYK algorithm). 7. Explain the Maximum Entropy Model for POS tagging. 8. What
are rule-based and stochastic part-of-speech taggers? (5M) 9. Explain the use of Probabilistic
Context-Free Grammar (PCFG) in NLP with an example. 10. Explain how Conditional Random
Field (CRF) is used for sequence labeling. 11. Demonstrate the concept of CRF in NLP. provide
ans as above in 7 to 8 pointss
2/2

ChatGPT said:
ChatGPT

1. How is HMM used for POS tagging? Explain in detail.

Steps in HMM-based POS Tagging:


1. POS Tagging as a Sequence Labeling Problem: Assigns tags (e.g., noun, verb) to
words in a sentence based on their likelihood and context.
2. Hidden States and Observations:
○ Hidden States: POS tags (e.g., Noun, Verb).
○ Observations: Words in the sentence.
3. HMM Components:
○ Transition Probabilities (P(Ti∣Ti−1)P(T_i|T_{i-1})P(Ti​∣Ti−1​)): Likelihood of
transitioning between POS tags.
○ Emission Probabilities (P(Wi∣Ti)P(W_i|T_i)P(Wi​∣Ti​)): Likelihood of a word
given a specific POS tag.
4. Input Sentence: Example: “I love NLP”.
5. Decoding (Viterbi Algorithm): Finds the most probable sequence of POS tags.
○ Step 1: Compute probabilities for all tag sequences.
○ Step 2: Select the sequence with the highest probability.
6. Training HMM: Uses labeled corpora like Penn Treebank to estimate probabilities.
7. Example:
○ Sentence: "I love NLP".
○ Predicted Tags: "I/PRON love/VERB NLP/NOUN".

2. Explain the use of CFG in NLP with a suitable example. (5M)

Context-Free Grammar (CFG) in NLP:

1. Definition: A set of rules that define valid syntactic structures in a language.


2. Structure: Composed of non-terminal symbols, terminal symbols, a start symbol, and
production rules.
3. Example Rules:
○ S→NP VPS \rightarrow NP \ VPS→NP VP
○ NP→Det NNP \rightarrow Det \ NNP→Det N
○ VP→V NPVP \rightarrow V \ NPVP→V NP.
4. Parse Tree:
○ Sentence: “The cat sleeps.”
○ Parse Tree:
■ SSS
■ NPNPNP: "The cat".
■ VPVPVP: "sleeps".
5. Applications: Parsing, grammar checking, and sentence generation.
6. Example Use: Parsing sentences to determine if they are syntactically correct.
3. What is POS tagging? Discuss various challenges faced by POS tagging.
(5M & 10M)

POS Tagging Definition:

Assigning a part-of-speech tag to each word in a sentence (e.g., noun, verb, adjective).

Challenges (5M):

1. Ambiguity: Words can have multiple POS tags (e.g., "book" as a noun or verb).
2. Unknown Words: Handling out-of-vocabulary words.
3. Context Dependence: Tagging depends on surrounding words.
4. Idiomatic Expressions: Words may not follow standard rules (e.g., "kick the bucket").

Additional Challenges (10M):

5. Data Sparsity: Lack of sufficient annotated data for low-resource languages.


6. Domain-Specific Language: Variations in tagging for specialized fields like medicine or
law.
7. Morphologically Rich Languages: Complex inflections in languages like Hindi or
Finnish.
8. Proper Noun Recognition: Differentiating proper nouns from common nouns.

4. What are the limitations of the Hidden Markov Model? (5M)

1. Independence Assumptions: Assumes observations depend only on the current state,


ignoring long-range dependencies.
2. Data Sparsity: Struggles with unseen data or rare word-tag pairs.
3. Fixed Order: Considers only adjacent states, limiting contextual understanding.
4. Non-Discriminative: Cannot use overlapping or arbitrary features.
5. Hard Decoding: Produces a single best path, ignoring alternative tag sequences.
6. Training Complexity: Requires large annotated datasets for accurate transition and
emission probabilities.
7. Inflexibility: Struggles with morphologically rich or highly inflected languages.

5. Compare the top-down and bottom-up approaches of parsing with


examples.
Aspect Top-Down Parsing Bottom-Up Parsing
Definition Starts from the root and tries to Starts from the input and builds up
derive the sentence. to the root.

Direction Root-to-leaf approach. Leaf-to-root approach.

Efficiency May explore unnecessary subtrees. Eliminates invalid paths early.

Ambiguity May lead to redundant computations. Resolves ambiguities during


construction.

Example Recursive Descent Parsing. CYK Algorithm.


Algorithm

Example Parsing "The cat sleeps" using Constructing parse tree from
Sentence grammar rules. terminal symbols.

Use Case Useful for predictive parsers. Suitable for unambiguous


grammars.

Challenges Struggles with left-recursive Requires a preprocessed


grammars. grammar.

6. Numerical (CYK Algorithm)

Given Grammar:

1. S→NP VPS \rightarrow NP \ VPS→NP VP.


2. NP→Det NNP \rightarrow Det \ NNP→Det N.
3. VP→V NPVP \rightarrow V \ NPVP→V NP.
4. Det→"The"Det \rightarrow "The"Det→"The".
5. N→"cat" ∣ "dog"N \rightarrow "cat" \ | \ "dog"N→"cat" ∣ "dog".
6. V→"chases"V \rightarrow "chases"V→"chases".

Input Sentence: "The cat chases".

1. CYK Table Construction:


○ Step 1: Populate diagonal with terminal symbols.
○ Step 2: Build up layers by combining rules.
2. Output: Final cell contains SSS, proving the sentence is valid.

7. Explain the Maximum Entropy Model for POS tagging.

Key Points:
1. Definition: A probabilistic model that predicts POS tags based on contextual features.
2. Feature Representation: Uses overlapping, non-independent features.
○ Example: Word suffix, previous tag, capitalization.
3. Mathematical Model: P(T∣W)=1Zexp⁡(∑iλifi(T,W))P(T|W) = \frac{1}{Z} \exp \left( \sum_i
\lambda_i f_i(T, W) \right)P(T∣W)=Z1​exp(i∑​λi​fi​(T,W)) Where ZZZ is the normalization
constant.
4. Training: Optimizes weights (λi\lambda_iλi​) for features.
5. Advantages: Handles diverse and arbitrary features.
6. Example:
○ Sentence: “He runs fast.”
○ Predicted Tags: “He/PRON runs/VERB fast/ADV”.

8. What are rule-based and stochastic part-of-speech taggers? (5M)


Aspect Rule-Based Tagger Stochastic Tagger

Definition Uses linguistic rules for tagging. Uses probabilities to predict tags.

Mechanism Handcrafted rules (e.g., suffix Statistical models like HMM or


analysis). CRF.

Flexibility Rigid; requires constant updating. Learns from data and adapts.

Example Assigning "Noun" to words ending in Bigram POS tagging.


"tion".

Accuracy High for specific domains. Generalizable across domains.

Dependency Requires domain expertise. Relies on large annotated


datasets.

Example Brill’s Tagger. HMM Tagger.


Tagger

Limitations Struggles with ambiguity. Requires computational


resources.

9. Explain the use of Probabilistic Context-Free Grammar (PCFG) in NLP


with an example.

1. Definition: CFG with probabilities assigned to production rules.


2. Example:
○ Rule: S→NP VPS \rightarrow NP \ VPS→NP VP.
○ Probability: P(S→NP VP)=0.8P(S \rightarrow NP \ VP) = 0.8P(S→NP VP)=0.8.
3. Usage: Determines the likelihood of parse trees.
○ Example Sentence: "The cat sleeps".
○ Likelihood = Product of probabilities for applied rules.
4. Applications:
○ Sentence disambiguation.
○ Speech recognition.

10. Explain how Conditional Random Field (CRF) is used for sequence
labeling.

1. Definition: A discriminative model for predicting sequences like POS tags.


2. Feature Inclusion: Incorporates both current and contextual features.
3. Graphical Model: Represents dependencies as undirected graphs.
4. Training: Maximizes conditional likelihood of labels given features.
5. Advantages: Avoids independence assumptions.
6. Example: Predicting POS tags for “Cats are cute”.

11. Demonstrate the concept of CRF in NLP.

Example:

● Input Sentence: “He eats quickly”.


● Features: Word shape, suffix, previous tags.
● Output Tags: He/PRON, eats/VERB, quickly/ADV.

Let me know if any topic needs further elaboration!

4o

You said:
### Module 4 1. What is WordNet? How is "sense" defined in WordNet? Explain with an
example. (5M) 2. What is Word Sense Disambiguation? Illustrate with an example how the
dictionary-based approach identifies the correct sense of an ambiguous word. 3. What are
lexicon and lexeme? Explain different types of relations that hold between lexemes with
examples. 4. Explain with suitable examples the following relationships between word
meanings: Homonymy, Polysemy, Synonymy, Antonymy. 5. Explain Yarowsky's bootstrapping
approach of semi-supervised learning. 6. What is WSD? Discuss the Dictionary-based approach
(Lesk Algorithm) for WSD with an example. 7. Explain with suitable examples the following
relationships between word meanings: Hyponymy, Hypernymy, Meronymy, Holonymy. (5M) 8.
Describe semantic analysis in NLP. (5M) 9. Demonstrate lexical semantic analysis using an
example. provide answer as above 7 to 8 points
3/3

ChatGPT said:
ChatGPT

1. What is WordNet? How is "sense" defined in WordNet? Explain with an


example. (5M)

WordNet:

1. Definition: WordNet is a lexical database of the English language, organized into


synonym sets (synsets), where each synset represents a unique sense or meaning of a
word.
2. Synsets: Each synset contains synonyms that share a common meaning.
3. Senses in WordNet: A "sense" refers to a specific meaning of a word within a context. A
single word may have multiple senses based on its usage.
4. Example: The word “bank” has multiple senses in WordNet:
○ Sense 1: A financial institution (synset: [bank, depository financial institution]).
○ Sense 2: The side of a river (synset: [bank, riverbank]).
5. Relations Between Senses: WordNet also defines relationships such as hyponymy
(specific meaning) and hypernymy (general meaning) between senses.
6. Applications: WordNet is used in tasks such as Word Sense Disambiguation (WSD),
machine translation, and information retrieval.

2. What is Word Sense Disambiguation? Illustrate with an example how the


dictionary-based approach identifies the correct sense of an ambiguous
word.

Word Sense Disambiguation (WSD):

1. Definition: WSD is the process of determining which sense of a word is used in a given
context.
2. Challenge: Words like “bank”, “bark”, or “set” can have multiple meanings depending on
the context.
3. Dictionary-Based Approach: Relies on external resources like dictionaries or WordNet.
It identifies the correct sense of a word by comparing the surrounding context with
definitions.
4. Example:
○ Sentence: “He sat by the bank of the river.”
○ Dictionary Sense: The correct sense here is "bank" as the edge of a river.
○ Process: The algorithm compares the context words like "river", "edge", and
"water" with the possible senses in WordNet, and selects the most relevant
sense (the riverbank sense).
5. Steps in Dictionary-Based WSD:
○ Extract context from the sentence.
○ Compare context with dictionary definitions.
○ Choose the sense with the highest overlap with the context.

3. What are lexicon and lexeme? Explain different types of relations that
hold between lexemes with examples.

Lexicon and Lexeme:

1. Lexicon: A lexicon is a collection or inventory of words and their meanings in a particular


language. It is a dictionary or database that stores the vocabulary of a language.
2. Lexeme: A lexeme is the fundamental unit of meaning, representing a set of forms of a
word that all have a similar meaning.
○ Example: The lexeme “run” includes “run”, “runs”, “ran”, and “running”.
3. Relations Between Lexemes:
○ Synonymy: Lexemes with similar meanings.
■ Example: “happy” and “joyful”.
○ Antonymy: Lexemes with opposite meanings.
■ Example: “hot” and “cold”.
○ Hyponymy: A lexeme whose meaning is more specific than another.
■ Example: “rose” is a hyponym of “flower”.
○ Meronymy: A part-whole relationship between lexemes.
■ Example: “wheel” is a meronym of “car”.
○ Holonymy: The inverse of meronymy, where one lexeme represents a whole that
consists of parts.
■ Example: “car” is a holonym of “wheel”.

4. Explain with suitable examples the following relationships between word


meanings: Homonymy, Polysemy, Synonymy, Antonymy.
Relationshi Definition Example
p

Homonymy A single word that has multiple "Bank" (financial institution) vs.
meanings, often unrelated. "bank" (side of a river).
Polysemy A single word with multiple meanings "Mouth" (opening in a face) vs.
that are related by extension. "mouth" (opening of a river).

Synonymy Words that have the same or similar "Big" and "large".
meanings.

Antonymy Words that have opposite meanings. "Hot" and "cold".

5. Explain Yarowsky's bootstrapping approach of semi-supervised learning.

Yarowsky's Bootstrapping Approach:

1. Definition: A semi-supervised learning technique that uses a small amount of labeled


data and a large amount of unlabeled data to create a more accurate model.
2. Process:
○ Seed Words: Start with a small set of labeled examples.
○ Propagation: The algorithm iteratively labels the unlabeled data by propagating
labels from the seed set.
○ Bootstrapping: After labeling some of the data, the newly labeled instances are
added to the seed set, refining the model.
3. Example:
○ Start with a few words with known senses (e.g., "bank" as a financial institution
and "bank" as a river edge).
○ Use these seed words to label similar words in unlabeled data.
○ After each iteration, the model gets more accurate and can handle more
ambiguous cases.
4. Applications: Used in tasks like Word Sense Disambiguation (WSD) and Named Entity
Recognition (NER).
5. Advantages: Reduces the need for large amounts of labeled data.
6. Challenges: Initial seed data must be carefully chosen for effective learning.

6. What is WSD? Discuss the Dictionary-based approach (Lesk Algorithm)


for WSD with an example.

Word Sense Disambiguation (WSD):

1. Definition: WSD is the process of determining which sense of a word is used in a


specific context.
2. Lesk Algorithm: A dictionary-based approach that selects the sense of a word that
overlaps most with the context words.
3. Steps in Lesk Algorithm:
○ For each sense of the ambiguous word, find its dictionary definition (or gloss).
○ Compare the definitions (glosses) with the surrounding words in the context.
○ Select the sense whose definition shares the most words with the context.
4. Example:
○ Sentence: “The bank refused the loan.”
○ Word “bank” has two senses:
■ Sense 1: Financial institution.
■ Sense 2: The side of a river.
○ The gloss for the financial institution sense includes terms like "money", "loan",
and "transaction", while the river sense includes terms like "river" and "shore".
○ Since "loan" appears in the context, the algorithm selects the financial institution
sense.
5. Applications: Used for disambiguating words in NLP tasks like machine translation.

7. Explain with suitable examples the following relationships between word


meanings: Hyponymy, Hypernymy, Meronymy, Holonymy. (5M)
Relationshi Definition Example
p

Hyponymy A more specific term that falls under a "Rose" (hyponym) is a type of
broader category. "flower" (hypernym).

Hypernymy A broader term that encompasses more "Animal" (hypernym)


specific terms. encompasses "dog", "cat", etc.

Meronymy A part-whole relationship where a term "Wheel" is a meronym of "car".


refers to a part of something larger.

Holonymy The inverse of meronymy, where a term "Car" is a holonym of "wheel".


refers to the whole that contains parts.

8. Describe semantic analysis in NLP. (5M)

Semantic Analysis in NLP:

1. Definition: The process of extracting meaning from text by analyzing word meanings,
sentence structure, and context.
2. Goals: To represent the meaning of sentences and resolve ambiguities.
3. Steps:
○ Word Sense Disambiguation (WSD): Identifying the correct sense of words.
○ Named Entity Recognition (NER): Identifying entities like names, dates, or
locations.
○ Sentiment Analysis: Determining the sentiment conveyed by the text.
4. Techniques:
○ Vector Space Models: Representing words as vectors in multi-dimensional
space.
○ Latent Semantic Analysis (LSA): Capturing relationships between words and
contexts.
5. Applications: Machine translation, question answering, information retrieval, and
chatbots.

9. Demonstrate lexical semantic analysis using an example.

1. Definition: Lexical semantic analysis involves understanding the meaning of individual


words and how they contribute to the overall meaning of a sentence.
2. Example Sentence: “The bat flew out of the cave.”
○ Polysemy: The word “bat” can refer to a flying mammal or a piece of sports
equipment.
○ Contextual Disambiguation: By analyzing the surrounding words ("flew",
"cave"), the correct sense (flying mammal) is selected.
3. Steps:
○ Extract words and identify potential meanings.
○ Disambiguate meanings using context.
○ Represent meanings in a formal system (e.g., using WordNet).
4. Applications: Useful in information retrieval, machine translation, and text
summarization.

Let me know if you'd like further clarification or details!

4o mini

You said:
### Module 5 1. Explain Hobbs' algorithm for pronoun resolution. 2. Explain three types of
referents that complicate the reference resolution problem. (5M) 3. Explain discourse reference
resolution in detail. 4. Describe in detail the Cantering Algorithm for reference resolution. 5.
What is reference resolution? (5M) 6. Illustrate the reference phenomena for solving the
pronoun problem. 7. Explain anaphora resolution using Hobbs' and Cantering algorithms.
provide 7 to 8 points answer as above

ChatGPT said:
ChatGPT

1. Explain Hobbs' algorithm for pronoun resolution.


Hobbs' Algorithm for Pronoun Resolution:

1. Definition: Hobbs' algorithm is a method for resolving pronouns (like "he", "she", "it") to
their appropriate antecedents in a sentence or discourse.
2. Core Idea: The algorithm relies on the syntactic structure of the sentence to determine
which noun phrase a pronoun refers to.
3. Steps:
○ Identify the pronoun in the sentence.
○ Traverse the sentence structure to find the closest noun phrase.
○ Check whether the noun phrase is grammatically compatible with the pronoun.
○ Select the closest noun phrase as the antecedent.
4. Example:
○ Sentence: "John went to the store. He bought some milk."
○ Hobbs' algorithm identifies "John" as the antecedent for "He" based on the
syntactic structure and proximity.
5. Challenges:
○ Ambiguities in pronoun reference (e.g., “he” can refer to multiple male
individuals).
○ Handling long-distance anaphora or reference across multiple sentences.
6. Strengths: Effective in syntactically simple cases where the antecedent is close to the
pronoun.
7. Limitations: Struggles with complex cases, such as when the antecedent is implicit or
involves vague reference.

2. Explain three types of referents that complicate the reference resolution


problem. (5M)

Three Types of Referents in Reference Resolution:

1. Ambiguous Referents:
○ Definition: A referent can be ambiguous if a pronoun could refer to multiple
entities.
○ Example: "John saw James, and he smiled." The pronoun "he" could refer to
either John or James, creating ambiguity.
2. Implicit Referents:
○ Definition: Sometimes the referent is not explicitly stated in the text, making it
hard to identify the antecedent.
○ Example: "She will come tomorrow." Without prior context, it’s unclear who "she"
refers to.
3. Long-Distance Referents:
○ Definition: The referent may be far from the pronoun, either within the same
sentence or across multiple sentences.
○ Example: "The manager asked Sarah to handle the presentation. She was
excellent at it." The pronoun "She" refers to Sarah, but it’s far from the mention of
Sarah.
4. Coreference in Complex Sentences: Sentences with embedded clauses or multiple
subjects can confuse reference resolution.
5. Pronominal vs. Definite Noun Phrases: Identifying whether a pronoun refers to a
definite noun phrase (e.g., "the president") or an indefinite one (e.g., "a man") can
complicate resolution.

3. Explain discourse reference resolution in detail.

Discourse Reference Resolution:

1. Definition: Discourse reference resolution refers to identifying the antecedent of a


referring expression (pronouns, noun phrases) within a larger discourse context, which
may span multiple sentences.
2. Core Idea: Unlike sentence-level reference resolution, discourse resolution requires
understanding relationships across sentences or even paragraphs.
3. Techniques:
○ Cohesion Analysis: Identifying relationships between words or phrases that
help connect different parts of the discourse (e.g., "therefore", "he" refers to a
previous person).
○ Salience and Proximity: More salient or recently mentioned entities are likely to
be the referents of pronouns.
4. Example:
○ Discourse: "Mary went to the store. She bought apples."
○ The pronoun "She" refers to "Mary" in the previous sentence.
5. Challenges:
○ Ambiguities over which noun phrase is the most salient or prominent referent.
○ Understanding cross-sentence references that require deeper semantic analysis.
6. Methods:
○ Linking: Connecting pronouns with their antecedents using syntactic, semantic,
or discourse-based cues.
○ Centering Theory: Uses the concept of "centers" to track discourse entities,
focusing on the most salient entity at any given moment.
7. Applications: Essential in machine translation, text summarization, and dialogue
systems, where coherence across sentences is critical.

4. Describe in detail the Cantering Algorithm for reference resolution.

Cantering Algorithm for Reference Resolution:


1. Definition: The Cantering algorithm is a rule-based approach designed to resolve
anaphora (pronoun or noun phrase reference) by using syntactic and semantic cues.
2. Core Idea: The algorithm resolves references by examining the grammatical structure
and the discourse context, such as sentence boundaries and the salience of
antecedents.
3. Steps:
○ Identify potential referents in the discourse (noun phrases or previous pronouns).
○ Evaluate each candidate based on syntactic proximity (sentence structure) and
semantic compatibility (whether the candidate fits the meaning of the pronoun).
○ Apply rules to eliminate incompatible referents, such as gender or number
agreement.
4. Example:
○ Sentence: “John loves his dog. He feeds it every day.”
○ The algorithm resolves "He" to "John" and "it" to "dog" based on context and
syntactic structure.
5. Features:
○ Cantering focuses on both syntactic relationships (e.g., noun phrase agreement)
and discourse-based relationships (e.g., entity salience).
○ It uses a fixed set of rules for resolving references.
6. Challenges:
○ Cantering can struggle with long-distance references or cases involving implicit
antecedents.
7. Applications: Used in systems requiring accurate pronoun resolution in complex texts
or dialogue systems.

5. What is reference resolution? (5M)

Reference Resolution:

1. Definition: Reference resolution is the task of determining what a referring expression


(e.g., a pronoun or noun phrase) refers to in a sentence or discourse.
2. Types of Reference:
○ Anaphora: Where a pronoun refers to a previous noun or noun phrase.
○ Cataphora: Where a noun or noun phrase refers forward to something
mentioned later.
3. Importance: Crucial for tasks like machine translation, text summarization, and
information retrieval, where understanding which entities are being talked about is key.
4. Challenges:
○ Ambiguous references: Pronouns like “he”, “it”, or “they” can refer to multiple
entities.
○ Complex sentence structures: Long-distance references can make resolution
difficult.
5. Methods:
○ Rule-based approaches: Using syntactic, semantic, and contextual rules to
identify antecedents.
○ Statistical methods: Machine learning techniques to model and predict referent
choices.
6. Example: In the sentence "He bought a car. It is red," "He" refers to the most salient
person (probably the subject of the previous sentence), and "It" refers to the car.
7. Applications: Important in natural language understanding tasks like question
answering, dialogue systems, and summarization.

6. Illustrate the reference phenomena for solving the pronoun problem.

Reference Phenomena for Solving the Pronoun Problem:

1. Gender Agreement: A pronoun must match the gender of its antecedent.


○ Example: "John is happy. He loves football." The pronoun "He" refers to John.
2. Number Agreement: A pronoun must match the number (singular or plural) of its
antecedent.
○ Example: "Mary and Ann are friends. They are going to the mall." "They" refers to
Mary and Ann.
3. Proximity: The closest noun phrase to a pronoun is often selected as its antecedent.
○ Example: "Mary gave a book to John. He liked it." "He" refers to John, as he is
the closest noun phrase.
4. Co-reference Resolution: Identifying when two or more noun phrases refer to the same
entity.
○ Example: "The car broke down. It was old." "It" refers to "The car."
5. Salience: More prominent or previously mentioned entities are more likely to be referred
to by pronouns.
6. Syntactic Cues: Analyzing sentence structure (e.g., subject vs. object position) to
determine the antecedent.
7. Contextual Clues: Using the broader context, including previous sentences, to resolve
ambiguous pronouns.

7. Explain anaphora resolution using Hobbs' and Cantering algorithms.

Anaphora Resolution Using Hobbs' and Cantering Algorithms:

1. Hobbs’ Algorithm:
○ Method: Focuses on the syntactic proximity and grammatical structure of the
sentence.
○ Process: It identifies the nearest noun phrase to a pronoun and checks its
compatibility with the pronoun in terms of gender, number, and syntactic role.

Example: "John came in. He sat down." Hobbs' algorithm identifies "He" as
referring to "John."
2. Cantering Algorithm:
○ Method: A more rule-based approach that incorporates both syntactic and
semantic analysis. It resolves references by evaluating syntactic proximity,
number and gender agreement, and discourse-based salience.
○ Process: The algorithm uses a set of predefined rules to link pronouns with
antecedents based on context.
○ Example: "John and Mary arrived. She was happy." Cantering resolves "She" to
"Mary" based on semantic rules and proximity.

These algorithms are complementary, with Hobbs focusing on syntactic clues and Cantering
combining more complex discourse-based rules.

4o mini

You said:
### Module 6 1. Compare Information Retrieval with an Information Extraction System. (5M &
10M) 2. What is Information Retrieval and Machine Translation in applications? 3. Write a note
on text summarization. 4. What is rule-based machine translation? (5M) 5. Explain the different
steps in text processing for Information Retrieval. (5M) 6. Explain machine translation
approaches in NLP. (5M) 7. Explain the statistical approach for machine translation. (5M) 8.
Explain a Question Answering System (QAS) in detail. 9. Demonstrate the working of a machine
translation system. 10. Explain the Information Retrieval System. provide 7 to 8 points ans
above

ChatGPT said:
ChatGPT

1. Compare Information Retrieval with an Information Extraction System.


(5M & 10M)

Information Retrieval vs. Information Extraction:

1. Definition:
○ Information Retrieval (IR) refers to the process of searching and retrieving
documents that are relevant to a user’s query from a large collection of text.
○ Information Extraction (IE) refers to the process of identifying and extracting
specific information from unstructured or semi-structured text, such as named
entities, relationships, or facts.
2. Objective:
○ IR aims to return relevant documents or text segments based on the user's query.
○ IE aims to extract specific data (e.g., names, dates, events) from documents.
3. Process:
○ IR involves searching, ranking, and presenting documents based on relevance.
○ IE involves parsing, entity recognition, and relation extraction.
4. Result:
○ IR returns a set of documents or text snippets.
○ IE returns structured data or facts (e.g., a list of people, places, or dates).
5. Data Type:
○ IR operates on entire documents or text collections.
○ IE operates on specific parts of text, such as sentences or paragraphs, to extract
structured data.
6. Focus:
○ IR focuses on retrieving relevant information regardless of its structure.
○ IE focuses on extracting particular pieces of information from the text.
7. Use Cases:
○ IR is widely used in search engines, digital libraries, and document management
systems.
○ IE is used in tasks like news aggregation, medical data extraction, and legal
document analysis.

2. What is Information Retrieval and Machine Translation in applications?

Information Retrieval in Applications:

1. Definition: Information retrieval involves searching and retrieving relevant data or


documents from a large corpus based on user queries.
2. Applications:
○ Search Engines: Google, Bing, etc., rely on IR to provide relevant results.
○ Document Management Systems: IR is used to locate documents based on
keywords or topics.
○ Digital Libraries: Helps in finding research papers, books, and other materials.

Machine Translation in Applications:

1. Definition: Machine translation (MT) refers to the automatic translation of text from one
language to another using computational techniques.
2. Applications:
○ Google Translate: A widely used tool for translating text across multiple
languages.
○ Localization: MT helps in translating software, websites, and content for global
audiences.
○ Cross-Language Information Retrieval: MT is used in searching documents in
different languages.
3. Write a note on text summarization.

Text Summarization:

1. Definition: Text summarization is the process of generating a shorter version of a text


while retaining the essential information and meaning.
2. Types:
○ Extractive Summarization: Selects important sentences or phrases directly
from the original text to form the summary.
○ Abstractive Summarization: Generates new sentences that paraphrase the
original content, providing a more coherent summary.
3. Applications:
○ News Summarization: Automatic generation of brief news articles.
○ Document Summarization: Summarizing long reports or research papers for
quicker reading.
○ Product Reviews: Summarizing customer feedback to extract key opinions.
4. Techniques:
○ Statistical Methods: Based on frequency of words and sentence importance.
○ Deep Learning Models: Use neural networks to understand the context and
generate summaries.
5. Challenges:
○ Maintaining Meaning: Ensuring that important information is not lost in the
summary.
○ Coherence: Creating a summary that is coherent and understandable.

4. What is rule-based machine translation? (5M)

Rule-Based Machine Translation (RBMT):

1. Definition: Rule-based machine translation is a method where translation is performed


based on linguistic rules and dictionaries for both source and target languages.
2. Components:
○ Lexical Database: A dictionary of words and their meanings in both languages.
○ Grammar Rules: Set of syntactic rules that define how words and phrases
should be translated.
3. Process:
○ Analysis: The source language text is parsed using syntax and semantics rules.
○ Transfer: The structure of the source language is mapped to that of the target
language.
○ Generation: The translated text is generated in the target language using the
defined rules.
4. Advantages:
○ Accuracy: Can achieve high accuracy if the rules are carefully designed.
○ Control: The translation process is fully controlled by linguists.
5. Challenges:
○ Complexity: Requires extensive linguistic knowledge and a large set of rules.
○ Scalability: Difficult to scale for multiple languages due to the need for extensive
manual work.

5. Explain the different steps in text processing for Information Retrieval.


(5M)

Text Processing Steps for Information Retrieval:

1. Tokenization: The process of breaking down the text into individual terms or tokens
(words, phrases, or characters).
2. Stop-word Removal: Common words such as “the”, “a”, and “in” are removed because
they do not contribute to the search’s meaning.
3. Stemming: Reducing words to their base or root form (e.g., "running" to "run").
4. Lemmatization: Similar to stemming but more sophisticated, reducing words to their
dictionary form (e.g., "better" to "good").
5. Indexing: Creating an index of terms and their locations in the document corpus to allow
for efficient searching.
6. Query Processing: The user’s search query is processed, and terms are matched with
the indexed terms in the corpus.
7. Ranking: Based on relevance, the documents are ranked and presented to the user.

6. Explain machine translation approaches in NLP. (5M)

Machine Translation Approaches:

1. Rule-Based Machine Translation (RBMT):


○ Uses predefined linguistic rules and dictionaries for translation.
○ Requires deep linguistic knowledge for each language pair.
2. Statistical Machine Translation (SMT):
○ Relies on statistical models trained on large parallel corpora to translate between
languages.
○ Learns translation probabilities from the data and can generalize to new text.
3. Neural Machine Translation (NMT):
○ Uses deep learning, particularly neural networks, to learn end-to-end translation
tasks.
○ Produces more fluent and context-aware translations compared to SMT.
4. Example-Based Machine Translation (EBMT):
○ Translates using stored translations from a corpus of examples.
○ Selects and adapts the most relevant translation examples to the current input.

7. Explain the statistical approach for machine translation. (5M)

Statistical Machine Translation (SMT):

1. Definition: SMT uses statistical models that are trained on large bilingual corpora to
translate text from one language to another.
2. Corpus-based Approach: The system learns translation probabilities from large parallel
corpora (i.e., text aligned in both languages).
3. Components:
○ Phrase Table: A collection of phrase pairs and their associated translation
probabilities.
○ Language Model: A statistical model that predicts the fluency of a sentence in
the target language.
4. Process:
○ Training: The system learns the translation probabilities from a corpus of aligned
text.
○ Decoding: The system generates the target language translation by choosing
the most probable phrase alignments.
5. Advantages:
○ Can work with any language pair given sufficient training data.
○ Allows for automatic learning of translation patterns from data.
6. Challenges:
○ Requires large parallel corpora.
○ Does not handle context as well as rule-based or neural approaches.

8. Explain a Question Answering System (QAS) in detail.

Question Answering System (QAS):

1. Definition: A QAS is designed to answer user queries by automatically retrieving


information from a database or corpus of text.
2. Types of QAS:
○ Closed-domain QAS: Answers questions related to a specific topic (e.g., a
medical question answering system).
○ Open-domain QAS: Can answer questions on a wide range of topics (e.g.,
general knowledge questions).
3. Process:
○ Question Processing: The system analyzes the input question to understand its
structure and intent.
○ Information Retrieval: The system searches for relevant documents or text that
can answer the question.
○ Answer Generation: The system selects the best answer from the retrieved
documents, often extracting the answer or generating a response.
4. Techniques:
○ Information Retrieval: To retrieve relevant documents or passages.
○ Natural Language Processing: To extract the answer from the retrieved text.
○ Machine Learning: To improve the accuracy of the system over time.
5. Challenges:
○ Ambiguity: Users may phrase questions in ambiguous ways.
○ Complexity: Some questions may require reasoning or combining information
from multiple sources.

9. Demonstrate the working of a machine translation system.

Working of a Machine Translation System:

1. Input: A sentence in the source language (e.g., English: "I love programming").
2. Preprocessing: The sentence is tokenized, and stop words are removed.
3. Translation: The system uses either rule-based, statistical, or neural methods to
translate the sentence into the target language (e.g., French: "J'aime la
programmation").
4. Postprocessing: The translated sentence may undergo further adjustments for fluency
and grammar.
5. Output: The translated sentence is presented to the user in the target language.

10. Explain the Information Retrieval System.

Information Retrieval System:

1. Definition: An information retrieval system is designed to store, index, and retrieve


relevant documents from a large corpus based on user queries.
2. Components:
○ Document Collection: A corpus of text or documents.
○ Indexing System: An index that maps terms to their locations in the documents.
○ Query Processor: Analyzes and processes user queries to match with indexed
documents.
○ Ranking System: Ranks the documents based on relevance to the query.
3. Process:
○ Inverted Indexing: The system creates an inverted index, mapping words to the
documents in which they appear.

Query Matching: The user’s query is tokenized, processed, and compared to
the index.
○ Ranking: Documents are ranked based on relevance using algorithms like
TF-IDF or BM25.
4. Applications:
○ Search Engines: Google, Bing, etc.
○ Digital Libraries: Retrieval of academic papers or books.
○ Enterprise Search: Retrieval of internal documents from large databases.

4o mini

You said:
### Extras 1. Define affixes. Explain different types of affixes. (5M) 2. Numerical (Transition and
Emission Probability Matrix). 3. What are five types of referring expressions? Explain with
examples. provife nmore detailed ans 7 to 8 points answer

ChatGPT said:
ChatGPT

1. Define affixes. Explain different types of affixes. (5M)

Affixes:

1. Definition: Affixes are morphemes (the smallest meaningful units in language) that are
attached to a root word to alter its meaning or function. They do not stand alone as
words but modify the base word.
2. Types of Affixes:
○ Prefix: An affix that is added at the beginning of a root word.
■ Example: "Un" in "unhappy" (negates the meaning of "happy").
○ Suffix: An affix that is added at the end of a root word.
■ Example: "ness" in "happiness" (turns an adjective into a noun).
○ Infix: An affix that is inserted within a root word, typically in languages other than
English.
■ Example: In Tagalog, "um" is inserted in "takbo" (run) to form
"tumarakbo" (running).
○ Circumfix: An affix that surrounds the root word, with parts added at the
beginning and end.
■ Example: In German, "ge-" and "-t" surround the verb "spiel" (play) to
form "gespielt" (played).
○ Transfix: An affix inserted within the root, common in languages like Arabic.
■ Example: "ktb" becomes "katab" (he wrote) when combined with affixes.
3. Function of Affixes:
○ Derivational Affixes: Create new words or change the meaning of the base
word (e.g., "unhappy" from "happy").
○ Inflectional Affixes: Modify a word to express grammatical features such as
tense, number, or case, without changing the word's meaning (e.g., "dogs" from
"dog").

2. Numerical (Transition and Emission Probability Matrix)

Transition and Emission Probability Matrix:

In the context of Hidden Markov Models (HMM) used in NLP for tasks such as Part-of-Speech
(POS) tagging, two key components are the Transition Probability Matrix and the Emission
Probability Matrix.

1. Transition Probability Matrix:


○ Represents the probability of moving from one state (e.g., one POS tag) to
another state in a sequence.
○ Notation: P(State_i → State_j), which is the probability of transitioning from state
i to state j.
○ Example: If "Noun" is state 1 and "Verb" is state 2, then the probability of
transitioning from "Noun" to "Verb" might be 0.3.
2. Transition Matrix Example:
From/To Noun Verb Adjective

Noun 0.4 0.3 0.2

Verb 0.1 0.5 0.2

Adjective 0.2 0.1 0.5

3.
Emission Probability Matrix:
○ Represents the probability of an observation (e.g., a word) being emitted from a
particular state (e.g., a specific POS tag).
○ Notation: P(Observation | State), which is the probability of observing a
particular word given a state.
○ Example: If the POS tag is "Noun", the emission probability of the word "dog"
could be 0.6.
4. Emission Matrix Example:
Word/Tag Noun Verb Adjective

dog 0.6 0.1 0.05

run 0.1 0.7 0.1


happy 0.05 0.05 0.8

These matrices are used in HMMs to compute the most likely sequence of states (POS tags)
given a sequence of observations (words).

3. What are five types of referring expressions? Explain with examples.

Types of Referring Expressions:

1. Pronouns:
○ Definition: Words used to refer to entities without explicitly naming them. They
depend on context for their interpretation.
○ Example: "John went to the store. He bought some milk."
○ Here, "He" refers to "John" based on prior context.
2. Definite Descriptions:
○ Definition: Phrases that refer to a specific entity known to both the speaker and
listener.
○ Example: "The president of the company."
○ Refers to a specific person who is known to both parties.
3. Indefinite Descriptions:
○ Definition: Phrases that refer to an unspecified entity.
○ Example: "A man walked into the room."
○ Refers to any man, not a specific one.
4. Demonstratives:
○ Definition: Words that indicate specific entities based on proximity or context,
often used with a noun.
○ Example: "This book is interesting."
○ "This" refers to a specific book in close proximity.
5. Proper Nouns:
○ Definition: Specific names of people, places, or things, used to refer to a unique
entity.
○ Example: "Albert Einstein was a physicist."
○ Refers specifically to the individual "Albert Einstein."

These referring expressions are essential in establishing who or what is being discussed, and
their correct interpretation relies on the context and discourse in which they are used.

4o mini

You said:
Discuss the challenges in various stages of natural language processing. [10] ⦁ a
Consider the following corpus <s> the/DT students/NN pass/V the/DT test/NN<\s> <s> the/DT
students/NN wait/V for/P the/DT result/NN<\s> <s> teachers/NN test/V students/NN<\s>
Compute the emission and transition probabilities for a bigram HMM. Also decode the following
sentence using Viterbi algorithm. “The students wait for the test”

ChatGPT said:
ChatGPT

Challenges in Various Stages of Natural Language Processing (NLP) [10]

NLP involves multiple stages, each presenting unique challenges that impact the accuracy and
efficiency of processing text. The following are key challenges encountered at various stages:

1. Tokenization:
○ Challenge: The task of segmenting text into words or tokens is complicated by
punctuation, contractions, and the existence of multi-word expressions.
○ Example: "I can't go" should be tokenized as ["I", "can't", "go"], but the
apostrophe in "can't" may complicate segmentation.
2. Part-of-Speech (POS) Tagging:
○ Challenge: Words often have multiple meanings depending on the context
(lexical ambiguity), making it difficult to assign the correct POS tag.
○ Example: The word "bank" can be a noun (riverbank) or a verb (to bank money).
Context is needed to determine the correct tag.
3. Named Entity Recognition (NER):
○ Challenge: Identifying and classifying entities such as names of people,
organizations, and locations can be difficult due to variations in naming
conventions and domain-specific terms.
○ Example: “Apple” can refer to the fruit or the tech company, which poses a
challenge for NER systems.
4. Syntactic Parsing:
○ Challenge: Understanding sentence structure is complex due to ambiguity in
sentence construction and syntactic structures that can differ between
languages.
○ Example: The sentence "The old man the boats" requires proper parsing to
avoid misinterpretation.
5. Semantic Analysis:
○ Challenge: Understanding the meaning of words and sentences involves dealing
with issues like polysemy (a single word having multiple meanings) and word
sense disambiguation.
○ Example: "The bank of the river" vs. "The bank of America" – both have different
meanings for "bank."
6. Coreference Resolution:
○ Challenge: Determining which words or phrases refer to the same entity
(pronouns, etc.) is difficult, especially in long texts or texts with complex
structures.
○ Example: In "John went to the store. He bought apples," resolving that "He"
refers to "John" is not trivial in all contexts.
7. Word Sense Disambiguation (WSD):
○ Challenge: Choosing the correct meaning of a word based on context is a
significant challenge.
○ Example: The word "bat" can refer to a flying mammal or a piece of sports
equipment, and its correct sense depends on the surrounding words.
8. Machine Translation:
○ Challenge: Translating between languages involves handling differences in
syntax, idiomatic expressions, and cultural nuances.
○ Example: Phrasal verbs in English may not have a direct equivalent in other
languages, making translation harder.
9. Speech Recognition:
○ Challenge: Converting spoken language into text is affected by accents,
background noise, and speech variations.
○ Example: Misunderstanding of homophones (e.g., "pair" vs. "pear") or slurred
speech can lead to incorrect transcriptions.
10. Text Summarization:
○ Challenge: Condensing long texts while preserving important information is
difficult, especially for abstractive summarization that generates new sentences.
○ Example: Extractive methods may miss key insights, while abstractive methods
may introduce errors or inaccuracies.

Compute the Emission and Transition Probabilities for a Bigram HMM

Given the following corpus:

bash
Copy code
<s> the/DT students/NN pass/V the/DT test/NN </s>
<s> the/DT students/NN wait/V for/P the/DT result/NN </s>
<s> teachers/NN test/V students/NN </s>

Step 1: Identify Tokens and Tags

● Tokens: the, students, pass, test, wait, for, result, teachers


● POS Tags: DT (determiner), NN (noun), V (verb), P (preposition)

Step 2: Calculate Transition Probabilities

The Transition Probability is the probability of moving from one POS tag to another. It's
calculated as:
P(Tagi∣Tagi−1)=Count of transitions from Tagi−1 to TagiTotal transitions from
Tagi−1P(\text{Tag}_i | \text{Tag}_{i-1}) = \frac{\text{Count of transitions from Tag}_{i-1} \text{ to
Tag}_i}{\text{Total transitions from Tag}_{i-1}}P(Tagi​∣Tagi−1​)=Total transitions from Tagi−1​Count
of transitions from Tagi−1​to Tagi​​

For the corpus:

● <s> → DT = 3 transitions (from 3 sentences)


● DT → NN = 3 transitions (from "the/DT students/NN", "the/DT students/NN", and "the/DT
test/NN")
● NN → V = 2 transitions (from "students/NN pass/V" and "students/NN wait/V")
● V → DT = 1 transition (from "wait/V for/P the/DT")
● DT → V = 1 transition (from "the/DT students/NN test/V")
● V → P = 1 transition (from "wait/V for/P")
● P → DT = 1 transition (from "for/P the/DT")

Transition Matrix Example (probabilities calculated based on counts):

From/To DT NN V P </s>

<s> 1.0 0 0 0 0

DT 0 1.0 0.33 0 0

NN 0 0 1.0 0 0

V 0 0 0.67 0.33 0

P 0 0 0 1.0 0

</s> 0 0 0 0 1.0

Step 3: Calculate Emission Probabilities

The Emission Probability is the probability of a word being generated from a particular POS
tag. It's calculated as:

P(Wordi∣Tagi)=Count of Tagi and WordiTotal occurrences of TagiP(\text{Word}_i | \text{Tag}_i) =


\frac{\text{Count of Tag}_i \text{ and Word}_i}{\text{Total occurrences of
Tag}_i}P(Wordi​∣Tagi​)=Total occurrences of Tagi​Count of Tagi​and Wordi​​

For the corpus:

● DT → "the" occurs 3 times:


P(the∣DT)=33=1.0P(\text{the} | \text{DT}) = \frac{3}{3} = 1.0P(the∣DT)=33​=1.0
● NN → "students" occurs 2 times:
P(students∣NN)=23P(\text{students} | \text{NN}) = \frac{2}{3}P(students∣NN)=32​
● V → "pass" occurs 1 time, "wait" occurs 1 time:
P(pass∣V)=12,P(wait∣V)=12P(\text{pass} | \text{V}) = \frac{1}{2}, P(\text{wait} | \text{V}) =
\frac{1}{2}P(pass∣V)=21​,P(wait∣V)=21​
● P → "for" occurs 1 time:
P(for∣P)=1.0P(\text{for} | \text{P}) = 1.0P(for∣P)=1.0

Emission Matrix Example:

Word/Tag DT NN V P

the 1.0 0 0 0

students 0 0.67 0 0

pass 0 0 0.5 0

wait 0 0 0.5 0

for 0 0 0 1.0

result 0 0.33 0 0

test 0 0.33 0 0

teachers 0 0.33 0 0

Step 4: Decode the Sentence using the Viterbi Algorithm

The sentence to decode: “The students wait for the test”

● Input: Words = ["the", "students", "wait", "for", "the", "test"]


● Tags: DT, NN, V, P, DT, NN
1. Initialize the Viterbi matrix with probabilities based on emission and transition matrices.
2. For each word in the sentence, calculate the most probable tag using dynamic
programming, considering previous states (tags).
3. Trace back through the Viterbi matrix to identify the most likely sequence of POS tags.

Without the detailed calculations (which would involve filling the matrix step by step), the
decoded sequence is likely:

● The → DT
● students → NN
● wait → V
● for → P
● the → DT
● test → NN

This would be the most probable POS tag sequence based on the given HMM.

4o mini

You said:
For given above corpus, N: Noun [Martin, Justin, Will, Spot, Pat] M: Modal verb [can , will]
V:Verb [ watch, spot, pat] Create Transition Matrix & Emission Probability Matrix Statement is
“Justin will spot Will” Apply Hidden Markov Model and do POS tagging for given statements

ChatGPT said:
ChatGPT

Step 1: Define the Corpus and Part-of-Speech Tags

Given the POS categories and vocabulary from the problem, here’s the categorization of words:

● Nouns (N): [Martin, Justin, Will, Spot, Pat]


● Modal Verbs (M): [can, will]
● Verbs (V): [watch, spot, pat]

Corpus Example:

● Sentence: "Justin will spot Will"


○ Words: ["Justin", "will", "spot", "Will"]
○ POS Tags: [N, M, V, N]

Step 2: Create Transition Matrix

The transition matrix represents the probabilities of moving from one POS tag to another (i.e.,
the probability of a tag given the previous tag). We calculate this matrix based on the corpus,
but since we don’t have a full corpus of transitions, we'll assume that each transition between
tags is equally likely (for simplicity). In real-world cases, we'd calculate the transition
probabilities based on word-tag co-occurrence in a large corpus.

Assumed transition probabilities (we'll normalize later to make sure they sum to 1):

● From Noun (N):


○ N → M: 0.5
○ N → V: 0.5
○ N → N: 0 (since Nouns typically do not transition to another Noun)
● From Modal verb (M):
○ M → V: 1.0 (A modal verb is most likely followed by a verb)
4o mini

You said:
<S> Martin Justin can watch Will <E> <S> Spot will watch Martin <E> <S> Will Justin spot
Martin <E> <S> Martin will pat Spot <E> For given above corpus, N: Noun [Martin, Justin, Will,
Spot, Pat] M: Modal verb [can , will] V:Verb [ watch, spot, pat] Create Transition Matrix &
Emission Probability Matrix Statement is “Justin will spot Will” Apply Hidden Markov Model and
do POS tagging for given statements

ChatGPT said:
ChatGPT

Let's break down the problem step by step and apply the Hidden Markov Model (HMM) to the
given task.

Given Information:

● Corpus:
○ <S> Martin Justin can watch Will <E>
○ <S> Spot will watch Martin <E>
○ <S> Will Justin spot Martin <E>
○ <S> Martin will pat Spot <E>
● POS Tags:
○ N: Noun [Martin, Justin, Will, Spot, Pat]
○ M: Modal verb [can, will]
○ V: Verb [watch, spot, pat]

Step 1: Define Transition Matrix

The transition matrix shows the probability of transitioning from one POS tag to another. To
build this, we'll count the transitions between the POS tags in the given corpus.

Corpus Analysis:

Based on the corpus, we have the following possible transitions:

● From Noun (N):


○ N → M (e.g., Martin → Justin → can)
○ N → V (e.g., Justin → can → watch)
○ N → N (e.g., Will → Justin → spot)
● From Modal (M):
○ M → V (e.g., can → watch, will → spot)
● From Verb (V):
○ V → N (e.g., watch → Will, spot → Martin)

Transition Counts:
1. Sentence 1: <S> Martin Justin can watch Will <E>
○ N→N→M→V→N
2. Sentence 2: <S> Spot will watch Martin <E>
○ N→M→V→N
3. Sentence 3: <S> Will Justin spot Martin <E>
○ N→N→V→N
4. Sentence 4: <S> Martin will pat Spot <E>
○ N→M→V→N

Transition Probabilities:

● From Noun (N):


○ N → N = 0 (Never transitions from Noun to Noun)
○ N → M = 1/3 (1 out of 3 transitions from Noun is to Modal verb)
○ N → V = 2/3 (2 out of 3 transitions from Noun is to Verb)
● From Modal (M):
○ M → V = 1 (Always transitions from Modal to Verb)
● From Verb (V):
○ V → N = 1 (Always transitions from Verb to Noun)

Transition Matrix:
From \ToNMVN01323M001V100\begin{array}{|c|c|c|c|} \hline \text{From \textbackslash To} & N
& M & V \\ \hline N & 0 & \frac{1}{3} & \frac{2}{3} \\ M & 0 & 0 & 1 \\ V & 1 & 0 & 0 \\ \hline
\end{array}From \ToNMV​N001​M31​00​V32​10​​

Step 2: Define Emission Matrix

The emission matrix gives the probability of a word being associated with a specific POS tag.
We calculate this based on the frequency of each word appearing with a given POS tag.

Emission Counts:

1. Martin → N (appears 2 times)


2. Justin → N (appears 2 times)
3. Will → N (appears 2 times)
4. Spot → N (appears 1 time)
5. Pat → V (appears 1 time)
6. can → M (appears 1 time)
7. will → M (appears 2 times)
8. watch → V (appears 2 times)
9. spot → V (appears 1 time)

Total occurrences of each POS tag:

● Nouns (N): 5 occurrences (Martin, Justin, Will, Spot, Pat)


● Modal verbs (M): 3 occurrences (can, will)
● Verbs (V): 4 occurrences (watch, spot, pat)

Emission Probabilities:

● For Nouns (N):


○ P(Martin | N) = 2/5
○ P(Justin | N) = 2/5
○ P(Will | N) = 2/5
○ P(Spot | N) = 1/5
○ P(Pat | N) = 0 (not a noun)
● For Modal Verbs (M):
○ P(can | M) = 1/3
○ P(will | M) = 2/3
● For Verbs (V):
○ P(watch | V) = 2/4 = 1/2
○ P(spot | V) = 1/4
○ P(pat | V) = 1/4

Emission Matrix:
Word
\POSNMVMartin2500Justin2500Will2500Spot1500Pat0014can0130will0230watch0012spot001
4\begin{array}{|c|c|c|c|} \hline \text{Word \textbackslash POS} & N & M & V \\ \hline \text{Martin}
& \frac{2}{5} & 0 & 0 \\ \text{Justin} & \frac{2}{5} & 0 & 0 \\ \text{Will} & \frac{2}{5} & 0 & 0 \\
\text{Spot} & \frac{1}{5} & 0 & 0 \\ \text{Pat} & 0 & 0 & \frac{1}{4} \\ \text{can} & 0 & \frac{1}{3} &
0 \\ \text{will} & 0 & \frac{2}{3} & 0 \\ \text{watch} & 0 & 0 & \frac{1}{2} \\ \text{spot} & 0 & 0 &
\frac{1}{4} \\ \hline \end{array}Word
\POSMartinJustinWillSpotPatcanwillwatchspot​N52​52​52​51​00000​M0000031​32​00​V000041​0021​4
1​​

Step 3: Apply the Viterbi Algorithm for POS Tagging

Now, we’ll use the Viterbi algorithm to find the most likely POS tag sequence for the given
sentence:
“Justin will spot Will”

Initialization:

● We start with Justin (word 1), and calculate the initial probabilities of each POS tag
based on the emission probabilities for the word "Justin".
○ P(N | Justin) = P(N) * P(Justin | N) = 1 * 2/5 = 2/5
○ P(M | Justin) = P(M) * P(Justin | M) = 1 * 0 = 0
○ P(V | Justin) = P(V) * P(Justin | V) = 1 * 0 = 0

Step 1: Process the word "will" (word 2):


● For each tag (N, M, V), calculate the probabilities based on the previous word "Justin"
and its tag probabilities:
○ P(N | will) = P(N | M) * P(will | M) = 1/3 * 2/3 = 2/9
○ P(M | will) = P(M | M) * P(will | M) = 0 * 2/3 = 0
○ P(V | will) = P(V | M) * P(will | M) = 1 * 2/3 = 2/9

Step 2: Process the word "spot" (word 3):

● Calculate for each possible tag for "spot":


○ P(N | spot) = P(N | V) * P(spot | V) = 1 * 1/4 = 1/4
○ P(M | spot) = P(M | V) * P(spot | V) = 0 * 1/4 = 0
○ P(V | spot) = P(V | V) * P(spot | V) = 1 * 1/4 = 1/4

Step 3: Process the word "Will" (word 4):

● Calculate for each possible tag for "Will":


○ P(N | Will) = P(N | V) * P(Will | N) = 1 * 2/5 = 2/5
○ P(M | Will) = P(M | V) * P(Will | M) = 0 * 0 = 0
○ P(V | Will) = P(V | V) * P(Will | V) = 1 * 0 = 0

Final POS Tag Sequence:

The most likely sequence of POS tags is:

● Justin → Noun (N)


● will → Modal (M)
● spot → Verb (V)
● Will → Noun (N)

Conclusion:

The POS tagging for the sentence “Justin will spot Will” based on the Viterbi algorithm is:

NMVN

You might also like