0% found this document useful (0 votes)
18 views47 pages

NLP 2

The document explains the differences between syntactic ambiguity and lexical ambiguity, highlighting that syntactic ambiguity arises from sentence structure while lexical ambiguity stems from words with multiple meanings. It also defines affixes, their types, and their relevance in natural language processing. Additionally, it discusses open and closed class words, rule-based machine translation, and relationships between word meanings such as homonymy, polysemy, synonymy, and antonymy.

Uploaded by

samafaw996
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views47 pages

NLP 2

The document explains the differences between syntactic ambiguity and lexical ambiguity, highlighting that syntactic ambiguity arises from sentence structure while lexical ambiguity stems from words with multiple meanings. It also defines affixes, their types, and their relevance in natural language processing. Additionally, it discusses open and closed class words, rule-based machine translation, and relationships between word meanings such as homonymy, polysemy, synonymy, and antonymy.

Uploaded by

samafaw996
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Q: Differentiate between Syntactic ambiguity and Lexical Ambiguity.

Sure! Here's a clear, exam-ready answer for Differentiating between Syntactic Ambiguity and
Lexical Ambiguity, perfect for your Mumbai University BE NLP exams.

Difference Between Syntactic Ambiguity and Lexical Ambiguity

Aspect Syntactic Ambiguity Lexical Ambiguity

Definition Ambiguity caused by the structure or Ambiguity caused by a single word


grammar of a sentence. having multiple meanings.

Source of Sentence structure leads to more than Word meaning leads to confusion
Ambiguity one possible interpretation. because it has different senses.

Focus Focuses on syntax (sentence formation). Focuses on semantics (word


meanings).

Example 1 “I saw the man with the telescope.” “He went to the bank.” Which
Who has the telescope? (You? Or the bank? (Riverbank or financial bank?)
man?)

Example 2 “Flying planes can be dangerous.” “The bat flew out of the cave.”
Planes that are flying? Or the act of flying Bat (animal)? Or bat (used in
planes? cricket)?

Resolution Handled by parsing techniques and Handled by Word Sense


Technique syntactic disambiguation rules. Disambiguation (WSD) techniques.

Relevance in Important in syntactic parsing and Important in semantic analysis and


NLP sentence structure analysis. lexical processing.

Short Definitions:

• Syntactic Ambiguity:
Occurs when the structure of a sentence allows for multiple interpretations.

• Lexical Ambiguity:
Occurs when a word in a sentence has more than one meaning.
Summary with Example Sentences:

Syntactic Ambiguity Example:

Sentence: “I saw her duck.”


Meaning 1: You saw the girl’s pet duck.
Meaning 2: You saw her quickly lower her head (duck as a verb).

Lexical Ambiguity Example:

Word: “Crane”
Meaning 1: A type of bird.
Meaning 2: A construction machine.

Quick Revision Tip:

• Syntactic = Sentence Structure Confusion

• Lexical = Word Meaning Confusion

Let me know if you want more examples or diagrams to help memorize this! Ready for the next
question?

Q: Define affixes. Explain the types of affixes.

Sure! Here's an exam-ready answer for your Mumbai University BE NLP paper on Affixes.

Affixes and Their Types

Definition of Affixes:

An affix is a morpheme (the smallest unit of meaning in a language) that is attached to a root word
or base word to form a new word or change its grammatical function.

Affixes are added before, after, within, or around a word.

Purpose of Affixes:

• To modify the meaning of a word.

• To change the grammatical category (noun → verb, verb → adjective).

• To create new words (word formation).


Types of Affixes:

Type Position Example

Prefix Added before the root word. un- in unhappy re- in rewrite

Suffix Added after the root word. -ness in kindness -ing in running

Infix Inserted inside the root word. Rare in English. Example (informal): fan-bloody-
tastic

Circumfix Added both before and after a Rare in English. Example (in German): ge-...-t
word. "geliebt" (loved)

Explanation of Each Type:

1⃣ Prefix

• Added at the beginning of a word.

• Changes the meaning of the word.


Example:

• Mis- + understand → Misunderstand

• Dis- + agree → Disagree

2⃣ Suffix

• Added at the end of a word.

• Changes the form or tense of the word.


Example:

• Quick + -ly → Quickly (adjective → adverb)

• Friend + -ship → Friendship (noun formation)

3️⃣ Infix

• Added inside a word.

• Rare in English, more common in other languages.


Example (informal English):
• Abso-bloody-lutely

• Tagalog language (Philippines):

o Sulat (write) → Sumulat (wrote)

4️⃣ Circumfix

• Two-part affix added around a word.

• Not commonly used in English but found in languages like German and Dutch.
Example (German):

• spiel (play) → gespielt (played)

o ge- (prefix) + -t (suffix)

Diagram Summary:

Root Word: happy

Prefix: un- + happy → unhappy

Suffix: happy + -ness → happiness

Circumfix: ge- + spiel + -t → gespielt (German)

Infix: abso + -bloody- + lutely → abso-bloody-lutely

Importance of Affixes in NLP:

• Helps in Morphological Analysis.

• Important for stemming and lemmatization.

• Used to analyze word formation and meaning extraction.

Quick Revision Tip:

• Prefix: Front

• Suffix: End

• Infix: Inside

• Circumfix: Around
Let me know if you want examples for morphological analysis or diagrams for your notes!
Ready for the next question?

Q: Describe open class words and closed class words in English with examples.

Sure! Here's a clear and exam-ready answer on Open Class Words and Closed Class Words in
English, ideal for your Mumbai University BE NLP exam.

Open Class Words and Closed Class Words in English

1. Open Class Words

Definition:

Open class words are content words that carry the main meaning in a sentence.

• They accept new words easily as the language evolves.

• The class is "open", meaning new words are regularly added.

Types of Open Class Words:

1. Nouns (names of people, places, things, ideas)

2. Verbs (actions or states)

3. Adjectives (describing words)

4. Adverbs (describe actions, adjectives, or other adverbs)

Examples:

Part of Speech Examples

Nouns book, computer, Mumbai, happiness

Verbs run, code, eat, analyze

Adjectives fast, intelligent, beautiful

Adverbs quickly, silently, very

Why Called "Open Class"?

• New words are frequently added, especially with technology and trends.
Example:

• Selfie (new noun)

• Google (verb, "to google something")


2. Closed Class Words

Definition:

Closed class words are function words that help in the grammatical structure of a sentence.

• They rarely accept new words.

• The class is "closed" because new members are not easily added.

Types of Closed Class Words:

1. Pronouns (he, she, it, they)

2. Prepositions (in, on, at, by)

3. Conjunctions (and, but, or, because)

4. Articles/Determiners (the, a, an, this, that)

5. Auxiliary Verbs (is, are, have, do)

Examples:

Part of Speech Examples

Pronouns he, she, it

Prepositions in, on, under

Conjunctions and, but, because

Articles the, a, an

Auxiliary Verbs is, are, have

Why Called "Closed Class"?

• They are fixed and stable, not open to new additions.


Example:

• The word "and" or "the" has existed for centuries without change.

Key Differences Between Open and Closed Class Words:

Aspect Open Class Words Closed Class Words


Meaning Carry main content/meaning Serve grammatical function

Change New words frequently added Rarely change; stable set

Examples Nouns, Verbs, Adjectives, Adverbs Pronouns, Prepositions, Conjunctions

Function Express concepts and ideas Link and organize words in sentences

Other Name Content Words Function Words

Quick Revision Examples:

Open Class:

• Noun: Student

• Verb: Study

• Adjective: Brilliant

• Adverb: Quickly

Closed Class:

• Pronoun: She

• Preposition: On

• Conjunction: And

• Article: The

Importance in NLP:

• Open class words are important in information extraction and semantic analysis.

• Closed class words are crucial in syntactic parsing and grammar structure.

Let me know if you need a diagram or mnemonics to memorize this!


Want to go for the next question?

Q: What is rule base machine translation?

Here’s an exam-friendly answer for Rule-Based Machine Translation (RBMT), suitable for your
Mumbai University BE NLP paper:
Rule-Based Machine Translation (RBMT)

Definition:

Rule-Based Machine Translation (RBMT) is a method of automatically translating text from one
language (source language) to another (target language) using linguistic rules and bilingual
dictionaries.

It is based on grammatical rules, morphological analysis, and syntactic structures of both


languages.

How RBMT Works:

1. Analysis Phase:

o The source language text is analyzed morphologically and syntactically.

o Identifies parts of speech, sentence structure, word meanings, etc.

2. Transfer Phase:

o The system translates the analyzed text into the target language structure.

o Grammar rules and translation dictionaries are applied.

3. Generation Phase:

o The translated structure is generated in the target language, following its grammar
and syntax.

o Produces grammatically correct sentences.

Types of Rule-Based Machine Translation:

1. Direct Translation:

o Word-for-word translation.

o Simplest method but often ignores grammar.


Example:

▪ English: How are you?


▪ Hindi (direct): कैसे हो आप? (Incorrect order)

2. Transfer-Based Translation:

o Uses intermediate representations (syntax trees, etc.) for better structure.


Example:

▪ English → Intermediate → Hindi

▪ English: How are you?

▪ Hindi: आप कैसे हैं ? (Correct structure)

3. Interlingua-Based Translation:

o Converts the source language into a universal language-independent meaning


representation.

o Then translates it into the target language.


Example:

▪ English → Interlingua → Hindi

▪ This approach allows multi-language translation.

Advantages of RBMT:

• Provides high-quality translations for well-defined languages.

• Translations are consistent, following predefined grammar rules.

• Useful for specialized domains where rules are well-defined (legal, medical).

Disadvantages of RBMT:

• Requires a lot of linguistic knowledge and manual effort to create rules and dictionaries.

• Difficult to scale to many languages due to complexity.

• Translations may be rigid and lack natural fluency.

Example of RBMT Translation:

English Sentence:
The student reads a book.

RBMT Process:
• Morphological Analysis:

o Noun (student), Verb (reads), Noun (book)

• Syntactic Structure:

o Subject → Verb → Object

• Transfer Rules (English → Hindi):

o Subject → Object → Verb


Hindi Output:
छात्र एक ककताब पढ़ता है ।

Applications of RBMT:

• Google Translate (early versions)

• SYSTRAN (French-English translation system)

• Government translation systems (e.g., Anusaaraka in India)

Comparison with Other MT Approaches:

Aspect RBMT Statistical MT (SMT) Neural MT (NMT)

Basis Grammar rules + Statistical patterns in Deep learning and neural


dictionaries data networks

Flexibility Low (rule-based) Medium (based on High (context understanding)


data)

Manual High Medium Low (requires data instead)


effort

Fluency Medium Medium High

Summary Line (For Revision):


"RBMT uses hand-crafted linguistic rules and bilingual dictionaries to perform direct, transfer-
based, or interlingua-based machine translation."

Let me know if you want an easy-to-remember flow diagram for the RBMT process! Ready for the
next question?
Q: Explain with suitable example following relationships between word meanings. Homonymy,
Polysemy, Synonymy, Antonymy

Absolutely! Here's a clear and concise answer on the relationships between word meanings,
covering Homonymy, Polysemy, Synonymy, and Antonymy, perfect for your Mumbai University BE
NLP exams.

Relationships Between Word Meanings

1. Homonymy

Definition:

Homonymy refers to two or more words that sound the same or are spelled the same, but have
different meanings and different origins.

Homonyms can be:

• Homophones (same pronunciation, different meaning/spelling)

• Homographs (same spelling, different meaning/pronunciation)

Example:

• Bank

o Bank (Financial institution): I deposited money in the bank.

o Bank (Side of a river): He sat on the river bank.

Explanation:
Same word “bank”, but different meanings and no relation between them.

2. Polysemy

Definition:

Polysemy refers to a single word that has multiple related meanings.

Example:

• Mouth

o Mouth (Part of the body): She opened her mouth to speak.

o Mouth (Opening of a cave/river): They reached the mouth of the river.

Explanation:
The meanings are related (both refer to an opening).
3️. Synonymy

Definition:

Synonymy refers to two or more words that have similar or identical meanings.

Example:

• Begin and Start

o Please begin your work.

o Please start your work.

Explanation:
Both “begin” and “start” have similar meanings (to initiate something).

4️. Antonymy

Definition:

Antonymy refers to two words that have opposite meanings.

Example:

• Hot ↔ Cold

o The tea is hot.

o The water is cold.

Explanation:
The words “hot” and “cold” are opposites in temperature.

Quick Comparison Table

Relationship Definition Example

Homonymy Same spelling/pronunciation, unrelated Bat (animal) / Bat (cricket bat)


meanings

Polysemy One word, related multiple meanings Foot (of a person) / Foot (of a
mountain)

Synonymy Words with similar meanings Big / Large

Antonymy Words with opposite meanings Up / Down


Importance in NLP:

• Understanding these relationships helps in semantic analysis, word sense disambiguation,


machine translation, etc.

Let me know if you want diagrams or mnemonics to memorize these! Ready for the next one?

Q: Explain perplexity of any language model.

Here's a simple and exam-friendly explanation of Perplexity in Language Models, tailored for your
Mumbai University BE NLP paper:

Perplexity of a Language Model

Definition:

Perplexity is a measurement used to evaluate how well a language model predicts a sample.
It shows how confused or uncertain the model is when it makes predictions.

Lower perplexity means the model is better at predicting the next word in a sequence.
Higher perplexity means the model is less accurate (more confused).

Intuition Behind Perplexity:

Think of perplexity as a “surprise score”:


If the model is less surprised by the actual next word, its perplexity is low.
If it’s very surprised, the perplexity is high.

Mathematical Formula:

For a probability distribution given by a language model over a test set, perplexity (PP) is calculated
as:

PP(W)=P(w1,w2,...,wN)−1NPP(W) = P(w_1, w_2, ..., w_N)^{- \frac{1}{N}}

Or equivalently:

PP(W)=2H(W)PP(W) = 2^{H(W)}

Where:

• H(W)H(W) = Entropy of the language model

• NN = Number of words in the test data


• P(w1,w2,...,wN)P(w_1, w_2, ..., w_N) = Probability assigned by the model to the test
sequence

Example (Easy to Understand):

Suppose a language model is predicting the next word out of 4️ possibilities:

• If the model is 100% sure, perplexity = 1 (no confusion).

• If it’s equally uncertain about all 4 choices, perplexity = 4 (maximum confusion).

Interpretation:

Perplexity = Average number of choices the model has at each step.


Lower is better!

Perplexity and Language Models:

• A bigram model (predicts word based on the previous one) may have higher perplexity.

• A trigram model (predicts word based on the previous two) usually has lower perplexity,
because it captures more context.

Example in NLP tasks:

• Perplexity is used to evaluate models like N-gram models, Neural language models, and
GPT-like models.

Key Points to Remember:

Concept Explanation

Perplexity measures How well the model predicts the next word

Low perplexity Good prediction, low uncertainty

High perplexity Poor prediction, high uncertainty

Ideal perplexity Close to 1 (perfect prediction, least confusion)

Summary Line (For Revision):


"Perplexity is a measure of a language model's uncertainty in predicting the next word; lower
perplexity indicates a better model."
Let me know if you want a sample calculation for perplexity! Ready for the next question?

Q: Explain the role of FSA in morphological analysis?

Here's a clear and exam-ready answer to “Explain the role of FSA in Morphological Analysis”,
perfect for your Mumbai University BE NLP paper!

Role of Finite State Automata (FSA) in Morphological Analysis

What is Morphological Analysis?

Morphological analysis is the process of analyzing the structure of words and breaking them
down into morphemes (smallest units of meaning), such as roots, prefixes, and suffixes.

Example:

• Word: “unhappiness”

• Morphemes: un- (prefix) + happy (root) + -ness (suffix)

What is FSA (Finite State Automata)?

An FSA is a computational model made up of states and transitions, used to recognize patterns
or sequences of inputs.

In NLP, FSAs are used to model word structures, morphological rules, and to process text
efficiently.

Role of FSA in Morphological Analysis:

1. Lexical Recognition (Word Recognition):

o FSA can recognize valid word forms by traversing states that represent morphemes
(prefix, root, suffix).

o It checks whether a given word conforms to the morphological rules of a language.

2. Example:
FSA can recognize that “cats” is composed of cat + s (root + plural marker).

3. Word Generation (Morphological Generation):


o FSA can generate valid words from their morphemes by following state transitions.

o Example: Adding -ed, -ing, -s, etc., to base words.

4. Morphological Parsing (Decomposition):

o FSA breaks a word into its constituent morphemes.

o Example:
Input = “replayed”
Output = re- (prefix) + play (root) + -ed (suffix)

5. Efficiency and Speed:

o FSAs are fast and simple, suitable for processing large dictionaries and efficient
morphological analysis.

6. Handling Inflectional and Derivational Morphology:

o FSAs can model both:

▪ Inflectional (walk → walked)

▪ Derivational (happy → happiness)

Example:

For the word “playing”:

• Start → play (root) → -ing (suffix) → Accept state

• FSA accepts the word as a valid combination of root + suffix.

Diagram (For understanding):

Start → [Root: play] → [-ing] → Accept

Types of FSAs Used in Morphology:

Type Use
Finite State Acceptors Recognize whether a word is valid

Finite State Transducers (FST) Map between surface forms and lexical/morphological forms

Finite State Transducers (FST) in Morphology:

FSTs extend FSAs by producing output along with recognizing input.


Example: Input = cats, FST maps it to cat + N + PLURAL.

Key Points for Exams:

• FSA is used for efficient morphological recognition and generation.

• It analyzes and generates word forms by modeling morpheme combinations.

• FSAs are simple, fast, and suitable for rule-based morphology systems.

• FSTs enhance FSAs by mapping between forms and features.

Summary Sentence (For quick revision):


"FSAs play a vital role in morphological analysis by recognizing, generating, and parsing word forms
through a simple, efficient state-transition model."

Let me know if you want more examples or a diagram explanation! Ready for the next question?

Q: Explain Different stage involved in NLP process with suitable example.

Here’s a simple and exam-ready answer to “Explain Different Stages Involved in NLP Process with
Suitable Example”, tailored for your Mumbai University BE NLP exam!
I’ve broken it down into clear points so you can remember and write easily.

Stages Involved in Natural Language Processing (NLP)

Natural Language Processing (NLP) involves a series of stages to analyze, understand, and generate
human language by machines.

1. Lexical Analysis (Tokenization)

What Happens?
The text is divided into smaller units such as words, sentences, or tokens.
Purpose:
Identify basic units (tokens) for further processing.

Example:
Input: “NLP is interesting.”
Tokens: [NLP], [is], [interesting], [.]

2. Morphological Analysis

What Happens?
It analyzes the structure of words and breaks them into morphemes (smallest meaningful units).

Purpose:
Understand roots, prefixes, suffixes, etc.

Example:
Word: “unhappiness” → un- + happy + -ness

3️. Syntactic Analysis (Parsing)

What Happens?
Check the grammar and structure of the sentence based on grammar rules.

Purpose:
Generate a parse tree that shows the syntactic structure.

Example:
Sentence: “The cat sits on the mat.”

• Noun Phrase (NP): The cat

• Verb Phrase (VP): sits on the mat

4️. Semantic Analysis

What Happens?
Understand the meaning of words and sentences.

Purpose:
Map syntactic structures to their meanings.

Example:
“The cat sits on the mat.”
→ It understands that there is a cat that is sitting on a mat.

5. Discourse Analysis
What Happens?
Analyzes relationships between sentences in a paragraph or conversation.

Purpose:
Ensure contextual consistency and reference resolution.

Example:
“Ravi went to the shop. He bought apples.”
→ Understands “He” refers to Ravi.

6. Pragmatic Analysis

What Happens?
Interpret the intended meaning in context, including speaker intention and real-world knowledge.

Purpose:
Derive meaning beyond literal words.

Example:
“Can you pass the salt?”
→ It’s not a question about ability, but a request.

7. Speech Processing (Optional in NLP systems)

What Happens?
Deals with speech-to-text and text-to-speech conversions.

Purpose:
Understand and generate spoken language.

Example:
Voice input: “What is the weather today?”
System converts speech to text → processes it → responds.

Example Showing All Stages (Simplified):

Input Sentence:
“John gave Mary a book.”

Stage Explanation

Lexical Analysis Tokens: [John], [gave], [Mary], [a], [book]

Morphological Analysis Identifies "gave" as past tense of "give"


Syntactic Analysis Subject: John, Verb: gave, Object: Mary, book

Semantic Analysis Understands John is the giver, Mary is the receiver

Discourse Analysis Links to prior text or context if mentioned

Pragmatic Analysis Considers intention, situation, etc.

Summary Table for Quick Revision:

Stage Purpose Example

Lexical Analysis Split text into tokens NLP → [N], [L], [P]

Morphological Analyze word structure “unhappiness” → un- + happy + -


Analysis ness

Syntactic Analysis Check grammatical structure “The cat sat on the mat.”
(parsing)

Semantic Analysis Extract meanings Understand “cat sits on mat”

Discourse Analysis Understand sentence relations “He” refers to “Ravi”

Pragmatic Analysis Understand intention & context “Can you pass the salt?” → Request

Summary Sentence (For revision):


"NLP consists of stages like lexical, morphological, syntactic, semantic, discourse, and pragmatic
analysis to process and understand human language effectively."

Let me know if you need diagrams, flowcharts, or more examples! Ready for the next question?

Q: Consider the following corpus <s> I tell you to sleep and rest </s> <s> I would like to sleep for an
hour </s> <s> Sleep helps one to relax </s> List all possible bigrams. Compute conditional
probabilities and predict the next ord for the word “to”.

Let's break it down step by step in a clear, easy-to-write format for your Mumbai University NLP
exam!
Given Corpus:

<s> I tell you to sleep and rest </s>

<s> I would like to sleep for an hour </s>

<s> Sleep helps one to relax </s>

Step 1: List All Possible Bigrams

Bigram = Sequence of 2 consecutive words.

From Sentence 1:

<s> I

I tell

tell you

you to

to sleep

sleep and

and rest

rest </s>

From Sentence 2:

<s> I

I would

would like

like to

to sleep

sleep for

for an

an hour

hour </s>

From Sentence 3️:


<s> Sleep

Sleep helps

helps one

one to

to relax

relax </s>

Step 2: List of All Unique Bigrams (Total List)

<s> I

I tell

tell you

you to

to sleep

sleep and

and rest

rest </s>

I would

would like

like to

sleep for

for an

an hour

hour </s>

Sleep helps

helps one

one to

to relax

relax </s>
Step 3️: Count Bigram Occurrences

Bigram Count

I 2

Sleep 1

I tell 1

I would 1

tell you 1

you to 1

would like 1

like to 1

to sleep 2

sleep and 1

sleep for 1

Sleep helps 1

helps one 1

one to 1

to relax 1

and rest 1

rest 1

for an 1
an hour 1

hour 1

relax 1

Step 4️: Compute Conditional Probabilities

Formula:

P(w_i | w_{i-1}) = \dfrac{\text{Count of Bigram (w_{i-1} w_i)}}{\text{Count of unigram (w_{i-1})}}

Focus on word “to” (w_{i-1} = “to”)

What are the bigrams starting with "to"?

• to sleep → Count = 2

• to relax → Count = 1

Total occurrences of "to" as the first word in bigrams:

• "to sleep" (2 times)

• "to relax" (1 time)


Total = 3️

Probabilities for the next word after “to”:

Next Word Bigram Probability

sleep to sleep 2 / 3 = 0.6667 (or 66.67%)

relax to relax 1 / 3 = 0.3️3️3️3️ (or 33.33%)

Step 5: Prediction for the Next Word After "to"

Based on probabilities, “sleep” has the highest probability (66.67%).


So, the predicted next word after “to” is:

**sleep**
Final Answer (For your exam):

List of all bigrams:

I, I tell, tell you, you to, to sleep, sleep and, and rest, rest ,
I, I would, would like, like to, to sleep, sleep for, for an, an hour, hour ,
Sleep helps, helps one, one to, to relax, relax .

Conditional Probabilities for "to":

• P(sleep | to) = 2/3 = 0.6667

• P(relax | to) = 1/3 = 0.3333

Prediction:

The next word after "to" is sleep, as it has the highest probability.

Let me know if you want me to explain unigram probabilities or build the full probability table!

Q: Explain Yarowsky bootstrapping approach of semi supervised learning

Here's a clear and easy-to-remember explanation of the Yarowsky Bootstrapping Approach in


Semi-Supervised Learning, perfect for your Mumbai University NLP exams:

Yarowsky Bootstrapping Approach of Semi-Supervised Learning

Introduction:

• The Yarowsky Algorithm is a semi-supervised learning technique.

• It was proposed by David Yarowsky in 1995.

• Mainly used for word sense disambiguation (WSD) but also applicable to text classification
tasks.

Key Ideas Behind Yarowsky's Approach:

1. Few Labeled Data + Lots of Unlabeled Data:

1. Starts with a small set of labeled examples.

2. Uses unlabeled data to improve the model.

2. Two Main Assumptions:

1. One Sense per Collocation


A word will likely have the same sense in the same context (collocation).
2. One Sense per Discourse
A word generally keeps the same meaning throughout the entire document.

Step-by-Step Working of Yarowsky Bootstrapping Algorithm:

Step Explanation

1 Start with a small set of labeled data (seed examples).

2 Build an initial classifier (based on features like collocations, context words, etc.).

3️ Apply the classifier to the unlabeled data to assign labels to high-confidence examples.

4️ Add these high-confidence predictions to the labeled dataset (bootstrapping).

5 Retrain the classifier on the new labeled set (old + new examples).

6 Repeat steps 3-5 until no more new high-confidence predictions are found (or convergence
is reached).

Example of Yarowsky Bootstrapping in Word Sense Disambiguation:

Example Word: "Plant"

• Sense 1: Factory

• Sense 2: Living organism (tree, flower)

Step 1:

• Label a few examples:


“The plant produces electricity” → Factory
“Water the plant daily” → Tree

Step 2:

• Find collocations and context words:

o Factory: produces, workers, electricity

o Tree: water, leaf, garden

Step 3️:

• Use these patterns to label unlabeled examples with high confidence.

Step 4️:
• Add high-confidence predictions to training data.

• Repeat the process to improve accuracy.

Advantages of Yarowsky's Bootstrapping Approach:

1. Works well with very little labeled data.

2. Efficient for word sense disambiguation.

3. Can be applied to other classification tasks.

Disadvantages:

1. Depends on initial labeled seeds.

2. Can propagate errors if wrong predictions are added.

Diagram (For easy understanding in exams):

Small Labeled Data → Initial Classifier → Predict Unlabeled Data

↑ ↓

←←←←←←←←←←←←← Add High-Confidence Predictions ←←←←←←←←←

Final Answer for Exams (Short Revision):

Yarowsky Bootstrapping is a semi-supervised learning algorithm that starts with a small set of
labeled examples, builds a classifier, and then iteratively labels unlabeled data by adding high-
confidence predictions to the training data. It works on the assumptions of one sense per
collocation and one sense per discourse, and was initially designed for word sense disambiguation
tasks.

Let me know if you want a flowchart or MCQs based on this topic for revision!

Q: What is POS tagging? Discuss various challenges faced by POS tagging.

Here's a clear and structured answer for POS Tagging and its challenges, perfect for your
Mumbai University NLP exams:

What is POS Tagging?


POS (Part-of-Speech) Tagging is the process of assigning a part of speech label (like noun, verb,
adjective, etc.) to each word in a sentence based on its definition and context.

Example:

Sentence: "The cat sat on the mat."

POS Tags:

The (Determiner)

cat (Noun)

sat (Verb)

on (Preposition)

the (Determiner)

mat (Noun)

Common POS Tags:

POS Meaning

NNP Proper noun

NN Common noun

VB Verb (base)

JJ Adjective

RB Adverb

IN Preposition

DT Determiner

Why is POS Tagging Important?

• Helps in syntactic parsing.

• Useful for Named Entity Recognition (NER).

• Important for Information Retrieval, Machine Translation, and Speech Recognition.


Challenges in POS Tagging

Challenge Explanation

1. Ambiguity of Words Many words can have multiple parts of speech depending on context.
Example: "Book" can be a noun (Read a book) or a verb (Book a ticket).

2. Unknown Words New words, slang, technical terms, or foreign words may not be
(Out-of-Vocabulary) present in the training data, making them hard to tag.

3️. Context Dependency The same word can have different POS tags depending on sentence
structure. Example: "Fast": - He runs fast (adverb). - He drives a fast car
(adjective).

4️. Free Word Order Languages like Hindi have flexible word order, making POS tagging
Languages harder compared to fixed word order languages like English.

5. Complex Sentence Long sentences, nested clauses, and coordination increase complexity.
Structures

6. Multiword Phrases like "kick the bucket" or "give up" act as a single semantic unit,
Expressions (MWEs) but tagging individual words can cause confusion.

7. Domain Adaptation A POS tagger trained on news articles may not work well on medical
texts, social media, or legal documents due to domain-specific
vocabulary.

8. Lack of Annotated For low-resource languages, there may be insufficient labeled corpora
Data to train accurate POS taggers.

Short Answer for Quick Revision:

POS Tagging assigns parts of speech (noun, verb, adjective, etc.) to words in a sentence based on
context.
Challenges include word ambiguity, unknown words, context dependency, free word order,
complex sentence structure, multiword expressions, domain adaptation, and lack of annotated
data.

Let me know if you want a diagram, flowchart, or MCQ practice for this topic!
Q: What are the limitations of Hidden Markov Model?

Here’s a clear, exam-friendly answer for the limitations of Hidden Markov Models (HMMs) in
Natural Language Processing (NLP)—perfect for your Mumbai University BE exams in Artificial
Intelligence & Machine Learning!

Limitations of Hidden Markov Models (HMMs)

Hidden Markov Models (HMMs) are widely used for tasks like POS tagging, speech recognition, and
sequence labeling in NLP.
However, they have several limitations, as listed below:

1. Strong Independence Assumptions

• HMM assumes that the current state depends only on the previous state (first-order Markov
assumption).
This ignores long-range dependencies between words in a sentence.

• In natural language, context often depends on more than just the previous word.

2. Limited Feature Incorporation

• HMMs cannot easily handle multiple features like:


Word prefixes, suffixes
Capitalization
POS context beyond one state

• Only observations (typically words) are used for predictions.

• Difficult to include rich features without complicating the model.

3️. Generative Model Limitation

• HMM is a generative model, meaning it models joint probabilities:


P(observations, states)

• It tries to model the entire data distribution, which can be complex and inefficient,
especially when discriminative models like Conditional Random Fields (CRFs) focus directly
on prediction (P(states | observations)).

4️. Data Sparsity

• Requires large amounts of labeled training data to estimate:


Transition probabilities
Emission probabilities
• With limited data, HMMs can suffer from data sparsity problems.

5. Assumes Fixed Number of States

• You must predefine the number of hidden states.


If you choose too few, the model underfits.
If you choose too many, the model overfits.

6. Inability to Handle Complex Dependencies

• HMMs struggle with non-Markovian dependencies:


Example: Subject-verb agreement in a sentence might depend on earlier parts of the
sentence, not just the previous word.

7. Time Complexity

• For large vocabularies and many states, the computation of probabilities (using algorithms
like Viterbi) can become slow and expensive.

Short Summary for Revision:

Limitations of HMMs:

• Strong independence assumptions (ignores long dependencies).

• Limited feature integration.

• Generative model (less effective for classification tasks).

• Data sparsity issues.

• Requires predefined number of states.

• Can't handle complex dependencies.

• Computational complexity for large datasets.

Bonus Tip (for Extra Marks! ):

Alternative models like CRFs and Neural Networks (LSTMs, BERT) overcome many of these
limitations.

Let me know if you want a comparison table between HMM and CRF, or examples of real-world
HMM applications!

Q: Explain the different steps in text processing for Information Retrieval


Here's a clear and structured answer to “Different steps in text processing for Information
Retrieval (IR)”, perfect for your Mumbai University BE - AI/ML NLP exams!

Text Processing in Information Retrieval (IR)

Text processing is a pre-processing stage in an Information Retrieval System (IRS). It involves


preparing raw text data for efficient storage, indexing, and retrieval.

Steps in Text Processing for IR:

1. Document Collection / Corpus Preparation

• Collect a large set of documents (web pages, articles, books, etc.).

• This corpus becomes the data source for indexing and searching.

2. Tokenization

• Break down text into smaller units called tokens (words, phrases, symbols).
Example:

Text: "Information Retrieval is powerful."

Tokens: ["Information", "Retrieval", "is", "powerful"]

3️. Stop Words Removal

• Common, frequent words that do not add significant meaning are removed.
Example:

• Stop words: “is”, “the”, “and”, “a”, etc.

• Reduces index size and speeds up search.

4️. Normalization / Case Folding

• Convert all text to lowercase to ensure uniformity.


Example:

• “Retrieval” → “retrieval”

• Helps to avoid case sensitivity issues during search.

5. Stemming and Lemmatization


• Stemming: Reduces words to their root form by chopping off suffixes.
Example:

• “running”, “runs”, “ran” → “run”

• Uses algorithms like Porter Stemmer.

• Lemmatization: Reduces words to their dictionary form (lemma) using morphological


analysis.
Example:

• “better” → “good”

• More accurate than stemming but slower.

6. Indexing

• Create an inverted index mapping terms to document IDs where they appear.
Example:

Term: "retrieval" → Docs: [1, 3, 7]

• Enables fast searching and ranking.

7. Term Weighting (TF-IDF)

• Assign weights to terms to measure their importance in documents.

• TF (Term Frequency): Frequency of a term in a document.

• IDF (Inverse Document Frequency): How rare a term is across all documents.

• TF-IDF score ranks documents based on relevance.

8. Query Processing
• User query goes through same preprocessing steps (tokenization, stop word removal,
stemming).

• Query terms are matched with the inverted index to retrieve documents.

9. Ranking and Retrieval

• Retrieved documents are ranked based on relevance scores (TF-IDF, BM25, etc.).

• Top-ranked documents are displayed to the user.

10. Relevance Feedback (Optional)

• The system may use user feedback (clicks, ratings) to improve future searches.

Short Answer for Revision:

The main steps in text processing for IR are:


Corpus preparation → Tokenization → Stop words removal → Normalization →
Stemming/Lemmatization → Indexing → Term weighting (TF-IDF) → Query processing → Retrieval
& Ranking → Relevance feedback (optional).

Diagram Request? Want a flowchart or diagram to make this easier to remember?


Or need examples of indexing and TF-IDF computation? Just let me know!

Q: Compare top-down and bottom-up approach of parsing with example.

Here's a clear, structured answer comparing Top-Down and Bottom-Up Parsing—perfect for
your Mumbai University BE exams in NLP!

Comparison of Top-Down and Bottom-Up Parsing

In Natural Language Processing (NLP), parsing is the process of analyzing a sentence to understand
its syntactic structure, typically represented by a parse tree.

Two main parsing strategies are: Top-Down Parsing


Bottom-Up Parsing

Criteria Top-Down Parsing Bottom-Up Parsing

Direction of Starts from the root (start symbol) and Starts from the leaves (input words)
Parsing goes down to the leaves (input words). and goes up to the root (start
symbol).
Approach Predictive—tries to predict the Constructive—builds the structure
structure from the grammar. from the input tokens.

Process Begins with the start symbol (e.g., S) Begins with the input sentence and
and applies production rules to applies reductions to reach the start
generate the sentence. symbol.

Goal Try to derive the sentence from the Try to reduce the sentence into the
start symbol. start symbol.

Type of Works best with context-free grammars Can handle left recursion better and
Grammar that are non-left-recursive. works well with ambiguous
grammars.

Example Sentence: "The cat sleeps"

Grammar Rule:
S → NP VP

1. Start with S

2. Expand S → NP VP

3. Expand NP → Det N ("The cat")

4. Expand VP → V ("sleeps")
Prediction first, match later. | Sentence: "The cat sleeps"

5. Start with "The" → Det

6. "cat" → N

7. Combine Det + N → NP

8. "sleeps" → V

9. Combine NP + V → S
Build parts first, combine later. | | Advantages | Simpler and intuitive
Easy to implement | Efficient for complex grammars
Handles ambiguity better | | Disadvantages | Struggles with left recursion
May explore unnecessary possibilities (inefficient). | Can be less intuitive
More complex to implement in some cases. |

Example for Both Approaches

Grammar:

S → NP VP
NP → Det N

VP → V

Det → "The"

N → "cat"

V → "sleeps"

Top-Down Parsing (Predictive)

• Start with S

• Expand: S → NP VP

• Expand: NP → Det N

• Match: Det → The, N → cat

• Expand: VP → V

• Match: V → sleeps

Bottom-Up Parsing (Constructive)

• Start with words: "The", "cat", "sleeps"

• "The" → Det

• "cat" → N

• Combine: Det + N → NP

• "sleeps" → V

• Combine: NP + V → S

Short Summary for Quick Revision

Top-Down Bottom-Up

Start from S, predict sentence Start from words, build up to S

Predictive parsing Shift-reduce parsing

Can't handle left recursion well Can handle left recursion better

Example: Expand rules first Example: Combine tokens first


Let me know if you want me to explain specific parsing algorithms (like Recursive Descent, CYK,
or Earley) or draw parse trees!

Q: What do you mean by word sense disambiguation (WSD)? Discuss dictionary based approach for
WSD.

Here's a clear and exam-friendly answer for Word Sense Disambiguation (WSD) and its
Dictionary-Based Approach, tailored for your Mumbai University BE AI/ML NLP paper!

What is Word Sense Disambiguation (WSD)?

Word Sense Disambiguation (WSD) is the task of identifying the correct meaning (sense) of a word
in context when the word has multiple meanings (polysemy).

Example:

• Word: “Bank”

o Meaning 1: A financial institution

o Meaning 2: The side of a river

Sentence: “I deposited money in the bank.”


Bank → Financial institution

Sentence: “He sat on the bank of the river.”


Bank → Riverbank

WSD helps machines understand and process natural language accurately by selecting the
correct sense of words.

Dictionary-Based Approach for WSD

The dictionary-based approach uses lexical resources like dictionaries, thesauri, or WordNet to
identify the correct sense of a word.

Main Idea:

• It compares the context of the ambiguous word with the definitions (glosses) of each
possible sense in the dictionary.

• Selects the sense that best matches the context.

Types of Dictionary-Based Approaches:

1. Lesk Algorithm (Most Common)


• Compares the dictionary definitions (glosses) of each sense of the target word with the
context words.

• The sense whose gloss has the largest overlap with the context is selected.

Example:
Target Word: “Plant”

• Gloss 1: "An industrial building where manufactured goods are produced."

• Gloss 2: "A living organism lacking the power of locomotion."


Context sentence: "The workers left the plant after the shift."
Overlap: “workers”, “shift” → More related to Gloss 1 (industrial building).

2. Extended Lesk Algorithm

• In addition to glosses of the target word, it also compares glosses of neighboring/context


words.

• Increases the overlap comparison, providing better accuracy.

3️. Use of Semantic Networks (WordNet)

• WordNet provides synonyms, definitions, hypernyms, hyponyms, etc.

• Algorithms match these semantic relations with context to choose the correct sense.

Advantages of Dictionary-Based WSD

• Easy to implement.

• No training data required (unsupervised).

• Useful when corpora are not available.

Limitations

• Dependent on the quality of the dictionary.

• Glosses may be short, causing limited overlaps.

• Not as accurate as machine learning or deep learning methods.

Summary for Revision:

Concept Word Sense Disambiguation (WSD)


Definition Identifying the correct meaning of a word in context.

Dictionary-Based Approach Uses dictionaries (WordNet) to compare glosses and select sense.

Common Method Lesk Algorithm (overlap of gloss and context words).

Pros Simple, unsupervised, no training data needed.

Cons Limited by dictionary quality and gloss length.

Let me know if you want me to explain the Lesk Algorithm step-by-step, or show
diagrams/flowcharts for better understanding!

Q: Explain Hobbs algorithm for pronoun resolution.

Here's a clear, exam-oriented answer for Hobbs Algorithm used in Anaphora (Pronoun)
Resolution, tailored for your Mumbai University BE AI/ML NLP exam.

What is Anaphora Resolution?

Anaphora resolution is the task of finding the antecedent (reference) of a pronoun or anaphoric
expression in a sentence.

Example:

• "John went to the park. He played football there."


Here, "He" refers to "John".

Hobbs Algorithm for Pronoun Resolution

Definition:

Hobbs Algorithm is a syntactic, structure-based algorithm used for pronoun resolution.


It searches parse trees to find the antecedent for pronouns (like he, she, it, they, etc.).

Steps of Hobbs Algorithm

1. Start at the NP node immediately dominating the pronoun.

2. Go up the tree to the first NP or S node. Call this node X, and the path you took P.

3. Traverse all branches below node X to the left of path P, in left-to-right, breadth-first order.
Any NP encountered is a potential antecedent.
4. If X is the highest S node, continue to step 5.
Otherwise:

o Go up to the next NP or S node, call it X, and repeat step 3.

5. If you reach the highest S node without finding a candidate,

o Continue with previous sentences and search in left-to-right breadth-first order


through their parse trees.

Explanation with Example

Example Sentence:

“John went to the park. He played football there.”

Parse Trees:

• S
├─ NP (John)
└─ VP (went ...)

• S
├─ NP (He)
└─ VP (played ...)

Pronoun: "He"
Start at NP (He)

• Move up to S

• Traverse left of He, but nothing is there

• Move up to previous sentence

• Search left to right, find NP (John)


So "He" refers to John.

Advantages of Hobbs Algorithm

• Simple and deterministic

• Works well for syntactic pronoun resolution

• Uses syntactic parse trees, making it systematic

Limitations

• Relies heavily on correct parse trees

• Doesn't handle semantic or pragmatic cues


• May fail when context understanding is required

Summary for Quick Revision:

Feature Details

Algorithm Type Syntax-based, Tree Traversal

Main Goal Resolve pronouns (find antecedents)

Method Traverses parse trees in a systematic order

Key Steps Go up the tree → Traverse left branches → Move to previous sentences

Best For Simple, syntactic pronoun resolution

Limitation No semantic or world knowledge use

Let me know if you want a diagram of tree traversal for better understanding or examples with
complex sentences!

Q: Explain Text summarization in detail.

Here's a simple, clear, and exam-friendly answer for Text Summarization, perfect for your
Mumbai University BE AI/ML NLP exam.

Text Summarization in NLP

Definition:

Text Summarization is the process of automatically generating a shorter version of a long text
document while preserving its key information and meaning.

It aims to reduce the size of the content without losing important points.

Types of Text Summarization

1. Extractive Summarization

• Selects important sentences, phrases, or paragraphs directly from the original text.

• These sentences are extracted and combined to create the summary.

Example:
Original text:
"Natural Language Processing is a field of AI. It deals with interaction between computers and
human languages."
Extractive summary:
"Natural Language Processing is a field of AI."

How it works:

• Scoring sentences based on factors like:

o Frequency of keywords

o Sentence position

o Similarity to the title

• Top scored sentences are picked for the summary.

2. Abstractive Summarization

• Generates a new summary by paraphrasing and rewriting the content.

• Uses deep learning models to understand the text and reproduce key ideas in its own
words.

Example:

Original text:
"Natural Language Processing deals with the interaction between computers and human
languages."
Abstractive summary:
"NLP enables computers to understand human language."

How it works:

• Uses Seq2Seq models, Transformer models like BERT, GPT, T5, etc.

• Understands the content at a deeper level and generates human-like summaries.

Techniques Used in Text Summarization

Method Description

Frequency-Based Calculates the frequency of words; higher frequency = more important

TF-IDF Gives importance to words that are frequent in the document but rare
elsewhere
Graph-Based Builds a graph where sentences are nodes, and edges represent similarity.
(TextRank) Sentences are ranked and top-ranked ones are selected

Deep Learning Uses RNNs, LSTMs, Transformers for abstractive summarization

Applications of Text Summarization

1. News Summarization (short summaries of long news articles)

2. Search Engines (displaying summaries of search results)

3. Document Management (summarizing legal or medical documents)

4. Chatbots & Virtual Assistants (summarizing user queries or responses)

5. Research Paper Summarization

Advantages of Text Summarization

• Saves time and effort

• Provides quick access to key information

• Useful in content curation and information retrieval

Challenges in Text Summarization

• Understanding context and semantics

• Maintaining coherence and fluency

• Handling domain-specific language

• Abstractive summarization needs complex models and large datasets

Quick Summary for Revision

Concept Details

Definition Automatic generation of a summary

Types Extractive and Abstractive

Techniques Frequency, TF-IDF, TextRank, Transformers


Applications News, Search Engines, Legal, Medical

Challenges Coherence, Context understanding

Let me know if you want examples of algorithms like TextRank or Transformer-based models
explained!

Q: Explain Porter Stemming algorithm in detail.

Here's a clear, exam-oriented answer for the Porter Stemming Algorithm, perfect for your
Mumbai University BE AI/ML NLP exams (NLP Subject).

Porter Stemming Algorithm

Definition:

The Porter Stemming Algorithm is a rule-based algorithm used for reducing words to their
stem/root form.
It helps in removing suffixes from words, thereby simplifying words to a common base form.

Example:

• Connection, Connected, Connecting ➜ Connect

Purpose of Stemming:

• Reduce words to their base form (stem) to improve text processing tasks like:

o Information retrieval

o Text classification

o Search engines

Developed By:

• Martin Porter in 1980

• One of the most widely used stemming algorithms in NLP.

How It Works (High-Level Overview):

• Applies a set of rules in five steps.

• Each rule identifies common suffix patterns and applies conditions to decide whether to
remove or replace them.
• The algorithm processes the word from end to start, chopping off suffixes like -ing, -ed, -es, -
tion, etc.

Detailed Steps of the Porter Stemming Algorithm:

Step 1: Remove Plurals and Past Tense

• Suffixes like -s, -es, -ed, -ing

Example:

• caresses ➜ caress

• ponies ➜ poni ➜ poni (adjusted in later steps)

• caressed ➜ caress

• meeting ➜ meet

Step 2: Mapping Double Suffixes to Single Ones

• Converts double suffixes like -ational, -izer, etc.

Example:

• relational ➜ relate

• conditional ➜ condition

• rational ➜ rational

• valizer ➜ valize

Step 3️: Deal with Suffixes like -ic, -ful, -ness

• Removes suffixes like -icate, -ful, -ness, etc.

Example:

• electrical ➜ electric

• hopefulness ➜ hope

• goodness ➜ good

Step 4️: Removes Additional Suffixes

• Removes suffixes like -ant, -ence, -ment, -ion, etc.

Example:
• agreement ➜ agree

• replacement ➜ replace

Step 5: Remove a Final -e

• If a word ends with -e, remove it under specific conditions.

Example:

• probate ➜ probat

• rate ➜ rate (keeps it because of the condition)

• cease ➜ ceas

Example Walkthrough:

Word: "Caresses"

• Step 1a: "caresses" ➜ "caress"

• No further steps apply


Final stem: "caress"

Word: "Relational"

• Step 2: "relational" ➜ "relate"


Final stem: "relate"

Advantages of Porter Stemmer:

• Simple and efficient

• Reduces computational complexity in NLP tasks

• Widely used in search engines and information retrieval

Limitations of Porter Stemmer:

• It’s rule-based, so it may over-stem or under-stem

• Sometimes removes too much or too little


Example:

o University ➜ Univers (Not an actual word)

• Does not handle irregular words (ex: "ran" ➜ no stemming)


Applications of Porter Stemming Algorithm:

1. Information Retrieval

2. Search Engines (like Google, Lucene)

3. Text Mining & Classification

4. Chatbots and NLP Pipelines

Quick Summary Table for Revision

Feature Details

Developed by Martin Porter (1980)

Type Rule-based suffix stripping

Steps 5 sequential steps with rules

Goal Reduce words to root/stem form

Advantages Simple, Fast, Efficient

Limitations Over-stemming/Under-stemming

Applications IR systems, NLP, Search Engines

Let me know if you want to see Porter Stemmer pseudocode or an implementation example in
Python!

You might also like