NLP - Sem
NLP - Sem
Effect on Word Does not usually change the core meaning of the Often changes the core meaning or
Meaning word, only adds grammatical distinctions. the syntactic category of the word.
Morphological Includes suffixes such as "-s," "-ed," "-ing," "-est," Includes affixes like "-er," "-ly," "-
Affixes etc. ness," "-ful," etc.
Examples of -s (plural), -ed (past tense), -ing (present -er (worker), -ly (quickly), -ness
Suffixes participle), -est (superlative) (happiness), -ful (careful)
2. Describe the Penn Treebank tag set and its role in NLP.
The Penn Treebank (PTB) Tag Set is a widely used Part-of-Speech (POS) tagging system developed as part
of the Penn Treebank Project. It is used in NLP to label words with their corresponding grammatical roles,
enabling accurate syntactic and semantic analysis. It consists of 45 POS tags, categorizing words into
grammatical types such as nouns, verbs, adjectives, adverbs, prepositions, pronouns, determiners, and
conjunctions.
Examle:
The Determiner (DT) fox Noun (NN)
quick Adjective (JJ) jumps Verb (VBZ)
brown Adjective (JJ) over Preposition (IN)
Training Data
No training data needed Needs a large annotated corpus
Requirement
Limited (new rules must be manually More flexible (adapts to different domains
Adaptability
added) and languages)
4. Explain the challenges of multiple tags and unknown words in POS tagging.
1. Challenges of Multiple Tags (Ambiguity in POS Tagging)
In POS tagging, a single word can have multiple possible tags depending on the context. This issue, known
as ambiguity, occurs when a word can function as more than one part of speech.
Types of Ambiguity:
• Lexical Ambiguity: A word can belong to multiple POS categories.
o Example: "Can"
▪ Noun: I bought a can of soda.
▪ Verb: Can you help me?
• Syntactic Ambiguity: The sentence structure leads to multiple interpretations.
o Example: "He saw the man with the telescope."
▪ Did he have the telescope, or did the man have it?
Solutions to Handle Multiple Tags:
• Context-Based Disambiguation: Using neighboring words to determine the correct tag.
• Statistical Models: Hidden Markov Models (HMM) or Conditional Random Fields (CRF) predict the
most likely tag.
• Deep Learning: Neural networks learn from large annotated datasets to resolve ambiguity.
4. Sentiment Analysis
Goal: Identify emotion or opinion in text by labeling words as positive, negative, or neutral.
• Example: "The movie was absolutely amazing!"
o amazing → POSITIVE
• Importance: Used in customer feedback analysis, social media monitoring, and product reviews.
Unit- 4
1. What is lexical semantics? Provide examples.
Lexical semantics is a subfield of linguistics and natural language processing (NLP) that focuses on the
meaning of words and their relationships with each other in a given language. It helps in understanding
how words convey meaning based on their context, structure, and usage.
Key Aspects of Lexical Semantics
1. Word Meaning and Sense
o Words can have multiple senses or meanings depending on context.
o Example:
▪ "She wore a diamond ring." (ring = jewelry)
▪ "I heard the phone ring." (ring = sound)
2. Synonymy (Similar Meaning Words)
o Words with similar meanings but different forms.
o Example: big ≈ large, buy ≈ purchase
3. Antonymy (Opposite Meaning Words)
o Words that have opposite meanings.
o Example: hot ↔ cold, happy ↔ sad
4. Hyponymy and Hypernymy (Word Hierarchies)
o Hyponym: A specific word under a broader category.
o Hypernym: A general category that includes hyponyms.
o Example:
▪ Dog (hyponym) → Animal (hypernym)
▪ Rose (hyponym) → Flower (hypernym)
5. Polysemy (Multiple Related Meanings of a Word)
o A single word with multiple related meanings.
o Example:
▪ Bank (financial institution)
▪ Bank (side of a river)
6. Homonymy (Same Word, Unrelated Meaning)
o Words that sound or look alike but have different meanings.
o Example:
▪ Bat (flying mammal)
▪ Bat (sports equipment)
7. Collocations (Commonly Co-Occurring Words)
o Words that naturally appear together.
o Example:
▪ Fast food (not quick food)
▪ Heavy rain (not big rain)
8. Lexical Ambiguity
o A word with multiple possible interpretations.
▪ Example: "He saw the bat." (bat = animal or sports equipment?)
Meaning No semantic connection between Meanings are related in some Meanings are closely
Relationship meanings. way. related or identical.
Example Bat (animal) vs. Bat (sports Head (of a person) vs. Head Big ↔ Large, Happy
Words equipment) (leader of a company) ↔ Joyful
- He hit the ball with a bat. (sports) - She is the head of the - He owns a big house.
Example
- A bat was flying in the cave. department. (leader) - He was - He owns a large
Sentences
(animal) hit on the head. (body part) house.
Aspect Homonymy Polysemy Synonymy
Unit – 5
1. What is text summarization, and how does LEX RANK work?
Text summarization is the process of condensing a large text into a shorter, meaningful version while
preserving its key information. It helps in quick information retrieval and is widely used in news
aggregation, search engines, and document summarization.
There are two main types of text summarization:
1. Extractive Summarization
o Selects important sentences/phrases from the original text.
o Example: LEX RANK, TextRank.
2. Abstractive Summarization
o Generates a summary by paraphrasing and reinterpreting the text.
o Example: Transformer-based models like BERTSUM, T5, GPT.
Measures precision of n-
BLEU (Bilingual Useful for machine Prefers short summaries,
gram matches between
Evaluation translation-based does not capture meaning
system & reference
Understudy) summarization. variations well.
summaries.
Metric Description Strengths Weaknesses
Does the summary help find relevant Measures usefulness for search
Information Retrieval
documents faster? engines.
Example:
• Spam Detection: Classifying emails as spam or not spam
• Sentiment Analysis: Categorizing customer reviews as positive, negative, or neutral
• Topic Categorization: Classifying news articles into sports, politics, entertainment, etc.
7. What are affective lexicons, and how are they used in sentiment analysis?
Affective lexicons are specialized dictionaries of words and phrases that are assigned emotional or
sentiment scores based on their meaning and intensity. These lexicons help determine the affective
(emotional) state conveyed in a text by mapping words to emotions such as happiness, anger, sadness,
fear, surprise, and disgust.
Key Features of Affective Lexicons:
• Contain words labeled with sentiment polarity (positive, negative, neutral).
• Assign emotion intensity scores to words.
• Used in rule-based and hybrid sentiment analysis approaches.
How Are Affective Lexicons Used in Sentiment Analysis?
1 Word-Level Sentiment Scoring
• Each word in a text is matched with its sentiment score from the lexicon.
• Example: “happy” → Positive (+0.9), “terrible” → Negative (-0.8).
2 Sentence-Level Sentiment Analysis
• Aggregates word sentiment scores to determine overall sentence sentiment.
• Example:
o "The movie was fantastic, but the ending was sad."
o Words: fantastic (+0.9), sad (-0.5) → Overall sentiment: Neutral/Positive.
3️ Emotion Detection
• Lexicons help classify text into specific emotions like joy, anger, or fear.
• Example: "I am thrilled about the new project!" → Emotion: Joy.
4️ Aspect-Based Sentiment Analysis (ABSA)
• Helps identify sentiment towards specific aspects of a product/service.
• Example: "The battery life is excellent, but the camera is poor."
o Battery life → Positive, Camera → Negative.
Example of Sentiment Analysis Using Affective Lexicons
Sentence: "I love this new laptop! The screen is amazing, but the battery drains fast."
Lexicon-Based Sentiment Score: • drains (-0.6) → Negative
• love (+0.9) → Positive • fast (neutral) → No sentiment
• amazing (+0.8) → Positive Overall Sentiment: Positive
8. Explain the concept of aspect-based sentiment analysis.
Aspect-Based Sentiment Analysis (ABSA) is an advanced form of sentiment analysis that focuses on
identifying the sentiment polarity (positive, negative, or neutral) towards specific aspects or features of an
entity, rather than analyzing the overall sentiment of a text.
Key Idea: Instead of classifying an entire review as positive or negative, ABSA determines which part of
the entity (product, service, etc.) is being praised or criticized.
Example of ABSA
Customer Review:
"The hotel room was spacious and clean, but the WiFi was slow."
ABSA Breakdown:
1. Rule-Based Methods
• Uses patterns & dictionaries (e.g., regex for identifying dates).
1. Multiple Interpretations
• NLP models often struggle to determine the intended meaning of words, phrases, or sentences.
• Machines lack human intuition and world knowledge to resolve ambiguity effectively.
Example (Lexical Ambiguity):
• "The bank is closed today."
o Bank (financial institution)
o Bank (riverbank)
Challenge: Without context, NLP models may misinterpret "bank" incorrectly.
2. Complexity in Sentence Structure
• Syntactic ambiguity arises when a sentence can be parsed in multiple ways, leading to different
meanings.
Example (Syntactic Ambiguity):
• "I saw the man with a telescope."
o Did I use a telescope to see the man?
o Or did the man have a telescope?
Challenge: Parsing algorithms may struggle to determine the correct sentence structure.
3. Meaning Changes Based on Context
• Semantic ambiguity occurs when words or phrases have different meanings depending on the
context.
Example (Semantic Ambiguity):
• "He is looking for a match."
o Match (for lighting a fire)
o Match (sports event)
o Match (romantic partner)
Challenge: NLP models need context-aware mechanisms (e.g., transformer-based models like BERT) to
infer the correct meaning.
4. Pronoun Reference Issues
• Anaphoric ambiguity occurs when pronouns can refer to multiple entities, making coreference
resolution difficult.
Example (Anaphoric Ambiguity):
• "John told Mike that he won the game."
o Who won? John or Mike?
Challenge: NLP systems must accurately link pronouns to the correct nouns using coreference
resolution techniques.
5. Real-World Knowledge & Pragmatics
• Pragmatic ambiguity arises when sentences require external world knowledge to interpret
correctly.
Example (Pragmatic Ambiguity):
• "Can you pass the salt?"
o Literal Meaning: Are you physically capable of passing the salt?
o Intended Meaning: Please pass me the salt.
Challenge: NLP models must infer speaker intentions using context and common sense reasoning.
Impact of Ambiguity in NLP Applications
Machine Translation Words with multiple meanings can lead to incorrect translations.
Chatbots & Virtual Assistants Ambiguous inputs may result in irrelevant responses.