0% found this document useful (0 votes)
11 views7 pages

NLP Mod 1 SEE

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

NLP Mod 1 SEE

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

MODULE 1

Q1) Differences Between NLU (Natural Language Understanding) and NLG (Natural Language
Generation). (IAT)

Ans. The differences include:

Aspect NLU (Natural Language NLG (Natural Language


Understanding) Generation)

Definition Interprets and analyzes human Generates human-like language based


language to derive meaning. on structured data.

Focus Understanding the input language. Producing output language.

Input Text or speech from a human. Structured data or machine-generated


concepts.

Output Structured data, intents, or entities. Grammatically correct sentences or


paragraphs.

Key Tasks Entity recognition, sentiment Text summarization, report


analysis, intent detection. generation, dialogue response.

Techniques Parsing, semantic analysis, Templates, language models,


Used dependency tree construction. sequence-to-sequence models.

Role in NLP Typically an early-stage processing Final-stage processing for response or


Pipeline step. output generation.

Examples Translating a question to a query for Generating a weather report from


a database. meteorological data.

Complexity Requires handling ambiguity and Requires grammatical accuracy.


context.
Applications Chatbots, search engines, sentiment AI writers, report generators,
analysis. automated storytelling.

Dependency Relies on semantic and syntactic Depends on structured data or NLU


parsing for accuracy. output for context.

Error Impact Misunderstanding user input leads Poorly generated language affects user
to incorrect responses. trust and readability.

Q2) List the generations of NLP and the advantages and disadvantages of NLP. (IAT)

Ans. Generations of NLP:


1. First Generation - Rule-based systems using manually crafted rules for analyzing and
understanding language.
2. Second Generation - Statistical models utilizing probabilistic techniques for language
processing.
3. Third Generation - Machine learning models integrating supervised learning techniques
for text analysis.
4. Fourth Generation - Neural networks employing deep learning for context-aware and
complex NLP tasks.
5. Fifth Generation - Transformative AI with large-scale pre-trained models like GPT and
BERT for advanced NLP.

Advantages of NLP:
1. Enhanced Communication - Bridges human-computer interaction using natural
language.
2. Automation - Reduces manual effort by automating tasks like sentiment analysis and
translation.
3. Speed and Efficiency - Processes vast text data quickly and accurately.
4. Real-time Assistance - Enables applications like chatbots for immediate query
resolution.
5. Personalization - Customizes recommendations based on user behavior and language.
6. Multi-language Support - Supports diverse languages for global applicability.
7. Data Extraction - Extracts relevant information from unstructured text.
8. Cost-Effective - Lowers operational costs with automation and data insights.

Disadvantages of NLP:
1. Complexity - Requires sophisticated algorithms and extensive training data.
2. Ambiguity - Struggles with uncertainty and contextual understanding in sentences.
3. Bias Risks - Inherits biases from training datasets.
4. Dependency on Data - Performance depends heavily on the quality of training data.
5. Error Propagation - Mistakes in early stages can affect subsequent analysis.
6. Cultural Sensitivity - May misinterpret cultural significance and expressions.
7. Computational Cost - High processing power requirements for large-scale models.

Q3) Differences between tokenization and stemming. (IAT)


Ans. Module 4 - Q5)
Q4) Why NLP is difficult give suitable examples. (IAT)

Ans. Challenges in NLP with Explanations and Examples:


1. Ambiguity in Language: Words can have multiple meanings depending on usage.
Example: Words like "bank" can mean a financial institution or a riverbank.

2. Sarcasm and Irony: Statements often mean the opposite of their literal wording.
Example: "Oh great, another traffic jam!" implies frustration, not happiness.

3. Context Dependence: Words require context to convey the intended meaning.


Example: "He is at the bank" (could mean riverbank or financial bank).

4. Polysemy: Single words can represent multiple related meanings.


Example: "Light" (not heavy) vs. "Light" (illumination).

5. Synonyms: Different words can have similar meanings but precise usage differences.
Example: "Big" and "large" both mean sizable but are used differently in some contexts.

6. Grammar Variability: Different sentence structures can convey the same idea.
Example: "She read the book" vs. "The book was read by her."
7. Idiomatic Expressions: Phrases cannot be understood from literal word meanings.
Example: "Spill the tea" means to reveal a secret, not literally spilling tea.

8. Regional Dialects and Slang: Varying regional usage adds complexity.


Example: "Flat" (UK) means apartment, while in the US it might mean a punctured tire.

9. Out-of-Vocabulary Words: New or rare words may not exist in training data.
Example: Words like "NFT" or "Metaverse" might not be recognized initially.

10. Sentence Segmentation: Identifying sentence boundaries can be tricky.


Example: "Dr. Smith arrived. She was late" (Period in "Dr." doesn't end a sentence).

11. Domain-Specific Knowledge: Jargon varies greatly across fields like medicine or law.
Example: "BP" in medicine refers to blood pressure, but in finance, it means basis points.

12. Handling Multilingual Texts: Different languages have unique syntax and semantics.
Example: In Hindi, verbs change forms based on gender, making translation complex.

13. Temporal Understanding: Understanding time references in text can be challenging.


Example: "I will meet you next Monday" (requires knowing today’s date for clarity).

Q5) NLP Pipeline.

Ans. To build an effective NLP pipeline, follow these structured steps:

1. Sentence Segmentation:
a. Breaks the text into individual sentences.
b. Example:
i. Input: "Independence Day is one of the important festivals for every Indian
citizen. It is celebrated on the 15th of August each year."
ii. Output:
1. "Independence Day is one of the important festivals for every Indian
citizen."
2. "It is celebrated on the 15th of August each year."
2. Word Tokenization:
a. Divides sentences into individual words or tokens.
b. Example:
i. Input: "JavaTpoint offers Corporate Training, Summer Training, Online
Training, and Winter Training."
ii. Output: ["JavaTpoint", "offers", "Corporate", "Training", "Summer",
"Training", "Online", "Training", "and", "Winter", "Training", "."]
3. Stemming:
a. Reduces words to their root form, though the root may not be a meaningful word.
b. Example:
i. Input: "celebrates", "celebrated", "celebrating"
ii. Output: "celebr", "celebr", "celebr"
4. Lemmatization:
a. Converts words to their base form (lemma), which is a meaningful word.
b. Example:
i. Input: "intelligence", "intelligent", "intelligently"
ii. Output: "intelligent", "intelligent", "intelligent"
5. Identifying Stop Words:
a. Filters out common words that add little value to the analysis (e.g., "is", "and",
"the").
b. Example:
i. Input: "He is a good boy."
ii. Output (after removing stop words): "good", "boy"
6. Dependency Parsing:
a. Determines how words in a sentence are related to each other grammatically.
b. Example:
i. Sentence: "She eats an apple."
ii. Output: Parses relationships, identifying "eats" as the action related to both
"She" and "apple".
7. POS Tagging:
a. Assigns parts of speech (noun, verb, adjective, etc.) to each word in a sentence.
b. Example:
i. Sentence: "Google is a tech company."
ii. Output: Google (NNP), is (VBZ), a (DT), tech (JJ), company (NN)
8. Named Entity Recognition (NER):
a. Identifies and classifies named entities like people, organizations, or locations.
b. Example:
i. Sentence: "Steve Jobs introduced iPhone at the Macworld Conference in San
Francisco, California."
ii. Output: Steve Jobs (Person), iPhone (Product), Macworld Conference
(Event), San Francisco (Location), California (Location)
9. Chunking:
a. Groups tokens into chunks based on their syntactic roles, such as noun phrases or
verb phrases.
b. Example:
i. Sentence: "The quick brown fox jumps over the lazy dog."
ii. Output: [The quick brown fox] (NP), [jumps over] (VP), [the lazy dog] (NP)

Q6) Phases of NLP.


Ans. Phases of NLP
1. Lexical Analysis and Morphological Processing:
a. Breaks down text into tokens (words, sentences) and analyzes word structures like
roots and affixes.
b. Additional Point: Involves removing stop words and reducing words to their base
form through stemming/lemmatization.
2. Syntactic Analysis (Parsing):
a. Ensures the correct grammatical structure of sentences and identifies word
relationships.
b. Additional Point: Generates parse trees that visually represent the grammatical
structure of sentences.
3. Semantic Analysis:
a. Focuses on the meaning of words and sentences to ensure logical coherence.
b. Additional Point: Includes Named Entity Recognition (NER) and Semantic Role
Labeling (SRL) for identifying roles.
4. Discourse Integration:
a. Maintains context and coherence by considering relationships between consecutive
sentences.
b. Additional Point: Coreference Resolution identifies when different words refer to
the same entity in text.
5. Pragmatic Analysis:
a. Interprets intended meaning using context, social, and cultural norms.
b. Additional Point: Speech Act Recognition determines the purpose behind
statements like commands, requests, or questions.

You might also like