0% found this document useful (0 votes)

8 views

NLP Assignement Solution

Natural Language Processing (NLP) is an AI field that enables computers to analyze and generate human language, with applications including machine translation and question answering systems. Levels of language analysis in NLP include phonology, morphology, syntax, semantics, and pragmatics, with detailed discussions on morphology and pragmatics. The document also covers text classification using supervised machine learning and Naïve Bayes sentiment classification, illustrating key concepts with examples and mathematical expressions.

Uploaded by

Maymoon Irfan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

NLP Assignement Solution

Uploaded by

Maymoon Irfan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Question 1: Define Natural Language Processing?

What are the significant application

areas of Natural Language Processing? Discuss any two application areas in detail.
Definition:
Natural Language Processing (NLP) is a field of artificial intelligence that focuses on enabling
computers to analyze, understand, and generate human language. It draws from computational
linguistics, cognitive science, psycholinguistics, and statistics to process language at various
levels (phonology, morphology, syntax, semantics, and pragmatics).

Significant Application Areas of NLP:

• Machine Translation
• Question Answering Systems
• Information Retrieval and Extraction
• Text Categorization and Classification
• Speech Recognition and Text-to-Speech
• Sentiment Analysis
• Spelling and Grammar Checking
• Plagiarism Detection
• Dialogue Systems
• Language Learning and Teaching Tools

Detailed Discussion of Two Applications:

1. Machine Translation:
o Converts text from one language to another automatically.
o Example: Google Translate attempts to translate a daily newspaper from Japanese
to English.
o Challenges: Ambiguity, syntactic differences, cultural expressions.
2. Question Answering:
o Systems designed to retrieve specific answers to user queries.
o Example: A system answering "Who is the first Taiwanese president?" from a
large document corpus.
o Involves NLP tasks such as named entity recognition, parsing, and semantic
matching.

Question 2: Define the different levels of language analysis and discuss two of them in
detail with real-life examples.

Levels of Language Analysis in NLP:

1. Phonology
2. Morphology
3. Syntax
4. Semantics
5. Pragmatics
Detailed Explanation:

1. Morphology:
o Deals with the structure and formation of words from morphemes.
o Morpheme: Smallest meaningful unit (e.g., "dog" or the plural suffix "-s").
o Example:
 dogs = dog (free morpheme) + -s (bound morpheme)
 unhappiness = un- + happy + -ness
2. Pragmatics:
o Concerned with how context influences the interpretation of meaning.
o Examples:
 “Do you know the time?” is often a request, not a yes/no question.
 “We gave the monkeys the bananas because they were hungry” → 'they'
refers to monkeys;
“...because they were overripe” → 'they' refers to bananas.

Question 3: Define the three types of representation in descriptive statistics.

Frequency Distributions:

• Tabular display showing how often each value appears in a dataset.

Graphical Representations:

• Visual tools such as bar charts, pie charts, and histograms.

• Help in quick understanding of patterns and distributions.

Summary Statistics:

• Condense data using key values:

o Mean: Arithmetic average.
o Median: Middle value.
o Mode: Most frequent value.
o Variance/Standard Deviation: Measure of spread.

Question 4: Differentiate between semantics and syntax with reference to the levels of
language analysis.

Feature Syntax Semantics

Focus Sentence structure Meaning of words and sentences
Interpretation of phrases and
Concerned with Grammar rules, sentence parsing
sentences
Type of Analysis Syntactic parsing (tree structures) Meaning representation
“Colorless green ideas sleep But the sentence is semantically
Example
furiously” (syntactically valid) meaningless
Question 5: Draw the systematic diagram where NLP fits in CS Taxonomy.

Question 6: Discuss the text classification method using supervised machine learning with
the help of mathematical expression.

Text Classification using Supervised Learning involves:

• Training Set: Labeled documents (𝑑𝑑1 , 𝑐𝑐1 ), … , (𝑑𝑑𝑚𝑚 , 𝑐𝑐𝑚𝑚 )

• Input: Unlabeled document 𝒅𝒅
• Output: Predicted class 𝒄𝒄ˆ ∈ 𝐶𝐶

Naïve Bayes Classifier:

𝑐𝑐ˆ𝑀𝑀𝑀𝑀𝑀𝑀 = arg max 𝑃𝑃(𝑐𝑐 ∣ 𝑑𝑑) = arg max 𝑃𝑃(𝑑𝑑 ∣ 𝑐𝑐)𝑃𝑃(𝑐𝑐)

𝑐𝑐∈𝐶𝐶 𝑐𝑐∈𝐶𝐶

Under the Bag-of-Words model and conditional independence:

𝑛𝑛 𝑛𝑛

𝑃𝑃(𝑑𝑑 ∣ 𝑐𝑐) = � 𝑃𝑃(𝑓𝑓𝑖𝑖 ∣ 𝑐𝑐) ⇒ 𝑐𝑐ˆ𝑁𝑁𝑁𝑁 = arg max 𝑃𝑃(𝑐𝑐) � 𝑃𝑃(𝑓𝑓𝑖𝑖 ∣ 𝑐𝑐)
𝑐𝑐∈𝐶𝐶
𝑖𝑖=1 𝑖𝑖=1
To avoid underflow, compute in log space:
𝑛𝑛

𝑐𝑐ˆ𝑁𝑁𝑁𝑁 = arg max log 𝑃𝑃(𝑐𝑐) + � log 𝑃𝑃(𝑓𝑓𝑖𝑖 ∣ 𝑐𝑐)

𝑐𝑐∈𝐶𝐶
𝑖𝑖=1

Question 7: A fair coin is tossed 3 times. What is the likelihood of 2 heads?

• Sample Space:
Ω = {𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇, 𝑇𝑇𝑇𝑇𝑇𝑇}
→ |Ω| = 8

• Event A: Outcomes with exactly 2 heads

𝐴𝐴 = {𝐻𝐻𝐻𝐻𝐻𝐻, 𝐻𝐻𝐻𝐻𝐻𝐻, 𝑇𝑇𝑇𝑇𝑇𝑇}
→ |𝐴𝐴| = 3

• Uniform Distribution:
|𝐴𝐴| 3
𝑃𝑃(𝐴𝐴) = = = 0.375
|Ω| 8

Question #8: Naïve Bayes Sentiment Classification

Category Documents
Excellent product
Affordable and Reliable
Positive (+)
Very satisfied with the purchase
Highly recommended
Very disappointed
Negative (-)
Not worthy

Step 1: Preprocess and Build the Vocabulary

1. Normalization: Convert all text to lowercase and tokenize the words.

o Positive Documents (after normalization):
 Doc1: "excellent product" → excellent, product
 Doc2: "affordable and reliable" → affordable, and, reliable
 Doc3: "very satisfied with the purchase" → very, satisfied, with,
the, purchase
 Doc4: "highly recommended" → highly, recommended
o Negative Documents:
 Doc5: "very disappointed" → very, disappointed
 Doc6: "not worthy" → not, worthy
Vocabulary V: The union of unique words from all documents:

V={excellent, product, affordable, and, reliable, very, satisfied, with, the, purchase, highly, reco
mmended, disappointed, not, worthy}V = \{ \text{excellent, product, affordable, and, reliable,
very, satisfied, with, the, purchase, highly, recommended, disappointed, not, worthy}
\}V={excellent, product, affordable, and, reliable, very, satisfied, with, the, purchase, highly, rec
ommended, disappointed, not, worthy}

• Vocabulary size: ∣V∣=15

Step 2: Compute Class Priors

• Total documents = 6
• Positive Prior P(+): 46≈0.667\frac{4}{6} \approx 0.66764≈0.667
• Negative Prior P(−): 26≈0.333\frac{2}{6} \approx 0.33362≈0.333

Step 3: Preprocess the Test Document

• Test Document: “Disappointed quality, not recommended”

• Normalized (lowercase, tokenized):
Test Words={disappointed, quality, not, recommended}

Step 4: Calculate Likelihoods with Laplace (Add-One) Smoothing

For any word 𝑤𝑤 given class 𝑐𝑐 :

count (𝑤𝑤, 𝑐𝑐) + 1

𝑃𝑃(𝑤𝑤 ∣ 𝑐𝑐) =
Total words in 𝑐𝑐 + |𝑉𝑉|

Step 5: Compute the Posterior for Each Class

The Naïve Bayes classification rule in product space is:

𝑐𝑐ˆ = arg max 𝑃𝑃(𝑐𝑐) � 𝑃𝑃(𝑤𝑤 ∣ 𝑐𝑐)

𝑐𝑐∈{+,−}
𝑤𝑤∈ Test Document

For the Positive Class:

Score + = 𝑃𝑃(+) × 𝑃𝑃( disappointed ∣ +) × 𝑃𝑃( quality ∣ +) × 𝑃𝑃( not ∣ +) × 𝑃𝑃( recommended ∣ +)
1 1 1 2
= 0.667 × × × ×
27 27 27 27
2
= 0.667 × 4
27
2
≈ 0.667 ×
531441
≈ 0.667 × 3.76 × 10−6
≈ 2.51 × 10−6
For the Negative Class:

Score_ = 𝑃𝑃(−) × 𝑃𝑃( disappointed ∣ −) × 𝑃𝑃( quality ∣ −) × 𝑃𝑃( not ∣ −) × 𝑃𝑃( recommended ∣ −)
2 1 2 1
= 0.333 × × × ×
19 19 19 19
4
= 0.333 × 4
19
4
≈ 0.333 ×
130321
≈ 0.333 × 3.07 × 10−5
≈ 1.02 × 10−5

Step 6: Compare the Scores

• Positive Class Score: ≈ 2.51 × 10−6

• Negative Class Score: ≈ 1.02 × 10−5
Since 1.02 × 10−5 > 2.51 × 10−6, the Naïve Bayes model predicts that the test document has
negative sentiment.

Morphology Note
No ratings yet
Morphology Note
13 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Lecture 02
No ratings yet
Lecture 02
31 pages
5.2 Natural Language Processing
No ratings yet
5.2 Natural Language Processing
43 pages
MOD-1
No ratings yet
MOD-1
71 pages
Classification
No ratings yet
Classification
81 pages
Week 6: Introduction To Natural Language Processing
No ratings yet
Week 6: Introduction To Natural Language Processing
18 pages
CSC 528 Lecture 3
No ratings yet
CSC 528 Lecture 3
42 pages
NLP Lab Manual
No ratings yet
NLP Lab Manual
17 pages
It-3035 (NLP) - CS Mid Feb 2024
No ratings yet
It-3035 (NLP) - CS Mid Feb 2024
6 pages
Introduction To NLP
No ratings yet
Introduction To NLP
50 pages
Natural Language Processing
No ratings yet
Natural Language Processing
28 pages
Ai Lecture22
No ratings yet
Ai Lecture22
32 pages
Intro to statistical nlp
No ratings yet
Intro to statistical nlp
57 pages
Solution NLP UT1
No ratings yet
Solution NLP UT1
7 pages
NLP - Viva - Que & Ans
No ratings yet
NLP - Viva - Que & Ans
15 pages
NLP KEY
No ratings yet
NLP KEY
16 pages
NLP Intro
No ratings yet
NLP Intro
74 pages
Ima 2000
No ratings yet
Ima 2000
56 pages
Introduction To Natural Language Processing and NLTK
No ratings yet
Introduction To Natural Language Processing and NLTK
23 pages
Naive Bayes With Sentiment Classification
No ratings yet
Naive Bayes With Sentiment Classification
82 pages
NLP 2K22 DEC CS3EA06 - IT3EA06 Natural Language Processing
No ratings yet
NLP 2K22 DEC CS3EA06 - IT3EA06 Natural Language Processing
4 pages
nlp_essentials
No ratings yet
nlp_essentials
22 pages
NLP
No ratings yet
NLP
17 pages
Natural Language Processing Questions
No ratings yet
Natural Language Processing Questions
5 pages
Shivangi Tyagi (NLP Assignments)
No ratings yet
Shivangi Tyagi (NLP Assignments)
60 pages
Lecture03 Naive Bayes
No ratings yet
Lecture03 Naive Bayes
33 pages
Applied Natural Language Processing: Barbara Rosario
No ratings yet
Applied Natural Language Processing: Barbara Rosario
39 pages
Unit 1a
No ratings yet
Unit 1a
53 pages
?? ??? ????????? ?????????
No ratings yet
?? ??? ????????? ?????????
23 pages
Natural Language Processing_NOTES
No ratings yet
Natural Language Processing_NOTES
4 pages
MultinomialNB
No ratings yet
MultinomialNB
52 pages
4 Naive Bayes
No ratings yet
4 Naive Bayes
82 pages
CA1[1]
No ratings yet
CA1[1]
30 pages
Unit 4 NLP Notes
No ratings yet
Unit 4 NLP Notes
35 pages
ccs369 Unit 1 Summarized Lecture Note
No ratings yet
ccs369 Unit 1 Summarized Lecture Note
29 pages
Reference Material NLP - 2
No ratings yet
Reference Material NLP - 2
40 pages
Text Analytics and Natural Language Processing - KAI073.docx
No ratings yet
Text Analytics and Natural Language Processing - KAI073.docx
24 pages
NLP Viva
No ratings yet
NLP Viva
14 pages
dupppppppppp
No ratings yet
dupppppppppp
15 pages
CS-875-Lecture 4
No ratings yet
CS-875-Lecture 4
47 pages
AI Unit 5
No ratings yet
AI Unit 5
18 pages
Intro DL 10 NLP
No ratings yet
Intro DL 10 NLP
99 pages
NLP Q2 21SAL54 Scheme
No ratings yet
NLP Q2 21SAL54 Scheme
6 pages
Sample Seminar Abstract-Synopsis Format
No ratings yet
Sample Seminar Abstract-Synopsis Format
5 pages
The Traditional Approach To Natural Language Processing
No ratings yet
The Traditional Approach To Natural Language Processing
7 pages
Naive Bayes
No ratings yet
Naive Bayes
56 pages
Natural Language Processing
No ratings yet
Natural Language Processing
17 pages
Applications of AI
No ratings yet
Applications of AI
11 pages
L5 TextClassification Updated
No ratings yet
L5 TextClassification Updated
179 pages
NLP Introduction
No ratings yet
NLP Introduction
35 pages
Unit V Natural Language Processing
No ratings yet
Unit V Natural Language Processing
20 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
28 pages
Sentiment Analysis
No ratings yet
Sentiment Analysis
48 pages
05 Introduction To NLP
No ratings yet
05 Introduction To NLP
63 pages
MLRD 2
No ratings yet
MLRD 2
15 pages
04 Textcat
No ratings yet
04 Textcat
101 pages
Nlp Revision Notes
No ratings yet
Nlp Revision Notes
6 pages
Text and Speech Analysis Notes CCS369-UNIT 1
No ratings yet
Text and Speech Analysis Notes CCS369-UNIT 1
27 pages
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
High School Pre-Calculus Tutor
From Everand
High School Pre-Calculus Tutor
The Editors of REA
4/5 (1)
Differential Equation: Dr. Bulbul Jan
No ratings yet
Differential Equation: Dr. Bulbul Jan
8 pages
Differential Equation: Dr. Bulbul Jan
No ratings yet
Differential Equation: Dr. Bulbul Jan
11 pages
Differential Equation: Dr. Bulbul Jan
No ratings yet
Differential Equation: Dr. Bulbul Jan
8 pages
Differential Equation: Dr. Bulbul Jan
No ratings yet
Differential Equation: Dr. Bulbul Jan
11 pages
Differential Equation: Dr. Bulbul Jan
No ratings yet
Differential Equation: Dr. Bulbul Jan
25 pages
Pronunciation
No ratings yet
Pronunciation
4 pages
Vocatives in Nkengasong'S Black Caps and Red Feathers and Achebe'S
No ratings yet
Vocatives in Nkengasong'S Black Caps and Red Feathers and Achebe'S
17 pages
Pr5 Homonymy
No ratings yet
Pr5 Homonymy
45 pages
Ron Clark's Essential 55 Rules
No ratings yet
Ron Clark's Essential 55 Rules
6 pages
2359-Article Text-7873-1-10-20230314
No ratings yet
2359-Article Text-7873-1-10-20230314
8 pages
Social Psychology CHAPTER 8
No ratings yet
Social Psychology CHAPTER 8
2 pages
Module in Purposive Com - Unit 2
No ratings yet
Module in Purposive Com - Unit 2
12 pages
Auditory Learning Is The Process of Learning Through Speaking and
No ratings yet
Auditory Learning Is The Process of Learning Through Speaking and
8 pages
Ancient Script New
No ratings yet
Ancient Script New
9 pages
UNIT 9 - LESSON 3 - PART 2 - Speaking and Writing
No ratings yet
UNIT 9 - LESSON 3 - PART 2 - Speaking and Writing
6 pages
FTU Intercultural Syllabus 2024
No ratings yet
FTU Intercultural Syllabus 2024
4 pages
Gecc 102 - Module 1
No ratings yet
Gecc 102 - Module 1
13 pages
Danish Materials Package
100% (8)
Danish Materials Package
29 pages
The TOK I - A Exhibition How To
No ratings yet
The TOK I - A Exhibition How To
26 pages
GoFrench Book
100% (3)
GoFrench Book
161 pages
W8. Group 10. Formal Vs Functional
No ratings yet
W8. Group 10. Formal Vs Functional
16 pages
Module 1 Eapp
No ratings yet
Module 1 Eapp
18 pages
Ayele Gemechu and Wondimu Tegegne
No ratings yet
Ayele Gemechu and Wondimu Tegegne
9 pages
Word Formation Processes
No ratings yet
Word Formation Processes
27 pages
Lesson Plan: and Abilities
No ratings yet
Lesson Plan: and Abilities
8 pages
CRIZZ T-20 Cricket Championship Sponcership
No ratings yet
CRIZZ T-20 Cricket Championship Sponcership
8 pages
G3 Q3 Summative 4 Math Iloko
No ratings yet
G3 Q3 Summative 4 Math Iloko
5 pages
Filología Neotestamentaria 51 - V.v.a.A
No ratings yet
Filología Neotestamentaria 51 - V.v.a.A
137 pages
Introduction To Communication
No ratings yet
Introduction To Communication
10 pages
Prueba Diagnostica USE OF LANGUAGE
No ratings yet
Prueba Diagnostica USE OF LANGUAGE
2 pages
Functionalist Approaches To Translation - A Historical Overview
No ratings yet
Functionalist Approaches To Translation - A Historical Overview
17 pages
Listening Esa Muept
No ratings yet
Listening Esa Muept
11 pages
Reading-And-Vocabulary-Acquisition
No ratings yet
Reading-And-Vocabulary-Acquisition
7 pages
Resume - Laura Huyen Montes
No ratings yet
Resume - Laura Huyen Montes
2 pages

NLP Assignement Solution

Uploaded by

NLP Assignement Solution

Uploaded by

Question 1: Define Natural Language Processing?

What are the significant application

Significant Application Areas of NLP:

Detailed Discussion of Two Applications:

Levels of Language Analysis in NLP:

Question 3: Define the three types of representation in descriptive statistics.

• Tabular display showing how often each value appears in a dataset.

• Visual tools such as bar charts, pie charts, and histograms.

• Condense data using key values:

Feature Syntax Semantics

Text Classification using Supervised Learning involves:

• Training Set: Labeled documents (𝑑𝑑1 , 𝑐𝑐1 ), … , (𝑑𝑑𝑚𝑚 , 𝑐𝑐𝑚𝑚 )

Naïve Bayes Classifier:

𝑐𝑐ˆ𝑀𝑀𝑀𝑀𝑀𝑀 = arg max 𝑃𝑃(𝑐𝑐 ∣ 𝑑𝑑) = arg max 𝑃𝑃(𝑑𝑑 ∣ 𝑐𝑐)𝑃𝑃(𝑐𝑐)

Under the Bag-of-Words model and conditional independence:

𝑐𝑐ˆ𝑁𝑁𝑁𝑁 = arg max log 𝑃𝑃(𝑐𝑐) + � log 𝑃𝑃(𝑓𝑓𝑖𝑖 ∣ 𝑐𝑐)

Question 7: A fair coin is tossed 3 times. What is the likelihood of 2 heads?

• Event A: Outcomes with exactly 2 heads

Question #8: Naïve Bayes Sentiment Classification

Step 1: Preprocess and Build the Vocabulary

1. Normalization: Convert all text to lowercase and tokenize the words.

• Vocabulary size: ∣V∣=15

Step 2: Compute Class Priors

Step 3: Preprocess the Test Document

• Test Document: “Disappointed quality, not recommended”

Step 4: Calculate Likelihoods with Laplace (Add-One) Smoothing

count (𝑤𝑤, 𝑐𝑐) + 1

Step 5: Compute the Posterior for Each Class

𝑐𝑐ˆ = arg max 𝑃𝑃(𝑐𝑐) � 𝑃𝑃(𝑤𝑤 ∣ 𝑐𝑐)

For the Positive Class:

Step 6: Compare the Scores

• Positive Class Score: ≈ 2.51 × 10−6

You might also like