0% found this document useful (0 votes)

8 views4 pages

End Sem Answer Key 2023

The document discusses various aspects of natural language processing (NLP), including initialization and recurrence relations for distance calculations, lexical and allophonic variations, and applications like machine translation and spell correction. It also covers challenges in information extraction, statistical named entity recognition techniques, and the differences between extractive and abstractive summarization. Additionally, it highlights the pros and cons of chatbots and issues related to scale in NLP models.

Uploaded by

cse21055

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views4 pages

End Sem Answer Key 2023

Uploaded by

cse21055

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

Question 1.

Initialization
D(i,0) = i
D(0,j) = j
Recurrence Relation:
For each i = 1…M
For each j = 1…N
D(i-1,j) + 1
D(i,j)= min D(i,j-1) + 1
D(i-1,j-1) + 2; if X(i) ≠ Y(j)
0; if X(i) = Y(j)
Termination:
D(N,M) is distance

 Lexical variation: a difference in what segments are used to represent the word in the lexicon
 Due to the influence of the surrounding sounds, syllable structure, etc.
 Because can be pronounced either as [b iy k ah z] [b iy k ah zh] [b iy k ah s] [b iy k aa z]
 Allophonic variation: a difference in how the individual segments change their value in different contexts.
 Because can be pronounced either as monosyllabic cause or bisyllabic because
Question 2.
 Machine Translation:
 P(high winds tonite) > P(large winds tonite)
 Spell Correction
 The office is about fifteen minuets from my house
 P(about fifteen minutes from) > P(about fifteen minuets from)
 Speech Recognition
 P(I saw a van) > P(eyes awe of an)

 Types: means the number of distinct words in a corpus, i.e. the size of the vocabulary V.
 Tokens: means the total number of running words. N
 “They picnicked by the pool, then lay back on the grass and looked at the stars.” 16 word tokens and
14 word types
 Switchboard corpus has 2.4 million wordform tokens and approximately 20,000 wordform types.
 Brown corpus contains 61,805 wordform types.
 Brown et al.(1992): 583 million wordform tokens that included 293,181 different wordform types.

Question 3.
• Entropy: expected surprise (over p):
é 1 ù
H( p) = E p ê log 2 ú = -å px log 2 px
ë p x û x

• – x log x is convex
• – å x log x is convex (sum of convex functions is convex).
Issues of Scale
• Lots of features:
• NLP maxent models can have well over a million features.
• Even storing a single array of parameter values can have a substantial memory cost.
• Lots of sparsity:
• Many features seen in training will never occur again at test time.
• Overfitting very easy
• Optimization problems:
• Feature weights can be infinite, and iterative solvers can take a long time to get to those infinities.

Question 4.
 It can tell us how the word is pronounced.
 noun is CONtent and the adjective is conTENT
 OBject(noun) and obJECT(verb),
 DIScount(noun) and disCOUNT(verb)
 Knowing a word’s part of speech can help tell us which morphological affixes it can take.
 It gives a significant amount of information about the word and its neighbors.
 Knowing whether a word is a possessive pronoun or a personal pronoun can tell us what words are
likely to occur in its vicinity.
 useful in language model for speech recognition.
B. Parts of speech can be divided into two broad super categories:
 Open Class: Growing continuously
 4 Major open classes: nouns, main verbs, adjectives, and adverbs
 Closed class: It has relatively fixed membership.
 Example of English closed classes: prepositions, determiners, pronouns, conjunctions, auxiliary
verbs, particles are closed classes.
 Function word: tend to be very short, occur frequently, and play an important role in grammar.
 Example: of, it, and, you, etc.
Question 5.

Yesterday, I bought a Nokia phone and my girlfriend bought a moto phone. We called each other when we got home.
The voice on my phone was not clear. The camera was good. My girlfriend said the sound of her phone was clear. I
wanted a phone with good voice quality. So I was satisfied and returned the phone to BestBuy yesterday.

Challenges
 Contrasts with standard text-based categorization
 Domain dependent
 Sarcasm
 Sometimes people express their negative feelings using positive or intensified positive words in the
text.
 Thwarted expressions
 The sentences/words that contradict the overall sentiment of the set are in majority.
Example: The actors are good, the music is brilliant and appealing.
Yet, the movie fails to strike a chord.
 Consolidation of Conflicting sentiments

Question 6.
 Dotted rule: We use a dot within the right hand side of a state’s grammar rule to indicate the progress made in
recognizing it.
Operation of Earley parser
 March through the N+1 sets of states in the chart in a left-to-right fashion.
 At each step, one of the three operators is applied to a single state as input and deriving new states from it.
 Predictor: S→.VP, [0, 0] => VP→.Verb, [0, 0] & VP→.Verb NP, [0, 0]
 Scanner: VP→.Verb NP, [0, 0] => VP→Verb.NP, [0,1]
 Completer: NP→Det Nominal., [1,3] & VP→Verb.NP, [0,1] => VP→Verb NP., [0,3]
 This results in the addition of new states to the end of either the current or next set of states in the chart.
 The presence of a state S→ α., [0,N] in the list of states in the last chart entry indicates successful parse.

Question 7.
Information Extraction tasks are characterized by two properties:
1. the desired knowledge can be described by a relatively simple and fixed template (frame) with slotsthat need to be
filled in with material from the text
2. only a small part of the information in the text is relevant for filling in this frame; the rest can be ignored.

Precision is a measure of how much of the information that the system returned is actually correct.
Precision: =# of correct answers given by system / # of answers given by system

Recall is a measure of how much relevant information the system has extracted from the text.
Recall: = # of correct answers given by system / total # of possible correct answers in the text

F-measure that balances recall and precision by using a parameter β.

F =(β2+1)P.R/ β2P+R
When β= 1, precision and recall are given equal weight.
When β> 1, precision is favored
Whenβ< 1, recall is favored.

Question 8.

1. Statistical NER techniques: Sequence models: HMMs, CMMs/MEMMs, CRFs

2. Hybrid Approach
3. Dictionary (Gazetteers) Look-up Approach

Following are the major challenges encountering in Indian Languages.

Agglutination
Ambiguity
Between Proper and common nouns
Between named entities
Lack of Capitalization

Question 9.
The classic search model
Inverted index
• We need variable-size postings lists
– On disk, a continuous run of postings is normal and best
– In memory, can use linked lists or variable length arrays
• Some tradeoffs in size/ease of insertion

Positional indexes
• In the postings, store, for each term the position(s) in which tokens of it appear:
<term, number of docs containing term;
doc1: position1, position2 … ;
doc2: position1, position2 … ;
etc.>
example
<be: 993427;
1: 7, 18, 33, 72, 86, 231;
2: 3, 149;
4: 17, 191, 291, 430, 434;
5: 363, 367, …>

Question 10.
 Extractive summaries are created by reusing portions (words, sentences, etc.) of the input text verbatim.
 For example, search engines typically generate extractive summaries from webpages.
 Most of the summarization research today is on extractive summarization.

 In abstractive summarization, information from the source text is re-phrased.

 Human beings generally write abstractive summaries (except when they do their assignments ).
 Abstractive summarization has not reached a mature stage because allied problems such as semantic
representation, inference and natural language generation are relatively harder.

Chatbots: pro and con

 Pro:
 Fun
 Good for narrow, scriptable applications
 Cons:
 They don't really understand
 Rule-based chatbots are expensive and brittle
 IR-based chatbots can only mirror training data
 The case of Microsoft Tay
 (or, Garbage-in, Garbage-out)
 The future: combining chatbots with frame-based agents

TNPSC English Previous Year Question
No ratings yet
TNPSC English Previous Year Question
491 pages
NLP 2-5 Unit Notes
No ratings yet
NLP 2-5 Unit Notes
83 pages
Notes
No ratings yet
Notes
37 pages
Chapter 5
No ratings yet
Chapter 5
22 pages
Comic-Strip Grammar
95% (22)
Comic-Strip Grammar
64 pages
BAI601 All Modules VTU 10 Mark Complete
No ratings yet
BAI601 All Modules VTU 10 Mark Complete
18 pages
AI Unit V
No ratings yet
AI Unit V
64 pages
Text Representation: Lecture # 6
No ratings yet
Text Representation: Lecture # 6
21 pages
Ai CT-2 Answers
No ratings yet
Ai CT-2 Answers
20 pages
NLP CIE 1 Scheme and Solutions
No ratings yet
NLP CIE 1 Scheme and Solutions
5 pages
NLP L IA2
No ratings yet
NLP L IA2
23 pages
Text Mining
No ratings yet
Text Mining
34 pages
Chapter Four 1
No ratings yet
Chapter Four 1
91 pages
Structure
No ratings yet
Structure
163 pages
NLP Unit-4
No ratings yet
NLP Unit-4
48 pages
Multilingual Issues
No ratings yet
Multilingual Issues
7 pages
Unit 6 Endsem PYQs
No ratings yet
Unit 6 Endsem PYQs
15 pages
Himmelmann 2006
No ratings yet
Himmelmann 2006
30 pages
NLP Sem Answers (All)
No ratings yet
NLP Sem Answers (All)
124 pages
NLP QB2
No ratings yet
NLP QB2
9 pages
Unit 1 Notes
No ratings yet
Unit 1 Notes
74 pages
NLP IA1 Question Bank: Concept
No ratings yet
NLP IA1 Question Bank: Concept
10 pages
IJISRT18DC138
No ratings yet
IJISRT18DC138
6 pages
Apex Institute of Technology Natural Language Processing (20CST354)
No ratings yet
Apex Institute of Technology Natural Language Processing (20CST354)
43 pages
Chapter 4 - Processing Text
No ratings yet
Chapter 4 - Processing Text
7 pages
SKD Academy (CBSE) Session - 2024-2025 Subject - Artificial Intelligence (417) Important Questions Chap - NLP
No ratings yet
SKD Academy (CBSE) Session - 2024-2025 Subject - Artificial Intelligence (417) Important Questions Chap - NLP
7 pages
NLP CH 2
No ratings yet
NLP CH 2
59 pages
Lucas Paquetta Raw NLP
No ratings yet
Lucas Paquetta Raw NLP
12 pages
Q2 Week 3 Perfect Tenses
No ratings yet
Q2 Week 3 Perfect Tenses
43 pages
Irregular Verbs in Groups
No ratings yet
Irregular Verbs in Groups
30 pages
NLP Key
No ratings yet
NLP Key
16 pages
Unit 6 - NLP Notes
No ratings yet
Unit 6 - NLP Notes
7 pages
NLP-Lectures 4,5,6
No ratings yet
NLP-Lectures 4,5,6
85 pages
Natural Language Processing Internal 1
No ratings yet
Natural Language Processing Internal 1
18 pages
Quest NLP
No ratings yet
Quest NLP
13 pages
Verbs (Part 2)
No ratings yet
Verbs (Part 2)
34 pages
Time Exp With Hacer
No ratings yet
Time Exp With Hacer
8 pages
517-C-30070-Assignment - Chapter NLP
No ratings yet
517-C-30070-Assignment - Chapter NLP
9 pages
NLP Question Bank
No ratings yet
NLP Question Bank
7 pages
Module 3 - Part 1
No ratings yet
Module 3 - Part 1
54 pages
NLP - N-Gram Language Model
No ratings yet
NLP - N-Gram Language Model
22 pages
Unit Vapplications Notes
No ratings yet
Unit Vapplications Notes
13 pages
Materi Bahasa Inggris Lintas Minat Kelas X
No ratings yet
Materi Bahasa Inggris Lintas Minat Kelas X
9 pages
NLP-Questions Class 10 Ai
No ratings yet
NLP-Questions Class 10 Ai
8 pages
Essential Grammar in Use 4th Edition
No ratings yet
Essential Grammar in Use 4th Edition
2 pages
Pipeline
No ratings yet
Pipeline
9 pages
Unit 1 NLP KCS072
No ratings yet
Unit 1 NLP KCS072
12 pages
100 NLP Questions
100% (6)
100 NLP Questions
23 pages
GR 8 TERM 2 2024 - Final Scope
No ratings yet
GR 8 TERM 2 2024 - Final Scope
11 pages
Q ClassX AI Ch7
No ratings yet
Q ClassX AI Ch7
6 pages
Tense
No ratings yet
Tense
6 pages
Atividade 7 Ano Ingles
No ratings yet
Atividade 7 Ano Ingles
2 pages
NLP - Viva - Que & Ans
No ratings yet
NLP - Viva - Que & Ans
15 pages
Unit 3 NLP
No ratings yet
Unit 3 NLP
7 pages
9th English Full Book
No ratings yet
9th English Full Book
2 pages
A Grammar of The Vulgate (BW New OCR and Margin Cropped - 19 Jan 2017 For All PDF
No ratings yet
A Grammar of The Vulgate (BW New OCR and Margin Cropped - 19 Jan 2017 For All PDF
175 pages
Unit
No ratings yet
Unit
4 pages
Eisenstein
No ratings yet
Eisenstein
305 pages
Irregular and Regular Verbs
No ratings yet
Irregular and Regular Verbs
13 pages
Question Bank NLP SOLUTIONS
No ratings yet
Question Bank NLP SOLUTIONS
21 pages
NLP Week 2 Rationalist and Empiricist Paradigms in Natural Language Processing
No ratings yet
NLP Week 2 Rationalist and Empiricist Paradigms in Natural Language Processing
28 pages
JSU Bahasa Inggeris RuE Ting.1 MR - 2024
No ratings yet
JSU Bahasa Inggeris RuE Ting.1 MR - 2024
10 pages
P.S.Senior Secondary School Class X - Artificial Intelligence - 2021-22 Natural Language Processing Question and Answers
No ratings yet
P.S.Senior Secondary School Class X - Artificial Intelligence - 2021-22 Natural Language Processing Question and Answers
7 pages
Better Sentence Writing in 30 Minutes a Day
From Everand
Better Sentence Writing in 30 Minutes a Day
Diana Campbell
No ratings yet
Future Perfect Tense Ws
No ratings yet
Future Perfect Tense Ws
2 pages
Natural Language Processing Questions
No ratings yet
Natural Language Processing Questions
5 pages
Lecture 1 Text Preprocessing PDF
No ratings yet
Lecture 1 Text Preprocessing PDF
29 pages
It-3035 (NLP) - CS End May 2023
No ratings yet
It-3035 (NLP) - CS End May 2023
10 pages
Logic Assignment 2
No ratings yet
Logic Assignment 2
8 pages
Explanation Based Learning: Fundamentals and Applications
From Everand
Explanation Based Learning: Fundamentals and Applications
Fouad Sabry
No ratings yet
Worksheet #5 - Iv Ciclo - 2020-I
No ratings yet
Worksheet #5 - Iv Ciclo - 2020-I
2 pages
NLP Sem Questions and Answers
No ratings yet
NLP Sem Questions and Answers
72 pages
CELTA Pre Task Answers
No ratings yet
CELTA Pre Task Answers
10 pages
Ngrams
100% (1)
Ngrams
22 pages
Aids To Comprehension
No ratings yet
Aids To Comprehension
2 pages
NLP Quiz
No ratings yet
NLP Quiz
2 pages
Natural Language Processing
From Everand
Natural Language Processing
Ajit Singh
No ratings yet
Modals
No ratings yet
Modals
13 pages
CS 388: Natural Language Processing:: N-Gram Language Models
No ratings yet
CS 388: Natural Language Processing:: N-Gram Language Models
22 pages
Midterm F09 Answers
No ratings yet
Midterm F09 Answers
12 pages
Class 1: Extend Your Vocabulary - Verbs Used With Truth and Lies
No ratings yet
Class 1: Extend Your Vocabulary - Verbs Used With Truth and Lies
7 pages
NLP Endsem 2016
No ratings yet
NLP Endsem 2016
2 pages
Passive Voice
No ratings yet
Passive Voice
6 pages
Present Perfect Simple Vs Past Simple
100% (1)
Present Perfect Simple Vs Past Simple
2 pages
Static Dictionary For Pronunciation Modeling
No ratings yet
Static Dictionary For Pronunciation Modeling
5 pages
Logical Connectors
100% (17)
Logical Connectors
43 pages
Analysis of Statistical Parsing in Natural Language Processing
No ratings yet
Analysis of Statistical Parsing in Natural Language Processing
6 pages
Character Analysis
No ratings yet
Character Analysis
2 pages
Extreme Adjectives
No ratings yet
Extreme Adjectives
4 pages
British English and American English
No ratings yet
British English and American English
2 pages

End Sem Answer Key 2023

Uploaded by

End Sem Answer Key 2023

Uploaded by

Question 1.

F-measure that balances recall and precision by using a parameter β.

1. Statistical NER techniques: Sequence models: HMMs, CMMs/MEMMs, CRFs

Following are the major challenges encountering in Indian Languages.

 In abstractive summarization, information from the source text is re-phrased.

Chatbots: pro and con

You might also like