NLP QB
NLP QB
2 What is minimum edit distance between two words? Calculate minimum edit 12 Section-I
distance between two words: “small” and “smell” using dynamic
programming algorithm. Consider cost for insertion, deletion and
substitution is 1,1,2.
3 What is the difference between non-word and real-word spelling correction? 12 Section-I
What is perplexity? Estimate the perplexity of the corpus based on unigram
language model: “the man is a thief but the man is a good man”
5 What is morphology in NLP? What is morphemes? What is bounded and free 12 Section-I
morphemes explain with example. What is stemming and how it is different
from lemmatization?
7 Explain about Noisy Channel Model for Spelling Correction-Gram Language 12 Section-I
Models
9 Apply different smoothing techniques to a language model and analyze their 12 Section-I
impact on performance.
13 What is POS tagging? Find the POS tag for the phrase “the light book”, using 12 Section-II
Viterbi algorithm in Hidden markov tagging model with the following
information.
14 What is difference between real words and non words? What is FSA and how 12 Section-II
inflections in words can be represented using FSA, explain with example.
15 What are the problems of Hidden Markov model to predict the POS tags for a 12 Section-II
given sentence or phrase? Explain how Baum Welch algorithm learns the
parameters – transition matrix, observation matrix and initial state
distribution.
16 What is smoothing in language model? What are the advantages smoothing? 12 Section-II
Find the Good turing smoothing for the following sentence:
“he is he is good man”
18 Established why maximum entropy model is better than hidden Markov 12 Section-II
model. How POS tagging is achieved in maximum entropy model. What is
beam search explain in detail.
19 How is the uniformity maintained in maximum entropy model? Write the 12 Section-II
maximum entropy model principles.
20 Consider the maximum entropy model for POS tagging, where you want to 12 Section-II
estimate P(tag|word). In a hypothetical setting, assume that tag can
take the values D, N and V (short forms for Determiner, Noun and
Verb). The variable word could be any member of a set V of possible
words, where V contains the words a, man, sleeps, as well as
additional words. The distribution should give the following
probabilities
P(D|a) = 0.9
P(N|man) = 0.9 P(V|sleeps) = 0.9
P(D|word) = 0.6 for any word other than a, man or sleeps P(N|word) = 0.3
for any word other than a, man or sleeps P(V|word) = 0.1 for any
word other than a, man or sleeps
It is assumed that all other probabilities, not defined above could take any
values such that
ΣP(tag|word) = 1 is satisfied for any word in V.
a. Define the features of your maximum entropy model that can
model this distribution. Mark your features as f1, f2 and so on.
Each feature should have the same format as explained in the
class
b. For each feature fi, assume a weight λi. Now, write expression
for the following probabilities in terms of your model
parameters
P(D|cat)
P(N|laughs)
P(D|man)
c. What value do the parameters in your model take to give the
distribution as described above. (i.e.P(D|a) = 0.9) and so on.
21 What is syntax? What is parsing? What is the difference between derivation 12 Section-
and parse tree? What is constituency? Write down different forms of III
constituency with example. What is the significance of “head” of a
constituency, explain?
22 What is the difference between top down and bottom up parsing? Apply CYK 12 Section-
algorithm to parse the sentence “a pilot likes flying planes” with given III
grammar
23 What is inside-outside probability? Apply CYK algorithm to parse the 12 Section-
sentence “a pilot likes flying planes” with given probabilistic context free III
grammar to find most probable sparse tree.
Find the inside probabilities for each word for the following sentence;
“Astronomers saw stars with ears”
32 What is word space? Write down the steps to create words space and explain 12 Section-IV
it with example and how can it be useful to show the word similarities.
33 How weights can be measured based on context? Deduce the formulation 12 Section-IV
for weight measurements. What is difference between attributional and
relational similarity?
34 What one-hot encoding? How words can be represented using one-hot 12 Section-IV
encoding explain with example? What are the limitations of one-hot
encoding explaining with example?
35 What is CBOW? How CBOW is used to emebed word explain with example. 12 Section-IV
What is the difference between skip-gram and CBOW?
42 What are the main stages of text summarization? How salient words can be 12 Section-V
defined? How sentence can be weighted?
43 How sentences can be simplified, Explain with example. How summarization 12 Section-V
systems can be evaluated? What is ROUGE and how is it used for system
evaluation
44 What is text classification? What kind of problems can be solved using text 12 Section-V
classification? How text classification problems can be solved?
45 Discuss the different types of text classification tasks, including binary, multi- 12 Section-V
class, and hierarchical classification.
46 Discuss the evaluation of text classifiers using metrics like accuracy, precision, 12 Section-V
recall, and F1-score.
49 Describe the application of machine learning algorithms like Naive Bayes, 12 Section-V
support vector machines (SVMs), and random forests in text classification.
47 12 Section-IV
48 12 Section-IV
49 12 Section-IV
50 12 Section-IV