NLP Endsem 2016
NLP Endsem 2016
1. (a) Use dynamic programming to compute the edit distance between words
WRONG and WINGS, assuming that the costs of insertion and deletion are 2 and
the cost of substitution is 1. [6]
(b) Assume that there are only two senses of the word bank (one pertaining to the
financial sense, and the other to the “river bank” sense) in WordNet. In a given
piece of raw text, the distributional neighbours of bank are {account, deposit,
river}. Given that each of these neighbours can have multiple senses as well, how
would you go about assigning dominance based ranks to the two senses of bank?
Show the steps in detail. [4]
2. Consider a Machine Translation parallel corpus having three sentence pairs. The
first sentence pair is “go there fast”/”jaldi udhar jaao”. The second sentence pair is
“go there”/”udhar jaao”. The third sentence pair is “go”/”jaao”. (a) Show how the
first few iterations of EM are useful in learning word alignments from this corpus.
Make clear any simplifying assumptions on top of IBM Model 3. (b) How is extra
knowledge “getting generated” in successive iterations of EM? [8+2]
3. What limitations of the basic parsing techniques does the CYK parser address? Is
there an assumption on the grammar rules that CYK can deal with? If yes, what
are these? Given the grammar below and the input sentence “w =(()(()))”, show the
steps in chart parsing using CYK. Alongside your charts showing each step,
mention clearly the rule(s) that is(are) used (if any) to advance to this step from
the previous one. [1.5 + 1.5 + 7]
S → SS
S →(S1
S1 → S)
S → ()
== The End ==