NLP Assignment-7 Solution
NLP Assignment-7 Solution
Assignment 7
Type of Question: MCQ
==================================================
=
Question 1: Suppose you have a raw text corpus and you compute word co
occurrence matrix from there. Which of the following algorithm(s) can you utilize
to learn word representations? (Choose all that apply) [1 mark]
a. CBOW
b. SVM
c. PCA
d. Bagging
Answer: a, c
Solution:
==================================================
=
Question 2: What is the method for solving word analogy questions like, given A,
B and D, find C such that A:B::C:D, using word vectors? [1 mark]
a. vc = va + (vb − vd), then use cosine similarity to find the closest word of vc.
b. vc = va + (vd − vb) then do dictionary lookup for vc
c. vc = vd + (va − vb) then use cosine similarity to find the closest word of vc.
d. vc = vd + (va − vb) then do dictionary lookup for vc.
e. None of the above
Answer: c
Solution: vd − vc = vb − va
vc = vd + va − vb then use cosine similarity to find the closest word of vc.
==================================================
=
Question 3: What is the value of PMI(w1, w2) for C(w1) = 250, C(w2) = 1000,
C(w1, w2) = 160, N = 100000? N: Total number of documents.
C(wi): Number of documents, wi has appeared in.
C(wi, wj ): Number of documents where both the words have appeared in.
Note: Use base 2 in logarithm. [1 mark]
a. 4
b. 5
c. 6
d. 5.64
Answer: c
Solution:
==================================================
=
a. 6/11, 3/8
b. 10/11, 5/6
c. 4/9, 2/7
d. 5/9, 5/8
Answer: a
Solution:
==================================================
=
a. 4.704, 1,720
b. 1.692, 0.553
c. 2.246, 1.412
d. 3.213, 2.426
Answer: c
Solution:
==================================================
=
a. 0.773, 0.412
b. 0.881, 0.764
c. 0.987, 0.914
d. 0.897, 0.315
Answer: c
Solution:
Cosine-sim (w1, w2) = (2*4 + 8*9 + 5*7) / (√(2*2 + 8*8 + 5*5) * √(4*4 + 9*9 +
7*7)) = 0.987
Cosine-sim (w1, w3) = (2*1 + 8*2 + 5*3) / (√(2*2 + 8*8 + 5*5) * √(1*1 + 2*2 +
3*3)) = 0.914
==================================================
=
Answer: 1
Solution: Word vectors learnt using CBOW or Skipgram models can’t
disambiguate between Antonyms or Polysemous words.
==================================================
=