Assignment 11
Assignment 11
Assignment 11
Type of Questions: MCQ
Question 1: What are the ideal qualities of a summary in automatic extractive text
summarization?
Answer: 4
Solution: Maximum relevance to the theme is the basic goal of any summarization
method. Redundancy of the information stored in a summary should be minimum to
make sure that the summary is brief.
1
1.
0.000 0.500 0.400 0.200 0.000
0.500 0.000 0.020 0.000 0.100
0.400
M̃ = 0.020 0.000 0.000 0.400
0.200 0.000 0.000 0.000 0.300
0.000 0.100 0.400 0.300 0.000
2.
0.000 0.455 0.364 0.182 0.000
0.806 0.000 0.032 0.000 0.161
0.488
M̃ = 0.024 0.000 0.000 0.488
0.400 0.000 0.000 0.000 0.600
0.000 0.125 0.500 0.375 0.000
3.
0.000 0.806 0.488 0.400 0.000
0.455 0.000 0.024 0.000 0.125
0.364
M̃ = 0.032 0.000 0.000 0.500
0.182 0.000 0.000 0.000 0.375
0.000 0.161 0.488 0.600 0.000
Answer: 2
Solution: For making M row-stochastic, one needs to normalize individual rows by
the sum of its elements.
Mi,j
M̃i,j = P
j Mi,j
Question 3: Find the PageRank values for the 5 sentence nodes mentioned in above
question (with similarity values as per matrix in Equation 1). Use µ = 1.0.
[HINT: Use an online matrix multiplier or MATLAB/Python for solving this prob-
lem.]
1. 0.286 0.161 0.214 0.130 0.208
2. 0.251 0.173 0.201 0.163 0.212
3. 0.061 0.072 0.226 0.324 0.317
2
4. 0.310 0.031 0.207 0.348 0.104
Answer: 1
Solution: Initial values of v can be any randomly initialized probability vector.
Question 4: Find the PageRank values for the 5 sentence nodes mentioned in above
question (with similarity values as per matrix in Equation 1). Use µ = 0.5.
[HINT: Use an online matrix multiplier or MATLAB/Python for solving this prob-
lem.]
1. 0.286 0.161 0.214 0.130 0.208
2. 0.251 0.173 0.201 0.163 0.212
3. 0.061 0.072 0.226 0.324 0.317
4. 0.310 0.031 0.207 0.348 0.104
Answer: 2
Solution: Initial values of v can be any randomly initialized probability vector.
1. 0.251, 0.012
3
2. 0.251, 0.212
3. 0.286, 0.208
4. 0.324, 0.317
Answer: 2
Solution: Refer to Lecture 52
Question 6: What are the goals of using PageRank algorithm and Maximal Marginal
Relevance for summarization?
Answer: 2
Solution:
Question 7: Consider the following test case for summarizing a document with 4
sentences, t1 , t2 , t3 , t4 . Their relevance scores are 3, 3, 2.8, and2.8, respectively. The
redundancy scores between each pair of sentences are given by the matrix below.
0 0.5 0.5 0.5
0.5 0 0.2 0.1
Red = 0.5 0.2 0
0
0.5 0.1 0 0
If the length of each sentence is L and we want a summary of length 2L, find out the
optimal extractive summary.
1. t1 , t3
4
2. t3 , t4
3. t2 , t4
4. t1 , t2
Answer: 3
Solution: To find the best summary with two sentences, enumerate all the pairs and
compute their summary scores using the formula to find the best extractive summary.
t2 , t4 has the highest summary score of 5.7.
X X
s(S) = Rel(i) − Red(i, j)
ti ∈S ti ,tj ∈S;i<j
Question 8: Consider the system generated summary (S) and the reference sum-
maries as follows:
S : the cat was sleeping.
R1 : the cat was sleeping under the bed
R2 : the cat was found under the bed
R3 : the cat was under the bed, sleeping.
What are the ROUGE-1 and ROUGE-2 recall values for the give summary with
respect to the references?
1. 1.0, 1.0
2. 0.524, 0.389
3. 0.524, 0.611
4. 0.579, 0.366
Answer: 2
Solution:
4+3+4 11
ROUGE-1 = 7+7+7
= 21
≈ 0.524
3+2+2 7
ROUGE-2 = 6+6+6
= 18
≈ 0.389