Quiz IRS17 3A Answers
Quiz IRS17 3A Answers
Quiz IRS17 3A Answers
Name __________________________________________________________________________________
Student ID______________________________________________________________________________
Multiple Choice Questions: Please choose only one answer per question
Q3 Select the most efficient processing order for the Boolean query Q.
Q: trees AND marmalade AND eyes. Term Doc. Freq
(marmalade AND eyes) first, then merge with trees. eyes 213,312
(marmalade AND trees) first, then merge with eyes. marmalade 107,913
(trees AND eyes) first, then merge with marmalade. trees 316,812
Any combination would result in the same amount of operations.
Page 1/2
Introduction to Information Retrieval (CS 121 / Inf 141)
Quiz #3 Permutation A - 05/23/2017 WITH ANSWERS
Q6 Find the Jaccard coefficient (Jc) for the query and documents below.
Query: top university (set q) Doc 1: university of California (set d1)
Doc 2: best university in USA (set d2)
Jc(q,d1)=1/4, Jc(q,d2)=1/5 Jc(q,d1)=0, Jc(q,d2)=1/6
Jc(q,d1)=1/5, Jc(q,d2)=1/6 Jc(q,d1)=1/5, Jc(q,d2)=0
Q8 Mark the false statement with regards to the term frequency (tf)?
The tf is the number of times that a term occurs in a document.
Relevance of a term in a document increases proportionally with its tf.
The tf of a query is the sum of the tf of each of the terms in the query.
The tf of a query is 0 if none of the query terms is present in the document.
Q9 Mark the false statement with regards to the document frequency (df)?
Rare terms are more informative than frequent terms.
The df of a term t can be found as the length of the posting list of t.
Frequent terms are more informative than rare terms.
The df of a term t refers to the number of documents that contain t.
Q10 Which of the following statements is false with regards to the Vector Space
Similarity?
Terms are axes of the space, which results in a high-dimensional space.
Documents and queries can be presented as points or vectors in the space.
The Euclidean distance query-document is a good approach to rank its similarity.
Documents can be ranked according to their proximity to the query in the space.
Page 2/2