CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
Fall 2018
http://www.cs.upc.edu/~caim
1 / 18
4. Evaluation and Relevance Feedback
Evaluation of Information Retrieval Usage, I
What are we exactly to do?
3 / 18
Evaluation of Information Retrieval Usage, II
Then, what exactly are we to optimize?
Notation:
D: set of all our documents on which the user asks one query;
A: answer set: documents that the system retrieves as
answer;
R: relevant documents: those that the user actually wishes to
see as answer.
(But no one knows this set, not even the user!)
4 / 18
The Recall and Precision measures
5 / 18
Recall and Precision, II
Example: test for tuberculosis (TB)
Recall
% of true TB that test positive = 35 / 50 = 70 %
Precision
% of positives that really have TB = 35 / 40 = 87.5 %
I Large recall: few sick people go away undetected
I Large precision: few people are scared unnecessarily (few
false alarms)
6 / 18
Recall and Precision, III. Confusion matrix
Equivalent definition
Confusion matrix
Answered
relevant not relevant
relevant tp fn
Reality
not relevant fp tn
|R∩A| tp
I |R| = tp + f n I Recall = = tp+f n
|R|
I |A| = tp + f p |R∩A| tp
I Precision = =
I |R ∩ A| = tp |A| tp+f p
7 / 18
How many documents to show?
8 / 18
Rank-recall and rank-precision plots
9 / 18
A single “precision and recall” curve
x-axis for recall, and y-axis for precision.
(Similar to, and related to, the ROC curve in predictive models.)
11 / 18
Other measures of effectiveness, II
I Coverage:
|relevant & known & retrieved| / |relevant & known|
I Novelty:
|relevant & retrieved & UNknown| / |relevant & retrieved|
12 / 18
Relevance Feedback, I
Going beyond what the user asked for
1. Get a query q
2. Retrieve relevant documents for q
3. Show top k to user
4. Ask user to mark them as relevant / irrelevant
5. Use answers to refine q
6. If desired, go to 2
13 / 18
Relevance Feedback, II
How to create the new query?
14 / 18
Relevance Feedback, III
In practice, often:
I good improvement of the recall for first round,
I marginal for second round,
I almost none beyond.
15 / 18
Relevance Feedback, IV
. . . as Query Expansion
16 / 18
Pseudorelevance feedback
17 / 18
Pseudorelevance feedback, II
18 / 18