Solutions To Exercises: Chapter 1 - Information Retrieval Models

Uploaded by

Sai Pawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

67 views1 page

Solutions To Exercises: Chapter 1 - Information Retrieval Models

Uploaded by

Sai Pawan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Solutions to Exercises

Chapter 1 – Information Retrieval Models

Djoerd Hiemstra

1.1(c) The Venn diagrams of Figure 1.2 show exactly 8 disjoint subsets of documents, including
the area around the diagram. Whatever the final result of a Boolean query, each subset is
either selected or not, so in total 28 = 256 subsets can be defined.
1.2(b) Vector spaces are metric spaces, i.e., a set of objects equipped with a distance, where the
distance from q to d is the same as from d to q, so if we change q for d and d for q,
the distance should remain the same. In practice however, many practical term weighting
algorithms do not use the same weights for queries and documents. In such a case, the
similarity might not be equal to 0.08. One might argue that such a model is not a vector
space model, though.
1.3(c) If we add a single document, the document frequencies of terms that occur in the added
document will increase by 1. Furthermore, the number of documents N changes, which
affects the idfs of all other terms.
1.4(b) Assuming that the idf is calculated using the number of documents N as shown in Equation
(1.5), all weights need to change, as is the case in Exercise 3.
1.5(a) For each term, the probabilistic model considers either presence of the term in the document,
or absence of the term in the document. So, per term, there are two cases, hence 23 = 8
different scores in the case of three query terms.
1.6(c) As said above, the model only considers presence or absence, but in the case of presence
it does not consider the number of occurrences of the term in the document. If the term is
present in both D and E, then they will be assigned the exact same score. If the system
needs to provide a total ranking, then the implementation has to determine which document
is ranked first.
1.7(b) Without smoothing, the probability is a simple fraction of the number of occurrences, divided
by the total number of terms in the document. Of course, language models do not need an
additional term weighting algorithm.
1.8(c) If λ = 0, the model does not use smoothing. Without smoothing, terms that do not occur
in the document are assigned zero probability. The score of a document is determined by
multiplying the probabilities of the single terms. If one of them is zero, the final score of
the document is zero.

Information Retrieval: Searching in the 21st Century Edited by A. Göker & J. Davies
© 2009 John Wiley & Sons, Ltd. ISBN: 978-0-470-02762-2

Cloud Computing New
No ratings yet
Cloud Computing New
117 pages
Unit V Easy To Learn
No ratings yet
Unit V Easy To Learn
21 pages
AL ICT Marking Scheme English Medium
No ratings yet
AL ICT Marking Scheme English Medium
5 pages
Information Retrieval: IR Models: Boolean Model
No ratings yet
Information Retrieval: IR Models: Boolean Model
37 pages
IR Models: Chapter Five
100% (1)
IR Models: Chapter Five
26 pages
AutoCad 2007 Keyboard Shortcuts
No ratings yet
AutoCad 2007 Keyboard Shortcuts
9 pages
CS8080 INFORMATION RETRIEVAL TECHNIQUES II INTERNAL EXAMINATION - Google Forms
No ratings yet
CS8080 INFORMATION RETRIEVAL TECHNIQUES II INTERNAL EXAMINATION - Google Forms
420 pages
Preventive Maintenance Checklist
100% (1)
Preventive Maintenance Checklist
1 page
IR Lecture 4b
No ratings yet
IR Lecture 4b
57 pages
Modern Information Retrieval: Modeling
No ratings yet
Modern Information Retrieval: Modeling
263 pages
Lecture 5 - Scoring, Term Weighting, Vector Space Model - Part 1
No ratings yet
Lecture 5 - Scoring, Term Weighting, Vector Space Model - Part 1
45 pages
Lecture 5
No ratings yet
Lecture 5
75 pages
IR Chap4
100% (1)
IR Chap4
32 pages
Good Irmodeling
No ratings yet
Good Irmodeling
263 pages
I R Rank
No ratings yet
I R Rank
52 pages
6 Tfidf
No ratings yet
6 Tfidf
48 pages
4_IRModels
No ratings yet
4_IRModels
46 pages
Chapter 5 IR
No ratings yet
Chapter 5 IR
46 pages
4-IR Models
No ratings yet
4-IR Models
33 pages
IR Models
No ratings yet
IR Models
65 pages
Cisco Cimc API Book
No ratings yet
Cisco Cimc API Book
96 pages
Tif Idf Cosine Similarity
No ratings yet
Tif Idf Cosine Similarity
44 pages
Week 3 - Probabilistic Retrieval and Relevance Feedback
No ratings yet
Week 3 - Probabilistic Retrieval and Relevance Feedback
37 pages
IR - Models
100% (3)
IR - Models
58 pages
Module 3 Indexing Part A
No ratings yet
Module 3 Indexing Part A
46 pages
IRS BZU Lecture 7 Jan23
No ratings yet
IRS BZU Lecture 7 Jan23
27 pages
Chapter 4
No ratings yet
Chapter 4
8 pages
4-IR Models
No ratings yet
4-IR Models
33 pages
Introduction To: Information Retrieval
No ratings yet
Introduction To: Information Retrieval
40 pages
5 B IRModels
No ratings yet
5 B IRModels
51 pages
L02-IR Models MMN
No ratings yet
L02-IR Models MMN
27 pages
aa
No ratings yet
aa
4 pages
4_IRModels
No ratings yet
4_IRModels
30 pages
IR Chap4
100% (1)
IR Chap4
32 pages
Lecture 6 Score - Term Weight - Vector Space Model
No ratings yet
Lecture 6 Score - Term Weight - Vector Space Model
43 pages
Information Retrieval Models
No ratings yet
Information Retrieval Models
15 pages
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
No ratings yet
CAIM: Cerca I Anàlisi D'informació Massiva: FIB, Grau en Enginyeria Informàtica
21 pages
Retrieval Models and Rank Retrieval
No ratings yet
Retrieval Models and Rank Retrieval
16 pages
Vector Space Model and Features: Carl Staelin
No ratings yet
Vector Space Model and Features: Carl Staelin
28 pages
5 IRModels
No ratings yet
5 IRModels
30 pages
F-IR
No ratings yet
F-IR
30 pages
Unit 2
No ratings yet
Unit 2
58 pages
5 IRModels IR
No ratings yet
5 IRModels IR
25 pages
asila-IR
No ratings yet
asila-IR
16 pages
3 Retrieval Models
No ratings yet
3 Retrieval Models
87 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
27 pages
Chapter Five IR Models
No ratings yet
Chapter Five IR Models
28 pages
Chapter 4 IR Models
No ratings yet
Chapter 4 IR Models
34 pages
02 Chap02a-BooleanAndvector Models
No ratings yet
02 Chap02a-BooleanAndvector Models
30 pages
C Type Cast
100% (1)
C Type Cast
23 pages
Student Performance Prediction
No ratings yet
Student Performance Prediction
4 pages
Lec2 2
No ratings yet
Lec2 2
17 pages
Introduction of IR Models
No ratings yet
Introduction of IR Models
62 pages
Introduction of IR Models
No ratings yet
Introduction of IR Models
67 pages
4 IRModels
No ratings yet
4 IRModels
32 pages
4_IRModels
No ratings yet
4_IRModels
46 pages
2
No ratings yet
2
17 pages
Boolean and Vector Space Retrieval Models
No ratings yet
Boolean and Vector Space Retrieval Models
31 pages
ISE Information Retrieval Mod-V
No ratings yet
ISE Information Retrieval Mod-V
48 pages
Supervisionguide15 16 Students
No ratings yet
Supervisionguide15 16 Students
18 pages
IR END PYQ SOLS
No ratings yet
IR END PYQ SOLS
8 pages
IR Unit 2
No ratings yet
IR Unit 2
54 pages
Chapter 4 IR Models
No ratings yet
Chapter 4 IR Models
43 pages
IR Models: - Why IR Models? - Boolean IR Model - Vector Space IR Model - Probabilistic IR Model
No ratings yet
IR Models: - Why IR Models? - Boolean IR Model - Vector Space IR Model - Probabilistic IR Model
46 pages
Curriculum Management System
No ratings yet
Curriculum Management System
81 pages
15 Templates PDF
No ratings yet
15 Templates PDF
27 pages
IR Systems Usually Adopt Index Terms To Process Queries Index Term
No ratings yet
IR Systems Usually Adopt Index Terms To Process Queries Index Term
24 pages
DRD
No ratings yet
DRD
16 pages
A Guidebook of Project & Program Management For Enterprise Innovation
No ratings yet
A Guidebook of Project & Program Management For Enterprise Innovation
99 pages
CSI 4107 - Winter 2016 - Midterm
0% (1)
CSI 4107 - Winter 2016 - Midterm
10 pages
Vb-Script 1
0% (2)
Vb-Script 1
15 pages
Two Marks Question and Answers
No ratings yet
Two Marks Question and Answers
4 pages
Australia Post Parcel Post Barcode Guidelines v2
No ratings yet
Australia Post Parcel Post Barcode Guidelines v2
11 pages
Review of Db4o From Db4objects
No ratings yet
Review of Db4o From Db4objects
10 pages
Teradata Performance Optimization
No ratings yet
Teradata Performance Optimization
7 pages
Microsoft Online Services Incentives Documentation
No ratings yet
Microsoft Online Services Incentives Documentation
6 pages
Readme Imx6 Linux
No ratings yet
Readme Imx6 Linux
14 pages
Firefly
No ratings yet
Firefly
6 pages
Question paper bcs oct 2023
No ratings yet
Question paper bcs oct 2023
15 pages
Sorting Algorithm Comparison Chart
No ratings yet
Sorting Algorithm Comparison Chart
18 pages
Unattend MDT
No ratings yet
Unattend MDT
4 pages
IELink Monitor Instruction Manual
No ratings yet
IELink Monitor Instruction Manual
4 pages
Use Case Description-Appendix 1-5 Modified
No ratings yet
Use Case Description-Appendix 1-5 Modified
4 pages
IE 405 ProblemSet 4 Solutions Shared
No ratings yet
IE 405 ProblemSet 4 Solutions Shared
3 pages
Basic Job Description
No ratings yet
Basic Job Description
3 pages
SDFG
No ratings yet
SDFG
4 pages
Arrays - Matrices and Vectors in Matlab, Freemat, Octave and Scilab by WWW - Freemat.info
No ratings yet
Arrays - Matrices and Vectors in Matlab, Freemat, Octave and Scilab by WWW - Freemat.info
5 pages
KBTG IT Security Architect
No ratings yet
KBTG IT Security Architect
1 page
Pointer C++ (UAMD)
No ratings yet
Pointer C++ (UAMD)
6 pages
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
From Everand
Lessons in Bioinformatics - Dot Plots: Lessons in Bioinformatics, #1
Björn Olsson
No ratings yet

Solutions To Exercises: Chapter 1 - Information Retrieval Models

Uploaded by

Solutions To Exercises: Chapter 1 - Information Retrieval Models

Uploaded by

Solutions to Exercises

Chapter 1 – Information Retrieval Models

You might also like