0% found this document useful (0 votes)

120 views2 pages

Sheet 1

The document discusses several exercises related to efficiently evaluating Boolean queries over inverted indexes. It addresses topics like: 1) Whether queries of the form Brutus AND NOT Caesar or Brutus OR NOT Caesar can still be evaluated in linear time. 2) How to extend the postings merge algorithm to arbitrary Boolean queries and determine its time complexity. 3) Rewriting queries into disjunctive normal form and whether this would make evaluation more or less efficient. 4) Recommending an optimal processing order for a sample complex query given postings list sizes. 5) Handling negation to determine the best evaluation order for a sample query. 6) Whether processing postings lists by size is guaranteed to

Uploaded by

Ahmed gamal ebied

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

120 views2 pages

Sheet 1

Uploaded by

Ahmed gamal ebied

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 2

Sheet 1

Exercise 1
For the queries below, can we still run through the intersection in time O(x + y),
where x and y are the lengths of the postings lists for Brutus and Caesar? If not, what
can we achieve?
a. Brutus AND NOT Caesar
b. Brutus OR NOT Caesar

Exercise 2
Extend the postings merge algorithm to arbitrary Boolean query formulas. What is
its time complexity? For instance, consider:
a. (Brutus OR Caesar) AND NOT (Antony OR Cleopatra)
Can we always merge in linear time? Linear in what? Can we do better than this?

Exercise 3
We can use distributive laws for AND and OR to rewrite queries.
a. Show how to rewrite the query in Exercise 2 into disjunctive normal form using the
distributive laws.
b. Would the resulting query be more or less efficiently evaluated than the original form
of this query?
c. Is this result true in general or does it depend on the words and the contents of the
document collection?

Exercise 4
Recommend a query processing order for
d. (tangerine OR trees) AND (marmalade OR skies) AND (kaleidoscope OR eyes)
given the following postings list sizes:

Postings Term
size
213312 eyes
87009 kaleidoscope
107913 marmalade
271658 skies
46653 tangerine
316812 trees

Exercise 5
If the query is:
a. friends AND romans AND (NOT countrymen)
how could we use the frequency of countrymen in evaluating the best query evaluation order? In
particular, propose a way of handling negation in determining the order of query processing.
Exercise 6
For a conjunctive query, is processing postings lists in order of size guaranteed to be
optimal? Explain why it is, or give an example where it isn’t.

Exercise 7
Write out a postings merge algorithm, for an x OR y query.

Exercise 8
How should the Boolean query x AND NOT y be handled? Why is naive evaluation
of this query normally very expensive? Write out a postings merge algorithm that
evaluates this query efficiently.

Exercise 9
1. Why don’t we use grep for information retrieval?
2. Why don’t we use a relational database for information retrieval?
3. In constructing the index, which step is most expensive/complex?

Chapter 1 Query Processing and Optimization
No ratings yet
Chapter 1 Query Processing and Optimization
129 pages
Presantation - Chapter 07-Decrease and Conquer
No ratings yet
Presantation - Chapter 07-Decrease and Conquer
41 pages
Information Retrieval Solutions Manual
84% (57)
Information Retrieval Solutions Manual
17 pages
Ait307 QP
No ratings yet
Ait307 QP
3 pages
CS470 Introduction To Database Management Systems: (Chapters 13 and 14 of The Textbook)
100% (1)
CS470 Introduction To Database Management Systems: (Chapters 13 and 14 of The Textbook)
22 pages
(4th NLP'22) Final Exam
No ratings yet
(4th NLP'22) Final Exam
2 pages
S2-18-SS ZG537-L1
No ratings yet
S2-18-SS ZG537-L1
60 pages
Introduction To Information Rertrieval Answer
100% (4)
Introduction To Information Rertrieval Answer
6 pages
Information Retrieval MCQ PDF
100% (2)
Information Retrieval MCQ PDF
4 pages
IR - Models
100% (3)
IR - Models
58 pages
Assignments 1 Solution
100% (1)
Assignments 1 Solution
6 pages
Hive Quiz and Questions
No ratings yet
Hive Quiz and Questions
6 pages
CS614 FinalTerm Solved Papers
No ratings yet
CS614 FinalTerm Solved Papers
24 pages
Boolean Retrieval
No ratings yet
Boolean Retrieval
34 pages
CSI 4107 - Winter 2016 - Midterm
0% (1)
CSI 4107 - Winter 2016 - Midterm
10 pages
Introduction To Information Storage and Retrieval: Chapter Four: Indexing Structure
No ratings yet
Introduction To Information Storage and Retrieval: Chapter Four: Indexing Structure
34 pages
Irs Question Papers
No ratings yet
Irs Question Papers
6 pages
Homework2 Solution
100% (1)
Homework2 Solution
11 pages
Information Retrieval 1 Introduction To IR
No ratings yet
Information Retrieval 1 Introduction To IR
12 pages
Data Analytics Question Bank
No ratings yet
Data Analytics Question Bank
4 pages
Solution.: Increase - 3
No ratings yet
Solution.: Increase - 3
5 pages
IR Models: - Why IR Models? - Boolean IR Model - Vector Space IR Model - Probabilistic IR Model
No ratings yet
IR Models: - Why IR Models? - Boolean IR Model - Vector Space IR Model - Probabilistic IR Model
46 pages
Hbase PPT PDF
No ratings yet
Hbase PPT PDF
100 pages
Irs PPT Unit Ii
No ratings yet
Irs PPT Unit Ii
19 pages
S2-18-SS ZG537-L1
No ratings yet
S2-18-SS ZG537-L1
47 pages
Cyber Security MCQ Unit - V
No ratings yet
Cyber Security MCQ Unit - V
4 pages
SQL Level 2 - Powerpoint Joins
100% (1)
SQL Level 2 - Powerpoint Joins
43 pages
Data Mining-Exams
100% (2)
Data Mining-Exams
3 pages
Cse357 MCQ
No ratings yet
Cse357 MCQ
28 pages
18CS72
No ratings yet
18CS72
2 pages
Trie and Redblack Tree Mcqs
No ratings yet
Trie and Redblack Tree Mcqs
9 pages
Tcs Theory Notes by Kamal Sir
No ratings yet
Tcs Theory Notes by Kamal Sir
24 pages
Query Processing - Database Questions & Answers - Sanfoundry 00
No ratings yet
Query Processing - Database Questions & Answers - Sanfoundry 00
7 pages
Formal Languages Automata Thery580
No ratings yet
Formal Languages Automata Thery580
122 pages
The Classic TF-IDF Vector Space Model
No ratings yet
The Classic TF-IDF Vector Space Model
15 pages
Rob Lect-1 (Introduction) PDF
No ratings yet
Rob Lect-1 (Introduction) PDF
35 pages
DBMS Quiz
100% (3)
DBMS Quiz
2 pages
Unit 1
No ratings yet
Unit 1
15 pages
Ir MCQ-1
No ratings yet
Ir MCQ-1
22 pages
Chapter 1 Introduction To ISR
No ratings yet
Chapter 1 Introduction To ISR
39 pages
Introduction To Information Retrieval-Ch2 Solutions
No ratings yet
Introduction To Information Retrieval-Ch2 Solutions
2 pages
Anna University OOPS Question Bank Unit 2
100% (1)
Anna University OOPS Question Bank Unit 2
6 pages
PySpark SQL Functions-10-03
No ratings yet
PySpark SQL Functions-10-03
357 pages
Irs Important Questions
0% (1)
Irs Important Questions
3 pages
Sp09midterm Revised
No ratings yet
Sp09midterm Revised
6 pages
Unit - 3 Ir Questionbank
No ratings yet
Unit - 3 Ir Questionbank
27 pages
Page Rank Questions
No ratings yet
Page Rank Questions
4 pages
Question Bank of Applied Machine Learning
No ratings yet
Question Bank of Applied Machine Learning
2 pages
Unit Ii Modeling
No ratings yet
Unit Ii Modeling
15 pages
Edwin Mares - The Logic of Entailment and Its History-Cambridge University Press (2024)
No ratings yet
Edwin Mares - The Logic of Entailment and Its History-Cambridge University Press (2024)
280 pages
DM Important Questions
100% (1)
DM Important Questions
2 pages
Relational Database Design: Exercises
No ratings yet
Relational Database Design: Exercises
9 pages
CSE373: Design and Analysis of Algorithms
No ratings yet
CSE373: Design and Analysis of Algorithms
52 pages
Part I IR VTU M Tech SSE
No ratings yet
Part I IR VTU M Tech SSE
72 pages
Foundations For Programming Languages by John C. Mitchell
No ratings yet
Foundations For Programming Languages by John C. Mitchell
16 pages
Big Data Engineer Ibm Exploree Cartes - Quizlet
No ratings yet
Big Data Engineer Ibm Exploree Cartes - Quizlet
30 pages
2024 Ksu Cse 1321 Final Review Actual Questions and Answers Solved and Verified 100
No ratings yet
2024 Ksu Cse 1321 Final Review Actual Questions and Answers Solved and Verified 100
17 pages
String Matching Algorithm
No ratings yet
String Matching Algorithm
18 pages
Mid-Term Test COSC3101.03: Design & Analysis of Algorithms
No ratings yet
Mid-Term Test COSC3101.03: Design & Analysis of Algorithms
5 pages
Binary Search Tree
No ratings yet
Binary Search Tree
177 pages
Karnaugh Map: Engr. Pablo B. Asi Engr. Leni A. Bulan Engr. Liza R. Maderazo
No ratings yet
Karnaugh Map: Engr. Pablo B. Asi Engr. Leni A. Bulan Engr. Liza R. Maderazo
18 pages
LAST Final Exam For G-12
No ratings yet
LAST Final Exam For G-12
3 pages
Introduction To Automatic Indexing
No ratings yet
Introduction To Automatic Indexing
28 pages
FLAT Unit 1 LM
No ratings yet
FLAT Unit 1 LM
10 pages
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
No ratings yet
Unit 3 Big Data MCQ AKTU: Royal Brinkman Gartenbaubedarf
17 pages
Syllabus
No ratings yet
Syllabus
9 pages
Assignment For DS (Stack and Recursion)
No ratings yet
Assignment For DS (Stack and Recursion)
11 pages
Exercises 1 Final
No ratings yet
Exercises 1 Final
2 pages
Rob Lect-1 - 2 (Robotics and Arduino) PDF
No ratings yet
Rob Lect-1 - 2 (Robotics and Arduino) PDF
22 pages
CS614 Current FinalTerm Paper 20 August 2016
No ratings yet
CS614 Current FinalTerm Paper 20 August 2016
15 pages
Chapter - 2 Boolean Algebra and Logic Simplification
No ratings yet
Chapter - 2 Boolean Algebra and Logic Simplification
72 pages
Subject-Distributed Computing: Question Bank For Oral Exam
No ratings yet
Subject-Distributed Computing: Question Bank For Oral Exam
1 page
Chapter 3 - Syntax Analysis
No ratings yet
Chapter 3 - Syntax Analysis
16 pages
Advanced DBMS Concepts Practical 1
No ratings yet
Advanced DBMS Concepts Practical 1
1 page
Dynamic Programing
No ratings yet
Dynamic Programing
24 pages
Unit 1 - Modern Information Retrieval - WWW - Rgpvnotes.in
No ratings yet
Unit 1 - Modern Information Retrieval - WWW - Rgpvnotes.in
8 pages
Session 6
No ratings yet
Session 6
25 pages
Class 12 DFA To RE - Arden's Theorem
No ratings yet
Class 12 DFA To RE - Arden's Theorem
12 pages
Week02-Pointers and Dynamic Memory
No ratings yet
Week02-Pointers and Dynamic Memory
44 pages
Allan Weiss
No ratings yet
Allan Weiss
7 pages
Recursion
No ratings yet
Recursion
22 pages
Chap 2 Part 2
No ratings yet
Chap 2 Part 2
20 pages
DP Practice - 2
No ratings yet
DP Practice - 2
9 pages
GE Math 4 Midterm
No ratings yet
GE Math 4 Midterm
13 pages
فاينل
No ratings yet
فاينل
12 pages
Additional Practice Problems About Countability and Cardinality
No ratings yet
Additional Practice Problems About Countability and Cardinality
5 pages
Sheet Bounes
No ratings yet
Sheet Bounes
1 page
Final Solution Manual v2
No ratings yet
Final Solution Manual v2
13 pages
Mid-Term1 - Exam2019
No ratings yet
Mid-Term1 - Exam2019
2 pages
Sample Midterm
No ratings yet
Sample Midterm
6 pages
Pure Mathematics Solutions 2013 Unit One
No ratings yet
Pure Mathematics Solutions 2013 Unit One
8 pages
Developing A Fuzzy Logic Based Game System: Directorate of Computer Center, Usak University, Usak 64200, Turkey
No ratings yet
Developing A Fuzzy Logic Based Game System: Directorate of Computer Center, Usak University, Usak 64200, Turkey
8 pages
Inequalities For Chains of Normalized Symmetric Sums
No ratings yet
Inequalities For Chains of Normalized Symmetric Sums
7 pages
Operators in Apex
No ratings yet
Operators in Apex
6 pages
Lagranges Mean Value Theorem Proof
No ratings yet
Lagranges Mean Value Theorem Proof
5 pages
02qbank PDF
No ratings yet
02qbank PDF
4 pages
Programming Pardigm
No ratings yet
Programming Pardigm
3 pages

Sheet 1

Uploaded by

Sheet 1

Uploaded by

Sheet 1

You might also like