Module 4-Boolean Retrieval Models
Module 4-Boolean Retrieval Models
Module 4
• Boolean Model – Term Vocabulary and Posting Lists – Dictionaries and Tolerant
Retrieval – Index Construction – Index Compression – Vector Space Model – Scoring
and Term Weighting – Evaluation in Information Retrieval
Introduction to Information Retrieval
Outline
❶ Introduction
❷ Inverted index
❹ Query optimization
Introduction to Information Retrieval
Information Retrieval
Sec. 1.1
Basic assumptions of Information Retrieval
6
Introduction to Information Retrieval
7
Introduction to Information Retrieval
Boolean retrieval
8
Introduction to Information Retrieval
Outline
❶ Introduction
❷ Inverted index
❹ Query optimization
Introduction to Information Retrieval
10
Introduction to Information Retrieval
11
Introduction to Information Retrieval
Incidence vectors
13
Introduction to Information Retrieval
14
Introduction to Information Retrieval
Answers to query
15
Introduction to Information Retrieval
Bigger collections
16
Introduction to Information Retrieval
17
Introduction to Information Retrieval
Inverted Index
dictionary postings
18
Introduction to Information Retrieval
Inverted Index
dictionary postings
19
Introduction to Information Retrieval
Inverted Index
dictionary postings
20
Introduction to Information Retrieval
22
Introduction to Information Retrieval
Generate posting
23
Introduction to Information Retrieval
Sort postings
24
Introduction to Information Retrieval
25
Introduction to Information Retrieval
dictionary postings
26
Introduction to Information Retrieval
27
Introduction to Information Retrieval
Outline
❶ Introduction
❷ Inverted index
❹ Query optimization
Introduction to Information Retrieval
29
Introduction to Information Retrieval
30
Introduction to Information Retrieval
31
Introduction to Information Retrieval
32
Introduction to Information Retrieval
Boolean queries
The Boolean retrieval model can answer any query that is a
Boolean expression.
Boolean queries are queries that use AND, OR and NOT to join
query terms.
Views each document as a set of terms.
Is precise: Document matches condition or not.
Primary commercial retrieval tool for 3 decades
Many professional searchers (e.g., lawyers) still like Boolean
queries.
You know exactly what you are getting.
Many search systems you use are also Boolean: spotlight,
email, intranet etc.
33
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
41
Introduction to Information Retrieval
42
Introduction to Information Retrieval
Westlaw: Comments
Proximity operators: /3 = within 3 words, /s = within a
sentence, /p = within a paragraph
Space is disjunction, not conjunction! (This was the default
in search pre-Google.)
Long, precise queries: incrementally developed, not like
web search
Why professional searchers often like Boolean search:
precision, transparency, control
When are Boolean queries the best way of searching?
Depends on: information need, searcher, document
collection, . . .
43
Introduction to Information Retrieval
Outline
❶ Introduction
❷ Inverted index
❹ Query optimization
Introduction to Information Retrieval
Query optimization
45
Introduction to Information Retrieval
Query optimization
46
Introduction to Information Retrieval
47
Introduction to Information Retrieval
48
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval
Introduction to Information Retrieval