Introduction of IR Models
Introduction of IR Models
IR Models
Introduction of IR Models
Information
Retrieval
Models
At the end of this chapter every student must able to:
Define what model is
Describe why model is needed in information retrieval
Differentiate different types of information retrieval models
Boolean Model
Vector space model
probabilistic model
know how to calculate and find the similarity of some
documents to the given query
Identify term frequency, document frequency, inverted
document frequency, term weight and similarity
measurements
What is model?
• Model- is an idealization or abstraction of actual processes (i.e.,
things that happen in the real world)
There are 2 good reasons for having models of IR
OR
NOT
……cont
• Boolean relevance prediction ( R )
• Consider a set of five docs and assume that they contain the terms
shown in the table
Doc. Terms
D1 Algorithm, information, retrieval
D2 Retrieval, science
D3 Algorithm, information, science
D4 Pattern, retrieval, science
D5 Science, algorithm
– Information need
– Query/Boolean expression
= 0.2*0.6/0.4=0.3
b. p(NR/Y)+p(R/Y)=1
p(R/Y)=1-p(NR/Y)
=1-0.3
So , the document is
=0.7 relevant
Question
• If the value of the probability of one document is relevant is
equal with that of non relevant how can we decide whether the
document is relevant or not?
Exercise
Comparison of IR models