0% found this document useful (0 votes)
193 views

Extended Boolean Model

Extended Boolean model

Uploaded by

aklamos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
193 views

Extended Boolean Model

Extended Boolean model

Uploaded by

aklamos
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Extended Boolean model

The Extended Boolean modelwas described in a Communications of the ACM article appearing in 1983, by Gerard Salton, Edward
A. Fox, and Harry Wu. The goal of the Extended Boolean model is to overcome the drawbacks of the Boolean model that has been
used in information retrieval. The Boolean model doesn't consider term weights in queries, and the result set of a Boolean query is
often either too small or too big. The idea of the extended model is to make use of partial matching and term weights as in the vector
space model. It combines the characteristics of the Vector Space Model with the properties of Boolean algebra and ranks the
similarity between queries and documents. This way a document may be somewhat relevant if it matches some of the queried terms
and will be returned as a result, whereas in theStandard Boolean modelit wasn't.[1]

Thus, the extended Boolean model can be considered as a generalization of both the Boolean and vector space models; those two are
special cases if suitable settings and definitions are employed. Further, research has shown effectiveness improves relative to that for
Boolean query processing. Other research has shown that relevance feedback and query expansion can be integrated with extended
Boolean query processing.

Contents
Definitions
The 2 Dimensions Example
Generalizing the idea and P-norms
Examples
Improvements over the Standard Boolean Model
Further reading
See also
References

Definitions
In the Extended Boolean model, a document is represented as a vector (similarly to in the vector model). Each i dimension
corresponds to a separate term associated with the document.

The weight of term Kx associated with documentdj is measured by its normalizedTerm frequency and can be defined as:

where Idfx is inverse document frequency.

The weight vector associated with documentdj can be represented as:

The 2 Dimensions Example


Considering the space composed of two terms Kx and Ky only, the corresponding term weights are w1 and w2.[2] Thus, for query
qor = (Kx ∨ Ky), we can calculate the similarity with the following formula:
For query qand = (Kx ∧ Ky), we can use:

Generalizing the idea and P-


norms Figure 1: The similarities Figure 2: The similarities
of q = (Kx ∨ Ky) with of q = (Kx ∧ Ky) with
We can generalize the previous 2D extended Boolean model documents dj and dj+1 . documents dj and dj+1 .
example to higher t-dimensional space using Euclidean distances.

This can be done usingP-norms which extends the notion of distance to include p-distances, where1 ≤ p ≤ ∞ is a new parameter.[3]

A generalized conjunctive query is given by:

The similarity of and can be defined as:

A generalized disjunctive query is given by:

The similarity of and can be defined as:

Examples
Consider the query q = (K1 ∧ K2) ∨ K3. The similarity between queryq and document d can be computed using the formula:

Improvements over the Standard Boolean Model


Lee and Fox[4] compared the Standard and Extended Boolean models with three test collections, CISI, CACM and INSPEC. Using P-
norms they obtained an average precision improvement of 79%, 106% and 210% over the Standard model, for the CISI, CACM and
INSPEC collections, respectively.
The P-norm model is computationally expensive because of the number of exponentiation operations that it requires but it achieves
much better results than the Standard model and even Fuzzy retrieval techniques. The Standard Boolean model is still the most
efficient.
Further reading
Adaptive Feedback Methods in an Extended Boolean Model by Dr .Jongpill Choi
Interpolation of the extended Boolean retrieval model
Fox, E.; Betrabet, S.; Koushik, M.; Lee, W
. (1992), Information Retrieval: Algorithms and Data structures; Extended
Boolean model, Prentice-Hall, Inc.
Skorkovská, Lucie; Ircing, Pavel (2009),Experiments with Automatic Query Formulation in the Extended Boolean
Model, Springer Berlin / Heidelberg

See also
Information retrieval

References
1. Salton, Gerard; Fox, Edward A.; Wu, Harry (1983), Extended Boolean information retrieval(http://portal.acm.org/citat
ion.cfm?id=358466), Communications of the ACM, Volume 26, Issue 11
2. Lusheng Wang (http://www.cs.cityu.edu.hk/~cs5286/Lectures/Lwang.ppt)
3. Garcia, Dr. E., The Extended Boolean Model - Weighted Queries: Term Weights, p-Norm Queries and Multiconcept
Types. Boolean OR Extended? AND that is the Query (http://www.miislita.com/term-vector/term-vector-6-boolean-mo
del.html)
4. Lee, W. C.; Fox, E. A. (1988),Experimental Comparison of Schemes for Interpreting Boolean Queries
(http://eprints.
cs.vt.edu/archive/00000112/01/TR-88-27.pdf)(PDF)

Retrieved from "https://en.wikipedia.org/w/index.php?title=Extended_Boolean_model&oldid=799696563


"

This page was last edited on 9 September 2017, at 08:38.

Text is available under theCreative Commons Attribution-ShareAlike License ; additional terms may apply. By using this
site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of theWikimedia
Foundation, Inc., a non-profit organization.

You might also like