Extended Boolean Model
Extended Boolean Model
The Extended Boolean modelwas described in a Communications of the ACM article appearing in 1983, by Gerard Salton, Edward
A. Fox, and Harry Wu. The goal of the Extended Boolean model is to overcome the drawbacks of the Boolean model that has been
used in information retrieval. The Boolean model doesn't consider term weights in queries, and the result set of a Boolean query is
often either too small or too big. The idea of the extended model is to make use of partial matching and term weights as in the vector
space model. It combines the characteristics of the Vector Space Model with the properties of Boolean algebra and ranks the
similarity between queries and documents. This way a document may be somewhat relevant if it matches some of the queried terms
and will be returned as a result, whereas in theStandard Boolean modelit wasn't.[1]
Thus, the extended Boolean model can be considered as a generalization of both the Boolean and vector space models; those two are
special cases if suitable settings and definitions are employed. Further, research has shown effectiveness improves relative to that for
Boolean query processing. Other research has shown that relevance feedback and query expansion can be integrated with extended
Boolean query processing.
Contents
Definitions
The 2 Dimensions Example
Generalizing the idea and P-norms
Examples
Improvements over the Standard Boolean Model
Further reading
See also
References
Definitions
In the Extended Boolean model, a document is represented as a vector (similarly to in the vector model). Each i dimension
corresponds to a separate term associated with the document.
The weight of term Kx associated with documentdj is measured by its normalizedTerm frequency and can be defined as:
This can be done usingP-norms which extends the notion of distance to include p-distances, where1 ≤ p ≤ ∞ is a new parameter.[3]
Examples
Consider the query q = (K1 ∧ K2) ∨ K3. The similarity between queryq and document d can be computed using the formula:
See also
Information retrieval
References
1. Salton, Gerard; Fox, Edward A.; Wu, Harry (1983), Extended Boolean information retrieval(http://portal.acm.org/citat
ion.cfm?id=358466), Communications of the ACM, Volume 26, Issue 11
2. Lusheng Wang (http://www.cs.cityu.edu.hk/~cs5286/Lectures/Lwang.ppt)
3. Garcia, Dr. E., The Extended Boolean Model - Weighted Queries: Term Weights, p-Norm Queries and Multiconcept
Types. Boolean OR Extended? AND that is the Query (http://www.miislita.com/term-vector/term-vector-6-boolean-mo
del.html)
4. Lee, W. C.; Fox, E. A. (1988),Experimental Comparison of Schemes for Interpreting Boolean Queries
(http://eprints.
cs.vt.edu/archive/00000112/01/TR-88-27.pdf)(PDF)
Text is available under theCreative Commons Attribution-ShareAlike License ; additional terms may apply. By using this
site, you agree to the Terms of Use and Privacy Policy. Wikipedia® is a registered trademark of theWikimedia
Foundation, Inc., a non-profit organization.