0% found this document useful (0 votes)
48 views16 pages

Chapter #4: Query Languages

This document discusses different types of query languages used for information retrieval. It covers keyword-based queries including single-word, context, Boolean, and natural language queries. It also discusses pattern matching queries and allowing errors. Finally, it discusses structural queries for fixed, hypertext, and hierarchical structures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views16 pages

Chapter #4: Query Languages

This document discusses different types of query languages used for information retrieval. It covers keyword-based queries including single-word, context, Boolean, and natural language queries. It also discusses pattern matching queries and allowing errors. Finally, it discusses structural queries for fixed, hypertext, and hierarchical structures.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 16

Chapter #4

Query Languages

Information Retrieval in Practice


By: Eng. Ali Hassan Ahmed
Keyword-Based Querying
A query is formulation of a user information need
Keyword-based queries are popular

1. Single-Word Queries Data Retrieval


2. Context Queries
3. Boolean Queries
4. Natural Language Information Retrieval
Single-Word Queries
 A query is formulated by a word
 A document is formulated by long sequences of
words
 A word is a sequence of letters surrounded by
separators
 What are letters and separators? e.g,’on-line’
The division of the text into words is not
arbitrary
Context Queries
 Definition
- Search words in a given context
 Types
 Phrase
>a sequence of single-word queries
>e.g, enhance retrieval
 Proximity
>a sequence of single words or phrases, and a
maximum allowed distance between them are specified
Boolean Queries
 Definition
 A syntax composed of atoms that retrieve documents, and of
Boolean operators which work on their operands
 e.g, translation AND syntax OR syntactic
Natural Language

A query is an enumeration of words and context


queries
All the documents matching a portion of the user
query are retrieved
Pattern Matching
 Data retrieval
 A pattern is a set of syntactic features that must
occur in a text segment
 Types
 Words
 Prefixes
e.q ‘comput’->’computer’ ,’computation’,’computing’,etc
 Suffixes
e.q ‘ters’->’computers’,’testers’,’painters’,etc
 Substrings
e.q ‘tal’->’coastal’,’talk’,’metallic’,etc
 Ranges
between ‘held’ and ‘hold’->’hoax’ and ‘hissing’
Allowing errors
 Retrieve all text words which all ‘similar’ to the
given word
 edit distance:
the minimum number of character insertions,
deletions, and replacements needed to make two
strings equal, e.q , ‘flower’ and ‘flo wer’
 maximum allowed edit distance:
query specifies the maximum number of allowed
errors for a word to match the pattern
Structural Queries
 Mixing contents and structure in queries
- contents: words, phrases, or patterns
- structural constraints: containment, proximity,
or other restrictions on structural elements
 Three main structures
- Fixed structure
- Hypertext structure
- Hierarchical structure
Fixed Structure
Document:a fixed set of fields
EX: a mail has a sender, a receiver, a date, a subject and a body field
Search for the mails sent to a given person with “Notes” in the
Subject field
Hypertext
A hypertext is a directed graph where nodes hold some
text (text contents)
the links represent connections between nodes or
between positions inside nodes (structural connectivity)
Hypertext : WebGlimpse

WebGlimpse: combine browsing and searching on


the Web
Hierarchical Structure
WAIS (Wide Area Information Service)

 Beginning in the 1990s


 Query databases through the Internet
Lists of References
 Overlap and nest are not allowed
 All elements must be of the same type,e.g only
sections, or only paragraphs.
 A reference is a pointer to a region of the
database.
Proximal Nodes
 This model tries to find a good compromise
between expressiveness and efficiency.
 It does not define a specific language, but a
model in which it is shown that a number of
useful operators can be included achieving good
efficiency.

You might also like