Nformation Etrieval Ystems: P.Veera Swamy
Nformation Etrieval Ystems: P.Veera Swamy
S Y S T E M S ( IRS )
Course Instructor
P.Veera Swamy
Assistant Professor
U N I T- 5 S Y L L A B U S
Text Search Algorithms:
⚫ Introduction
⚫ Software text search algorithms
⚫ Hardware text search systems
Information System Evaluation:
⚫ Introduction
⚫ Measures used in system evaluation
⚫ Measurement example – T R E C results.
O V E RV I E W
Three classical techniques for text retrieval techniques
have been defined for organizing items in a textual
database, for rapidly identifying the relevant items and for
eliminating items that do not satisfy the search.
1. Full text scanning (streaming)
2. Word inversion and
3. Multi-attribute retrieval
⚫ Introduction
⚫ Measures used in system evaluation
⚫ Measurement example – T R E C results.
Search
Inf. need
Results
User
Evaluation
I N T RO D U C T I O N TO I N FO R M AT I O N SYSTEM
E VA LUAT I O N
The evaluations focused primarily on the effectiveness of
search algorithms. The creation of the annual Text
Retrieval Evaluation Conference (TREC) sponsored by the
Defense Advanced Research Projects Agency (DARPA)
and the National Institute of Standards and Technology
(NIST) changed the standard process of evaluating
information systems.
Conferences have been held every year, starting from
1992, usually in the Fall months. The conference provides
a standard database consisting of gigabytes of test data,
search statements and the expected results from the
searches to academic researchers and commercial
companies for testing of their systems.
In recent years the evaluation of Information
Retrieval Systems and techniques for indexing,
sorting, searching and retrieving information have
become increasingly important.
This growth in interest is due to two major reasons:
the growing number of retrieval systems being used
and additional focus on evaluation methods
themselves.
There are many reasons to evaluate the
effectiveness of an Information Retrieval System
(Belkin-93, Callan-93):
⚫ To aid in the selection of a system to procure
⚫ To monitor and evaluate system effectiveness
⚫ To evaluate query generation process
for improvements
⚫ To provide inputs to cost-benefit analysis
of an information system
⚫ To determine the effects of changes made
to an existing information System
E VA LUAT I O N C R I T E R I A
Effectiveness
⚫ System-only, human+system
Efficiency
⚫ Retrieval time, indexing time, index size
Usability
⚫ Learnability, novice use, expert use
From an academic perspective, measurements are
focused on the specific effectiveness of a system and
usually are applied to determining the effects of
changing a system's algorithms or comparing
algorithms among systems.
From a commercial perspective, measurements are
also focused on availability and reliability.
The most important evaluation metrics of
information systems will always be biased by human
subjectivity.
This problem arises from the specific data collected
to measure the user resources in locating relevant
information
A factor in most metrics in determining how well a
system is working is the relevancy of items.
Relevancy of an item, however, is not a binary
evaluation, but a continuous function between an
item's being exactly what is being looked for and its
being totally unrelated.
Relevancy, it is necessary to define the context under
which the concept is used. From a human judgment
standpoint, relevancy can be considered:
Subjective - depends upon a specific user's judgment
Situational - relates to a user's requirements
Cognitive - depends on human perception and
behavior
Temporal - changes over time
Measura - observable at a points in time
ble
In a dynamic environment, each user has his own
understanding of the requirement and the threshold on
what is acceptable. Based upon his cognitive model of
the information space and the problem, the user judges
a particular item. Some users consider information they
already know to be non-relevant to their information
need.
⚫ Example: User being presented with an article that the user
wrote does not provide "new" relevant information to answer
the user's query, although the article may be very relevant to
the search statement. Also the judgment of relevance can
vary over time. Retrieving information on an "XT" class of
PCs is not of significant relevance to personal computers in
1996, but would have been valuable in 1992. Thus, relevance
judgment is measurable at a point in time constrained by the
particular users and their thresholds on acceptability of
information.
Another way of specifying relevance is from
information, system and situational views.
1. Information View
The information view is subjective in nature and
pertains to human judgment of the conceptual
relatedness between an item and the search.
It involves the user's personal judgment of the
relevancy (aboutness) of the item to the user's
information need.
When reference experts (librarians, researchers,
subject specialists, indexers) assist the user, it is
assumed they can reasonably predict whether certain
information will satisfy the user's needs.
Ingwersen categorizes the information view into four
types of "aboutness” (Ingwersen-92):.
1. Author Aboutness - determined by the author's language
as matched by the system in natural language retrieval
2. Indexer Aboutness - determined by the indexer's
transformation of the author' s natural language into a
controlled vocabulary
3. Request Aboutness - determined by the user's or
intermediary's processing of a search statement into a
query
4. User Aboutness - determined by the indexer's attempt to
represent the document according to presupposition
about what the user will want to know
2. System View
The system view relates to a match between query
terms and terms within an item. It can be objectively
observed, manipulated and tested without relying on
human judgment because it uses metrics associated
with the matching of the query to the item (Barry-94,
Schamber-90).
The semantic relatedness between queries and items
is assumed to be inherited via the index terms that
represent the semantic content of the item in a
consistent and accurate fashion.
3.The Situation View
The situation view pertains to the relationship
between information and the user's information
problem situation. It assumes that only users can
make valid judgments regarding the suitability of
information to solve their information need.
Lancaster and Warner refer to information and
situation views as relevance and pertinence
respectively (Lancaster-93). Pertinence can be
defined as those items that satisfy the user's
information need at the time of retrieval.
M E A S U R E S U S E D I N S Y S T E M E VA L UAT I O N S
To define the measures that can be used in
evaluating Information Retrieval Systems, it is useful
to define the major functions associated with
identifying relevant items in an information system.
Items arrive in the system and are automatically or
manually transformed by "indexing" into searchable
data structures.
The user determines what his information need is
and creates a search statement. The system processes
the search statement, returning potential hits. The
user selects those hits to review and accesses them.
Identifying Relevant Items
Measurements can be made from two
perspectives:
⚫ User Perspective and
⚫ System Perspective
1.User Perspective
1. Author Aboutness - determined by the author's
language as matched by the system in natural
language retrieval
2. Indexer Aboutness - determined by the indexer's
transformation of the author' s natural language
into a controlled vocabulary
3. Request Aboutness - determined by the user's or
intermediary's processing of a search statement into
a query
4. User Aboutness - determined by the indexer's
attempt to represent the document according to
presupposition about what the user will want to
know
2.System Perspective
Based upon aggregate functions, whereas the
user takes a more personal view.
If a user's P C is not connecting to the system, then,
from that user's view the system is not operational.
Techniques for collecting measurements can also
be objective or subjective.
⚫ An objective measure is one that is well-defined and
based upon numeric values derived from the system
operation.
⚫ A subjective measure can produce a number, but is based
upon an individual users judgments.
MEASURES A S S O C I AT E D I N S Y S T E M E VA L UAT I O N S
1. Search Process
2. Response Time
3. Consistency
4. Quality of the Search
5. Fallout
6. Unique Relevance Recall( U R R )
7. Novelty Ratio
8. Coverage Ratio
9. Sought Recall
SEARCH PROCESS
This is associated with a user creating a new search
or modifying an existing query. In creating a search,
an example of an objective measure is the time
required to create the query, measured from when
the user enters into a function allowing query input
to when the query is complete.
Completeness is defined as when the query is
executed. Although of value, file possibilities for
erroneous data (except in controlled environments)
are so great that data of this nature are not collected
in tiffs area in operational systems.
Example: The erroneous data comes from the user
performing other activities in the middle of creating
the search such as going to get a cup of coffee.
RESPONSE TIME
Response time is a metric frequently collected to
determine the efficiency of the search execution. Response
time is defined as the time it takes to execute the search.
The ambiguity in response time originates from the
possible definitions of file end time.
The beginning is always correlated to when the user tells
the system to begin searching. The end time is affected by
the difference between the user's view and a system view.
From a user's perspective, a search could be considered
complete when file first result is available for tile user to
review, especially if the system has new items available
whenever a user needs to see tile next item. From a
system perspective, system resources are being used until
the search has determined all hits.
CONSISTENCY
To ensure consistency, response time is usually associated
with the completion of the search. This is one of the most
important measurements in a production system.
Determining how well a system is working answers the
typical concern of a user: "the system is working slow
today.“
It is difficult to define objective measures on the process of
a user selecting hits for review and reviewing them. The
problems associated with search creation apply to this
operation.
Using time as a metric does not account for reading and
cognitive skills of the user along with the user performing
other activities during the review process.
Data are usually gathered on the search creation and Hit
file review process by subjective techniques, such as
questionnaires to evaluate system effectiveness.
Q UA L I T Y O F T H E S E A R C H