Information Retrieval System
Information Retrieval System
Summary:
Much of the research in Information Retrieval has concerned improvements to similarity
computations, statistics gathering, and term extraction, with the goal of improving effectiveness.
However, a simple examination of user characteristics can readily show, the method of
computing similarity is less important than the behavior of the system interface and
environmental factors. It is hypothesised that there must be knowledge of the relationship
between a query, its user, the environment, and the query and user instantiation in the real world.
This hypothesis and others are demonstrated. With facilities for interaction and feedback
appropriately incorporated, effectiveness of 100% can be achieved.
Introduction:
Information Retrieval is the science of locating, from a large document collection, those
documents that full a specified information need [1, 2, 3, 4]. Much of Information Retrieval
research is concerned with proposing and testing of methodologies intended to perform this
function. To perform such tests it is necessary to make assumptions about the behavior of users
and the properties of text. For reasons of experimental design (following the assumption that
good" experiments should not have lots of variables) the user is often assigned the role of reader
with no part in the process that produces answers from the document collection.
It might be thought that a formal model of the relationships between queries, documents,
meaning, and relevance could be used as a foundation for information retrieval. It is argued that
there can be no such model, humans cannot be left out of the equation yet cannot be modelled.
(This paper does not consider the information needs of non-humans, such as robo-cup
competitors.) This paper considers the basis and aims of information retrieval, examining
assumptions and, on the basis of these observations, describes user experiments showing just
how much effectiveness can be improved. These experiments justify great optimism for future
system measurement and design, with full or at least 100% effectiveness easily achieved.
Language and text and their impact on information retrieval are considered first, then
there is examination of the interaction of users, their environment, and relevance. The suggested
system design and experiments are then reported.
Definition: