Information Retrieval System Assignment-1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

INFORMATION RETRIEVAL SYSTEM

ASSIGNMENT-1
Q1)Define IRS and Goals?

A1) An IRS is a system that is capable of storage , retrieval and maintain the
information .

It maybe a image,audio or a multi-media object.

Information retrieval is the activity of obtaining information resources relevant to


an information need from a collection of information resources.

Information Retrieval maybe defined as a software program that deals with the
organization,storage,retrieval and evaluation from document.

** Goals of IRS:

1)A goal of information retrieval is to optimize the speed of the Query

(i) To identify the sources of information relevant to the areas of


interest of the target users’ community

(ii) To analyze the contents of the sources (documents)

(iii) To represent the contents of the analyzed sources for matching


with the users’ queries (iv) To match the search statement with the
stored database
2Q)Describe objectives of IRS?
Two major measures are :

1)Precision:The ability to retrieve top ranked data that are mostly documents.

2)Recall:The ability of the search to find all the relevant items in the corpus.
3Q)Explain Browser capabilities?

A3)
4Q)Explain Ctalog and Indexing?

A4)

**INDEXING:

The function of indexing in libraries and information retrieval systems is to


indicate the whereabouts or absence of items relevant to a request. It is
essentially a time-saving mechanism. Theoretically, we can always find the
relevant items by an exhaustive search through the whole collection.

Since this is economically impossible, the size of the store to be examined is


reduced by classification, using this term in its very broadest sense, i.e. , as the
recognition of useful similarities between documents and the establishment of
useful document groups based on these similarities. So documents, or document
surrogates, are assigned to a limited number of classes according to certain
criteria, in particular, their subject content.
Most library indexes, other than those to imaginative worksare aimed ultimately
at the retrieval of subject information.

It is of two types:

1)Traditional

2)Automatic

** Cataloging:

Displaying list of relevant items.

Only the iterms are of nrcissity are to be retrieved.

This procedure reduces the work load on the system.

The information is used proprly because onlt required amount of data is taken
care of and is to be retrieved.

Eg: Images , names , video.


5Q)Explain signature file structure.

A5) Today, one or more of the

following four techniques have been frequently used:


full text searching, B-trees, inversion, and the signature
fi le. Full text searching imposes no space overhead but
requires long response ti me. In contrast, B-trees, inver-
sion, and the signature fi le work quickl, but need a large
intermediary representati on structure (index), which
provides direct links to relevant data.
The signature technique cannot only be used in docu-
ment databases but also in relati onal and object-ori-
ented databases.

SIGNATURE FILES:
Intuiti vely, a signature fi le can be considered as a set of
bit strings, which are called signatures. Compared to the
inverted index, the signature fi le is more effi cient in
handling new insertions and queries on parts of words.

But the scheme introduces informati on loss .


These are also called inverted fi le structures.
The concept of the inverted file type of index is as follows. Assume a set of
documents. Each document is assigned a list of keywords or attributes, with optional
relevance weights associated with each keyword (attribute). An inverted file is then
the sorted list (or index) of keywords (attributes), with each keyword having links to
the documents containing that keyword (see Figure 3.1) . This is the kind of index
found in most commercial library systems. 

You might also like