0% found this document useful (0 votes)
17 views20 pages

IRS Objective

The document outlines key concepts and terminologies related to Information Retrieval (IR) systems, including precision, recall, indexing techniques, and the roles of various algorithms. It discusses the importance of structuring data, the capabilities of IR systems, and the processes involved in indexing and searching for information. Additionally, it highlights the differences between structured and unstructured data formats, as well as the applications of IR in various fields.

Uploaded by

fastestbhau18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views20 pages

IRS Objective

The document outlines key concepts and terminologies related to Information Retrieval (IR) systems, including precision, recall, indexing techniques, and the roles of various algorithms. It discusses the importance of structuring data, the capabilities of IR systems, and the processes involved in indexing and searching for information. Additionally, it highlights the differences between structured and unstructured data formats, as well as the applications of IR in various fields.

Uploaded by

fastestbhau18
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

UNIT - 1

1. The ability to retrieve top-ranked documents that are mostly relevant

a. Recall

b. Precision ✓

c. Ranking

d. Zoning

2. ______ will follows the Structural Data format

a. IR

b. DBMS ✓

c. Both

d. None of the Above

3. ______ will follows the Unstructured Data format

a. IR ✓

b. DBMS

c. Both

d. None of the Above

4. Mapping between users’ specified need and the items in the IR systems is done by
_________ Capability

a. Search ✓

b. Browse

c. Miscellaneous

d. All of the Above

5. User can retrieve the Information as per the needs in IRS using _________ Capabilities.
a. Search

b. Browse

c. Miscellaneous

d. All of the Above ✓

6. The general objective of an IR system is

a. to minimize the overhead of a user

b. locating needed information

c. to maximize the overhead of a user

d. Both a & b ✓

7. The Retrieved Results done by the user, the IRS will produce the results in ____no. of
ways.

a. 2

b. 3

c. 4 ✓

d. 5

8. Precision Ranges from________

a. 0-1 ✓

b. 1-1

c. 1-0

d. Any Range

9. Recall Ranges from _______

a. 0-1

b. 1-1
c. 1-0 ✓

d. Any Range

10. Identification of processing tokens in IRS is in ______ level

a. Item Normalization ✓

b. Selective Dissemination of Information

c. Archival Document Database Search

d. Index Database Search + Automatic File Build Process

11. Logical restructuring of items is ___________

a. Ranking

b. Zoning ✓

c. Classification

d. None

12. The capability to create private and public index files is frequently implemented via a

a. structured Database Management System ✓

b. Information Retrieval Systems

c. Digital Library

d. Data Warehouse

13. In the__________ process, the user can logically store an item in a file along with
additional index terms and descriptive text.

a. Zoning

b. Index ✓

c. Classification

d. Ranking
14. _________ Algorithm Save system resources by eliminating from the set of searchable
processing tokens those have little value to the search

a. Stemming

b. Identify Processing Tokens

c. Stop Algorithm ✓

d. Characterize Tokens

15. _______Provides the capability to dynamically compare newly received items in the
information system against standing statements of interest of users and deliver the item to
those users whose statement of interest matches the contents of the items.

a. Item Normalization

b. Selective Dissemination of Information ✓

c. Archival Document Database Search

d. Index Database Search + Automatic File Build Process

16. To assist the users in generating indexes, the system provides a process called ________

a. Item Normalization

b. Selective Dissemination of Information

c. Automatic File Build ✓

d. None of the Above

17. ________ will restrict the distance allowed within an item between two search terms.

a. Boolean Operators

b. Proximity ✓

c. Contiguous Word Phrase

d. Term Masking
18. “United States of America” is an Example for

a. Boolean Operators

b. Proximity

c. Contiguous Word Phrase ✓

d. Term Masking

19. Natural Language Queries is

a. Decrease recall/ Improve precision

b. Improve recall/ Improve precision

c. Decrease recall/Decrease precision

d. Improve recall/Decrease precision ✓

20. The capability to name a query and store it to be retrieved and executed during a later
user session is called ________

a. Natural Language Queries

b. Proximity

c. Contiguous Word Phrase

d. Canned or Stored queries ✓

21. Rather than typing in a complete new query, the results of the previous search can be used
as a constraining list to create a new query that is applied against it is called _________

a. Canned or Stored queries

b. Proximity

c. Iterative Search ✓

d. Canned or Stored queries

22. Term masking is useful when applied to_________


a. words and work for finding ranges of numbers of numeric dates

b. words, but does not work for finding ranges of numbers of numeric dates ✓

c. does not work for finding words and ranges of numbers of numeric dates

d. finding ranges of numbers of numeric dates but does not work for words

23. Applications of information retrieval are

a. Personal

b. Educational

c. Career

d. All of the above ✓

24. ______ is used to mapping between a user’s specified need and the items in the IR
systems that will answer that need.

a. Search Capabilities ✓

b. Browse Capabilities

c. Miscellaneous Capabilities

d. All of the Above

25. ______ search is a frequently a search returns a Hit file containing many more items than
the user want to review

a. Item Search

b. Data Search

c. Iterative Search ✓

d. All

26. The capability to name a query and store it to be retrieved and executed during a later
user session is called________

a. User Query
b. Ask Query

c. Search Query

d. Canned Query ✓

27. Term masking is useful when applied to words, but does not work for finding ranges of
numbers of numeric dates.

a. Some times May be

b. Always True ✓

c. Never True

d. None of the Above

28. _______ Restrict the distance allowed within an item between two search terms.

a. Proximity ✓

b. Term Masking

c. Boolean Logic

d. Fuzzy Searches

29. Boolean Operators in Search will improve_________

a. Recall

b. Precision ✓

c. Both

d. None

30. Fuzzy Searches will improve______

a. Recall ✓

b. Precision

c. Both
d. None

31. ________ the user quickly focuses on the potentially relevant parts of the text to scan for
item relevance.

a. Highlighting ✓

b. Zoning

c. Ranking

d. All

32. WAIS is Wide Area Information Servers ✓.


UNIT – 2
33. __________ represent the concepts within an item to facilitate the user in finding relevant
information.

a. IRS

b. Indexing ✓

c. DBMS

d. Data Mining

34. The full text searchable data structures for items in the Document File provides a new
class of indexing called _____

a. full document indexing

b. total document indexing ✓

c. document indexing

d. None

35. _____ define what level of detail the subject index will contain.

a. Scope of the indexing ✓

b. Scope of the Data Base

c. Scope of the Information

d. Scope of the Text

36. _______ is the extent to which the different concepts in the item are indexed.

a. Scope

b. Specificity

c. Exhaustivity ✓

d. All
37. ________ is the preciseness of the index terms used in indexing.

a. Scope

b. Specificity ✓

c. Exhaustivity

d. All

38. Low exhaustivity has an adverse effect on ____________

a. Precision

b. Recall

c. Both ✓

d. None

39. ______ is an attempt to place a value on the index term’s representation of its associated
concept in the document.

a. Weighted Automatic Indexing ✓

b. Unweighted Automatic Indexing

c. Both

d. None

40. The process of creating term linkages at index creation time is called ______.

a. Post Coordination

b. Pre-coordination ✓

c. Specificity

d. Exhaustivity

41. ______ provides searchers with ways of finding morphological variants of search terms.

a. Pre-coordination
b. STOP Algorithm

c. Stemming ✓

d. All

42. Stemming Improves________

a. Precision

b. Recall ✓

c. Both

d. None

43. _______ determines a canonical set of concepts based on a test set of terms and uses them
as a basis for indexing all items.

a. Indexing by Term

b. Indexing By Concept ✓

c. Multimedia Indexing

d. All

44. ______ is coordinating terms at search time by ANDing index terms together, which only
finds indexes that have all of the search terms.

a. Post Coordination ✓

b. Pre-coordination

c. Specificity

d. Exhaustivity

45. Stemming Causes the problem for Natural Language Processing.

a. True ✓

b. False
c. Not Determined

d. Not Related

46. ______ Stem ends with letter ________.

a. Any

b. X ✓

c. All

d. None

47. _______ System uses the stemming Technique K-Stem.

a. CONVECTICS

b. INFORMIX

c. INQUERY ✓

d. All

48. K-Stem Algorithm uses ______ no. of data files to control and limit the stemming
process.

a. 7

b. 5

c. 4

d. 6 ✓

49. ______ is the most Common Data Structure used in both database and IRS.

a. Inverted File Structure ✓

b. N-Gram Data Structure

c. PAT DS

d. All
50. _______ will provide the optimum performance in searching large databases.

a. Inversion List ✓

b. N-Grams

c. Sistring

d. Signature

51. _______ data Structure is used to retrieve the information for continuous text.

a. N-Gram

b. PAT

c. Signature

d. Both a & b ✓

52. The substring in a PAT DS is ______.

a. Sstring

b. Sistring ✓

c. Substring

d. Sting-sub

53. Fuzzy Searches are easy to implement using PAT DS.

a. True

b. False ✓

c. Not Determined

d. Not Related

54. _______ DS eliminates the majority of items that are not related to a query.

a. Inverted File
b. N-Gram

c. PAT DS

d. Signature ✓

55. Indexing is the oldest technique for identifying the contents of an item to assist in their
retrieval.

56. Statistical techniques Calculation of weights use statistic information such as the
frequency of occurrence of words and their distributions in the searchable DB.

57. Stemming can be used to reduce the size of index files.

58. Automatic indexing is the capability to automatically determine the index terms to be
assigned to an item.

59. Sistring is a semi-infinite string.

60. Over-stemming leads to the conflated retrieval of non-relevant documents.

61. Under-stemming prevents related terms from being conflated, and relevant documents
will not be retrieved.

62. Specificity is the preciseness of the index terms used in indexing.

63. Total document indexing (uncontrolled vocabulary) is fast indexing but a difficult
search process.

64. The process of linking index terms together in a single index for a particular concept is
called Pre-Coordination and Linkages.

65. Inverted File Structure is the most common data structure used in both database and
IRS.
UNIT – 3
66. _______ is the process of analyzing an item to extract the information to be kept
permanently in an index.

a. Class Indexing

b. Automatic Indexing ✅

c. Manual Indexing

d. Any

67. _______ is used mostly in commercial systems.

a. Statistical ✅

b. Natural Language

c. Concept

d. Hypertext Linkages

68. _______ indexing stores the information that is used in calculating a probability that a
particular item satisfies a particular query.

a. Probabilistic ✅

b. Bayesian

c. Vector Space

d. Neural Net

69. _______ approaches store information used in generating a relative confidence level of an
item's relevance to a query.

a. Bayesian

b. Vector Space

c. Both ✅

d. None
70. _______ are dynamic learning structures that are discussed under concept indexing,
where they are used to determine concept classes.

a. Probabilistic

b. Bayesian

c. Vector Space

d. Neural Net ✅

71. _______ indexing uses words within an item to correlate to concepts discussed in the
item.

a. Statistical

b. Natural Language

c. Concept ✅

d. Hypertext Linkages

72. _______ approach is based upon the direct application of the theory of probability to IRS.

a. Probabilistic ✅

b. Natural Language

c. Concept

d. Hypertext Linkages

73. _______ produces efficient results when data is retrieved from multiple databases.

a. Probabilistic ✅

b. Natural Language

c. Concept

d. Hypertext Linkages

74. _______ processing is used to add semantic information in addition to statistical


information to enhance the indexing of the item.
a. Probabilistic

b. Natural Language ✅

c. Concept

d. Hypertext Linkages

75. Tagged Text Parser structure allows for identification of potential term phrases based
upon _______ identification.

a. Verb

b. Noun ✅

c. Adjective

d. All

76. _______ processing will use the DR-LINK System.

a. Probabilistic

b. Natural Language ✅

c. Concept

d. Hypertext Linkages

77. _______ system attempts to introduce a higher level of abstraction indexing on top of the
statistical processes.

a. Probabilistic

b. Natural Language ✅

c. Concept

d. Hypertext Linkages

78. _______ indexing is a statistical technique whose goal is to determine a canonical


representation of a concept.

a. Probabilistic
b. Natural Language

c. Concept ✅

d. Hypertext Linkages

79. _______ techniques have very powerful representation.

a. Binary

b. Vector ✅

c. Both

d. None

80. _______ pages at each Internet site are indexed automatically.

a. Automatically generated ✅

b. Manually generated

c. Crawlers

d. All

81. In _______, users define search terms, and it goes to various sites searching for the
desired information.

a. Automatically generated

b. Manually generated

c. Crawlers ✅

d. All

82. _______ is the example for WebCrawler’s.

a. WebCrawler’s

b. Open Text

c. Path Finder
d. All ✅

83. Term Frequency TFij is the frequency of occurrence of a term Ti in a document Dj.

84. Total Term Frequency TTFi is the frequency of occurrence of a term Ti in the entire
collection.

85. Document Frequency (DFi) is the number of unique documents in the collection that
contain a term Ti.

86. Tagged Text Parser structure allows for identification of potential term phrases based
upon Noun identification.

87. Automatic Indexing is the process of analyzing an item to extract the information to be
kept permanently in an index.

88. Manually generated (e.g., Yahoo!) pages are indexed manually into a linked hierarchy (an
“index”). Users browse in the hierarchy by following links.

89. Automatically generated (e.g., Alta Vista) pages at each Internet site are indexed
automatically (creating a “searchable data structure”).

90. Automatically generated structures are used for querying, rather than browsing.

91. Crawlers (e.g., WebCrawler) have no a priori indexing.

92. Crawlers (e.g., WebCrawler) allow users to define search terms, and the crawler goes to
various sites searching for the desired information.
93. Hypertext Linkages provide virtual threads of concepts between items versus directly
defining the concepts within an item.

94. The SMART system uses the Vector Model.

You might also like